跳转至

Node coloring(节点着色)

There are several built-in options for coloring graph nodes to give you more information about your pipeline:

Coloring option Description
No color Would remove coloring altogether
Custom color Allows you to select nodes and assign them a color by clicking on the Color button
Data Catalog Nodes would be colored based on the Data Catalog collection they are in. If the node belongs to more than one collection it would be colored as “In multiple collections“
Folder Colors the nodes by name of the folder the resource is located in
Issues Colors the nodes by the number of Foundry issues assigned to them. This option would also allow you to filter by issues labels.
Permissions Colors the nodes by the level of access the user has to the data or the resource. If you have access to the resources on the graph, this view also allows you to choose any Foundry user and view their permissions.
Project Colors the nodes based on the Foundry Project they are in.
Repository Colors the nodes based on the code repository used to create them. You can either color the nodes by the name of the repository, or by its type (e.g. Code Repository, Code Workbook).
Resource overview This view colors the node by details of the resource. The details of the resource generally refer to the way the resource was created (such as in Contour, Code Workbook, Fusion spreadsheet sync, by Upload, etc.).
Resource type Colors the nodes by Foundry resource types.
Build status Indicates the current build status of each dataset on the graph. If the nodes are grouped the more severe status would be presented.
Data Heath Indicates the status of resources health checks with the ability to filter to only watched health checks. If the nodes are grouped, the color of the group would indicate the most severe health check status of the group.
Out-of-date This option would indicate if the data or logic is out of date relative to the dataset ancestors.
Out-of-date with parent means a direct parent of the resource had been updated and the resource itself hasn't yet updated accordingly.
Out-of-date with ancestor means the resource is up-to-date with its direct parents, but there is a resource upstream that is more updated. This options allows you to filter for two types of updates: Data and Logic.
Data out-of-date means the data was updated in an ancestor and the resource hasn't yet picked up the update in a build.
Logic out-of-date means job-specs has changed.
Schedule count Indicates the amount of build schedules set on a dataset with the option to filter out paused schedules.
Sync status If there are syncs set on the dataset, this option would indicate the status of the sync
Time last built Indicates the time since the last time the dataset was successfully built.
Build duration According to the most recent successful build of each resource, this option would indicate the approximate build duration
Files Colors the nodes on the graph by files-related metrics: Average file size, count of files and dataset size
Row count Colors nodes by the number of rows in each dataset. If row count does not exist, it could be calculated in the dataset details helper or in the dataset view in Foundry (dataset app).
Spark usage Colors each node by Executor run/CPU time in a given period
User views Colors nodes by the number of user views
Branch Indicates the currently viewed branch of each node on the graph.
Code Status Indicates the code status for this node/dataset.
CI running means CI checks are currently running for this node.
CI Failed means that CI checks failed on this node.
Out of date means that the code is out of date for this node.
Unavailable means that the node/dataset is not a stemma backend or that the user is lacking permissions.
Storage Indicates where data is stored. Will be Foundry unless you are using Virtual Tables.
Compute Indicates the compute engine used by each node on the graph. For transforms run in Foundry, this will show the type of compute used (Spark or Flink, for example). For transforms that use compute pushdown, this will indicate the external compute engine used (BigQuery, Databricks or Snowflake, for example).
Transaction type Indicates each nodes transaction type: Append or Snapshot

中文翻译

节点着色

系统提供多种内置的图节点着色选项,以便为您提供有关管道(Pipeline)的更多信息:

着色选项 描述
无颜色 完全移除着色
自定义颜色 允许您选择节点并通过点击颜色按钮为其分配颜色
Data Catalog 根据节点所在的数据目录(Data Catalog)集合进行着色。如果节点属于多个集合,则着色为“在多个集合中”
文件夹 根据资源所在文件夹的名称对节点进行着色
问题 根据分配给节点的 Foundry 问题数量进行着色。此选项还允许您按问题标签进行筛选。
权限 根据用户对数据或资源的访问级别对节点进行着色。如果您有权访问图上的资源,此视图还允许您选择任何 Foundry 用户并查看其权限。
项目 根据节点所在的 Foundry 项目(Project)进行着色。
代码库 根据用于创建节点的代码库(Repository)进行着色。您可以按代码库名称或其类型(如 Code Repository、Code Workbook)对节点进行着色。
资源概览 此视图根据资源的详细信息对节点进行着色。资源详细信息通常指资源的创建方式(例如在 Contour、Code Workbook 中创建,通过 Fusion 电子表格同步或上传等)。
资源类型 根据 Foundry 资源类型对节点进行着色。
构建状态 指示图上每个数据集的当前构建状态。如果节点已分组,则显示最严重的状态。
数据健康度 指示资源健康检查的状态,并支持筛选仅显示已关注的健康检查。如果节点已分组,分组的颜色将指示该组中最严重的健康检查状态。
过期 此选项指示数据或逻辑相对于数据集上游节点是否已过期。
与父节点不同步表示资源的直接父节点已更新,而资源本身尚未进行相应更新。
与祖先节点不同步表示资源与其直接父节点保持同步,但上游存在更新更频繁的资源。此选项允许您筛选两种类型的更新:数据和逻辑。
数据过期表示祖先节点中的数据已更新,而资源尚未在构建中获取该更新。
逻辑过期表示作业规范已更改。
计划数量 指示在数据集上设置的构建计划数量,并支持筛选掉已暂停的计划。
同步状态 如果数据集上设置了同步任务,此选项将指示同步状态
上次构建时间 指示自数据集上次成功构建以来经过的时间。
构建时长 根据每个资源最近一次成功的构建,此选项将指示大致的构建时长
文件 根据文件相关指标对图上的节点进行着色:平均文件大小、文件数量和数据集大小
行数 根据每个数据集的行数对节点进行着色。如果不存在行数信息,可以在数据集详细信息助手或 Foundry 的数据集视图(数据集应用)中进行计算。
Spark 使用率 根据给定时间段内执行器(Executor)的运行/CPU 时间对每个节点进行着色
用户浏览量 根据用户浏览量对节点进行着色
分支 指示图上每个节点当前查看的分支。
代码状态 指示此节点/数据集的代码状态。
CI 运行中表示当前正在为此节点运行 CI 检查。
CI 失败表示此节点的 CI 检查失败。
已过期表示此节点的代码已过期。
不可用表示该节点/数据集不是 stemma 后端,或者用户缺乏权限。
存储 指示数据的存储位置。除非您使用虚拟表(Virtual Tables),否则将显示为 Foundry
计算 指示图上每个节点使用的计算引擎。对于在 Foundry 中运行的转换,这将显示所使用的计算类型(例如 Spark 或 Flink)。对于使用计算下推(Compute pushdown)的转换,这将指示所使用的外部计算引擎(例如 BigQuery、Databricks 或 Snowflake)。
事务类型 指示每个节点的事务类型:追加(Append)或快照(Snapshot)