跳转至

Common scheduling configurations(常见调度配置)

Get started with some examples of common schedules:

Build datasets regularly

For this example, we want raw_taxi (cleaned) to update every weekday at 9 AM, and we want to build not just raw_taxi (cleaned) but also all of its upstream dependencies. We should configure our schedule as follows:

image-time-based-full-page

Build datasets when new data is available

For this example, we want the schedule to run whenever another dataset has been updated. We can use the same configuration as in the previous section, with one small modification. An event trigger should be chosen, selecting which dataset(s) on the graph you wish to trigger the update.

when-datasets-update

For more details on event-based schedules, see the event triggers documentation.

Advanced (multiple) trigger configurations

image-of-any-trigger-config image-of-or-trigger-config

For this example, we want Dataset D to update at 9 AM daily, but also whenever the dataset it depends on, Parent A, sees a change. According to the table of combinations for compound triggers, if we combine a time-based trigger with an event-based trigger through an OR, the dataset will build at time T, as well as when event E occurs. Therefore, we will set the dataset we want to schedule the build for to Dataset D, and add a time-based trigger for 9 AM with an event-based trigger that watches for any update on Parent A. Choosing "Any of these triggers", or an advanced configuration and adding an OR between the conditions, are equivalent in this case.

Update a dataset at a specific time only if its parent has been updated

image-of-all-trigger-config image-of-and-trigger-config

For this example, we want Dataset D to update at 9 AM daily, but only if the dataset it depends on, Parent A, has seen a change. According to the table of combinations for compound triggers, if we combine a time-based trigger with an event-based trigger through an AND, the dataset will build at time T if event E has previously occurred. Therefore, we will set the dataset we want to schedule the build for to Dataset D, and add a time-based trigger for 9 AM with an event-based trigger that watches for any update on Parent A. Choosing "All of these triggers" or an advanced configuration and adding an AND between the conditions, are equivalent in this case.

:::callout{theme="neutral"} This configuration does not limit the time window in which Parent A has been updated. Whether it was updated at 8:55 AM on the same day or at 9:10 AM the day before, the event-based trigger will evaluate to TRUE at 9 AM, causing all criteria to be met and the schedule to run. This means that if Parent A is consistently updating after 9 AM, e.g. at 9:10 AM every day, then Dataset D will be built daily at 9 AM, with data from Parent A that is 23 hours and 50 minutes old. :::


中文翻译

常见调度配置

以下是一些常见调度配置的入门示例:

定期构建数据集

在此示例中,我们希望 raw_taxi (cleaned) 在每个工作日的上午 9 点更新,并且不仅构建 raw_taxi (cleaned),还构建其所有上游依赖项。我们应该按如下方式配置调度:

image-time-based-full-page

新数据可用时构建数据集

在此示例中,我们希望每当另一个数据集更新时调度就运行。我们可以使用与上一节相同的配置,只需稍作修改。应选择事件触发器(Event Trigger),并选择图表中希望触发更新的数据集。

when-datasets-update

有关基于事件的调度的更多详细信息,请参阅事件触发器(Event Trigger)文档。

高级(多重)触发器配置

image-of-any-trigger-config image-of-or-trigger-config

在此示例中,我们希望 Dataset D 每天上午 9 点更新,同时在其依赖的数据集 Parent A 发生变更时也进行更新。根据复合触发器组合表,如果通过 OR 组合时间触发器和事件触发器,数据集将在时间 T 以及事件 E 发生时构建。因此,我们将要调度构建的数据集设置为 Dataset D,并添加一个上午 9 点的时间触发器,以及一个监视 Parent A 任何更新的事件触发器。选择"任意触发器(Any of these triggers)",或者使用高级配置并在条件之间添加 OR,这两种方式在此情况下是等效的。

仅在父数据集更新时在特定时间更新数据集

image-of-all-trigger-config image-of-and-trigger-config

在此示例中,我们希望 Dataset D 每天上午 9 点更新,但仅在其依赖的数据集 Parent A 发生变更时才更新。根据复合触发器组合表,如果通过 AND 组合时间触发器和事件触发器,数据集将在时间 T 构建,前提是事件 E 之前已经发生过。因此,我们将要调度构建的数据集设置为 Dataset D,并添加一个上午 9 点的时间触发器,以及一个监视 Parent A 任何更新的事件触发器。选择"所有触发器(All of these triggers)",或者使用高级配置并在条件之间添加 AND,这两种方式在此情况下是等效的。

:::callout{theme="neutral"} 此配置并不限制 Parent A 更新的时间窗口。无论它是在同一天上午 8:55 更新,还是在前一天上午 9:10 更新,事件触发器在上午 9 点都会评估为 TRUE,从而满足所有条件并触发调度运行。这意味着,如果 Parent A 始终在上午 9 点之后更新(例如每天上午 9:10),那么 Dataset D 将在每天上午 9 点构建,使用的却是 Parent A 中 23 小时 50 分钟前的数据。 :::