Time series forecasting trainer（时间序列预测训练器 (Time Series Forecasting Trainer)）¶

The time series forecasting trainer trains a number of models in parallel or sequentially to help it determine the best performing model. This trainer may also take advantage of ensembling if it determines that a weighted ensemble of models performs better than any single model. This trainer is built using AutoGluon's ↗ TimeSeriesPredictor class.

The time series forecasting trainer relies on accurate historical time series data to predict future data across a number of pre-defined lookahead steps. The structure of the model ensemble can be viewed on the output model's experiment page under Plots.

Models trained¶

Internally, multiple models will be trained, unless disabled by excluding the model or with the use of hyperparameters.

The following types of models are available for training:

AutoETS: Automatically tuned exponential smoothing model.
DeepAR: Autoregressive forecasting model.
DirectTabular: Tabular regression model (predicts all values at once).
ETS: Naive exponential smoothing model.
NPTS: Non-parametric time series forecasting model.
Naive: Baseline model that sets the value at timestamp t equal to the observed value at t-1.
PatchTST: Transformer-based time series forecasting model.
RecursiveTabular: Similar to DirectTabular, but predicts values one by one.
SeasonalNaive: Baseline model that sets the value at timestamp t equal to the observed value at t-1 from the same season.
TemporalFusionTransformer: Deep-learning model that combines an LSTM layer and transformer to predict quantiles for future values.
Theta: Theta forecasting model.
TiDE: Time series dense encoder model.

Datasets¶

The time series trainer requires more advanced configuration for input datasets than the regression or classification trainers. The regression and classification trainers only require a training dataset, while the time series forecasting trainer requires a training dataset and a corresponding static dataset.

Training dataset¶

The training dataset should be a time series dataset with at minium a timestamp column and a target value column. The target value column is what the model will predict.

Additionally, the training dataset accepts column mappings for the following options:

Item ID: The item ID column mapping is useful when you have multiple time series tracked in a single dataset. For example, a dataset containing a history of sales may contain an item ID for the product, such as an SKU, allowing a model to be trained to predict individual items independently of each other.
Known covariates: Known covariates are columns that are known throughout the entire forecast horizon. For example, columns could indicate the following:
Holidays and weekends: A column could indicate if a given day is a holiday or a weekend that is known throughout the entire forecast horizon.
Event dates: A column could indicate that a certain event occurred that day, such as a sporting event. The remaining columns are treated as past covariates, which are values that fluctuate and are not known in advance, such as temperature or sales of similar products.

Static dataset¶

The static dataset contains metadata that are time-independent.

This may include data such as:

Store locations
Customer segment age group
Pricing tier or subscription level

The static dataset must be formatted in such a way that each item ID as mapped in the training dataset corresponds to a single row in the static dataset.

If you have a training dataset that looks like the following:

date	item_id	sales
2025-09-01	STORE_1	14
2025-09-01	STORE_2	7
2025-09-02	STORE_1	16
2025-09-02	STORE_2	9
2025-09-03	STORE_1	13

The static dataset may look like the following, containing some useful metadata about the values in item_id:

item_id	zip_code	store_class
STORE_1	10016	LARGE
STORE_2	90210	POP_UP

Static data will then be associated with the training dataset and optional testing dataset, joined by the item_id column.

Parameters¶

Forecast horizon length: The number of future time steps to predict during inference.
Evaluation metric: The metric used to score the models during training performance. Different datasets may perform better on different evaluation metrics. Refer to the in-platform documentation for more details. The higher an evaluation metric is, the better the model's performance. All evaluation metrics follow this pattern.
Time limit: Optionally set the maximum amount of time spent training. When the time limit is reached, the training portion of the job will end. Note that the job will not finish until model validation is complete and the model has been published.
Training preset: Training presets are an abstraction above more complex arguments around bagging and stacking.
best_quality: Uses a mix of statistical, deep-learning, and classical machine learning models with a longer validation step and multiple backtests. This preset will produce the most accurate model, but it may take a while to train.
high_quality: Uses a mix of statistical, deep-learning, and classical machine learning models. This preset will produce more accurate models than the medium_quality preset, but it will be slower to train.
medium_quality: Uses a mix of statistical and classical machine learning models, as well as TemporalFusionTransformer. This preset produces a decently performing model with a faster training time.
fast_training: Uses only simple statistical and tree-base models. This preset offers fast training that may not be accurate and is only recommended for prototyping.
Quantile levels: Quantile levels specify an extra set of quantile measurements (0.1, 0.2, 0.9, etc.) to add to the prediction output to provide a probabilistic distribution forecast. A value predicted for some quantile 0.1 means that the actual value is predicted to appear at less than the 0.1 value 10% of the time.
Resample options: Options for resampling the time series to a regular frequency.
Frequency alias: An alias for data frequency. When set to auto, the trainer will attempt to infer the frequency of time series data. Otherwise, each option represents a common frequency for timestamps. For example, B means that the timestamp is per business day, W is weekly, and so on. These values are based on pandas ↗ offset aliases. This will be used to influence any resampling to change timestamps to fit a common frequency.
Numeric aggregation method: When resampling, values may have to be modified to "correct" the value to the new frequency. This controls how non-categorical values are aggregated across timestamps before being re-sampled to new timestamps.
Categorical aggregation method: Same as the numeric aggregation method, but for categorical values.
Missing value fill strategy: Influences how to fill missing values in the target column.
Enable ensemble: When true, a weighted ensemble of all trained models will be fit and compared against individual models before determining the best model.
Excluded model types: Excludes certain model types from being used during training.

Advanced: Hyperparameters¶

The optional hyperparameter field allows for deeper customization and control over each trained model, with one caveat. When this field is defined, any model not supplied in the hyperparameter field will be ignored. This field will override the default hyperparameters chosen by AutoGluon, and can produce poor results. For this reason, we recommend avoiding this field if you have not consulted the AutoGluon documentation ↗. In the majority of cases, the default hyperparameters provide strong enough results.

In general, the arguments passed here will be sent directly to the underlying model implementation.

Take the example below:

{
    "DeepAR": {},
    "Theta": [
        {"decomposition_type": "additive"},
        {"seasonal_period": 1},
    ],
}

These hyperparameters will enforce that the following models are trained:

A DeepAR model.
A Theta model with decomposition_type set to additive.
A Theta model with seasonal_period set to 1.

With the provided hyperparameters, these are the only models that will be trained before any ensembling operations are applied.

Outputs¶

The time series forecasting trainer will output a Foundry model that contains the best model as determined by the validation steps. Details about the model can be accessed by navigating to the experiment, which will contain parameters, metrics, and plots that provide insight into the model's performance.

You can visualize and analyze the forecasts as time series using Foundry applications like Quiver, Vertex, and Workshop. Learn more about using time series in Foundry.

中文翻译¶

时间序列预测训练器 (Time Series Forecasting Trainer)¶

时间序列预测训练器会并行或顺序训练多个模型，以帮助确定性能最佳的模型。如果该训练器判定加权集成模型的表现优于任何单一模型，它还可以利用集成学习。该训练器基于 AutoGluon's ↗ 的 TimeSeriesPredictor 类构建。

时间序列预测训练器依赖准确的历史时间序列数据，在多个预定义的预测步长内预测未来数据。模型集成的结构可在输出模型的实验页面 图表 (Plots) 下查看。

训练的模型 (Models Trained)¶

内部会训练多个模型，除非通过排除模型或使用超参数禁用了某些模型。

以下类型的模型可供训练：

AutoETS: 自动调优的指数平滑模型。
DeepAR: 自回归预测模型。
DirectTabular: 表格回归模型（一次性预测所有值）。
ETS: 朴素指数平滑模型。
NPTS: 非参数时间序列预测模型。
Naive: 基线模型，将时间戳 t 的值设置为 t-1 的观测值。
PatchTST: 基于 Transformer 的时间序列预测模型。
RecursiveTabular: 与 DirectTabular 类似，但逐个预测值。
SeasonalNaive: 基线模型，将时间戳 t 的值设置为同一季节 t-1 的观测值。
TemporalFusionTransformer: 深度学习模型，结合 LSTM 层和 Transformer 来预测未来值的分位数。
Theta: Theta 预测模型。
TiDE: 时间序列密集编码器模型。

数据集 (Datasets)¶

时间序列训练器对输入数据集的要求比回归或分类训练器更复杂。回归和分类训练器仅需要一个训练数据集，而时间序列预测训练器需要一个训练数据集和一个对应的静态数据集。

训练数据集 (Training Dataset)¶

训练数据集应为一个时间序列数据集，至少包含一个时间戳列和一个目标值列。目标值列是模型将要预测的内容。

此外，训练数据集接受以下选项的列映射：

项目 ID (Item ID): 当单个数据集中包含多个时间序列时，项目 ID 列映射非常有用。例如，包含销售历史的数据集可能包含产品的项目 ID（如 SKU），从而可以训练模型独立预测各个项目。
已知协变量 (Known covariates): 已知协变量是在整个预测范围内都已知的列。例如，列可以指示以下内容：
节假日和周末：某列可以指示某天是否为节假日或周末，这些信息在整个预测范围内都是已知的。
事件日期：某列可以指示当天发生了特定事件，例如体育赛事。其余列被视为过去协变量，这些值是波动的且无法提前预知，例如温度或类似产品的销售额。

静态数据集 (Static Dataset)¶

静态数据集包含与时间无关的元数据。

这可能包括以下数据：

商店位置
客户细分年龄段
定价层级或订阅级别

静态数据集的格式必须确保训练数据集中映射的每个项目 ID 对应静态数据集中的一行。

如果您的训练数据集如下所示：

date	item_id	sales
2025-09-01	STORE_1	14
2025-09-01	STORE_2	7
2025-09-02	STORE_1	16
2025-09-02	STORE_2	9
2025-09-03	STORE_1	13

那么静态数据集可能如下所示，包含关于 item_id 值的一些有用元数据：

item_id	zip_code	store_class
STORE_1	10016	LARGE
STORE_2	90210	POP_UP

静态数据随后将与训练数据集和可选的测试数据集关联，通过 item_id 列进行连接。

参数 (Parameters)¶

预测范围长度 (Forecast horizon length): 推理期间要预测的未来时间步数。
评估指标 (Evaluation metric): 用于在训练期间对模型进行评分的指标。不同的数据集可能在不同的评估指标上表现更好。更多详情请参考平台内文档。评估指标值越高，模型性能越好。所有评估指标均遵循此规律。
时间限制 (Time limit): 可选设置训练花费的最长时间。达到时间限制后，作业的训练部分将结束。请注意，作业在模型验证完成并发布模型之前不会结束。
训练预设 (Training preset): 训练预设是对装袋和堆叠等更复杂参数的抽象。
best_quality: 混合使用统计、深度学习和经典机器学习模型，具有更长的验证步骤和多次回测。此预设将产生最准确的模型，但训练时间可能较长。
high_quality: 混合使用统计、深度学习和经典机器学习模型。此预设将产生比 medium_quality 预设更准确的模型，但训练速度较慢。
medium_quality: 混合使用统计和经典机器学习模型，以及 TemporalFusionTransformer。此预设可在较短的训练时间内产生性能尚可的模型。
fast_training: 仅使用简单的统计和基于树的模型。此预设提供快速训练，但可能不准确，仅推荐用于原型设计。
分位数水平 (Quantile levels): 分位数水平指定一组额外的分位数测量值（0.1、0.2、0.9 等），添加到预测输出中，以提供概率分布预测。对于某个分位数 0.1 的预测值意味着实际值有 10% 的概率会低于 0.1 这个值。
重采样选项 (Resample options): 将时间序列重采样为规则频率的选项。
频率别名 (Frequency alias): 数据频率的别名。设置为 auto 时，训练器将尝试推断时间序列数据的频率。否则，每个选项代表时间戳的常见频率。例如，B 表示时间戳按工作日计，W 表示按周计，等等。这些值基于 pandas ↗ 的偏移别名。这将用于影响任何重采样，以更改时间戳使其符合常见频率。
数值聚合方法 (Numeric aggregation method): 重采样时，可能需要修改值以"校正"为新频率的值。这控制着在重新采样到新时间戳之前，如何跨时间戳聚合非分类值。
分类聚合方法 (Categorical aggregation method): 与数值聚合方法相同，但用于分类值。
缺失值填充策略 (Missing value fill strategy): 影响如何填充目标列中的缺失值。
启用集成 (Enable ensemble): 启用时，将拟合所有已训练模型的加权集成，并与单个模型进行比较，然后确定最佳模型。
排除的模型类型 (Excluded model types): 排除某些模型类型，使其不参与训练。

高级：超参数 (Advanced: Hyperparameters)¶

可选的超参数字段允许对每个训练的模型进行更深入的定制和控制，但有一个注意事项。当定义此字段时，未在超参数字段中提供的任何模型都将被忽略。此字段将覆盖 AutoGluon 选择的默认超参数，并可能导致结果不佳。因此，我们建议在未查阅 AutoGluon 文档 ↗ 的情况下避免使用此字段。在大多数情况下，默认超参数已能提供足够强大的结果。

通常，此处传递的参数将直接发送到底层模型实现。

请看下面的示例：

{
    "DeepAR": {},
    "Theta": [
        {"decomposition_type": "additive"},
        {"seasonal_period": 1},
    ],
}

这些超参数将强制训练以下模型：

一个 DeepAR 模型。
一个 decomposition_type 设置为 additive 的 Theta 模型。
一个 seasonal_period 设置为 1 的 Theta 模型。

使用提供的超参数，这些是在应用任何集成操作之前唯一会训练的模型。

输出 (Outputs)¶

时间序列预测训练器将输出一个 Foundry 模型，其中包含由验证步骤确定的最佳模型。有关模型的详细信息，可以通过导航到实验页面访问，该页面将包含参数、指标和图表，以深入了解模型的性能。

您可以使用 Quiver、Vertex 和 Workshop 等 Foundry 应用程序将预测结果可视化为时间序列并进行分析。了解更多关于在 Foundry 中使用时间序列的信息。