跳转至

Regression trainer(回归训练器(Regression trainer))

The regression trainer trains a number of models in parallel or sequentially to help it determine the best performing model. This trainer may also take advantage of ensembling if it determines that a weighted ensemble of models performs better than any single model. This trainer is built using AutoGluon's ↗ TabularPredictor class.

The regression trainer may also perform stacking or bagging to further improve performance. Stacking is an ensembling process where the predictions produced by the set of models trained on the input data are used to train a further "layer" of models, and this process may be repeated. Bagging is an internal optimization method where each model architecture is trained on multiple random samples of the data and then outputs are combined in an ensemble. Stacking and bagging is controlled by the Training preset parameter.

The structure of the model ensemble can be viewed on the output model's experiment page under Plots.

Models trained

Internally, multiple models will be trained unless disabled by excluding the model or with the use of hyperparameters.

The following types of models are available for training:

While some of these models are strictly classification models, they can still be applied in regression when stacking is used.

Parameters

  • Evaluation metric: The metric used to score the models during training performance. Different datasets may perform better on different evaluation metrics. Refer to the in-platform documentation for more details. For the regression evaluation metrics that measure error (such as MAE, MSE, or RMSE), lower values indicate better performance. However, for metrics such as R² and Pearson correlation, higher values indicate better model performance.
  • Training preset: Training presets are an abstraction above more complex arguments around bagging and stacking.
  • best_quality: Enables stacking and bagging. This should be used when the best possible model is required, even at the cost of training and inference speed.
  • high_quality: Enables stacking and bagging. Training time and inference time should be faster than the best_quality preset, but the model may be slightly less accurate.
  • good_quality: Enables stacking and bagging. This preset has a faster training and inference speed than the previous presets, with decent predictive accuracy.
  • medium_quality: Disables stacking and bagging. This preset has the fastest training time, but moderate predictive accuracy. This should be used for prototyping.
  • Training and inference limits:
  • Training time limit: Optionally set the maximum amount of time spent training. This time limit may be respected on a best effort basis when using presets that enable stacking and bagging. When the time limit is reached, the training portion of the job will end. Note that the job will not finish until model validation is complete and the model has been published.
  • Inference time limit: Optionally set the maximum amount of time that a given model may run during inference. Any model that surpasses this time limit will be skipped. We do not recommend setting this value unless strict inference time limits are required.
  • Excluded model types: Excludes certain model types from being used during training.
  • Prediction column name: An override for the prediction column name. By default, the target column name will be used.

Advanced: Stacking configuration

The optional stacking configuration fields allow for deeper control over how the model ensemble is constructed:

  • Fit weighted ensemble: When enabled, a WeightedEnsemble model will be produced at each stacking layer. We recommend enabling this for improved model performance.
  • Fit last ensemble with all models: When enabled, the WeightedEnsemble of the last stacking layer will be fit with models from all previous layers as base models. This value has no effect when using a preset that disables stacking. We recommend enabling this for improved model performance.
  • Fit full weighted ensemble: When enabled, a secondary WeightedEnsemble model will be produced on top of the weighted ensembles produced when fit last ensemble with all models is enabled. This option has no effect if fit last ensemble with all models is disabled, or when using a preset that disables stacking.

Advanced: Hyperparameters

The optional hyperparameter field allows for deeper customization and control over each trained model, with one caveat. When this field is defined, any model not supplied in the hyperparameter field will be ignored. This field will override the default hyperparameters chosen by AutoGluon, and can produce poor results. For this reason, we recommend avoiding this field if you have not consulted the AutoGluon documentation ↗. In the majority of cases, the default hyperparameters provide strong enough results.

In general, the arguments passed here will be sent directly to the underlying model implementation.

Take the example below:

{
    "GBM": [
        {"extra_trees": true}, {}
    ],
    "NN_TORCH": {}
}

These hyperparameters will enforce that the following models are trained:

  1. A GBM model with the argument extra_trees set to true.
  2. A GBM model with default arguments.
  3. A PyTorch model with default arguments.

With the provided hyperparameters, these are the only models that will be trained before any ensembling or stacking operations are applied.

Outputs

The regression trainer will output a Foundry model that contains the best model as determined by the validation steps. Details about the model can be accessed by navigating to the experiment, which will contain parameters, metrics, and plots that provide insight into the model's performance.


中文翻译

回归训练器(Regression trainer)

回归训练器(Regression trainer)会并行或顺序训练多个模型,以帮助确定性能最佳的模型。如果该训练器判定多个模型的加权集成(weighted ensemble)表现优于任何单个模型,它还可以利用集成学习(ensembling)的优势。该训练器基于 AutoGluon's ↗TabularPredictor 类构建。

回归训练器还可以执行堆叠(stacking)或装袋(bagging)以进一步提升性能。堆叠是一种集成过程,将基于输入数据训练的一组模型产生的预测结果用于训练另一"层"模型,并且这一过程可以重复进行。装袋是一种内部优化方法,每种模型架构会在多个随机数据样本上进行训练,然后将输出结果组合成一个集成模型。堆叠和装袋由训练预设(Training preset)参数控制。

模型集成的结构可以在输出模型的实验页面中的图表(Plots)下查看。

训练的模型(Models trained)

在内部,除非通过排除模型或使用超参数禁用,否则将训练多个模型。

以下类型的模型可供训练:

虽然其中一些模型严格来说是分类模型,但在使用堆叠时,它们仍然可以应用于回归任务。

参数(Parameters)

  • 评估指标(Evaluation metric): 用于在训练过程中对模型进行评分的指标。不同的数据集可能在不同的评估指标上表现更好。更多详情请参考平台内文档。对于衡量误差的回归评估指标(如 MAE、MSE 或 RMSE),数值越低表示性能越好。而对于 R² 和皮尔逊相关系数(Pearson correlation)等指标,数值越高表示模型性能越好。
  • 训练预设(Training preset): 训练预设是对装袋和堆叠相关更复杂参数的高级抽象。
  • best_quality:启用堆叠和装袋。当需要最佳模型时,即使牺牲训练和推理速度也应使用此预设。
  • high_quality:启用堆叠和装袋。训练时间和推理时间应比 best_quality 预设更快,但模型精度可能略有下降。
  • good_quality:启用堆叠和装袋。此预设的训练和推理速度比前两个预设更快,同时具有不错的预测精度。
  • medium_quality:禁用堆叠和装袋。此预设的训练速度最快,但预测精度中等。适用于原型开发。
  • 训练和推理限制(Training and inference limits):
  • 训练时间限制(Training time limit): 可选设置训练所花费的最大时间。当使用启用堆叠和装袋的预设时,此时间限制将尽最大努力遵守。达到时间限制后,作业的训练部分将结束。请注意,在模型验证完成且模型发布之前,作业不会结束。
  • 推理时间限制(Inference time limit): 可选设置给定模型在推理期间运行的最大时间。任何超过此时间限制的模型将被跳过。除非有严格的推理时间限制要求,否则不建议设置此值。
  • 排除的模型类型(Excluded model types): 排除某些模型类型,使其不参与训练。
  • 预测列名称(Prediction column name): 预测列名称的覆盖值。默认情况下,将使用目标列名称。

高级:堆叠配置(Advanced: Stacking configuration)

可选的堆叠配置字段允许对模型集成的构建方式进行更深入的控制:

  • 拟合加权集成(Fit weighted ensemble): 启用后,每个堆叠层都会生成一个 WeightedEnsemble 模型。建议启用此选项以提升模型性能。
  • 使用所有模型拟合最终集成(Fit last ensemble with all models): 启用后,最后一个堆叠层的 WeightedEnsemble 将使用之前所有层的模型作为基础模型进行拟合。当使用禁用堆叠的预设时,此值无效。建议启用此选项以提升模型性能。
  • 拟合完整加权集成(Fit full weighted ensemble): 启用后,将在启用使用所有模型拟合最终集成时生成的加权集成之上,再生成一个二级 WeightedEnsemble 模型。如果使用所有模型拟合最终集成被禁用,或使用禁用堆叠的预设时,此选项无效。

高级:超参数(Advanced: Hyperparameters)

可选的超参数字段允许对每个训练模型进行更深入的定制和控制,但有一个注意事项。当定义此字段时,超参数字段中未提供的任何模型将被忽略。此字段会覆盖 AutoGluon 选择的默认超参数,并可能产生较差的结果。因此,建议在未查阅 AutoGluon 文档 ↗ 的情况下避免使用此字段。在大多数情况下,默认超参数已能提供足够强的结果。

通常,此处传递的参数将直接发送到底层模型实现。

以下面的示例为例:

{
    "GBM": [
        {"extra_trees": true}, {}
    ],
    "NN_TORCH": {}
}

这些超参数将强制训练以下模型:

  1. 一个参数 extra_trees 设置为 true 的 GBM 模型。
  2. 一个使用默认参数的 GBM 模型。
  3. 一个使用默认参数的 PyTorch 模型。

使用提供的超参数,这些是在应用任何集成或堆叠操作之前唯一会训练的模型。

输出(Outputs)

回归训练器将输出一个 Foundry 模型,其中包含通过验证步骤确定的最佳模型。有关模型的详细信息可以通过导航到实验页面进行访问,该页面将包含参数、指标和图表,帮助深入了解模型的性能。