Classification trainer(分类训练器(Classification trainer))¶
The classification trainer trains a number of models in parallel or sequentially to help it determine the best performing model. This trainer may also take advantage of ensembling if it determines that a weighted ensemble of models performs better than any single model. This trainer is built using AutoGluon's ↗ TabularPredictor class.
The classification trainer may also perform model training techniques such as stacking or bagging to further improve performance. Stacking is an ensembling process where the predictions produced by the set of models trained on the input data are used to train a further "layer" of models, and this process may be repeated. Bagging is an internal optimization method where each model architecture is trained on multiple random samples of the data and outputs are combined in an ensemble. Stacking and bagging is controlled by the Training preset parameter.
The structure of the model ensemble can be viewed on the output model's experiment page under Plots.
Models trained¶
Internally, multiple models will be trained, unless disabled by excluding the model or with the use of hyperparameters.
The following types of models are available for training:
- GBM: LightGBM ↗ gradient boosting model
- CAT: CatBoost ↗ model
- XGB: XGBoost ↗ model
- RF: Random Forest ↗ model
- XT: Extra Trees ↗ model
- KNN: KNearestNeighbors ↗ model
- LR: Linear ↗ model
- NN_TORCH: PyTorch ↗ model designed for tabular classification
- FASTAI: fast.ai ↗ model designed for tabular classification
Parameters¶
- Evaluation metric: The metric used to score the models during training performance. Different datasets may perform better on different evaluation metrics. Refer to the in-platform documentation for more details. The higher an evaluation metric is, the better the model's performance. All classification evaluation metrics follow this pattern.
- Training preset: Training presets are an abstraction above more complex arguments around bagging and stacking.
best_quality: Enables stacking and bagging. This should be used when the best possible model is required, even at the cost of training and inference speed.high_quality: Enables stacking and bagging. Training time and inference time should be faster than thebest_qualitypreset, but the model may be slightly less accurate.good_quality: Enables stacking and bagging. This preset has a faster training and inference speed than the previous presets, with decent predictive accuracy.medium_quality: Disables stacking and bagging. This preset has the fastest training time, but moderate predictive accuracy. This should be used for prototyping.- Training and inference limits:
- Training time limit: Optionally set the maximum amount of time spent training. This time limit may be respected on a best effort basis when using presets that enable stacking and bagging. When the time limit is reached, the training portion of the job will end. Note that the job will not finish until model validation is complete and the model has been published.
- Inference time limit: Optionally set the maximum amount of time that a given model may run during inference. Any model that surpasses this time limit will be skipped. We do not recommend setting this value unless strict inference time limits are required.
- Excluded model types: Excludes certain model types from being used during training.
- Prediction column name: An override for the prediction column name. By default, the target column name will be used.
- Predict probabilities: If true, class probabilities will be returned instead of a single class prediction. Additional output columns will be added following the format
{prediction_column_name}_proba_{class_label}whereprediction_column_nameis the name of the prediction column andclass_labelis the class that the probability is predicting. Note that by default, the prediction column name is the name of the target columns.
Advanced: Stacking configuration¶
The optional stacking configuration fields allow for deeper control over how the model ensemble is constructed:
- Fit weighted ensemble: When enabled, a
WeightedEnsemblemodel will be produced at each stacking layer. We recommend enabling this for improved model performance. - Fit last ensemble with all models: When enabled, the
WeightedEnsembleof the last stacking layer will be fit with models from all previous layers as base models. This value has no effect when using a preset that disables stacking. We recommend enabling this for improved model performance. - Fit full weighted ensemble: When enabled, a secondary
WeightedEnsemblemodel will be produced on top of the weighted ensembles produced when fit last ensemble with all models is enabled. This option has no effect if fit last ensemble with all models is disabled, or when using a preset that disables stacking.
Advanced: Hyperparameters¶
The optional hyperparameter field allows for deeper customization and control over each trained model, with one caveat. When this field is defined, any model not supplied in the hyperparameter field will be ignored. This field will override the default hyperparameters chosen by AutoGluon, and can produce poor results. For this reason, we recommend avoiding this field if you have not consulted the AutoGluon documentation ↗. In the majority of cases, the default hyperparameters provide strong enough results.
In general, the arguments passed here will be sent directly to the underlying model implementation.
Take the example below:
{
"GBM": [
{"extra_trees": true}, {}
],
"NN_TORCH": {}
}
These hyperparameters will enforce that the following models are trained:
- A GBM model with the argument
extra_treesset to true. - A GBM model with default arguments.
- A PyTorch model with default arguments.
With the provided hyperparameters, these are the only models that will be trained before any ensembling or stacking operations are applied.
Outputs¶
The classification trainer will output a Foundry model that contains the best model as determined by the validation steps. Details about the model can be accessed by navigating to the experiment, which will contain parameters, metrics, and plots that provide insight into the model's performance.
中文翻译¶
分类训练器(Classification trainer)¶
分类训练器会并行或串行训练多个模型,以帮助确定性能最佳的模型。如果该训练器判定加权集成模型(weighted ensemble)的性能优于任何单一模型,它还可以利用集成学习(ensembling)技术。该训练器基于 AutoGluon's ↗ 的 TabularPredictor 类构建。
分类训练器还可以执行堆叠(stacking)或装袋(bagging)等模型训练技术,以进一步提升性能。堆叠是一种集成过程,利用在输入数据上训练的一组模型产生的预测结果来训练另一"层"模型,并且这一过程可以重复进行。装袋是一种内部优化方法,每种模型架构会在数据的多个随机样本上进行训练,并将输出结果组合成集成模型。堆叠和装袋通过训练预设(Training preset)参数进行控制。
模型集成的结构可以在输出模型的实验页面中的图表(Plots)下查看。
训练的模型¶
在内部,系统会训练多个模型,除非通过排除模型或使用超参数禁用了某些模型。
以下类型的模型可用于训练:
- GBM: LightGBM ↗ 梯度提升模型(gradient boosting model)
- CAT: CatBoost ↗ 模型
- XGB: XGBoost ↗ 模型
- RF: 随机森林(Random Forest)↗ 模型
- XT: 极端随机树(Extra Trees)↗ 模型
- KNN: K近邻(KNearestNeighbors)↗ 模型
- LR: 线性(Linear)↗ 模型
- NN_TORCH: 专为表格分类(tabular classification)设计的 PyTorch ↗ 模型
- FASTAI: 专为表格分类设计的 fast.ai ↗ 模型
参数¶
- 评估指标(Evaluation metric): 用于在训练过程中对模型性能进行评分的指标。不同的数据集可能在不同的评估指标上表现更佳。更多详情请参考平台内文档。评估指标值越高,模型性能越好。所有分类评估指标均遵循此规律。
- 训练预设(Training preset): 训练预设是对装袋和堆叠相关复杂参数的高级抽象。
best_quality:启用堆叠和装袋。当需要最佳模型时使用此预设,即使牺牲训练和推理速度也在所不惜。high_quality:启用堆叠和装袋。训练时间和推理时间应快于best_quality预设,但模型精度可能略有下降。good_quality:启用堆叠和装袋。此预设的训练和推理速度快于前两种预设,同时具有不错的预测精度。medium_quality:禁用堆叠和装袋。此预设训练时间最快,但预测精度中等。适用于原型开发(prototyping)。- 训练和推理限制:
- 训练时间限制(Training time limit): 可选设置训练花费的最大时间。当使用启用堆叠和装袋的预设时,此时间限制将尽力遵守。达到时间限制后,训练部分的任务将结束。请注意,任务在模型验证完成并发布之前不会结束。
- 推理时间限制(Inference time limit): 可选设置给定模型在推理期间运行的最大时间。超过此时间限制的模型将被跳过。除非有严格的推理时间限制要求,否则不建议设置此值。
- 排除的模型类型(Excluded model types): 排除某些模型类型,使其不参与训练。
- 预测列名称(Prediction column name): 覆盖预测列的名称。默认情况下将使用目标列(target column)的名称。
- 预测概率(Predict probabilities): 如果为 true,将返回类别概率(class probabilities)而非单一类别预测。将按照
{prediction_column_name}_proba_{class_label}格式添加额外的输出列,其中prediction_column_name是预测列的名称,class_label是概率所预测的类别。请注意,默认情况下预测列名称即为目标列的名称。
高级:堆叠配置(Stacking configuration)¶
可选的堆叠配置字段允许对模型集成的构建方式进行更深入的控制:
- 拟合加权集成(Fit weighted ensemble): 启用后,每个堆叠层都会生成一个
WeightedEnsemble模型。建议启用此选项以提升模型性能。 - 使用所有模型拟合最终集成(Fit last ensemble with all models): 启用后,最后一个堆叠层的
WeightedEnsemble将使用之前所有层的模型作为基模型进行拟合。当使用禁用堆叠的预设时,此值无效。建议启用此选项以提升模型性能。 - 拟合完整加权集成(Fit full weighted ensemble): 启用后,将在启用使用所有模型拟合最终集成时生成的加权集成之上,再生成一个二级
WeightedEnsemble模型。如果使用所有模型拟合最终集成被禁用,或使用禁用堆叠的预设时,此选项无效。
高级:超参数(Hyperparameters)¶
可选的超参数字段允许对每个训练模型进行更深入的定制和控制,但有一个注意事项。当定义此字段时,任何未在超参数字段中提供的模型都将被忽略。此字段会覆盖 AutoGluon 选择的默认超参数,并可能导致结果不佳。因此,建议在未查阅 AutoGluon 文档 ↗ 的情况下避免使用此字段。在大多数情况下,默认超参数已能提供足够强大的结果。
通常,此处传递的参数将直接发送到底层模型实现中。
以下面的示例为例:
{
"GBM": [
{"extra_trees": true}, {}
],
"NN_TORCH": {}
}
这些超参数将强制训练以下模型:
- 一个参数
extra_trees设置为 true 的 GBM 模型。 - 一个使用默认参数的 GBM 模型。
- 一个使用默认参数的 PyTorch 模型。
使用提供的超参数时,这些是在应用任何集成或堆叠操作之前唯一会训练的模型。
输出¶
分类训练器将输出一个 Foundry 模型,其中包含由验证步骤确定的最佳模型。有关模型的详细信息,可以通过导航到实验页面进行查看,该页面将包含参数、指标和图表,以帮助深入了解模型的性能。