跳转至

Trained model node(已训练模型节点)

The trained model node allows you to run user-defined Machine Learning models — trained either inside or outside of Foundry — directly within a Pipeline Builder pipeline. This enables ML teams and no-code users to seamlessly integrate model inference into their data pipelines without writing any code.

Getting started

1. Configure your pipeline

Ensure you are working with a Spark (batch) pipeline and that Warm pool is set to OFF. Create a new pipeline if your existing one is not configured to use Spark (batch) mode.

Check if you are working with a batch pipeline with warm pool disabled.

2. Import your model

Navigate to Reusables > Trained Models in the import menu and follow the resource import flow to make your model available to the pipeline.

Trained models import menu.

3. Add the model node

Select a node in your pipeline canvas and select Trained model to insert it.

:::callout{theme="neutral"} The Trained model option only appears after you have imported at least one model into the pipeline following step 2. :::

Selecting the trained model node on the pipeline canvas.

4. Configure inputs and outputs

Map your input and output columns to the model's expected API schema.

  • Input columns support expressions, not just direct column mappings. You can apply casts, transformations, and other expressions to your input data before it is passed to the model.
  • Output columns can be aliased in the result dataset. This allows you to rename model output fields without needing to modify the model itself.

Input and output column mapping for the trained model node.

Supported models

Currently, only models with a single tabular input and a single tabular output in their model API are supported. The model must have at least one required input column.

When using the model node, your model must return exactly the columns defined in the model's API. Additional columns not defined in the model API will be dropped, whereas columns that are missing may result in build errors.

:::callout{theme="warning"} Timeseries models may produce unexpected results. Inference runs independently on each partition of your data, so models requiring grouped or sequenced data may operate on fragmented batches if data is randomly or incorrectly partitioned. :::

Support for additional API types and timeseries models is planned for a future release.

Supported column types

This feature supports models whose API includes the following data types for tabular input and output columns:

  • Primitives: string, boolean, integer, long, float, double, date, timestamp
  • Complex: array, map, struct (including nested fields)

Unsupported types such as objectSet are rejected at validation time.

Resource configuration & compute cost

Models run as isolated sidecar processes alongside your Spark executors, each with their own dedicated resources. The default resource allocation per model sidecar is:

Resource Default
CPU 1 core
Memory 8 GB
GPU None

If you experience slow builds, you can take either or both of the following actions:

  1. Increase the compute profile for your pipeline.
  2. Increase the resources allocated to your model.

To adjust your model's resources, open the Model import window (Reusables > Trained models) and select Configure resources:

Trained model import window with configure resource options.

This opens the configuration panel where you can adjust CPU, memory, and GPU allocation, as well as the model startup timeout override under the Advanced configuration section:

Model configuration dialog with resource allocation and advanced configuration sections.

:::callout{theme="neutral"} Resource configurations are set per Pipeline Builder pipeline, not per model node. All nodes of the same model within a build/pipeline share the same resource configuration. :::

Keep in mind that this sidecar process isolation comes with additional compute overhead compared to standard transforms. A sidecar is launched on every Spark executor and the driver for each model — see the diagram below for how this scales.

Diagram showing how model sidecars scale across Spark executors and the driver.

:::callout{theme="neutral"} If a model is imported into a pipeline but not used as a node, no sidecar is spun up and no additional cost is incurred. :::

Execution modes

Execution mode Supported
Batch (Spark)
Streaming ❌ (planned)
Faster (Lightweight/DataFusion) ❌ (planned)
Preview ❌ (planned)

Auto-upgrades & branching

Pipeline Builder always uses the latest available version of a model on the build's branch. For example, builds on master will always use the latest model version published to master. If the model version is not found on the build's branch, the system falls back to configured fallback branches, typically the master branch unless otherwise configured. This allows machine learning teams to retrain and publish new model versions with confidence that downstream builds will automatically pick up the latest version.

:::callout{theme="warning"} Models do not yet support Global Branches. When a pipeline uses a Global Branch, model version resolution falls back to the configured fallback branches, defaulting to master if no fallback branches are configured. :::

:::callout{theme="neutral"} Static version pinning is planned as a future feature. All builds currently use the latest available model version. :::

Marketplace

Pipelines containing an active trained model node are not currently supported in Marketplace and will fail to install.

If you need to publish a pipeline that includes a model, remove the model node from the pipeline canvas. The imported model resource itself does not need to be removed — only active model nodes block installation.

Support for Marketplace is planned for a future release.

Sidecar timeouts

When a build starts, each model sidecar must download and load the model before it can serve inference requests. By default, the system allows up to 10 minutes for a sidecar to become ready. If it does not become ready within that window — whether because the model is still loading or because the sidecar has died (often due to running out of memory) — the build fails with a timeout error.

If your model requires more (or less) time to load, you can override the default startup timeout. In the Configure model dialog, under Advanced configuration, set a Model startup timeout override value in seconds. Leave the field empty to use the default.

The startup timeout override is configured per model, per pipeline, and applies to every sidecar launched for that model in the build.

Once a sidecar is ready, individual inference requests have no timeout limit. This means long-running models — such as those performing complex computations or processing large batches — are not subject to per-request time constraints and will run to completion.

Learn more about training and deploying models in Foundry.


中文翻译

已训练模型节点

已训练模型节点 (Trained model node) 允许您在 Pipeline Builder 管道中直接运行用户定义的机器学习模型——无论是在 Foundry 内部还是外部训练的模型。这使得机器学习团队和无代码用户能够无缝地将模型推理集成到其数据管道中,而无需编写任何代码。

入门指南

1. 配置管道

确保您正在使用 Spark(批处理)管道,并且预热池 (Warm pool) 已设置为关闭 (OFF)。如果现有管道未配置为使用 Spark(批处理)模式,请创建一个新管道。

Check if you are working with a batch pipeline with warm pool disabled.

2. 导入模型

在导入菜单中导航至可复用组件 (Reusables) > 已训练模型 (Trained Models),并按照资源导入流程操作,使您的模型在管道中可用。

Trained models import menu.

3. 添加模型节点

在管道画布中选择一个节点,然后选择已训练模型 (Trained model) 将其插入。

:::callout{theme="neutral"} 只有在按照第 2 步将至少一个模型导入管道后,才会显示已训练模型选项。 :::

Selecting the trained model node on the pipeline canvas.

4. 配置输入和输出

将输入和输出列映射到模型预期的 API 架构。

  • 输入列支持表达式,而不仅仅是直接列映射。您可以在将输入数据传递给模型之前,对其应用类型转换、变换和其他表达式。
  • 输出列可以在结果数据集中设置别名。这允许您重命名模型输出字段,而无需修改模型本身。

Input and output column mapping for the trained model node.

支持的模型

目前,仅支持在其模型 API 中具有单个表格输入和单个表格输出的模型。该模型必须至少有一个必填输入列。

使用模型节点时,您的模型必须准确返回模型 API 中定义的列。未在模型 API 中定义的额外列将被丢弃,而缺失的列可能会导致构建错误。

:::callout{theme="warning"} 时间序列模型可能会产生意外结果。推理会独立地在数据的每个分区上运行,因此如果数据被随机或错误地分区,需要分组或排序数据的模型可能会在碎片化的批次上运行。 :::

计划在未来的版本中支持更多 API 类型和时间序列模型。

支持的列类型

此功能支持其 API 包含以下表格输入和输出列数据类型的模型:

  • 基本类型 (Primitives):stringbooleanintegerlongfloatdoubledatetimestamp
  • 复杂类型 (Complex):arraymapstruct(包括嵌套字段)

不支持的类型(如 objectSet)将在验证时被拒绝。

资源配置与计算成本

模型作为隔离的边车进程 (sidecar processes) 与您的 Spark 执行器 (executors) 一起运行,每个进程都有自己专属的资源。每个模型边车的默认资源分配为:

资源 默认值
CPU 1 核
内存 8 GB
GPU

如果遇到构建缓慢的问题,您可以采取以下一项或两项措施:

  1. 提高管道的计算配置文件
  2. 增加分配给模型的资源。

要调整模型的资源,请打开模型导入窗口(可复用组件 > 已训练模型),然后选择配置资源

Trained model import window with configure resource options.

这将打开配置面板,您可以在其中调整 CPU、内存和 GPU 分配,以及在高级配置部分下设置模型启动超时覆盖值:

Model configuration dialog with resource allocation and advanced configuration sections.

:::callout{theme="neutral"} 资源配置是按 Pipeline Builder 管道设置的,而不是按模型节点设置的。构建/管道中同一模型的所有节点共享相同的资源配置。 :::

请注意,与标准转换 (transforms) 相比,这种边车进程隔离会带来额外的计算开销。每个 Spark 执行器和驱动程序 (driver) 都会为每个模型启动一个边车——请参阅下图了解其扩展方式。

Diagram showing how model sidecars scale across Spark executors and the driver.

:::callout{theme="neutral"} 如果模型被导入管道但未用作节点,则不会启动边车,也不会产生额外成本。 :::

执行模式

执行模式 是否支持
批处理 (Spark)
流处理 ❌(计划中)
快速(轻量级/DataFusion) ❌(计划中)
预览 ❌(计划中)

自动升级与分支

Pipeline Builder 始终使用构建分支上最新可用的模型版本。例如,master 上的构建将始终使用发布到 master 的最新模型版本。如果在构建分支上找不到模型版本,系统将回退到配置的回退分支,通常为 master 分支,除非另有配置。这使得机器学习团队能够重新训练并发布新模型版本,并确信下游构建会自动获取最新版本。

:::callout{theme="warning"} 模型尚不支持全局分支 (Global Branches)。当管道使用全局分支时,模型版本解析将回退到配置的回退分支,如果未配置回退分支,则默认回退到 master。 :::

:::callout{theme="neutral"} 静态版本固定计划作为未来功能推出。目前所有构建均使用最新可用的模型版本。 :::

Marketplace

包含活动已训练模型节点的管道目前在 Marketplace 中不受支持,并将安装失败。

如果您需要发布包含模型的管道,请从管道画布中移除模型节点。导入的模型资源本身无需移除——只有活动的模型节点会阻止安装。

计划在未来的版本中支持 Marketplace。

边车超时

当构建开始时,每个模型边车必须下载并加载模型,然后才能处理推理请求。默认情况下,系统允许边车最多有 10 分钟的时间准备就绪。如果在该时间窗口内未准备就绪——无论是由于模型仍在加载,还是因为边车已终止(通常是由于内存耗尽)——构建将因超时错误而失败。

如果您的模型需要更多(或更少)的加载时间,您可以覆盖默认启动超时。在配置模型对话框的高级配置下,以秒为单位设置模型启动超时覆盖值。将该字段留空以使用默认值。

启动超时覆盖按模型、按管道配置,并适用于构建中为该模型启动的每个边车。

边车准备就绪后,单个推理请求没有超时限制。这意味着长时间运行的模型——例如执行复杂计算或处理大批量数据的模型——不受每个请求的时间限制,并将运行至完成。

了解有关在 Foundry 中训练和部署模型的更多信息。