跳转至

Managed profiles(托管配置文件(Managed profiles))

Foundry can automatically optimize Spark profiles for jobs based on historical resource usage. When enabled, Foundry analyzes the resource consumption of recent job runs and recommends a profile configuration that right-sizes driver and executor resources, reducing unnecessary resource allocation while maintaining job reliability.

Eligibility criteria

For Foundry to apply a managed profile recommendation, the following conditions must all be met:

  • The last 5 runs of the job were all successful.
  • At least 4 of the last 5 runs have recorded metrics (at most 1 run may be missing metrics).
  • The Spark profiles and configuration are identical across the current job and the last 5 jobs.
  • The job is using the default profile, or has the MANAGED_PROFILE profile applied.

If any of these conditions are not met, the job will run with the original profile configuration.

When a job meets the eligibility criteria, Foundry calculates a recommended profile based on the resource usage patterns from the last 5 successful runs.

Resource calculations

For each resource dimension (executor memory, executor cores, driver memory, driver cores, and number of executors), Foundry takes the maximum observed usage across the last 5 runs, capped at the original request value. This prevents scaling up a job beyond its initial resource allocation.

Local mode optimization

For jobs with low parallelism requirements, Foundry may recommend running the job in local mode for improved efficiency. This optimization applies when:

numExecutors * executorCores <= 4

When local mode is recommended, the driver resources are adjusted to accommodate the workload that would have been distributed across executors:

  • Driver memory: Set to the maximum of the original driver memory and the combined executor memory (numExecutors * executorMemoryGb).
  • Driver cores: Set to the maximum of the original driver cores and the combined executor cores (numExecutors * executorCores).

This consolidation eliminates the overhead of distributed execution for small jobs while ensuring sufficient resources are available on the driver.

Enabling managed profiles

Jobs using the default Spark profile are automatically eligible for managed profile recommendations. No additional configuration is required.

To enable managed profiles on a job that uses custom Spark profiles, add the MANAGED_PROFILE profile to your existing profile configuration. This signals to Foundry that the job should be considered for automatic profile optimization.

Learn how to apply Spark profiles to your transforms in the Apply Spark profiles documentation.


中文翻译

托管配置文件(Managed profiles)

Foundry 可根据历史资源使用情况自动优化作业的 Spark 配置文件。启用此功能后,Foundry 会分析近期作业运行的资源消耗,并推荐能够合理调整驱动程序(Driver)和执行器(Executor)资源配置的配置文件,从而在保持作业可靠性的同时减少不必要的资源分配。

资格条件(Eligibility criteria)

要使 Foundry 应用托管配置文件推荐,必须满足以下所有条件:

  • 作业最近 5 次运行均成功完成。
  • 最近 5 次运行中至少有 4 次记录了指标(最多允许 1 次运行缺失指标)。
  • 当前作业与最近 5 次作业的 Spark 配置文件及配置完全相同。
  • 作业使用默认配置文件,或已应用 MANAGED_PROFILE 配置文件。

如果上述任一条件未满足,作业将使用原始配置文件配置运行。

配置文件推荐机制

当作业满足资格条件时,Foundry 会根据最近 5 次成功运行的资源使用模式计算推荐配置文件。

资源计算

对于每个资源维度(执行器内存、执行器核心数、驱动程序内存、驱动程序核心数及执行器数量),Foundry 会取最近 5 次运行中观察到的最大使用量,并以原始请求值为上限。这可以防止作业超出其初始资源分配规模。

本地模式优化(Local mode optimization)

对于并行度要求较低的作业,Foundry 可能推荐以本地模式运行以提高效率。此优化适用于以下情况:

numExecutors * executorCores <= 4

当推荐使用本地模式时,驱动程序资源将进行调整以承载原本分配给执行器的工作负载:

  • 驱动程序内存: 设置为原始驱动程序内存与合并执行器内存(numExecutors * executorMemoryGb)中的较大值。
  • 驱动程序核心数: 设置为原始驱动程序核心数与合并执行器核心数(numExecutors * executorCores)中的较大值。

这种整合消除了小型作业的分布式执行开销,同时确保驱动程序拥有充足的资源。

启用托管配置文件

使用默认 Spark 配置文件的作业将自动获得托管配置文件推荐资格,无需额外配置。

要为使用自定义 Spark 配置文件的作业启用托管配置文件,请将 MANAGED_PROFILE 配置文件添加到现有配置文件配置中。这将告知 Foundry 该作业应纳入自动配置文件优化考虑范围。

了解如何将 Spark 配置文件应用于转换(Transform),请参阅应用 Spark 配置文件文档。