跳转至

Spark UI

Spark has its own Web UI ↗ which complements Foundry's Spark details page with additional information, including:

  • Executor lifecycle information, such as executor launch and shutdown.
  • Larger samples of task and executor metrics, including peak memory usage.
  • All Spark configs used during execution.

Viewing Spark UI

To view the Spark UI for a Transforms job, re-run the job as a debug job. You will see a Spark UI button; selecting this will open Spark's Web UI.

Re-run a job as debug job

Spark UI button

:::callout{theme="neutral"} Spark events appear in the Spark UI after a delay of 1-2 minutes. :::

Spark UI in Foundry usage

Spark's Web UI is rich in detail but does not present information in a manner tailored for Foundry. Below, we provide advice on navigating Spark's Web UI for Foundry jobs.

SQL execution

Spark can break up SQL queries into a main query and one or more subqueries. In some cases, a subquery is more interesting than the main query. This is true for many dataset writes in Foundry.

When viewing a "Writing dataset ..." SQL execution in the Spark UI, you can find the query graph for the write linked under Sub Execution IDs.

Writing dataset query

Main query 0 lacks information

Subquery 1 contains query graph

Context warming

The Jobs tab in the Spark UI shows that Transforms jobs trigger an initial count job. The purpose of the count job is to request executor allocations early, while the runtime performs additional setup (including installing dependencies). This increases the likelihood of executors being available by the time the Transform is ready to run.

Count job to request executors early


中文翻译


Spark UI

Spark 拥有自己的 Web UI ↗,可补充 Foundry 的 Spark 详情 页面,提供额外信息,包括:

  • 执行器生命周期信息,例如执行器的启动与关闭。
  • 任务和执行器指标的更大样本,包括峰值内存使用量。
  • 执行期间使用的所有 Spark 配置。

查看 Spark UI

要查看 Transforms 作业的 Spark UI,请以调试作业的方式重新运行该作业。您将看到一个 Spark UI 按钮;点击该按钮将打开 Spark 的 Web UI。

以调试作业方式重新运行作业

Spark UI 按钮

:::callout{theme="neutral"} Spark 事件会在延迟 1-2 分钟后出现在 Spark UI 中。 :::

Foundry 中的 Spark UI 使用

Spark 的 Web UI 信息丰富,但并非以针对 Foundry 定制的方式呈现信息。以下,我们提供在 Spark 的 Web UI 中导航以查看 Foundry 作业的建议。

SQL 执行

Spark 可以将 SQL 查询拆分为一个主查询和一个或多个子查询。在某些情况下,子查询比主查询更有参考价值。对于 Foundry 中的许多数据集写入操作来说,情况正是如此。

在 Spark UI 中查看 "Writing dataset ..." SQL 执行时,您可以在 Sub Execution IDs 下找到写入操作的查询图链接。

写入数据集查询

主查询 0 缺乏信息

子查询 1 包含查询图

上下文预热

Spark UI 中的 Jobs 选项卡显示,Transforms 作业会触发一个初始的 count 作业。count 作业的目的是在运行时执行额外设置(包括安装依赖项)期间,尽早请求执行器分配。这增加了在 Transform 准备运行时执行器已可用的可能性。

用于提前请求执行器的计数作业