跳转至

Lightweight transforms API evolution(轻量级转换 API 演进)

The transforms API for lightweight compute has evolved over time to support more streamlined syntax options.

Reference this page for details on the evolution of the lightweight API.

To get started writing lightweight transforms, visit the Getting started page or any of the Polars or pandas examples in the Python transforms documentation; these pages provide guidance on using lightweight transforms as the default compute option.

Legacy syntax: @lightweight decorator

The original syntax for lightweight compute is the @lightweight decorator. This syntax option remains fully supported.

from transforms.api import transform, lightweight, Input, Output


@lightweight
@transform(
    output=Output("/path/data/output"),
    input=Input("/path/data/input"),
)
def clean(output, input):
    df = input.pandas()
    output.write_table(df)

Updated syntax for Lightweight

The new recommended syntax for accessing lightweight transforms is @transform.using. This API removes the need for additional lightweight imports and streamlines the creation of lightweight transforms as the default.

The new API is available from transforms version 3.68.0 and higher. To make this available in Code Repositories, upgrade your repository with the repository upgrade guide. Ensure that the transformsLangPythonPluginVersion is equal to 1.978.0 or higher.

To learn more about transforms versions, see the transforms versions overview.

from transforms.api import transform, Input, Output


@transform.using(
    output=Output("/path/data/output"),
    input=Input("/path/data/input"),
)
def clean(output, input):
    df = input.pandas()
    output.write_table(df)

:::callout{theme="neutral"} To summarize, you can create a Lightweight transform using any of the below syntax options:

  • Updated API:
    @transform.using(...)
  • Updated API explicitly referencing Lightweight:
    @transform.lightweight(...)
  • Legacy API with @lightweight decorator:
    @lightweight
    @transform(...) :::

Updated syntax for Spark

Alongside the new, default lightweight API, there is a new transforms API for Spark. The recommended syntax is @transform.spark.using.

The new API is available from transforms version 3.95.0 and higher. To make this available in Code Repositories, upgrade your repository with the repository upgrade guide. Ensure that the transformsLangPythonPluginVersion is equal to 1.1003.0 or higher.

from transforms.api import transform, Input, Output


@transform.spark.using(
    output=Output("/path/data/output"),
    input=Input("/path/data/input"),
)
def clean(output, input):
    df = input.dataframe()
    output.write_dataframe(df)

Troubleshoot version errors with @transform.using

If you receive an error message similar to the following, you may be on an older version of the transforms library that does not support the updated syntax.

  • Usage of transform.using() on outdated repository
  • A function object does not have an attribute using

To resolve this issue, try the following options:


中文翻译


轻量级转换 API 演进

针对轻量级计算的转换 API(transforms API)经过持续演进,现已支持更简洁的语法选项。

请参考本页面了解轻量级 API 的演进详情。

如需开始编写轻量级转换,请访问入门指南页面,或查阅 Python 转换文档中的 Polars 或 pandas 示例;这些页面提供了将轻量级转换作为默认计算选项的使用指导。

旧版语法:@lightweight 装饰器

轻量级计算的原始语法为 @lightweight 装饰器。该语法选项仍完全受支持。

from transforms.api import transform, lightweight, Input, Output


@lightweight
@transform(
    output=Output("/path/data/output"),
    input=Input("/path/data/input"),
)
def clean(output, input):
    df = input.pandas()
    output.write_table(df)

轻量级转换的更新语法

访问轻量级转换的新推荐语法为 @transform.using。此 API 无需额外导入轻量级模块,并简化了将轻量级转换设为默认选项的流程。

新 API 适用于 transforms 3.68.0 及以上版本。如需在代码仓库中使用,请按照仓库升级指南升级仓库。确保 transformsLangPythonPluginVersion 版本不低于 1.978.0。

关于 transforms 版本的更多信息,请参阅 transforms 版本概述

from transforms.api import transform, Input, Output


@transform.using(
    output=Output("/path/data/output"),
    input=Input("/path/data/input"),
)
def clean(output, input):
    df = input.pandas()
    output.write_table(df)

:::callout{theme="neutral"} 总结而言,您可以使用以下任一语法选项创建轻量级转换:

  • 更新版 API:
    @transform.using(...)
  • 显式引用轻量级的更新版 API:
    @transform.lightweight(...)
  • 带 @lightweight 装饰器的旧版 API:
    @lightweight
    @transform(...) :::

Spark 的更新语法

除新的默认轻量级 API 外,还新增了适用于 Spark 的转换 API。推荐语法为 @transform.spark.using

新 API 适用于 transforms 3.95.0 及以上版本。如需在代码仓库中使用,请按照仓库升级指南升级仓库。确保 transformsLangPythonPluginVersion 版本不低于 1.1003.0。

from transforms.api import transform, Input, Output


@transform.spark.using(
    output=Output("/path/data/output"),
    input=Input("/path/data/input"),
)
def clean(output, input):
    df = input.dataframe()
    output.write_dataframe(df)

排查 @transform.using 的版本错误

如果您收到类似以下错误信息,说明您可能使用的是不支持更新语法的旧版 transforms 库。

  • Usage of transform.using() on outdated repository
  • A function object does not have an attribute using

要解决此问题,请尝试以下方案: