跳转至

Foundry connectors(Foundry 连接器(Connectors))

Connectors for interacting with Foundry from the transforms APIs.

A Connector can be used interactively to construct TransformInput and TransformOutput objects and also to run a Transform.

FoundryConnector

class transforms.foundry.connectors.FoundryConnector(service_config, auth_header, filesystem_id=None, fallback_branches=None, resolver=None)

  • Entry point for accessing Foundry services.
  • The Foundry object manages interactions with Foundry services by providing APIs for manipulating datasets.

Parameters

  • service_config (dict ↗)
  • A configuration dictionary conforming to the JSON spec in the Java class com.palantir.remoting.api.config.service.ServicesConfigBlock.
  • auth_header (str ↗)
  • The authorization string to use when connecting to Foundry services.
  • filesystem_id (str ↗, optional)
  • The backing filesystem to use.
  • fallback_branches (List[str ↗], optional)
  • Fallback branches.
  • resolver (Callable[[str ↗], str ↗], optional)
  • Function for resolving a dataset alias into a rid. Defaults to resolving the alias as a Project path.

input(alias=None, rid=None, branch=None, end_txrid=None, start_txrid=None, schema_version=None)

  • Construct a TransformInput from the given parameters.
  • The resource identifier used to construct the TransformInput will be resolved from the given alias unless the rid parameter is passed.

Parameters

  • alias (str ↗, optional)
  • The alias of the dataset.
  • rid (str ↗, optional)
  • The resource identifier of the dataset.
  • branch (str ↗, optional)
  • The branch from which to read the dataset. If not set the branch is chosen as the first branch in the fallbacks list that exists in the Catalog.
  • end_txrid (str ↗, optional)
  • The end transaction of the view, if not set, defaults to the latest transaction on the given branch.
  • start_txrid (str ↗, optional)
  • The starting transaction of the view.
  • schema_version (str ↗, optional)
  • The schema version to use when reading, if not set, defaults to the latest schema version on the given branch.

Returns

  • An input object representing the requested dataset.

Return type

Raises

  • ValueError
  • If either the alias or rid (but not both) is not specified.
  • ValueError
  • If a branch is not specified and a fallback branch cannot be found in the Catalog.

output(alias=None, rid=None, branch=None, txrid=None, filesystem_id=None)

Parameters

  • alias (str ↗, optional)
  • The alias of the dataset.
  • rid (str ↗, optional)
  • The resource identifier of the dataset.
  • branch (str ↗, optional)
  • The branch to which to write the dataset. If not set the branch is chosen as the first branch in the fallbacks list.
  • txrid (str ↗, optional)
  • The transaction into which data should be written.
  • filesystem_id (str ↗, optional)
  • The filesystem in which to create the dataset if it doesn’t already exist.

Returns

  • An output object representing the requested dataset.

Return type

Raises

  • ValueError
  • If either the alias or rid (but not both) is not specified.

run(transform)

  • Run the given Transform using the latest inputs and outputs.

Parameters


auth_header

  • str
  • The auth header used to contact Foundry.

fallback_branches

  • List[str]
  • The fallback branches used to retrieve datasets.

spark_session


中文翻译

Foundry 连接器(Connectors)

用于通过转换API(transforms APIs)与Foundry交互的连接器。

连接器(Connector)可以交互式地用于构建TransformInputTransformOutput对象,也可以用于运行Transform

FoundryConnector

class transforms.foundry.connectors.FoundryConnector(service_config, auth_header, filesystem_id=None, fallback_branches=None, resolver=None)

  • 访问Foundry服务的入口点。
  • Foundry对象通过提供用于操作数据集的API来管理与Foundry服务的交互。

参数(Parameters)

  • service_config (dict ↗)
  • 符合Java类com.palantir.remoting.api.config.service.ServicesConfigBlock中JSON规范的配置字典。
  • auth_header (str ↗)
  • 连接到Foundry服务时使用的授权字符串。
  • filesystem_id (str ↗, 可选)
  • 要使用的后端文件系统。
  • fallback_branches (List[str ↗], 可选)
  • 回退分支(fallback branches)。
  • resolver (Callable[[str ↗], str ↗], 可选)
  • 用于将数据集别名解析为rid的函数。默认为将别名解析为项目路径(Project path)。

input(alias=None, rid=None, branch=None, end_txrid=None, start_txrid=None, schema_version=None)

  • 根据给定参数构建TransformInput
  • 用于构建TransformInput的资源标识符(resource identifier)将从给定的alias解析,除非传入了rid参数。

参数(Parameters)

  • alias (str ↗, 可选)
  • 数据集的别名。
  • rid (str ↗, 可选)
  • 数据集的资源标识符。
  • branch (str ↗, 可选)
  • 从中读取数据集的分支。如果未设置,则选择fallbacks列表中在Catalog中存在的第一个分支。
  • end_txrid (str ↗, 可选)
  • 视图的结束事务,如果未设置,则默认为给定分支上的最新事务。
  • start_txrid (str ↗, 可选)
  • 视图的起始事务。
  • schema_version (str ↗, 可选)
  • 读取时使用的模式版本,如果未设置,则默认为给定分支上的最新模式版本。

返回(Returns)

  • 表示所请求数据集的输入对象。

返回类型(Return type)

异常(Raises)

  • ValueError
  • 如果未指定aliasrid(但必须指定其中之一)。
  • ValueError
  • 如果未指定分支,且在Catalog中找不到回退分支。

output(alias=None, rid=None, branch=None, txrid=None, filesystem_id=None)

参数(Parameters)

  • alias (str ↗, 可选)
  • 数据集的别名。
  • rid (str ↗, 可选)
  • 数据集的资源标识符。
  • branch (str ↗, 可选)
  • 要写入数据集的分支。如果未设置,则选择fallbacks列表中的第一个分支。
  • txrid (str ↗, 可选)
  • 数据应写入的事务。
  • filesystem_id (str ↗, 可选)
  • 如果数据集尚不存在,则用于创建数据集的文件系统。

返回(Returns)

  • 表示所请求数据集的输出对象。

返回类型(Return type)

异常(Raises)

  • ValueError
  • 如果未指定aliasrid(但必须指定其中之一)。

run(transform)

  • 使用最新的输入和输出运行给定的Transform

参数(Parameters)


auth_header

  • str
  • 用于联系Foundry的认证头(auth header)。

fallback_branches

  • List[str]
  • 用于检索数据集的回退分支。

spark_session