Foundry connectors(Foundry 连接器(Connectors))¶
Connectors for interacting with Foundry from the transforms APIs.
A Connector can be used interactively to construct TransformInput and TransformOutput objects and also to run a Transform.
FoundryConnector¶
class transforms.foundry.connectors.FoundryConnector(service_config, auth_header, filesystem_id=None, fallback_branches=None, resolver=None)¶
- Entry point for accessing Foundry services.
- The Foundry object manages interactions with Foundry services by providing APIs for manipulating datasets.
Parameters¶
- service_config (dict ↗)
- A configuration dictionary conforming to the JSON spec in the Java class com.palantir.remoting.api.config.service.ServicesConfigBlock.
- auth_header (str ↗)
- The authorization string to use when connecting to Foundry services.
- filesystem_id (str ↗, optional)
- The backing filesystem to use.
- fallback_branches (List[str ↗], optional)
- Fallback branches.
- resolver (Callable[[str ↗], str ↗], optional)
- Function for resolving a dataset alias into a rid. Defaults to resolving the alias as a Project path.
input(alias=None, rid=None, branch=None, end_txrid=None, start_txrid=None, schema_version=None)¶
- Construct a
TransformInputfrom the given parameters. - The resource identifier used to construct the
TransformInputwill be resolved from the givenaliasunless theridparameter is passed.
Parameters¶
- alias (str ↗, optional)
- The alias of the dataset.
- rid (str ↗, optional)
- The resource identifier of the dataset.
- branch (str ↗, optional)
- The branch from which to read the dataset. If not set the branch is chosen as the first branch in the fallbacks list that exists in the Catalog.
- end_txrid (str ↗, optional)
- The end transaction of the view, if not set, defaults to the latest transaction on the given branch.
- start_txrid (str ↗, optional)
- The starting transaction of the view.
- schema_version (str ↗, optional)
- The schema version to use when reading, if not set, defaults to the latest schema version on the given branch.
Returns¶
- An input object representing the requested dataset.
Return type¶
Raises¶
ValueError↗- If either the alias or rid (but not both) is not specified.
ValueError↗- If a branch is not specified and a fallback branch cannot be found in the Catalog.
output(alias=None, rid=None, branch=None, txrid=None, filesystem_id=None)¶
- Construct a TransformOutput from the given alias or rid.
- The resource identifier used to construct the
transforms.api.TransformOutputwill be resolved from the givenaliasunless theridparameter is passed.
Parameters¶
- alias (str ↗, optional)
- The alias of the dataset.
- rid (str ↗, optional)
- The resource identifier of the dataset.
- branch (str ↗, optional)
- The branch to which to write the dataset. If not set the branch is chosen as the first branch in the fallbacks list.
- txrid (str ↗, optional)
- The transaction into which data should be written.
- filesystem_id (str ↗, optional)
- The filesystem in which to create the dataset if it doesn’t already exist.
Returns¶
- An output object representing the requested dataset.
Return type¶
Raises¶
ValueError↗- If either the alias or rid (but not both) is not specified.
run(transform)¶
- Run the given Transform using the latest inputs and outputs.
Parameters¶
- transform (
transforms.api.Transform) - The transform to run.
auth_header¶
- str
- The auth header used to contact Foundry.
fallback_branches¶
- List[str]
- The fallback branches used to retrieve datasets.
spark_session¶
pyspark.sql.SparkSession↗- Understand the SparkSession created by FoundrySparkManager.
中文翻译¶
Foundry 连接器(Connectors)¶
用于通过转换API(transforms APIs)与Foundry交互的连接器。
连接器(Connector)可以交互式地用于构建TransformInput和TransformOutput对象,也可以用于运行Transform。
FoundryConnector¶
class transforms.foundry.connectors.FoundryConnector(service_config, auth_header, filesystem_id=None, fallback_branches=None, resolver=None)¶
- 访问Foundry服务的入口点。
- Foundry对象通过提供用于操作数据集的API来管理与Foundry服务的交互。
参数(Parameters)¶
- service_config (dict ↗)
- 符合Java类com.palantir.remoting.api.config.service.ServicesConfigBlock中JSON规范的配置字典。
- auth_header (str ↗)
- 连接到Foundry服务时使用的授权字符串。
- filesystem_id (str ↗, 可选)
- 要使用的后端文件系统。
- fallback_branches (List[str ↗], 可选)
- 回退分支(fallback branches)。
- resolver (Callable[[str ↗], str ↗], 可选)
- 用于将数据集别名解析为rid的函数。默认为将别名解析为项目路径(Project path)。
input(alias=None, rid=None, branch=None, end_txrid=None, start_txrid=None, schema_version=None)¶
- 根据给定参数构建
TransformInput。 - 用于构建
TransformInput的资源标识符(resource identifier)将从给定的alias解析,除非传入了rid参数。
参数(Parameters)¶
- alias (str ↗, 可选)
- 数据集的别名。
- rid (str ↗, 可选)
- 数据集的资源标识符。
- branch (str ↗, 可选)
- 从中读取数据集的分支。如果未设置,则选择fallbacks列表中在Catalog中存在的第一个分支。
- end_txrid (str ↗, 可选)
- 视图的结束事务,如果未设置,则默认为给定分支上的最新事务。
- start_txrid (str ↗, 可选)
- 视图的起始事务。
- schema_version (str ↗, 可选)
- 读取时使用的模式版本,如果未设置,则默认为给定分支上的最新模式版本。
返回(Returns)¶
- 表示所请求数据集的输入对象。
返回类型(Return type)¶
异常(Raises)¶
ValueError↗- 如果未指定alias或rid(但必须指定其中之一)。
ValueError↗- 如果未指定分支,且在Catalog中找不到回退分支。
output(alias=None, rid=None, branch=None, txrid=None, filesystem_id=None)¶
- 根据给定的别名或rid构建TransformOutput。
- 用于构建
transforms.api.TransformOutput的资源标识符将从给定的alias解析,除非传入了rid参数。
参数(Parameters)¶
- alias (str ↗, 可选)
- 数据集的别名。
- rid (str ↗, 可选)
- 数据集的资源标识符。
- branch (str ↗, 可选)
- 要写入数据集的分支。如果未设置,则选择fallbacks列表中的第一个分支。
- txrid (str ↗, 可选)
- 数据应写入的事务。
- filesystem_id (str ↗, 可选)
- 如果数据集尚不存在,则用于创建数据集的文件系统。
返回(Returns)¶
- 表示所请求数据集的输出对象。
返回类型(Return type)¶
异常(Raises)¶
ValueError↗- 如果未指定alias或rid(但必须指定其中之一)。
run(transform)¶
- 使用最新的输入和输出运行给定的Transform。
参数(Parameters)¶
- transform (
transforms.api.Transform) - 要运行的转换(transform)。
auth_header¶
- str
- 用于联系Foundry的认证头(auth header)。
fallback_branches¶
- List[str]
- 用于检索数据集的回退分支。
spark_session¶
pyspark.sql.SparkSession↗- 了解由FoundrySparkManager创建的SparkSession。