跳转至

Mapping join(映射连接(Mapping join))

Supported in: Batch, Faster

Replaces values from the target columns in the source dataset with values in the mapping dataset.

Transform categories: Join

Declared arguments

  • Input dataset: Source dataset containing columns to be mapped.
    Table
  • Key column for mapping values: Key column for mapping values.
    Column\
  • Mapping dataset: Dataset containing values to use for mapping.
    Table
  • Target columns: List of columns from left that will have values replaced.
    List\>
  • Values to use for mapping: Values to use for mapping.
    Column\
  • optional Assume unique mappings: If true, a distinct operation will be applied to the key column of the mapping table. If false, and the mapping table contains duplicate keys, the resulting dataset will contain duplicate rows based on each mapping. By default, this operation is applied. Note: setting this to false may result in better performance.
    Literal\
  • optional Default value: If empty, values from the target columns will remain unchanged if no mapping is found in the mapping table. By default, this is empty.
    Expression\

Type variable bounds: T1 accepts AnyType**T2 accepts AnyType

Examples

Example 1: Base case

Argument values:

  • Input dataset: ri.foundry.main.dataset.input
  • Key column for mapping values: flight_code
  • Mapping dataset: ri.foundry.main.dataset.mapping
  • Target columns: [flight_no, next_flight]
  • Values to use for mapping: flight_number
  • Assume unique mappings: null
  • Default value: unknown

Inputs:

ri.foundry.main.dataset.input

flight_no next_flight departure_time
533 112 2022-01-20T10:45:00Z
934 533 2022-01-20T11:20:00Z
222 934 2022-01-20T11:20:00Z

ri.foundry.main.dataset.mapping

flight_code flight_number airline
112 XB-123 foundry airlines
533 MT-444 foundry airlines
934 KK-123 new air

Output:

flight_no next_flight departure_time
MT-444 XB-123 2022-01-20T10:45:00Z
KK-123 MT-444 2022-01-20T11:20:00Z
unknown KK-123 2022-01-20T11:20:00Z


中文翻译


映射连接(Mapping join)

支持:批处理(Batch)、快速处理(Faster)

将源数据集中目标列的值替换为映射数据集中的值。

转换类别:连接(Join)

声明的参数

  • 输入数据集(Input dataset): 包含待映射列的源数据集。
    表(Table)
  • 映射值的键列(Key column for mapping values): 用于映射值的键列。
    列\
  • 映射数据集(Mapping dataset): 包含用于映射的值的数据集。
    表(Table)
  • 目标列(Target columns): 左侧数据集中需要替换值的列列表。
    列表\<列\>
  • 用于映射的值(Values to use for mapping): 用于映射的值。
    列\
  • 可选 假设映射唯一(Assume unique mappings): 若为 true,则会对映射表的键列执行去重操作。若为 false,且映射表包含重复键,则结果数据集将根据每个映射包含重复行。默认情况下会执行此操作。注意:将此参数设为 false 可能会获得更好的性能。
    字面量\<布尔值>
  • 可选 默认值(Default value): 若为空,则当映射表中未找到匹配项时,目标列的值保持不变。默认情况下为空。
    表达式\

类型变量边界: T1 接受任意类型**T2 接受任意类型

示例

示例 1:基础情况

参数值:

  • 输入数据集: ri.foundry.main.dataset.input
  • 映射值的键列: flight_code
  • 映射数据集: ri.foundry.main.dataset.mapping
  • 目标列: [flight_no, next_flight]
  • 用于映射的值: flight_number
  • 假设映射唯一: null
  • 默认值: unknown

输入:

ri.foundry.main.dataset.input

flight_no next_flight departure_time
533 112 2022-01-20T10:45:00Z
934 533 2022-01-20T11:20:00Z
222 934 2022-01-20T11:20:00Z

ri.foundry.main.dataset.mapping

flight_code flight_number airline
112 XB-123 foundry airlines
533 MT-444 foundry airlines
934 KK-123 new air

输出:

flight_no next_flight departure_time
MT-444 XB-123 2022-01-20T10:45:00Z
KK-123 MT-444 2022-01-20T11:20:00Z
unknown KK-123 2022-01-20T11:20:00Z