Left lookup join(左查找连接(Left lookup join))¶
Supported in: Streaming
Joins two datasets together, keeping all rows from the left table and only matching rows from the right dataset.
Transform categories: Join
Declared arguments¶
- Condition for columns to select on the left: All columns in the left input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
ColumnPredicate - Condition for columns to select on the right: All columns in the right input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
ColumnPredicate - Join condition: A list of columns from left and right input to join on.
List\, Column\\ >> - Left dataset: Left dataset to use in join.
Table - Max rows to join with a single row: Places a limit on the maximum number of rows from the right input that a single row from the left can be joined with. The highest this limit can go is ten. Setting too high of a limit can have negative performance implications.
Literal\ - Right dataset: Right dataset to use in join.
Table - optional Prefix for columns from right: Prefix to add to all columns on the right hand side.
Literal\
Examples¶
Example 1: Base case¶
Argument values:
- Condition for columns to select on the left:
columnNameIsIn(
columnNames: [tail_number, airline],
) - Condition for columns to select on the right:
columnNameIsIn(
columnNames: [home_airport],
) - Join condition: [(
tail_number,tail_number)] - Left dataset: ri.foundry.main.dataset.left
- Max rows to join with a single row: 10
- Right dataset: ri.foundry.main.dataset.right
- Prefix for columns from right: null
Inputs:
ri.foundry.main.dataset.left
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| PA-452 | new air | 212 | 2 |
| XB-123 | foundry airline | 1134 | 2 |
ri.foundry.main.dataset.right
| tail_number | home_airport |
|---|---|
| XB-123 | LHR |
| MT-222 | CPH |
| KK-452 | JFK |
| JR-201 | IAD |
Output:
| tail_number | airline | home_airport |
|---|---|---|
| XB-123 | foundry air | LHR |
| MT-222 | new airline | CPH |
| XB-123 | foundry airline | LHR |
| MT-222 | new air | CPH |
| KK-452 | new air | JFK |
| PA-452 | new air | null |
| XB-123 | foundry airline | LHR |
Example 2: Base case¶
Argument values:
- Condition for columns to select on the left:
columnNameIsIn(
columnNames: [tail_number, airline, factor],
) - Condition for columns to select on the right:
columnNameIsIn(
columnNames: [home_airport],
) - Join condition: [(
tail_number,tail_number), (factor,factor)] - Left dataset: ri.foundry.main.dataset.left
- Max rows to join with a single row: 10
- Right dataset: ri.foundry.main.dataset.right
- Prefix for columns from right: null
Inputs:
ri.foundry.main.dataset.left
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| PA-452 | new air | 212 | 2 |
| XB-123 | foundry airline | 1134 | 2 |
ri.foundry.main.dataset.right
| tail_number | home_airport | factor |
|---|---|---|
| XB-123 | LHR | 2 |
| MT-222 | CPH | 1 |
| KK-452 | JFK | 10 |
| JR-201 | IAD | 4 |
Output:
| tail_number | airline | factor | home_airport |
|---|---|---|---|
| XB-123 | foundry air | 2 | LHR |
| MT-222 | new airline | 5 | null |
| XB-123 | foundry airline | 5 | null |
| MT-222 | new air | 4 | null |
| KK-452 | new air | 1 | null |
| PA-452 | new air | 2 | null |
| XB-123 | foundry airline | 2 | LHR |
Example 3: Base case¶
Argument values:
- Condition for columns to select on the left:
allColumns(
) - Condition for columns to select on the right:
columnNameIsIn(
columnNames: [home_airport],
) - Join condition: [(
tail_number,tail_number)] - Left dataset: ri.foundry.main.dataset.left
- Max rows to join with a single row: 10
- Right dataset: ri.foundry.main.dataset.right
- Prefix for columns from right: null
Inputs:
ri.foundry.main.dataset.left
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| PA-452 | new air | 212 | 2 |
| XB-123 | foundry airline | 1134 | 2 |
ri.foundry.main.dataset.right
| tail_number | home_airport |
|---|---|
| XB-123 | LHR |
| XB-123 | LGW |
| MT-222 | CPH |
| KK-452 | JFK |
| JR-201 | IAD |
Output:
| tail_number | airline | miles | factor | home_airport |
|---|---|---|---|---|
| XB-123 | foundry air | 124 | 2 | LHR |
| XB-123 | foundry air | 124 | 2 | LGW |
| MT-222 | new airline | 1123 | 5 | CPH |
| XB-123 | foundry airline | 335 | 5 | LHR |
| XB-123 | foundry airline | 335 | 5 | LGW |
| MT-222 | new air | 565 | 4 | CPH |
| KK-452 | new air | 222 | 1 | JFK |
| PA-452 | new air | 212 | 2 | null |
| XB-123 | foundry airline | 1134 | 2 | LHR |
| XB-123 | foundry airline | 1134 | 2 | LGW |
中文翻译¶
左查找连接(Left lookup join)¶
支持:流式处理(Streaming)
将两个数据集连接在一起,保留左表中的所有行以及右数据集中匹配的行。
转换类别:连接(Join)
声明参数(Declared arguments)¶
- 左侧列选择条件(Condition for columns to select on the left): 左侧输入模式中的所有列都将被测试是否匹配此条件。如果匹配,该列将被选中并输出。
列谓词(ColumnPredicate) - 右侧列选择条件(Condition for columns to select on the right): 右侧输入模式中的所有列都将被测试是否匹配此条件。如果匹配,该列将被选中并输出。
列谓词(ColumnPredicate) - 连接条件(Join condition): 用于连接的左右输入列列表。
列表\<元组\<列\<布尔型(Boolean) | 字节型(Byte) | 日期型(Date) | 双精度浮点型(Double) | 单精度浮点型(Float) | 整型(Integer) | 长整型(Long) | 短整型(Short) | 字符串型(String) | 时间戳型(Timestamp)>, 列\<布尔型(Boolean) | 字节型(Byte) | 日期型(Date) | 双精度浮点型(Double) | 单精度浮点型(Float) | 整型(Integer) | 长整型(Long) | 短整型(Short) | 字符串型(String) | 时间戳型(Timestamp)>>> - 左侧数据集(Left dataset): 用于连接的左侧数据集。
表(Table) - 单行最大连接行数(Max rows to join with a single row): 限制左侧单行最多可与右侧输入中的多少行进行连接。此限制最高可设为十。设置过高的限制可能会对性能产生负面影响。
字面量\<整型(Literal\)> - 右侧数据集(Right dataset): 用于连接的右侧数据集。
表(Table) - 可选 右侧列前缀(Prefix for columns from right): 为右侧所有列添加的前缀。
字面量\<字符串(Literal\)>
示例(Examples)¶
示例 1:基本情况¶
参数值:
- 左侧列选择条件:
columnNameIsIn(
columnNames: [tail_number, airline],
) - 右侧列选择条件:
columnNameIsIn(
columnNames: [home_airport],
) - 连接条件: [(
tail_number,tail_number)] - 左侧数据集: ri.foundry.main.dataset.left
- 单行最大连接行数: 10
- 右侧数据集: ri.foundry.main.dataset.right
- 右侧列前缀: null
输入:
ri.foundry.main.dataset.left
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| PA-452 | new air | 212 | 2 |
| XB-123 | foundry airline | 1134 | 2 |
ri.foundry.main.dataset.right
| tail_number | home_airport |
|---|---|
| XB-123 | LHR |
| MT-222 | CPH |
| KK-452 | JFK |
| JR-201 | IAD |
输出:
| tail_number | airline | home_airport |
|---|---|---|
| XB-123 | foundry air | LHR |
| MT-222 | new airline | CPH |
| XB-123 | foundry airline | LHR |
| MT-222 | new air | CPH |
| KK-452 | new air | JFK |
| PA-452 | new air | null |
| XB-123 | foundry airline | LHR |
示例 2:基本情况¶
参数值:
- 左侧列选择条件:
columnNameIsIn(
columnNames: [tail_number, airline, factor],
) - 右侧列选择条件:
columnNameIsIn(
columnNames: [home_airport],
) - 连接条件: [(
tail_number,tail_number), (factor,factor)] - 左侧数据集: ri.foundry.main.dataset.left
- 单行最大连接行数: 10
- 右侧数据集: ri.foundry.main.dataset.right
- 右侧列前缀: null
输入:
ri.foundry.main.dataset.left
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| PA-452 | new air | 212 | 2 |
| XB-123 | foundry airline | 1134 | 2 |
ri.foundry.main.dataset.right
| tail_number | home_airport | factor |
|---|---|---|
| XB-123 | LHR | 2 |
| MT-222 | CPH | 1 |
| KK-452 | JFK | 10 |
| JR-201 | IAD | 4 |
输出:
| tail_number | airline | factor | home_airport |
|---|---|---|---|
| XB-123 | foundry air | 2 | LHR |
| MT-222 | new airline | 5 | null |
| XB-123 | foundry airline | 5 | null |
| MT-222 | new air | 4 | null |
| KK-452 | new air | 1 | null |
| PA-452 | new air | 2 | null |
| XB-123 | foundry airline | 2 | LHR |
示例 3:基本情况¶
参数值:
- 左侧列选择条件:
allColumns(
) - 右侧列选择条件:
columnNameIsIn(
columnNames: [home_airport],
) - 连接条件: [(
tail_number,tail_number)] - 左侧数据集: ri.foundry.main.dataset.left
- 单行最大连接行数: 10
- 右侧数据集: ri.foundry.main.dataset.right
- 右侧列前缀: null
输入:
ri.foundry.main.dataset.left
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| PA-452 | new air | 212 | 2 |
| XB-123 | foundry airline | 1134 | 2 |
ri.foundry.main.dataset.right
| tail_number | home_airport |
|---|---|
| XB-123 | LHR |
| XB-123 | LGW |
| MT-222 | CPH |
| KK-452 | JFK |
| JR-201 | IAD |
输出:
| tail_number | airline | miles | factor | home_airport |
|---|---|---|---|---|
| XB-123 | foundry air | 124 | 2 | LHR |
| XB-123 | foundry air | 124 | 2 | LGW |
| MT-222 | new airline | 1123 | 5 | CPH |
| XB-123 | foundry airline | 335 | 5 | LHR |
| XB-123 | foundry airline | 335 | 5 | LGW |
| MT-222 | new air | 565 | 4 | CPH |
| KK-452 | new air | 222 | 1 | JFK |
| PA-452 | new air | 212 | 2 | null |
| XB-123 | foundry airline | 1134 | 2 | LHR |
| XB-123 | foundry airline | 1134 | 2 | LGW |