跳转至

Anti join(反连接(Anti join))

Supported in: Batch, Faster

Anti joins left and right dataset inputs, removing all rows from the left relation that match the provided condition.

Transform categories: Join

Declared arguments

  • Condition for columns to select on the left: All columns in the left input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
    ColumnPredicate
  • Join condition: Condition on which to join.
    Expression\
  • Left dataset: Left dataset to use in join.
    Table
  • Right dataset: Right dataset to use in join.
    Table

Examples

Example 1: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [tail_number, airline],
    )
  • Join condition:
    equals(
     left: tail_number,
     right: tail_number,
    )
  • Left dataset: ri.foundry.main.dataset.left
  • Right dataset: ri.foundry.main.dataset.right

Inputs:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport
XB-123 LHR
MT-222 CPH
KK-452 JFK
JR-201 IAD

Output:

tail_number airline
PA-452 new air

Example 2: Base case

Description: Simple complex join condition.

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [tail_number, airline, factor],
    )
  • Join condition:
    and(
     conditions: [
    lessThan(
     left: factor,
     right: factor,
    ),
    equals(
     left: tail_number,
     right: tail_number,
    )],
    )
  • Left dataset: ri.foundry.main.dataset.left
  • Right dataset: ri.foundry.main.dataset.right

Inputs:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport factor
XB-123 LHR 2
MT-222 CPH 1
KK-452 JFK 10
JR-201 IAD 4

Output:

tail_number airline factor
XB-123 foundry air 2
MT-222 new airline 5
XB-123 foundry airline 5
MT-222 new air 4
PA-452 new air 2
XB-123 foundry airline 2

Example 3: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [tail_number, airline, factor],
    )
  • Join condition:
    and(
     conditions: [
    equals(
     left: tail_number,
     right: tail_number,
    ),
    equals(
     left: factor,
     right: factor,
    )],
    )
  • Left dataset: ri.foundry.main.dataset.left
  • Right dataset: ri.foundry.main.dataset.right

Inputs:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport factor
XB-123 LHR 2
MT-222 CPH 1
KK-452 JFK 10
JR-201 IAD 4

Output:

tail_number airline factor
MT-222 new airline 5
XB-123 foundry airline 5
MT-222 new air 4
KK-452 new air 1
PA-452 new air 2

Example 4: Base case

Argument values:

  • Condition for columns to select on the left:
    allColumns(

    )
  • Join condition:
    equals(
     left: tail_number,
     right: tail_number,
    )
  • Left dataset: ri.foundry.main.dataset.left
  • Right dataset: ri.foundry.main.dataset.right

Inputs:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport
XB-123 LHR
MT-222 CPH
KK-452 JFK
JR-201 IAD

Output:

tail_number airline miles factor
PA-452 new air 212 2


中文翻译


反连接(Anti join)

支持:批处理(Batch)、快速处理(Faster)

反连接对左右两个数据集输入进行操作,从左关系中移除所有满足指定条件的行。

转换类别:连接(Join)

声明的参数

  • 左侧列选择条件(Condition for columns to select on the left): 左侧输入模式中的所有列都将被测试是否匹配此条件。如果匹配,该列将被选中并包含在输出中。
    列谓词(ColumnPredicate)
  • 连接条件(Join condition): 用于执行连接的条件。
    表达式\<布尔值>(Expression\
  • 左侧数据集(Left dataset): 用于连接的左侧数据集。
    表(Table)
  • 右侧数据集(Right dataset): 用于连接的右侧数据集。
    表(Table)

示例

示例 1:基础情况

参数值:

  • 左侧列选择条件:
    columnNameIsIn(
     columnNames: [tail_number, airline],
    )
  • 连接条件:
    equals(
     left: tail_number,
     right: tail_number,
    )
  • 左侧数据集: ri.foundry.main.dataset.left
  • 右侧数据集: ri.foundry.main.dataset.right

输入:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport
XB-123 LHR
MT-222 CPH
KK-452 JFK
JR-201 IAD

输出:

tail_number airline
PA-452 new air

示例 2:基础情况

描述: 简单的复杂连接条件。

参数值:

  • 左侧列选择条件:
    columnNameIsIn(
     columnNames: [tail_number, airline, factor],
    )
  • 连接条件:
    and(
     conditions: [
    lessThan(
     left: factor,
     right: factor,
    ),
    equals(
     left: tail_number,
     right: tail_number,
    )],
    )
  • 左侧数据集: ri.foundry.main.dataset.left
  • 右侧数据集: ri.foundry.main.dataset.right

输入:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport factor
XB-123 LHR 2
MT-222 CPH 1
KK-452 JFK 10
JR-201 IAD 4

输出:

tail_number airline factor
XB-123 foundry air 2
MT-222 new airline 5
XB-123 foundry airline 5
MT-222 new air 4
PA-452 new air 2
XB-123 foundry airline 2

示例 3:基础情况

参数值:

  • 左侧列选择条件:
    columnNameIsIn(
     columnNames: [tail_number, airline, factor],
    )
  • 连接条件:
    and(
     conditions: [
    equals(
     left: tail_number,
     right: tail_number,
    ),
    equals(
     left: factor,
     right: factor,
    )],
    )
  • 左侧数据集: ri.foundry.main.dataset.left
  • 右侧数据集: ri.foundry.main.dataset.right

输入:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport factor
XB-123 LHR 2
MT-222 CPH 1
KK-452 JFK 10
JR-201 IAD 4

输出:

tail_number airline factor
MT-222 new airline 5
XB-123 foundry airline 5
MT-222 new air 4
KK-452 new air 1
PA-452 new air 2

示例 4:基础情况

参数值:

  • 左侧列选择条件:
    allColumns(

    )
  • 连接条件:
    equals(
     left: tail_number,
     right: tail_number,
    )
  • 左侧数据集: ri.foundry.main.dataset.left
  • 右侧数据集: ri.foundry.main.dataset.right

输入:

ri.foundry.main.dataset.left

tail_number airline miles factor
XB-123 foundry air 124 2
MT-222 new airline 1123 5
XB-123 foundry airline 335 5
MT-222 new air 565 4
KK-452 new air 222 1
PA-452 new air 212 2
XB-123 foundry airline 1134 2

ri.foundry.main.dataset.right

tail_number home_airport
XB-123 LHR
MT-222 CPH
KK-452 JFK
JR-201 IAD

输出:

tail_number airline miles factor
PA-452 new air 212 2