跳转至

Flatten struct(展开结构体 (Flatten struct))

Supported in: Batch, Faster, Streaming

Take all fields in a struct and turn them into columns in the output dataset.

Transform categories: Struct

Declared arguments

  • Dataset: Dataset containing struct column.
    Table
  • Expression: Expression evaluating to a struct column that will be flattened.
    Expression\
  • Max depth: The depth level specifying how deep a nested struct will be flattened.
    Literal\
  • optional Column prefix: Add a prefix to all columns created during the flatten.
    Literal\
  • optional Separator: Separate field names coming from nested structs.
    Literal\

Examples

Example 1: Base case

Argument values:

  • Dataset: ri.foundry.main.dataset.a
  • Expression: raw
  • Max depth: 2
  • Column prefix: new_
  • Separator: null

Input:

raw
{
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
{
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

Output:

new_airline_name new_airline_id new_tail_no raw
new air NA NA-123 {
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
foundry airways FA FA-123 {
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

Example 2: Base case

Argument values:

  • Dataset: ri.foundry.main.dataset.a
  • Expression: raw
  • Max depth: 2
  • Column prefix: new_
  • Separator: #SEPARATOR#

Input:

raw
{
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
{
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

Output:

new_airline#SEPARATOR#name new_airline#SEPARATOR#id new_tail_no raw
new air NA NA-123 {
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
foundry airways FA FA-123 {
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

Example 3: Null case

Argument values:

  • Dataset: ri.foundry.main.dataset.a
  • Expression: raw
  • Max depth: 2
  • Column prefix: new_
  • Separator: null

Input:

raw
null
{
airline: null,
tail_no: NA-123,
}
{
airline: {
id: FA,
name: null,
},
tail_no: FA-123,
}

Output:

new_airline_name new_airline_id new_tail_no raw
null null null null
null null NA-123 {
airline: null,
tail_no: NA-123,
}
null FA FA-123 {
airline: {
id: FA,
name: null,
},
tail_no: FA-123,
}


中文翻译


展开结构体 (Flatten struct)

支持:批处理 (Batch)、快速处理 (Faster)、流处理 (Streaming)

将结构体中的所有字段转换为输出数据集中的列。

转换类别:结构体 (Struct)

声明参数

  • 数据集 (Dataset):包含结构体列的数据集。
    表 (Table)
  • 表达式 (Expression):将被展开的结构体列的计算表达式。
    表达式\<结构体> (Expression\)
  • 最大深度 (Max depth):指定嵌套结构体展开深度的层级。
    字面量\<整数> (Literal\)
  • 可选 列前缀 (Column prefix):为展开过程中创建的所有列添加前缀。
    字面量\<字符串> (Literal\)
  • 可选 分隔符 (Separator):分隔来自嵌套结构体的字段名称。
    字面量\<字符串> (Literal\)

示例

示例 1:基础情况

参数值:

  • 数据集: ri.foundry.main.dataset.a
  • 表达式: raw
  • 最大深度: 2
  • 列前缀: new_
  • 分隔符: null

输入:

raw
{
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
{
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

输出:

new_airline_name new_airline_id new_tail_no raw
new air NA NA-123 {
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
foundry airways FA FA-123 {
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

示例 2:基础情况

参数值:

  • 数据集: ri.foundry.main.dataset.a
  • 表达式: raw
  • 最大深度: 2
  • 列前缀: new_
  • 分隔符: #SEPARATOR#

输入:

raw
{
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
{
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

输出:

new_airline#SEPARATOR#name new_airline#SEPARATOR#id new_tail_no raw
new air NA NA-123 {
airline: {
id: NA,
name: new air,
},
tail_no: NA-123,
}
foundry airways FA FA-123 {
airline: {
id: FA,
name: foundry airways,
},
tail_no: FA-123,
}

示例 3:空值情况

参数值:

  • 数据集: ri.foundry.main.dataset.a
  • 表达式: raw
  • 最大深度: 2
  • 列前缀: new_
  • 分隔符: null

输入:

raw
null
{
airline: null,
tail_no: NA-123,
}
{
airline: {
id: FA,
name: null,
},
tail_no: FA-123,
}

输出:

new_airline_name new_airline_id new_tail_no raw
null null null null
null null NA-123 {
airline: null,
tail_no: NA-123,
}
null FA FA-123 {
airline: {
id: FA,
name: null,
},
tail_no: FA-123,
}