Pivot(数据透视(Pivot))¶
Supported in: Batch, Faster
Performs the specified aggregations on the input dataset grouped by a set of columns. Unique values to pivot on must be provided such that the output schema is known ahead of runtime. This improves runtime stability over time.
Transform categories: Aggregate, Popular
Declared arguments¶
- Aggregations: List of aggregations to perform on the dataset.
List\> - Dataset: Dataset to perform aggregate on.
Table - Group by columns: List of columns to group the dataset by when aggregating.
List\> - Pivot by column: Column to pivot on.
Column\ - Pivot by values: List of unique values used to pivot and aliases for the output. Alias values are used to construct the output column name according to the prefix / suffix argument.
List\, Literal\\ >> - optional Prefix or suffix alias: If prefix, the output column name will be 'alias''aggregate', if suffix it will be 'aggregate'
alias.
Enum\
Type variable bounds: T accepts Boolean | Byte | Integer | Long | Short | String
Examples¶
Example 1: Base case¶
Argument values:
- Aggregations: [
alias(
alias: miles,
expression:
mean(
expression:miles,
),
)] - Dataset: ri.foundry.main.dataset.a
- Group by columns: [
airline] - Pivot by column:
airport - Pivot by values: [(JFK, new_york), (LHR, london)]
- Prefix or suffix alias: null
Input:
| airline | airport | miles |
|---|---|---|
| foundry airways | JFK | 1002345 |
| foundry airways | LHR | 2221324 |
| new air | SFO | 21356673 |
| new air | JFK | 12323456 |
| foundry airways | LHR | 12542352 |
| new air | JFK | 12232355 |
Output:
| airline | new_york_miles | london_miles |
|---|---|---|
| foundry airways | 1002345.0 | 7381838.0 |
| new air | 1.22779055E7 | null |
Example 2: Base case¶
Argument values:
- Aggregations: [
alias(
alias: miles,
expression:
mean(
expression:miles,
),
)] - Dataset: ri.foundry.main.dataset.a
- Group by columns: [
airline] - Pivot by column:
airport - Pivot by values: [(JFK, new_york), (LHR, london)]
- Prefix or suffix alias:
SUFFIX
Input:
| airline | airport | miles |
|---|---|---|
| foundry airways | JFK | 1002345 |
| foundry airways | LHR | 2221324 |
| new air | SFO | 21356673 |
| new air | JFK | 12323456 |
| foundry airways | LHR | 12542352 |
| new air | JFK | 12232355 |
Output:
| airline | miles_new_york | miles_london |
|---|---|---|
| foundry airways | 1002345.0 | 7381838.0 |
| new air | 1.22779055E7 | null |
中文翻译¶
数据透视(Pivot)¶
支持:批处理(Batch)、快速处理(Faster)
对输入数据集按指定列分组后执行指定的聚合操作。必须提供用于透视的唯一值,以便在运行前确定输出模式(schema)。这有助于提高运行时的长期稳定性。
转换类别:聚合(Aggregate)、常用(Popular)
声明参数¶
- 聚合操作(Aggregations): 对数据集执行的聚合操作列表。
List\> - 数据集(Dataset): 执行聚合操作的数据集。
Table - 分组列(Group by columns): 聚合时用于对数据集进行分组的列列表。
List\> - 透视列(Pivot by column): 用于透视的列。
Column\ - 透视值(Pivot by values): 用于透视的唯一值列表及其输出别名。别名值用于根据前缀/后缀参数构建输出列名。
List\, Literal\\ >> - 可选 前缀或后缀别名(Prefix or suffix alias): 若选择前缀,输出列名格式为"别名""聚合操作";若选择后缀,则格式为"聚合操作"
别名。
Enum\
类型变量边界: T 接受 Boolean | Byte | Integer | Long | Short | String
示例¶
示例 1:基础情况¶
参数值:
- 聚合操作: [
alias(
alias: miles,
expression:
mean(
expression:miles,
),
)] - 数据集: ri.foundry.main.dataset.a
- 分组列: [
airline] - 透视列:
airport - 透视值: [(JFK, new_york), (LHR, london)]
- 前缀或后缀别名: null
输入:
| airline | airport | miles |
|---|---|---|
| foundry airways | JFK | 1002345 |
| foundry airways | LHR | 2221324 |
| new air | SFO | 21356673 |
| new air | JFK | 12323456 |
| foundry airways | LHR | 12542352 |
| new air | JFK | 12232355 |
输出:
| airline | new_york_miles | london_miles |
|---|---|---|
| foundry airways | 1002345.0 | 7381838.0 |
| new air | 1.22779055E7 | null |
示例 2:基础情况¶
参数值:
- 聚合操作: [
alias(
alias: miles,
expression:
mean(
expression:miles,
),
)] - 数据集: ri.foundry.main.dataset.a
- 分组列: [
airline] - 透视列:
airport - 透视值: [(JFK, new_york), (LHR, london)]
- 前缀或后缀别名:
SUFFIX
输入:
| airline | airport | miles |
|---|---|---|
| foundry airways | JFK | 1002345 |
| foundry airways | LHR | 2221324 |
| new air | SFO | 21356673 |
| new air | JFK | 12323456 |
| foundry airways | LHR | 12542352 |
| new air | JFK | 12232355 |
输出:
| airline | miles_new_york | miles_london |
|---|---|---|
| foundry airways | 1002345.0 | 7381838.0 |
| new air | 1.22779055E7 | null |