Keeps duplicates(保留重复行)¶
Supported in: Batch, Faster
Keep duplicate rows from the input.
Transform categories: Other
Declared arguments¶
- Column subset: If any columns are specified only those will be used when determining uniqueness.
Set\> - Dataset: Dataset to keep duplicate rows from.
Table
Examples¶
Example 1: Base case¶
Argument values:
- Column subset: {
tail_number} - Dataset: ri.foundry.main.dataset.aggregate
Input:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| XB-123 | foundry airline | 1134 | 3 |
Output:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| XB-123 | foundry airline | 1134 | 3 |
Example 2: Base case¶
Description: No subset looks for exact duplicates.
Argument values:
- Column subset: {}
- Dataset: ri.foundry.main.dataset.aggregate
Input:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 6 |
| MT-222 | new airline | 1123 | 5 |
Output:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
Example 3: Null case¶
Argument values:
- Column subset: {
tail_number} - Dataset: ri.foundry.main.dataset.aggregate
Input:
| tail_number | airline | miles | factor |
|---|---|---|---|
| null | foundry air | 124 | 2 |
| null | new airline | 1123 | 5 |
| null | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| XB-123 | foundry airline | 1134 | 3 |
Output:
| tail_number | airline | miles | factor |
|---|---|---|---|
| null | foundry air | 124 | 2 |
| null | new airline | 1123 | 5 |
| null | foundry airline | 335 | 5 |
中文翻译¶
保留重复行¶
支持模式:批处理(Batch)、快速处理(Faster)
从输入数据中保留重复行。
转换类别:其他
声明参数¶
- 列子集(Column subset): 如果指定了任何列,则仅使用这些列来判断唯一性。
Set\> - 数据集(Dataset): 需要从中保留重复行的数据集。
Table
示例¶
示例1:基础案例¶
参数值:
- 列子集(Column subset): {
tail_number} - 数据集(Dataset): ri.foundry.main.dataset.aggregate
输入:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| XB-123 | foundry airline | 1134 | 3 |
输出:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 5 |
| XB-123 | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| XB-123 | foundry airline | 1134 | 3 |
示例2:基础案例¶
描述: 未指定子集时,查找完全重复的行。
参数值:
- 列子集(Column subset): {}
- 数据集(Dataset): ri.foundry.main.dataset.aggregate
输入:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
| MT-222 | new airline | 1123 | 6 |
| MT-222 | new airline | 1123 | 5 |
输出:
| tail_number | airline | miles | factor |
|---|---|---|---|
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
| XB-123 | foundry air | 124 | 2 |
示例3:空值案例¶
参数值:
- 列子集(Column subset): {
tail_number} - 数据集(Dataset): ri.foundry.main.dataset.aggregate
输入:
| tail_number | airline | miles | factor |
|---|---|---|---|
| null | foundry air | 124 | 2 |
| null | new airline | 1123 | 5 |
| null | foundry airline | 335 | 5 |
| MT-222 | new air | 565 | 4 |
| KK-452 | new air | 222 | 1 |
| XB-123 | foundry airline | 1134 | 3 |
输出:
| tail_number | airline | miles | factor |
|---|---|---|---|
| null | foundry air | 124 | 2 |
| null | new airline | 1123 | 5 |
| null | foundry airline | 335 | 5 |