Frequent pattern growth(频繁模式增长(Frequent pattern growth))¶
Supported in: Batch
Frequent pattern (fp) growth finds frequent patterns in your dataset.
Transform categories: Aggregate, Other
Declared arguments¶
- Input dataset: Source dataset containing items columns and transaction column.
Table - Items column: Array column containing the items for the patterns.
Column\> - Minimum support: Minimum fraction of how often a pattern needs to be present.
Literal\
Examples¶
Example 1: Base case¶
Argument values:
- Input dataset: ri.foundry.main.dataset.a
- Items column:
customer_attributes - Minimum support: 0.6
Input:
| customer_attributes |
|---|
| [ age_group: 20-30, country: Germany, gender: Female ] |
| [ age_group: 20-30, country: Germany, gender: Male ] |
Output:
| pattern | pattern_occurrence | total_count |
|---|---|---|
| [ country: Germany, age_group: 20-30 ] | 2 | 2 |
| [ age_group: 20-30 ] | 2 | 2 |
| [ country: Germany ] | 2 | 2 |
Example 2: Null case¶
Argument values:
- Input dataset: ri.foundry.main.dataset.a
- Items column:
customer_attributes - Minimum support: 0.0
Input:
| customer_attributes |
|---|
| null |
Output:
| pattern | pattern_occurrence | total_count |
|---|---|---|
Example 3: Null case¶
Argument values:
- Input dataset: ri.foundry.main.dataset.a
- Items column:
customer_attributes - Minimum support: 0.0
Input:
| customer_attributes |
|---|
| [ age_group: 20-30, country: Germany, gender: Female ] |
| [ null ] |
Output:
| pattern | pattern_occurrence | total_count |
|---|---|---|
| [ country: Germany ] | 1 | 2 |
| [ country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ null ] | 1 | 2 |
| [ age_group: 20-30 ] | 1 | 2 |
| [ gender: Female ] | 1 | 2 |
| [ gender: Female, country: Germany ] | 1 | 2 |
| [ gender: Female, country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ gender: Female, age_group: 20-30 ] | 1 | 2 |
Example 4: Edge case¶
Argument values:
- Input dataset: ri.foundry.main.dataset.a
- Items column:
customer_attributes - Minimum support: 0.0
Input:
| customer_attributes |
|---|
| [ age_group: 20-30, country: Germany, gender: Female ] |
| [ age_group: 20-30, country: Germany, gender: Male ] |
Output:
| pattern | pattern_occurrence | total_count |
|---|---|---|
| [ gender: Male ] | 1 | 2 |
| [ gender: Male, country: Germany ] | 1 | 2 |
| [ gender: Male, country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ gender: Male, age_group: 20-30 ] | 1 | 2 |
| [ age_group: 20-30 ] | 2 | 2 |
| [ country: Germany ] | 2 | 2 |
| [ country: Germany, age_group: 20-30 ] | 2 | 2 |
| [ gender: Female ] | 1 | 2 |
| [ gender: Female, country: Germany ] | 1 | 2 |
| [ gender: Female, country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ gender: Female, age_group: 20-30 ] | 1 | 2 |
中文翻译¶
频繁模式增长(Frequent pattern growth)¶
支持:批处理(Batch)
频繁模式增长(Frequent pattern growth,FP-growth)用于发现数据集中的频繁模式。
转换类别:聚合(Aggregate)、其他(Other)
声明参数(Declared arguments)¶
- 输入数据集(Input dataset): 包含项目列和事务列的源数据集。
表(Table) - 项目列(Items column): 包含模式中项目的数组列。
列\<数组\<字符串>>(Column\>) - 最小支持度(Minimum support): 模式需要出现的最小比例。
字面量\<双精度浮点数>(Literal\)
示例¶
示例 1:基本情况¶
参数值:
- 输入数据集: ri.foundry.main.dataset.a
- 项目列:
customer_attributes - 最小支持度: 0.6
输入:
| customer_attributes |
|---|
| [ age_group: 20-30, country: Germany, gender: Female ] |
| [ age_group: 20-30, country: Germany, gender: Male ] |
输出:
| pattern | pattern_occurrence | total_count |
|---|---|---|
| [ country: Germany, age_group: 20-30 ] | 2 | 2 |
| [ age_group: 20-30 ] | 2 | 2 |
| [ country: Germany ] | 2 | 2 |
示例 2:空值情况¶
参数值:
- 输入数据集: ri.foundry.main.dataset.a
- 项目列:
customer_attributes - 最小支持度: 0.0
输入:
| customer_attributes |
|---|
| null |
输出:
| pattern | pattern_occurrence | total_count |
|---|---|---|
示例 3:空值情况¶
参数值:
- 输入数据集: ri.foundry.main.dataset.a
- 项目列:
customer_attributes - 最小支持度: 0.0
输入:
| customer_attributes |
|---|
| [ age_group: 20-30, country: Germany, gender: Female ] |
| [ null ] |
输出:
| pattern | pattern_occurrence | total_count |
|---|---|---|
| [ country: Germany ] | 1 | 2 |
| [ country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ null ] | 1 | 2 |
| [ age_group: 20-30 ] | 1 | 2 |
| [ gender: Female ] | 1 | 2 |
| [ gender: Female, country: Germany ] | 1 | 2 |
| [ gender: Female, country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ gender: Female, age_group: 20-30 ] | 1 | 2 |
示例 4:边界情况¶
参数值:
- 输入数据集: ri.foundry.main.dataset.a
- 项目列:
customer_attributes - 最小支持度: 0.0
输入:
| customer_attributes |
|---|
| [ age_group: 20-30, country: Germany, gender: Female ] |
| [ age_group: 20-30, country: Germany, gender: Male ] |
输出:
| pattern | pattern_occurrence | total_count |
|---|---|---|
| [ gender: Male ] | 1 | 2 |
| [ gender: Male, country: Germany ] | 1 | 2 |
| [ gender: Male, country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ gender: Male, age_group: 20-30 ] | 1 | 2 |
| [ age_group: 20-30 ] | 2 | 2 |
| [ country: Germany ] | 2 | 2 |
| [ country: Germany, age_group: 20-30 ] | 2 | 2 |
| [ gender: Female ] | 1 | 2 |
| [ gender: Female, country: Germany ] | 1 | 2 |
| [ gender: Female, country: Germany, age_group: 20-30 ] | 1 | 2 |
| [ gender: Female, age_group: 20-30 ] | 1 | 2 |