跳转至

Coalesce data(合并分区(Coalesce data))

Supported in: Batch, Faster

Operation to reduce the number of partitions. If you have 1000 partitions and you coalesce to 100 there will not be a shuffle, instead each of the 100 new partitions will claim 10 of the current partitions. If a larger number of partitions is requested, it will stay at the current number of partitions.

Transform categories: Other

Declared arguments

  • Dataset: Dataset to perform coalesce on.
    Table
  • Number of partitions: Number of partitions to coalesce to.
    Literal\

中文翻译

合并分区(Coalesce data)

支持:批处理(Batch)、快速(Faster)

用于减少分区数量的操作。如果您有1000个分区,合并到100个分区时不会发生数据混洗(shuffle),而是每个新分区会接管当前10个分区的数据。如果请求的分区数量大于当前分区数,则保持当前分区数量不变。

转换类别:其他

声明参数(Declared arguments)

  • 数据集(Dataset): 要执行合并操作的数据集。
    表(Table)
  • 分区数量(Number of partitions): 合并后的目标分区数量。
    字面量\<整数>(Literal\