跳转至

Troubleshooting automation performance(自动化性能故障排除)

This guide helps you identify the root cause of unexpected performance issues and provides solutions. This page covers common performance spike patterns, a systematic diagnostic process, and immediate mitigation steps to reduce resource consumption.

:::callout{theme="info"} For proactive guidance on designing efficient automations, see Performance best practices. :::

Common performance spike patterns

Below are common patterns to help identify and diagnose performance issues with automations.

Pattern 1: Active automation on frequently updating objects

Symptoms: The automation ran many times in a short period, generating unexpected resource consumption. For example, an automation with a large number of object updates can generate many automation runs, each potentially triggering downstream effects.

Root cause: An object type updates very frequently, and the automation is configured with an "on object update" condition without a time-based cap. Often this happens when an automation that was previously paused gets unpaused.

How to fix:

  1. Immediately pause the automation to stop further executions.
  2. Add a time-based condition with an appropriate interval to cap execution frequency.
  3. Verify that Single execution mode is enabled.
  4. Resume the automation and monitor execution counts closely for the next day.

Pattern 2: Chained automations snowball

Symptoms: Multiple automations are running in sequence, with execution counts growing exponentially.

Root cause: Automations form a chain where each automation edits objects, triggering the next automation in the sequence:

  • Automation A edits objects
  • These edits trigger Automation B, which processes each object separately
  • Automation B edits more objects, triggering Automation C
  • The multiplicative effect can turn initial updates into exponentially more executions

How to fix:

  1. Pause all downstream automations in the chain.
  2. Evaluate whether you can consolidate the logic into fewer automations.
  3. Ensure that any remaining automations use bulk processing (ObjectSet inputs and Single execution mode).
  4. Re-enable automations one at a time, monitoring execution counts after each.

Pattern 3: Inefficient function operations

Symptoms: Function execution time is high, and resource consumption scales poorly with object count.

Root cause: Functions contain loops that query the Ontology once per iteration instead of processing objects in bulk.

How to fix:

  1. Review the function code for loops with Ontology queries inside.
  2. Refactor to load all required objects upfront or use backend aggregations.
  3. Change action inputs to accept ObjectSets instead of individual objects when possible.

For comprehensive function optimization guidance, see Optimize function performance.

Pattern 4: Function self-calling loop

Symptoms: A single function is being called many times, often recursively.

Root cause: A function edits objects, and those edits trigger the same automation that calls the function, creating a loop. Without guards in place, this can continue until manually stopped.

How to fix:

  1. Add a status flag or timestamp to objects to prevent re-processing.
  2. Add conditional logic to check whether processing is needed before making edits.
  3. Consider whether the logic should be moved outside of Automate entirely.

Diagnostic process

To investigate a performance spike, follow these steps:

  1. Check automation execution history: Open the automation and review the execution history. Key questions to consider:
  2. How many times did it run in the last day?
  3. When did the spike in executions begin?
  4. Is the frequency increasing over time?

  5. Identify condition frequency: Examine the condition configuration and object update patterns. Key questions to consider:

  6. How often are objects being updated that meet the conditions?
  7. Is a time-based condition configured? If so, what is the interval?
  8. Are updates happening more frequently than expected?

  9. Trace automation chains: Use Autopilot or Workflow Lineage to understand dependencies. Key questions to consider:

  10. Which automations trigger other automations?
  11. What is the full chain of effects?
  12. Are there potential snowball effects where one execution multiplies?

  13. Review function implementation: Examine the functions being called by the automations. Key topics to investigate:

  14. Do functions contain loops with Ontology queries?
  15. Are bulk processing patterns being used correctly?
  16. Check function execution times and external call counts.

  17. Look for recursive conditions: Determine if automations are triggering themselves. Key questions to consider:

  18. Does the function edit objects that cause the same automation's conditions to be met again?
  19. Are there status flags or guards to prevent recursive processing?
  20. Does the execution history show rapid repeated calls?

Immediate mitigation steps

When a performance or resource consumption spike is found, take these actions in priority order:

  1. Stop the bleeding
  2. Immediately pause the automation that is causing the spike. This prevents further resource consumption during investigation.

  3. Assess impact

  4. Check Resource Management to understand the total resource impact.
  5. Identify any downstream automations that may also need to be paused.
  6. Determine how far back the issue extends.

  7. Apply quick fix

  8. Add a time-based condition if one is missing.
  9. Change execution mode to Single execution if it is set to multiple.
  10. Add conditional logic to the function to skip unnecessary processing.

  11. Monitor recovery

  12. Resume the automation with reduced frequency or limited scope.
  13. Watch execution counts closely for the next day.
  14. Verify that resource consumption returns to expected levels.

Tools and resources

Below are several resources for diagnostic information.

For execution history and cost breakdowns

Automation workflow overview

  • Autopilot: Control center for managing and monitoring automation workflows at scale
  • Workflow Lineage: Visualizes automation dependencies and chains

中文翻译


自动化性能故障排除

本指南帮助您识别意外性能问题的根本原因并提供解决方案。本文涵盖常见的性能峰值模式、系统化诊断流程以及降低资源消耗的即时缓解措施。

:::callout{theme="info"} 关于设计高效自动化的主动指导,请参阅性能最佳实践。 :::

常见性能峰值模式

以下常见模式可帮助识别和诊断自动化性能问题。

模式1:对频繁更新对象的活动自动化

症状: 自动化在短时间内运行多次,产生意外的资源消耗。例如,包含大量对象更新的自动化可能产生多次运行,每次都可能触发下游影响。

根本原因: 对象类型更新非常频繁,且自动化配置了"对象更新时"条件但未设置时间上限。这种情况通常发生在之前暂停的自动化被重新启用时。

修复方法:

  1. 立即暂停自动化以停止进一步执行。
  2. 添加适当间隔的时间条件以限制执行频率。
  3. 确认已启用单次执行模式。
  4. 恢复自动化并在接下来的一天密切监控执行次数。

模式2:链式自动化雪崩

症状: 多个自动化按顺序运行,执行次数呈指数级增长。

根本原因: 自动化形成链条,每个自动化编辑对象时触发序列中的下一个自动化:

  • 自动化A 编辑对象
  • 这些编辑触发 自动化B,后者分别处理每个对象
  • 自动化B 编辑更多对象,触发 自动化C
  • 乘数效应可能将初始更新转化为指数级更多的执行

修复方法:

  1. 暂停链条中所有下游自动化。
  2. 评估是否可以将逻辑整合到更少的自动化中。
  3. 确保剩余自动化使用批量处理(ObjectSet 输入和单次执行模式)。
  4. 逐个重新启用自动化,并在每次启用后监控执行次数。

模式3:低效的函数操作

症状: 函数执行时间长,资源消耗随对象数量增加而扩展性差。

根本原因: 函数包含循环,每次迭代查询一次 Ontology,而不是批量处理对象。

修复方法:

  1. 检查函数代码中是否包含 Ontology 查询的循环。
  2. 重构代码以预先加载所有所需对象或使用后端聚合。
  3. 尽可能将操作输入改为接受 ObjectSet 而非单个对象。

有关全面的函数优化指导,请参阅优化函数性能

模式4:函数自调用循环

症状: 单个函数被多次调用,通常是递归调用。

根本原因: 函数编辑对象,而这些编辑触发了调用该函数的同一自动化,形成循环。如果没有防护措施,这种情况会持续到手动停止。

修复方法:

  1. 为对象添加状态标志或时间戳以防止重复处理。
  2. 添加条件逻辑,在编辑前检查是否需要处理。
  3. 考虑是否应将逻辑完全移出 Automate。

诊断流程

要调查性能峰值,请按以下步骤操作:

  1. 检查自动化执行历史: 打开自动化并查看执行历史。需考虑的关键问题:
  2. 过去一天运行了多少次?
  3. 执行次数激增从何时开始?
  4. 频率是否随时间增加?

  5. 识别条件频率: 检查条件配置和对象更新模式。需考虑的关键问题:

  6. 满足条件的对象更新频率如何?
  7. 是否配置了时间条件?如果是,间隔是多少?
  8. 更新是否比预期更频繁?

  9. 追踪自动化链: 使用 AutopilotWorkflow Lineage 了解依赖关系。需考虑的关键问题:

  10. 哪些自动化触发了其他自动化?
  11. 完整的连锁效应是什么?
  12. 是否存在一次执行成倍增加的潜在雪崩效应?

  13. 审查函数实现: 检查自动化调用的函数。需调查的关键主题:

  14. 函数是否包含 Ontology 查询的循环?
  15. 是否正确使用了批量处理模式?
  16. 检查函数执行时间和外部调用次数。

  17. 查找递归条件: 确定自动化是否在触发自身。需考虑的关键问题:

  18. 函数编辑的对象是否会导致同一自动化的条件再次满足?
  19. 是否有状态标志或防护措施防止递归处理?
  20. 执行历史是否显示快速重复调用?

即时缓解措施

发现性能或资源消耗峰值时,按优先级顺序采取以下措施:

  1. 停止问题扩散
  2. 立即暂停导致峰值的自动化。这可在调查期间防止进一步资源消耗。

  3. 评估影响

  4. 检查资源管理以了解总体资源影响。
  5. 识别可能也需要暂停的下游自动化。
  6. 确定问题回溯到多远。

  7. 应用快速修复

  8. 如果缺少时间条件,则添加一个。
  9. 如果执行模式设置为多次,则改为单次执行模式。
  10. 在函数中添加条件逻辑以跳过不必要的处理。

  11. 监控恢复

  12. 以降低的频率或有限的范围恢复自动化。
  13. 在接下来的一天密切监控执行次数。
  14. 确认资源消耗恢复到预期水平。

工具和资源

以下是用于诊断信息的若干资源。

执行历史和成本明细

自动化工作流概览

  • Autopilot 大规模管理和监控自动化工作流的控制中心
  • Workflow Lineage 可视化自动化依赖关系和链条