跳转至

Function monitoring(函数监控)

Functions in Foundry can be monitored to track performance and reliability. This page explains the available monitoring capabilities for functions.

Available monitoring rules

Function monitoring in Foundry supports the following rule types:

  1. Function duration p95: Alerts when the 95th percentile execution time exceeds thresholds.
  2. Number of function failures in window: Alerts when the total failure count exceeds thresholds within a timeframe. This rule tracks all failure types.
  3. Number of user-facing function failures in window: Alerts when the count of user-facing failures exceeds thresholds within a timeframe. This rule tracks only user-facing errors thrown by function code.
  4. Number of non-user-facing function failures in window: Alerts when the count of non-user-facing failures exceeds thresholds within a timeframe. This rule excludes user-facing errors, making it useful for monitoring infrastructure and system-level failures.

For detailed configuration options and parameters for each rule type, review the monitoring rules reference documentation.

Set up function monitoring

To set up monitoring for your functions, follow the standard process for creating monitoring views and rules:

  1. Create a monitoring view as described in the monitoring views overview documentation.
  2. Add a monitoring rule for functions as described in the section on adding a monitoring rule.
  3. Configure appropriate thresholds and severity levels.
  4. Set up alert notifications following the alert subscription guide.

Example monitoring alert setup.

Dynamic scopes

Function monitors support Workflow Lineage, Workshop, and OSDK application as dynamic scopes. When you select one of these scopes, the monitor automatically tracks all functions the scoped resource uses and adjusts as functions are added or removed without requiring further intervention.

Select scope dialog showing dynamic scope options for function monitors.


中文翻译


函数监控

Foundry 中的函数(Function)可被监控以追踪其性能与可靠性。本文档介绍了函数可用的监控能力。

可用监控规则

Foundry 函数监控支持以下规则类型:

  1. 函数执行时长 p95(Function duration p95): 当第95百分位执行时间超过阈值时触发告警。
  2. 窗口内函数失败次数(Number of function failures in window): 当指定时间窗口内总失败次数超过阈值时触发告警。此规则追踪所有失败类型。
  3. 窗口内面向用户的函数失败次数(Number of user-facing function failures in window): 当指定时间窗口内面向用户的失败次数超过阈值时触发告警。此规则仅追踪函数代码抛出的面向用户错误。
  4. 窗口内非面向用户的函数失败次数(Number of non-user-facing function failures in window): 当指定时间窗口内非面向用户的失败次数超过阈值时触发告警。此规则排除面向用户错误,适用于监控基础设施和系统级故障。

关于每种规则类型的详细配置选项和参数,请参阅监控规则参考文档

设置函数监控

要为函数设置监控,请遵循创建监控视图和规则的标准流程:

  1. 按照监控视图概述文档中的说明创建监控视图。
  2. 按照添加监控规则章节的说明为函数添加监控规则。
  3. 配置适当的阈值和严重级别。
  4. 按照告警订阅指南设置告警通知。

示例监控告警设置。

动态范围

函数监控支持将工作流血缘(Workflow Lineage)WorkshopOSDK 应用(OSDK application) 作为动态范围。选择其中任一范围后,监控器会自动追踪该范围资源使用的所有函数,并在函数新增或移除时自动调整,无需额外干预。

显示函数监控动态范围选项的范围选择对话框。

相关文档