跳转至

Monitoring FAQ(监控常见问题解答(FAQ))

Do all health checks now exist as monitoring rules?

Not all health checks exist as monitoring rules, but the most important health checks have analogous monitoring rules. We recommend using a combination of monitoring rules and health checks together in a monitoring view. To summarize coverage from monitoring views and health checks:

  • Resources that can only be monitored with monitoring views: Data connection agents, objects and links in Object Storage V2 (OSv2), Streaming datasets, live deployments of models, time series syncs, and automations.
  • Dataset-level checks that only exist as health checks: Content, freshness, and schema checks; data expectations; Object Storage V1 (Phonograph) and foundry-sync checks.
  • Monitoring rules that replace functionality from health checks: Consecutive schedule failures (replacing schedule status checks), schedule duration, and dataset time since job last succeeded monitors.

Why use monitors over health checks?

Monitors cover an entire scope rather than a single resource. This means that when an additional resource is added to that scope, it is automatically covered by the rule. For example, a monitoring rule that is set up to monitor all agents in a Project will also monitor any further agents added into that Project at a later time.

When should I create a new monitoring view instead of adding new rules to an existing one?

Each monitoring view should relate to a set of users who care about the monitors that are in that view. If a specific set of users [a, b, c] cares about specific Projects [x, y, z], create a single monitoring view with all of the resources in those Projects. If a specific set of users only care about monitoring agents, you should create a single monitoring view to monitor all agents in all Projects.

What permissions are required for a monitoring view?

Since a monitoring view is a filesystem resource, a user will need permission to the Project or folder in which the view is saved. To receive alerts or set up monitoring rules on a resource, the user will need access to the Project resources they wish to monitor. Even if a user with all necessary permissions subscribes a user or group to a monitoring view, those new subscribers will not receive alerts on any resources if they do not have explicit access permissions to that monitoring view.


中文翻译


监控常见问题解答(FAQ)

是否所有健康检查(health checks)都已作为监控规则(monitoring rules)存在?

并非所有健康检查都已作为监控规则存在,但最重要的健康检查都有对应的监控规则。我们建议在监控视图(monitoring view)中结合使用监控规则和健康检查。以下是监控视图和健康检查覆盖范围的总结:

  • 仅能通过监控视图监控的资源: 数据连接代理(Data connection agents)、对象存储 V2(Object Storage V2, OSv2)中的对象和链接、流式数据集(Streaming datasets)、模型实时部署、时间序列同步(time series syncs)以及自动化任务(automations)。
  • 仅作为健康检查存在的数据集级别检查: 内容检查(Content checks)、新鲜度检查(Freshness checks)和模式检查(Schema checks);数据期望(Data expectations);对象存储 V1(Object Storage V1, Phonograph)和 foundry-sync 检查。
  • 替代健康检查功能的监控规则: 连续调度失败监控(Consecutive schedule failures,替代调度状态检查)、调度持续时间监控(Schedule duration)以及数据集上次任务成功后的时间监控(Dataset time since job last succeeded)。

为什么使用监控规则而非健康检查?

监控规则覆盖整个范围(scope)而非单个资源。这意味着当该范围内新增资源时,该资源会自动被规则覆盖。例如,设置为监控某个项目(Project)中所有代理的监控规则,也会自动监控后续加入该项目的任何新代理。

何时应创建新的监控视图,而非向现有视图添加新规则?

每个监控视图应与一组关注该视图中监控指标的特定用户相关联。如果特定用户组 [a, b, c] 关注特定项目 [x, y, z],则创建一个包含这些项目中所有资源的单一监控视图。如果特定用户组只关注监控代理,则应创建一个单一监控视图来监控所有项目中的所有代理。

监控视图需要哪些权限?

由于监控视图是一种文件系统资源(filesystem resource),用户需要拥有保存该视图的项目或文件夹的权限。要接收警报或对资源设置监控规则,用户需要拥有其希望监控的项目资源的访问权限。即使拥有所有必要权限的用户将其他用户或群组订阅到某个监控视图,如果这些新订阅者没有对该监控视图的显式访问权限,他们也不会收到任何资源的警报。