跳转至

Maintaining pipelines(维护管道)

As data pipelines are created and productionized in order to support various use cases, some may reach a state where they are no longer under active development and the emphasis is primarily on pipeline maintenance.

This page focuses on the responsibilities of a pipeline maintainer, and the prerequisites to bring a pipeline into maintenance mode:

This rest of this section describes best practices and approaches for pipeline maintenance:

Prerequisites and expectations

Before you begin maintaining a pipeline, it is important that you have clear expectations defined for it. This will help you set realistic alerting thresholds, prioritize maintenance work and alerts on your pipeline, delineate responsibilities between teams, and most importantly, ensure that the pipeline meets the needs of the users.

The best practices throughout this section assume that you have captured the following expectations:

  • What data is in the scope of the pipeline
  • What data is delivered
  • When data is delivered
  • When data is supposed to be built
  • In particular, whether the pipeline should run over the weekends
  • At what frequency the data should ideally update
  • When data is considered critically out of date

Pipeline maintenance responsibilities

The responsibilities of a pipeline maintainer include:

  • Setting up the technical aspects of pipeline monitoring
  • Debugging the pipeline when it is broken (when health checks fail)
  • Making code changes and/or modifying the monitoring setup where necessary
  • Contacting upstream teams when data is incorrect or not received on time

In order to meet these responsibilities, the following skills and access are recommended for pipeline maintainers:

  • Data access (recommended if possible): Proper data access will make it possible to debug issues properly when there is an issue with the data.
  • Technical skills (recommended): Pipeline monitoring team members should be able to read code and navigate pipeline development tools such as Code Repositories, Builds, Data Lineage, and Data Health. This ensures they can interpret and triage issues effectively across the entire pipeline.
  • Familiarity with the pipeline architecture (optional): A team member should familiarize themselves with the pipeline before they begin monitoring. This can be facilitated through documentation and infrastructure knowledge management.

中文翻译

维护管道

随着数据管道为支持各种用例而被创建并投入生产,部分管道可能进入不再积极开发、重点转向维护的阶段。

本页聚焦于管道维护者的职责,以及将管道纳入维护模式的先决条件:

本节其余部分介绍了管道维护的最佳实践与方法:

先决条件与期望

在开始维护管道之前,务必为其定义清晰的期望。这将有助于设定合理的告警阈值、优先处理管道维护工作与告警、明确团队间的职责划分,最重要的是确保管道满足用户需求。

本节中的最佳实践均假设您已明确以下期望:

  • 管道涵盖的数据范围
  • 交付的数据内容
  • 数据交付时间
  • 数据应何时构建
  • 特别是管道是否需要在周末运行
  • 数据理想的更新频率
  • 数据何时被视为严重过期

管道维护职责

管道维护者的职责包括:

  • 设置管道监控的技术层面
  • 在管道出现故障时(健康检查失败时)进行调试
  • 必要时修改代码和/或调整监控设置
  • 当数据不正确或未按时接收时联系上游团队

为履行这些职责,建议管道维护者具备以下技能和权限:

  • 数据访问权限(如有可能建议具备):适当的数据访问权限可在数据出现问题时进行有效调试。
  • 技术技能(建议具备):管道监控团队成员应能阅读代码,并熟练使用代码仓库、构建、数据沿袭和数据健康等管道开发工具。这确保他们能有效解读并处理整个管道中的问题。
  • 熟悉管道架构(可选):团队成员应在开始监控前熟悉管道架构。可通过文档和基础设施知识管理来促进这一过程。