跳转至

Data Connection(数据连接(Data Connection))

Data Connection is an application used to synchronize data from external systems and other Foundry instances for use in Foundry. Users can use Data Connection to sync data into Foundry for use in the data integration, modeling, and Ontology layers. Additionally, Data Connection enables setting up outbound connections to enable writeback to external systems via webhooks and data exports.

Access the Data Connection application by selecting the icon from the workspace navigation bar.

The Data Connection application in the Foundry sidebar.

You can also find the Data Connection application by searching in the Applications Portal.

The Data Connection application found in the Applications Portal

If you are connecting Foundry to your data for the very first time for your organization, get started by referring to the initial setup guide.

Foundry standardizes the data connection process with the following three principles:

  • Robustness
  • Extensibility
  • Ease of use

Robustness

Often, connecting data between systems is prone to failures that can be difficult to recover from. Issues out of the user's control in the external environment (e.g., poor network connectivity, disk failures, or unresponsive source systems) can affect data syncs, breaking downstream analytics pipelines as well. Incomplete or corrupt data is not only a technical challenge, but also potentially dangerous for the organization if it is used unnoticed or not available for urgent needs.

Foundry proactively addresses these common failure points with automatic retries upon failures, use of simple functions (e.g., filesystem and database syncs) to pull data in small batches with low complexity queries from the source systems, and an integrated data health monitoring system to alert on critical failures and surface other pipeline health issues. Combined, these features minimize the risk of incomplete or corrupt data.

Foundry is also distinguished by a philosophy that data should be ingested "as-is" from its most raw source, with no external preprocessing. In the absence of external preprocessing, the branched and version-controlled Foundry pipeline becomes the single source for all of the changes that happened to raw data on its journey to Ontology, and any issues that arise on that path can be identified and resolved inside the platform. Data Connection adheres to this design philosophy by supporting both tabular and file-based syncs and deliberately offering minimal options for transforming the data before it arrives in the destination dataset (the starting point of the Foundry pipeline).

Extensibility

Enterprises have a diverse, complex array of systems that add value individually and as an integrated system. Each system has its own requirements for integration and some systems require unique capabilities or features that would make it typically difficult to integrate.

Foundry provides out-of-the-box integration with well-known system types (e.g., relational databases, FTPS, HDFS, S3, SFTP, and local directories) as well as the flexibility to connect and sync data from new system types. In many cases, new systems can reuse an existing plugin or one with just minor changes. Core features (e.g., scheduling and uploading) are standardized so that only the connection itself needs to be adjusted.

Foundry also supports communication between Foundry enrollments via the dedicated Foundry connector. With this connector, one Foundry instance can be treated as a source on another, supporting batch and streaming syncs with incremental options.

Learn more about the full range of available source types.

Ease of use

Managing data connections between systems can be an involved process that puts a significant burden on administrators who are responsible for every step: syncing, authentication, scheduling and orchestration, and monitoring.

Foundry abstracts away this complexity with backend services that take on the bulk of the work as users set up and manage pipelines through a simple frontend user interface. This lowers barrier of entry to a typically complex technical task, making it possible for more users to perform data connection.

:::callout{theme="success" title="Palantir Learning portal"} Try creating your first data connection with a relevant course on learn.palantir.com ↗. :::


中文翻译

数据连接(Data Connection)

数据连接(Data Connection) 是一个用于从外部系统及其他Foundry实例同步数据至Foundry的应用程序。用户可通过数据连接(Data Connection)将数据同步至Foundry,用于数据集成、建模及本体论(Ontology)层。此外,数据连接(Data Connection)还支持配置出站连接,通过Webhook和数据导出实现对外部系统的回写。

从工作区导航栏中选择对应图标即可访问数据连接(Data Connection)应用程序。

Foundry侧边栏中的数据连接(Data Connection)应用程序。

您也可以在应用程序门户(Applications Portal)中搜索找到数据连接(Data Connection)应用程序。

应用程序门户(Applications Portal)中的数据连接(Data Connection)应用程序

如果您是首次为组织将Foundry连接到数据,请参考初始设置指南开始操作。

Foundry通过以下三个原则标准化数据连接流程:

  • 鲁棒性(Robustness)
  • 可扩展性(Extensibility)
  • 易用性(Ease of use)

鲁棒性(Robustness)

通常情况下,系统间的数据连接容易出现难以恢复的故障。用户无法控制的外部环境问题(如网络连接不良、磁盘故障或源系统无响应)会影响数据同步,进而破坏下游分析管道。不完整或损坏的数据不仅是技术挑战,如果被无意识地使用或无法满足紧急需求,还可能给组织带来潜在危险。

Foundry主动应对这些常见故障点:在失败时自动重试,使用简单功能(如文件系统和数据库同步)以低复杂度查询从源系统分批拉取小量数据,并集成数据健康监控系统,对关键故障发出警报并揭示其他管道健康问题。这些功能相结合,最大限度地降低了数据不完整或损坏的风险。

Foundry的另一显著特点是其理念:数据应从最原始的来源"按原样"摄取,无需外部预处理。在没有外部预处理的情况下,分支化且版本受控的Foundry管道成为原始数据在通往本体论(Ontology)过程中所有变更的唯一来源,路径上出现的任何问题都可在平台内识别和解决。数据连接(Data Connection)遵循这一设计理念,支持表格和基于文件的同步,并有意识地提供最少的选项来在数据到达目标数据集(Foundry管道的起点)之前对其进行转换。

可扩展性(Extensibility)

企业拥有多样化、复杂的系统阵列,这些系统单独运作和作为集成系统时都能增加价值。每个系统都有其自身的集成要求,有些系统需要独特的功能或特性,这通常会使集成变得困难。

Foundry提供与常见系统类型(如关系数据库、FTPS、HDFS、S3、SFTP和本地目录)的开箱即用集成,同时也具备连接和同步新型系统数据的灵活性。在许多情况下,新系统可以重用现有插件,或仅需进行微小修改。核心功能(如调度和上传)已标准化,因此只需调整连接本身。

Foundry还通过专用的Foundry连接器(Foundry connector)支持不同Foundry注册实例之间的通信。使用此连接器,一个Foundry实例可被视为另一个实例的源,支持批量同步和流式同步,并提供增量选项。

了解所有可用源类型的完整范围。

易用性(Ease of use)

管理系统间的数据连接可能是一个复杂的过程,给负责每一步(同步、认证、调度与编排、监控)的管理员带来沉重负担。

Foundry通过后端服务抽象了这种复杂性,用户在简单的前端用户界面中设置和管理管道时,后端服务承担了大部分工作。这降低了通常复杂技术任务的入门门槛,使更多用户能够执行数据连接。

:::callout{theme="success" title="Palantir学习门户(Palantir Learning portal)"} 尝试通过learn.palantir.com ↗上的相关课程创建您的第一个数据连接。 :::