跳转至

Outputs(输出)

Pipeline outputs are the end result of a pipeline in Pipeline Builder; outputs can be datasets, virtual tables, or Ontology components such as object types, object links, or time series.

Outputs help you describe your pipeline, from adding your data to creating transform logic. Outputs are created once you deploy your first pipeline build and allow you to expand your workflow and Project capabilities in other Foundry applications.

To add an output, select Add pipeline output in the outputs panel to the right of your graph.

New pipeline creation page

This will take you to the output type selection screen.

Output types

Review the sections below for details on each output type.

Datasets

A dataset is a fundamental element in Foundry. Workflows in Pipeline Builder start with adding datasets, continue with transforms on datasets, and can end with datasets as an output. Add a dataset as an output to when you want to build a pipeline that produces clean, transformed data. You can use your final dataset as a foundation for Ontology building in the Ontology Manager.

Learn more about adding dataset outputs.

Media sets

A media set is a collection of media files with a common schema, for example, files of the same format. Media sets support a variety of forms of unstructured data, including visual media, PDF documents, audio, and more. Add a media set as an output when you want to build a pipeline that produces clean, transformed media.

Learn more about adding media set outputs.

Geotemporal series syncs

Geotemporal series can be used to track the geographic position of entities over time, and consist of different observations. Each observation contains an identifier, timestamp, position, and other user-specified properties. You can add a geotemporal series sync output to your Pipeline Builder pipeline to make batch or streaming geotemporal data in Foundry available in downstream applications such as the Map application. Geotemporal observations from Foundry are written in Pipeline Builder.

Learn more about adding geotemporal series syncs in Pipeline Builder.

:::callout{theme="neutral"} In order to use the geotemporal series sync output in Pipeline Builder, you must have geotemporal series enabled for your enrollment. Please contact your Palantir representative for any further questions about enabling this feature. :::

Virtual tables [Beta]

A virtual table acts as a pointer to a table in a source system outside of Foundry, and allows you to use that data in Foundry without ingesting it. Much like datasets, you can add virtual tables to your pipeline, continue with transforms, and end with a virtual table or a dataset as an output. Use a virtual table as an output when you want to use an external system for storage. Pipelines can include a mix of datasets and virtual tables; for example, you can start with a dataset and write your output externally, or you can start with a virtual table and write your output to Foundry.

Learn more about adding virtual table outputs.

Ontology outputs

Add Ontology elements as outputs to your pipeline to guide your workflow from raw datasets to cleaned and structured data that defines a new object type element in your global Ontology. With Pipeline Builder, you can add and edit object types, link types, and time series within one workflow interface rather than navigating back to Ontology Manager.

Learn more about adding Ontology outputs.

Object types

An object type is the schema definition of a real-world entity or event. Add an object type output to your workflow to help guide data transforms into defined elements that you can use to build applications in Foundry, including Workshop modules or Slate applications.

A link type is the schema definition of a relationship between two object types. A link refers to a single instance of that relationship between two objects. If you add two object type outputs to your pipeline, you can create a link to define their relationship and add to your global Ontology. Use object links to build robust applications in Workshop and Slate.

Time series syncs

Time series data are any data that consists of one or more sets of timestamp and value pairs; these pairs measure a quantity over time. This can include measuring sales volumes per year, total flights per day, production outputs per hour, or high frequency temperature readings at sub-second resolution. Add a time series sync output to your Pipeline to index data backing time series properties.

Learn more about time series setup and usage.

Learn more about the Foundry Ontology.


中文翻译

输出

管道输出是 Pipeline Builder 中管道的最终结果;输出可以是数据集、虚拟表或本体论(Ontology)组件,例如对象类型、对象链接或时间序列。

输出有助于描述您的管道,从添加数据到创建转换逻辑。一旦您部署了第一个管道构建,输出就会被创建,并允许您在其他 Foundry 应用程序中扩展工作流和项目功能。

要添加输出,请在图表右侧的输出面板中选择添加管道输出

新建管道页面

这将带您进入输出类型选择界面。

输出类型

请查看以下各节,了解每种输出类型的详细信息。

数据集

数据集是 Foundry 中的基本元素。Pipeline Builder 中的工作流从添加数据集开始,对数据集进行转换,并可以以数据集作为输出结束。当您想要构建一个生成干净、转换后数据的管道时,请将数据集添加为输出。您可以将最终数据集用作在 Ontology Manager 中构建本体论(Ontology)的基础。

了解更多关于添加数据集输出的信息。

媒体集

媒体集是具有共同模式(例如相同格式的文件)的媒体文件集合。媒体集支持多种形式的非结构化数据,包括视觉媒体、PDF 文档、音频等。当您想要构建一个生成干净、转换后媒体的管道时,请将媒体集添加为输出。

了解更多关于添加媒体集输出的信息。

地理时间序列同步

地理时间序列可用于跟踪实体随时间变化的地理位置,由不同的观测值组成。每个观测值包含一个标识符、时间戳、位置和其他用户指定的属性。您可以将地理时间序列同步输出添加到 Pipeline Builder 管道中,使 Foundry 中的批量或流式地理时间序列数据可在下游应用程序(如地图应用程序)中使用。来自 Foundry 的地理时间序列观测值在 Pipeline Builder 中写入。

了解更多关于在 Pipeline Builder 中添加地理时间序列同步的信息。

:::callout{theme="neutral"} 要在 Pipeline Builder 中使用地理时间序列同步输出,您必须为您的注册启用地理时间序列功能。如有关于启用此功能的任何进一步问题,请联系您的 Palantir 代表。 :::

虚拟表 [Beta]

虚拟表充当指向 Foundry 外部源系统中表的指针,允许您在不摄取数据的情况下在 Foundry 中使用该数据。与数据集类似,您可以将虚拟表添加到管道中,继续进行转换,并以虚拟表或数据集作为输出结束。当您想使用外部系统进行存储时,请将虚拟表用作输出。管道可以包含数据集和虚拟表的混合;例如,您可以从数据集开始并将输出写入外部系统,或者从虚拟表开始并将输出写入 Foundry。

了解更多关于添加虚拟表输出的信息。

本体论输出

将本体论(Ontology)元素添加为管道的输出,以引导您的工作流从原始数据集到干净、结构化的数据,这些数据定义了全局本体论(Ontology)中的新对象类型元素。使用 Pipeline Builder,您可以在一个工作流界面中添加和编辑对象类型、链接类型和时间序列,而无需返回 Ontology Manager。

了解更多关于添加本体论输出的信息。

对象类型

对象类型是现实世界实体或事件的模式定义。将对象类型输出添加到您的工作流中,有助于将数据转换引导为定义好的元素,您可以使用这些元素在 Foundry 中构建应用程序,包括 Workshop 模块或 Slate 应用程序。

对象链接

链接类型是两个对象类型之间关系的模式定义。链接指的是两个对象之间该关系的单个实例。如果您向管道添加两个对象类型输出,您可以创建一个链接来定义它们的关系,并将其添加到全局本体论(Ontology)中。使用对象链接在 Workshop 和 Slate 中构建强大的应用程序。

时间序列同步

时间序列数据是由一组或多组时间戳和值对组成的任何数据;这些对随时间测量某个量。这可以包括测量每年的销售量、每天的总航班数、每小时的生产输出或亚秒级分辨率的高频温度读数。将时间序列同步输出添加到您的管道中,以索引支持时间序列属性的数据。

了解更多关于时间序列设置使用的信息。

了解更多关于Foundry 本体论(Ontology)的信息。