AIP architecture（AIP 架构）¶

As described in the overview of AIP, Foundry, and Apollo, AIP is Palantir’s platform for connecting generative AI into operational domains. While AIP and Foundry operate together as part of a shared service mesh, powered by Apollo and deployed within Rubix, this page provides a view of AIP's end-to-end architecture through an AI-centric lens. This includes capabilities for securely connecting to the full range of LLMs, continuously integrating context into the Ontology, building agents and automations, observing and evaluating the ongoing performance of deployed agents, and a full developer toolchain for managing AI-driven products.

The AIP architecture can be summarized into 12 general categories of capability:

Diagram of AIP architecture, including the 12 general categories of capability.

Secure LLM integration & access: Enabling secure access to the full range of commercial LLMs (e.g., GPT, Gemini, Claude, Grok models) and open-source models (e.g., Llama), through Palantir-managed infrastructure that ensures that no transmitted data is retained by third-party providers, and no transmitted data is used for retraining by model providers. Enterprises can also integrate their existing models, whether existing model subscriptions, fine-tuned models, or domain-specific models.
End-to-end observability: Providing monitoring tools for every step of AI-driven workflows and agentic processes. This includes fine-grained monitoring for all data flows that feed the Ontology, logging for every action taken by human users or AI agents, and the ability to trace the cascade of chained executions in a workflow. This observability extends to token consumption and other aspects of resource usage.
Context engineering: Equipping developers with no-, low-, and pro-code tools for integrating the contextual data, logic, action that power the Ontology and all dependent workflows. All modalities of data integration (e.g., batch, streaming, real-time replication via CDC) can be leveraged through any runtime (e.g., Spark, Flink, DataFusion, Polars), while adhering to cohesive security, governance, provenance-tracking, and other essential guarantees.
The Ontology system: Activating context by integrating disparate data, logic, action, and security into a unified representation of enterprise decision-making. Read more about the Ontology.
The Ontology’s language models the "nouns" and "verbs" of operational processes into a legible form for both humans and agents.
The Ontology’s engine enables querying billions of objects, orchestrating tens-of-thousands of actions, and continuously incorporating feedback-based learning.
The Ontology’s toolchain empowers developers to build diverse and complex AI-powered applications, atop a common foundation.
Vector, compute, tool services: Providing the integrated vectorization services needed to produce and manage embeddings; an extensible compute framework that can leverage multi-node engines (e.g., Spark, Flink), efficient single-node engines (e.g., DuckDB, Polars), and any containerized “BYO” engine; and an integrated set of tool services that work with the Ontology system to function as an ever-evolving tool factory. The platform is designed to be modular and extensible with respect to models, compute engines, and interfaces.
Security & governance: Ensuring that every operation made by humans and agents abides by rigorous role-, marking-, and purpose-based controls. This requires a combination of infrastructure, platform, and enterprise security controls. These controls can be granularly configured and dynamically interrogated, and are cataloged in expressive audit logging. Governance capabilities extend uniformly across all operational, engineering, and developer activities performed within the platform interfaces as well as programmatically through the APIs/SDKs.
Agent lifecycle: Powering the interconnected building, orchestration, and evaluation processes for agents in production. Agents can be constructed using no-, low-, and pro-code workbenches. Durable orchestrations can be configured and managed through low-code interfaces like AIP Logic, or pro-code interfaces like Code Workspaces. The integrated evaluation framework (AIP Evals) operates seamlessly with the Ontology, enabling you to create test cases, debug and iterate on agent definitions, compare performance across different LLMs, examine the variance across executions, and more.
Operational automation: Facilitating the different modes of automation required within and across workflows. This includes scalable schedule-based automations, near real-time event-driven automations that process streaming data, and automations which are enmeshed with API-driven operations. Regardless of the modality, every automation can leverage the rich set of data, logic, and action primitives in the Ontology system, and a wide array of execution and notification configurations.
Development environments: Empowering developers to build agents and automations, how and where they want to. AIP provides integrated development environments (e.g., VS Code, JupyterLab), which provide seamless connectivity with Ontology-driven applications and integrated testing and evaluation frameworks. In equal measure, the Platform SDK ↗ and Ontology SDK, in conjunction with Palantir’s VS Code plug-in, bring the same core functionality to existing environments and developer toolchains. Additionally, Palantir MCP provides a secure interface for agentic development (analogous to what is possible in the platform with AI FDE).
Human + AI applications: Providing the full spectrum of AI-driven experiences; from object-oriented analytics, to real-time application building, to multimodal governance workflows, to the administration of core platform capabilities. Operational users, compliance teams, engineers, analysts, and other key personas have out-of-the-box applications specifically tailored for their workflows, as well as the ability to rapidly construct new Ontology-powered applications. In each case, the infusion of AI can be carefully controlled and transparently assessed, ensuring a smooth journey from augmentation to automation.
Package, release, deploy: Allowing developers to move beyond point analytics and applications to build fully featured AI products which leverage an integrated DevOps toolchain. End-to-end collections of data pipelines, Ontology definitions, automations, and prebuilt applications can be packaged, released, and deployed across heterogeneous target environments. Developers can specify allowances for "last-mile" customization, and downstream teams can securely receive updates as product definitions evolve, and as changes are validated and promoted across release channels.
Enterprise automation: Enabling builders, of all personas and backgrounds, to wield specialized AI agents (e.g., AI FDE, AIP Analyst) to construct data pipelines, write business logic, train models, build ontologies, produce analytics, and develop end-to-end applications. Critically, these agents operate atop the same foundation as human users, meaning that they abide by the same integrated change management capabilities (e.g., Global Branching), and can seamlessly weave human-in-the-loop workflows with entirely autonomous operations.

中文翻译¶

AIP 架构¶

如 AIP、Foundry 和 Apollo 概述所述，AIP 是 Palantir 用于将生成式 AI 连接到运营领域的平台。AIP 和 Foundry 作为共享服务网格的一部分协同运行，由 Apollo 提供支持并部署在 Rubix 之上。本文通过以 AI 为中心的视角，提供 AIP 端到端架构的视图。这包括安全连接到各类大语言模型(LLM)的能力、持续将上下文集成到本体论(Ontology)中的能力、构建智能体(Agent)和自动化(Automation)的能力、观察和评估已部署智能体持续性能的能力，以及用于管理 AI 驱动产品的完整开发者工具链。

AIP 架构可归纳为 12 个通用能力类别：

AIP 架构图，包含 12 个通用能力类别。

安全的 LLM 集成与访问： 通过 Palantir 管理的基础设施，安全访问各类商业大语言模型（如 GPT、Gemini、Claude、Grok 模型）和开源模型（如 Llama），确保传输的数据不会被第三方提供商保留，也不会被模型提供商用于重新训练。企业还可以集成其现有模型，无论是现有的模型订阅、微调模型还是领域特定模型。
端到端可观测性： 为 AI 驱动的工作流和智能体流程的每一步提供监控工具。这包括对所有输入本体论的数据流进行细粒度监控，记录人类用户或 AI 智能体执行的每个操作，以及追踪工作流中链式执行的级联过程。这种可观测性还扩展到令牌消耗和其他资源使用方面。
上下文工程： 为开发者提供无代码、低代码和专业代码工具，用于集成为本体论及所有依赖工作流提供支持的上下文数据、逻辑和操作。所有数据集成模式（如批处理、流处理、通过 CDC 的实时复制）均可通过任何运行时（如 Spark、Flink、DataFusion、Polars）利用，同时遵循统一的安全、治理、来源追踪和其他关键保障。
本体论系统： 通过将分散的数据、逻辑、操作和安全集成到企业决策的统一表示中，激活上下文。了解更多关于本体论的信息。
本体论的语言将运营流程中的"名词"和"动词"建模为人类和智能体都能理解的形式。
本体论的引擎支持查询数十亿个对象、编排数万个操作，并持续整合基于反馈的学习。
本体论的工具链使开发者能够在共同的基础上构建多样且复杂的 AI 驱动应用。
向量、计算、工具服务： 提供生成和管理嵌入(Embedding)所需的集成向量化服务；一个可扩展的计算框架，可利用多节点引擎（如 Spark、Flink）、高效的单节点引擎（如 DuckDB、Polars）以及任何容器化的"自带引擎"；以及一套与本体论系统协同工作的集成工具服务，充当不断演进的工具工厂。该平台在设计上对模型、计算引擎和接口保持模块化和可扩展性。
安全与治理： 确保人类和智能体的每个操作都遵守严格的基于角色、标记和目的的控制。这需要基础设施、平台和企业安全控制的组合。这些控制可以精细配置和动态查询，并通过详尽的审计日志进行记录。治理能力统一扩展到平台界面内以及通过 API/SDK 以编程方式执行的所有运营、工程和开发活动。
智能体生命周期： 为生产环境中智能体的构建、编排和评估流程提供支持。智能体可以使用无代码、低代码和专业代码工作台构建。持久化编排可以通过低代码界面（如 AIP Logic）或专业代码界面（如 Code Workspaces）进行配置和管理。集成的评估框架（AIP Evals）与本体论无缝协作，使您能够创建测试用例、调试和迭代智能体定义、比较不同 LLM 之间的性能、检查执行间的差异等。
运营自动化： 促进工作流内部和跨工作流所需的不同自动化模式。这包括可扩展的基于计划的自动化、处理流数据的近实时事件驱动自动化，以及与 API 驱动操作交织的自动化。无论采用何种模式，每个自动化都可以利用本体论系统中丰富的数据、逻辑和操作原语，以及广泛的执行和通知配置。
开发环境： 赋能开发者按照自己的方式和地点构建智能体和自动化。AIP 提供集成开发环境（如 VS Code、JupyterLab），这些环境与本体论驱动的应用以及集成的测试和评估框架无缝连接。同样，Platform SDK ↗ 和 Ontology SDK 与 Palantir 的 VS Code 插件配合使用，将相同的核心功能带入现有环境和开发者工具链。此外，Palantir MCP 为智能体开发提供了安全接口（类似于平台中使用 AI FDE 所能实现的功能）。
人类 + AI 应用： 提供全方位的 AI 驱动体验——从面向对象的分析，到实时应用构建，到多模态治理工作流，再到核心平台能力的管理。运营用户、合规团队、工程师、分析师和其他关键角色拥有专为其工作流量身定制的开箱即用应用，以及快速构建新的本体论驱动应用的能力。在每种情况下，AI 的注入都可以被精细控制并透明评估，确保从增强到自动化的平滑过渡。
打包、发布、部署： 允许开发者超越单点分析和应用，构建利用集成 DevOps 工具链的完整 AI 产品。端到端的数据管道集合、本体论定义、自动化和预构建应用可以打包、发布并部署到异构目标环境中。开发者可以指定"最后一公里"定制的权限，下游团队可以在产品定义演进、变更经过验证并跨发布渠道推广时安全地接收更新。
企业自动化： 使各种角色和背景的构建者能够运用专门的 AI 智能体（如 AI FDE、AIP Analyst）来构建数据管道、编写业务逻辑、训练模型、构建本体论、生成分析报告以及开发端到端应用。关键在于，这些智能体与人类用户运行在相同的基础之上，这意味着它们遵循相同的集成变更管理能力（如全局分支），并且可以将人在环工作流与完全自主操作无缝融合。