LLM-provider compatible APIs（兼容LLM提供商的API）¶

:::callout{title="Prerequisites" theme="neutral"} To use Palantir-provided language models, AIP must first be enabled on your enrollment. You must also have permissions to use AIP builder capabilities. :::

Foundry provides proxy endpoints for popular LLM providers, accepting requests in the same format as the providers' native APIs. This enables use of open-source SDKs and tooling while benefiting from Foundry capabilities such as rate limiting, data governance, and usage tracking.

The currently supported provider APIs and corresponding Foundry endpoints are as follows:

Anthropic messages ↗: /api/v2/llm/proxy/anthropic/v1/messages
OpenAI chat completions ↗: /api/v2/llm/proxy/openai/v1/chat/completions
OpenAI responses ↗: /api/v2/llm/proxy/openai/v1/responses
OpenAI embeddings ↗: /api/v2/llm/proxy/openai/v1/embeddings
[Beta] xAI chat completions ↗: /api/v2/llm/proxy/xai/v1/chat/completions
[Beta] xAI responses ↗: /api/v2/llm/proxy/xai/v1/responses
[Beta] Google generateContent ↗: /api/v2/llm/proxy/google/v1/models/{model}:generateContent
[Beta] Google streamGenerateContent ↗: /api/v2/llm/proxy/google/v1/models/{model}:streamGenerateContent?alt=sse

:::callout{theme="warning" title="Beta endpoints"} The xAI and Google (Gemini) endpoints are currently in beta and actively being developed. Not all features or fields may be supported yet. :::

:::callout{theme="neutral"} The Google streamGenerateContent endpoint currently only supports the SSE response format. The query param alt=sse must be provided. :::

Request shapes¶

Authentication is sent using the following bearer token header:

Authorization: Bearer {FOUNDRY_TOKEN}

Requests to these endpoints should have the same shape as the corresponding provider endpoint. Refer to the provider’s documentation for the expected request format.

:::callout{theme="neutral"} Some providers, for example, Anthropic, use a non-standard authentication header. When using their SDKs, you may need to configure the authentication method to use a bearer token instead. Providers that already use bearer token authentication, such as OpenAI, require no special configuration. :::

AIP integration and data governance¶

These endpoints enforce the same data governance as other AIP usage, such as zero data retention (ZDR) and georestriction requirements. We selectively enable provider API features that are compatible with these requirements.

Only models and providers that have been enabled on your enrollment will be available through these endpoints. For models served by multiple providers, requests will only be routed to enabled providers. Endpoint usage is visible in the Resource Management application, and is subject to rate limiting.

中文翻译¶

兼容LLM提供商的API¶

:::callout{title="前提条件" theme="neutral"} 要使用Palantir提供的语言模型，必须先在您的注册中启用AIP。您还需要拥有使用AIP构建者能力的权限。 :::

Foundry为流行的LLM提供商提供了代理端点，能够以与提供商原生API相同的格式接收请求。这使得在利用Foundry的速率限制、数据治理和使用追踪等功能的同时，还能使用开源SDK和工具。

目前支持的提供商API及对应的Foundry端点如下：

Anthropic消息接口 ↗： /api/v2/llm/proxy/anthropic/v1/messages
OpenAI聊天补全接口 ↗： /api/v2/llm/proxy/openai/v1/chat/completions
OpenAI响应接口 ↗： /api/v2/llm/proxy/openai/v1/responses
OpenAI嵌入接口 ↗： /api/v2/llm/proxy/openai/v1/embeddings
[Beta] xAI聊天补全接口 ↗： /api/v2/llm/proxy/xai/v1/chat/completions
[Beta] xAI响应接口 ↗： /api/v2/llm/proxy/xai/v1/responses
[Beta] Google generateContent接口 ↗： /api/v2/llm/proxy/google/v1/models/{model}:generateContent
[Beta] Google streamGenerateContent接口 ↗： /api/v2/llm/proxy/google/v1/models/{model}:streamGenerateContent?alt=sse

:::callout{title="Beta端点" theme="warning"} xAI和Google（Gemini）端点目前处于测试阶段，正在积极开发中。可能尚未支持所有功能或字段。 :::

:::callout{theme="neutral"} Google的streamGenerateContent端点目前仅支持SSE响应格式。必须提供查询参数alt=sse。 :::

请求格式¶

身份验证通过以下Bearer令牌标头发送：

Authorization: Bearer {FOUNDRY_TOKEN}

对这些端点的请求应与相应提供商端点的格式保持一致。请参考提供商的文档了解预期的请求格式。

:::callout{theme="neutral"} 某些提供商（例如Anthropic）使用非标准的身份验证标头。使用其SDK时，您可能需要将身份验证方法配置为使用Bearer令牌。而已使用Bearer令牌身份验证的提供商（如OpenAI）则无需特殊配置。 :::

AIP集成与数据治理¶

这些端点强制执行与其他AIP使用相同的数据治理策略，例如零数据保留（ZDR）和地理限制要求。我们会选择性启用与这些要求兼容的提供商API功能。

只有已在您的注册中启用的模型和提供商才能通过这些端点使用。对于由多个提供商提供的模型，请求将仅路由到已启用的提供商。端点使用情况在资源管理应用程序中可见，并受速率限制约束。