Compute usage with AIP Logic(使用 AIP Logic 的计算用量)¶
AIP Logic is a Palantir tool that allows you to quickly and maintainably build LLM-driven processes while interacting with your organization's data through the Ontology and computational functionality. AIP Logic is built around the concept of "blocks" of LLM instructions that can be combined linearly to create chain-of-thought workflows that can query data, execute actions and functions, and generate net new information for your use case. In AIP Logic, a "block" is the atomic unit of usage measurement, though each block can trigger other systems within Foundry that may also use compute-seconds to return information to the AIP Logic block.
:::callout{theme="neutral"} If you have an enterprise contract with Palantir, contact your Palantir representative before proceeding with compute usage calculations. :::
Core concepts: Resource, blocks, and tools¶
An AIP Logic resource is comprised of one or more AIP Logic blocks. Running a resource will run the blocks required to achieve a desired output. Blocks can use tools such as Ontology queries, functions, and actions to produce an output.
Measuring Foundry compute with AIP Logic¶
AIP LLM tokens¶
LLM tokens in AIP are measured in the manner of the underlying model (such as OpenAI ↗), and depend on the size of prompts and responses as well as on the number of prompts that are made. For more information, consult the table of usage for each model type.
LLM block execution¶
When an AIP Logic block executes or chooses to use a tool, there is a minimum compute-second usage.
- Basic LLM block execution:
4compute-seconds - LLM block tool execution:
8compute-seconds
Additional Foundry compute usage¶
When an AIP Logic block federates computation out to external tools (such as Ontology queries or functions), additional compute may be used during the execution of these applications.
Managing AIP Logic usage of Foundry compute¶
Some operations in AIP Logic can significantly affect compute usage. Below, we provide guidance on controlling compute usage by being careful about token usage, the total number of logic block executions, and usage of Foundry compute.
Token usage¶
- Processing large amounts of text can significantly increase overall compute usage. Be aware of the size of the input prompts that are being used with LLMs. This is especially relevant when pulling in large text blocks from the Ontology.
- To moderate token usage, you should work to pare down the amount of text that is injected into prompts and ensure that only relevant text is included in the prompts themselves. This is especially important when working with large documents.
Total number of logic block executions¶
- Running many logic blocks can use large amounts of compute, especially if the blocks are triggered programmatically via a function (that is, not triggered by human action).
- To moderate logic block executions, consider combining multiple blocks into single prompts (where appropriate), and only using additional blocks when necessary. Action, function, and data transformation blocks do not incur compute usage on their own.
Foundry compute¶
- Running many calls to Foundry applications (such as Ontology queries, functions, or actions) can use large amounts of compute. This may occur if a logic resource requires many retries, or calls many functions or actions on each execution.
- To moderate compute usage from other applications, ensure that you understand the number of calls to other tools that are made in your chain of logic blocks. The potential tools that can call out to other parts of Foundry are:
- Ontology Query
- Action Execution
- Function Execution
- Data Transformation
Example of AIP Logic compute usage¶
Assume a user has an AIP Logic resource that has two LLM blocks. One of the LLM blocks has an action configured and will call it on execution. The logic resource is run end-to-end twice.
Number of LLM blocks: 2
Number of LLM blocks that call actions: 1
Number of runs: 2
1 run compute-seconds = 2 LLM blocks * 4 compute-seconds + 1 action block * 8 compute-seconds
1 run compute-seconds = (2 * 4) + (1 * 8)
1 run compute-seconds = 16 compute-seconds
2 runs = 2 * 16 compute-seconds = 32 compute-seconds
Total = 32 compute-seconds
中文翻译¶
使用 AIP Logic 的计算用量¶
AIP Logic 是 Palantir 的一款工具,可让您通过 Ontology 和计算功能与组织数据进行交互,从而快速且可维护地构建由 LLM 驱动的流程。AIP Logic 围绕 LLM 指令"块"(block)的概念构建,这些块可以线性组合,创建能够查询数据、执行操作(action)和函数(function),并为您的用例生成全新信息的思维链(chain-of-thought)工作流。在 AIP Logic 中,"块"是用量计量的基本单位,但每个块可能触发 Foundry 内的其他系统,这些系统也可能消耗计算秒(compute-seconds)来向 AIP Logic 块返回信息。
:::callout{theme="neutral"} 如果您与 Palantir 签订了企业合同,请在进行计算用量计算之前联系您的 Palantir 代表。 :::
核心概念:资源、块和工具¶
一个 AIP Logic 资源(resource)由一个或多个 AIP Logic 块组成。运行一个资源将执行实现所需输出所必需的块。块可以使用诸如 Ontology 查询、函数和操作等工具来生成输出。
使用 AIP Logic 计量 Foundry 计算用量¶
AIP LLM 令牌¶
AIP 中的 LLM 令牌(token)按照底层模型的方式计量(例如 OpenAI ↗),并取决于提示(prompt)和响应的大小以及发出的提示数量。更多信息,请查阅每种模型类型的用量表。
LLM 块执行¶
当 AIP Logic 块执行或选择使用工具时,会产生最低计算秒用量。
- 基本 LLM 块执行:
4计算秒 - LLM 块工具执行:
8计算秒
额外的 Foundry 计算用量¶
当 AIP Logic 块将计算任务分派给外部工具(例如 Ontology 查询或函数)时,这些应用程序执行期间可能会产生额外的计算用量。
管理 AIP Logic 对 Foundry 计算用量的使用¶
AIP Logic 中的某些操作会显著影响计算用量。以下,我们提供通过关注令牌用量、逻辑块执行总数以及 Foundry 计算用量来控制计算用量的指导。
令牌用量¶
- 处理大量文本会显著增加总体计算用量。请注意与 LLM 一起使用的输入提示的大小。当从 Ontology 中拉取大段文本时,这一点尤其重要。
- 为控制令牌用量,您应努力精简注入提示中的文本量,并确保提示中仅包含相关文本。在处理大型文档时,这一点尤为重要。
逻辑块执行总数¶
- 运行大量逻辑块会消耗大量计算资源,尤其是当这些块通过函数以编程方式触发时(即非人工操作触发)。
- 为控制逻辑块执行次数,请考虑将多个块合并到单个提示中(在适当情况下),并仅在必要时使用额外的块。操作、函数和数据转换块本身不会产生计算用量。
Foundry 计算用量¶
- 对 Foundry 应用程序(例如 Ontology 查询、函数或操作)进行大量调用会消耗大量计算资源。如果逻辑资源需要多次重试,或每次执行时调用多个函数或操作,就可能出现这种情况。
- 为控制其他应用程序产生的计算用量,请确保了解您的逻辑块链中对其他工具的调用次数。可能调用 Foundry 其他部分的潜在工具包括:
- Ontology 查询
- 操作执行
- 函数执行
- 数据转换
AIP Logic 计算用量示例¶
假设用户有一个包含两个 LLM 块的 AIP Logic 资源。其中一个 LLM 块配置了一个操作,并将在执行时调用该操作。该逻辑资源端到端运行两次。
LLM 块数量:2
调用操作的 LLM 块数量:1
运行次数:2
单次运行计算秒 = 2 个 LLM 块 * 4 计算秒 + 1 个操作块 * 8 计算秒
单次运行计算秒 = (2 * 4) + (1 * 8)
单次运行计算秒 = 16 计算秒
2 次运行 = 2 * 16 计算秒 = 32 计算秒
总计 = 32 计算秒