Modeling Objective live deployment FAQ(建模目标实时部署常见问题解答)¶
Below are some frequently asked questions about Modeling Objective live deployments, which are distinct from direct deployments configured from the model page. Learn more about creating and setting up a live deployment and the differences between live and direct deployments.
What conda packages/environment will be included by default with my released Python model?¶
Models will be packaged with the conda packages/environment configured in the model adapter of the published model. Palantir will also add necessary lightweight packages to serve your production model.
What is the maximum amount of data that can be sent in a request to live?¶
By default, live accepts up to 50MB in a single request. This upper limit is configurable; contact your Palantir representative for more details.
Can I create monitors for Modeling Objective live deployments?¶
Yes. Modeling Objective live deployment uptime can be monitored through a monitoring view.
Can I include private libraries in addition to public libraries as part of my environment customization?¶
We support the use of private and public libraries to be imported into a submission environment.
- For public libraries, you need to ensure that the requested libraries are either available in a public channel (such as conda-forge) and those public channels are configured in a way that can be discovered within Foundry;
- For private libraries created within Foundry, the libraries must be properly published per the Python Library instructions.
Can I restrict who can create live deployments on my instance?¶
Yes. The ability to create live deployments can be permissioned separately from the ability to create batch deployments. Contact your Palantir representative for guidance.
Can I scale up my Python models?¶
Yes, Foundry currently provides traffic scaling when deployed within Palantir's container infrastructure.
By default, each deployment is configured with 2 replicas, ensuring there is no downtime during upgrades. The default CPU and memory footprint is also low, resulting in a low default cost profile.
This can be overridden for individual deployments, to support larger models or higher expected load. Additionally, the default profile can be overridden for all live deployments via Control Panel.
Can I run my Python models on GPU?¶
Yes. GPU support for Python models is in the beta phase of development and may not be available on your enrollment. Functionality may change during active development.
Can I turn off my deployment to stop incurring cost?¶
Yes, you can disable your live deployment via the individual deployment page.
When you are ready to start using it again, you can re-enable it via the UI as well. Note that the deployment will upgrade to latest release once it has been re-enabled.
Alternatively, you can also delete a deployment; this action cannot be reversed however, and you will no longer maintain the same Target RID.
Can my Foundry ML Python models reach out to external APIs?¶
Yes. However, you (or an authorized administrator) must configure network egress for live deployments.
Will a timeout occur during a live deployment request?¶
A five minute timeout will occur due to the default dialog read timeout, as running inference is a synchronous process.
中文翻译¶
建模目标实时部署常见问题解答¶
以下是一些关于建模目标实时部署的常见问题,此类部署与从模型页面配置的直接部署有所不同。了解更多关于创建和设置实时部署的信息,以及实时部署与直接部署的区别。
我发布的 Python 模型默认会包含哪些 conda 包/环境?¶
模型将打包发布模型时在模型适配器(model adapter)中配置的 conda 包/环境。Palantir 还会添加必要的轻量级包来服务于您的生产模型。
单次请求发送到实时部署的最大数据量是多少?¶
默认情况下,实时部署单次请求最多接受 50MB 数据。此上限可配置,详情请联系您的 Palantir 代表。
能否为建模目标实时部署创建监控?¶
可以。建模目标实时部署的运行时间可通过监控视图进行监控。
除了公共库之外,我能否在环境自定义中包含私有库?¶
我们支持将私有库和公共库导入提交环境。
- 对于公共库,您需要确保所请求的库在公共频道(如 conda-forge)中可用,并且这些公共频道已配置为可在 Foundry 内被发现;
- 对于在 Foundry 内创建的私有库,必须按照 Python 库说明正确发布。
能否限制谁可以在我的实例上创建实时部署?¶
可以。创建实时部署的权限可以与创建批处理部署的权限分开设置。请联系您的 Palantir 代表获取指导。
能否扩展我的 Python 模型?¶
可以。Foundry 当前在 Palantir 的容器基础设施内部署时提供流量扩展功能。
默认情况下,每个部署配置有 2 个副本,确保升级期间无停机。默认的 CPU 和内存占用也较低,因此默认成本较低。
此配置可针对单个部署进行覆盖,以支持更大的模型或更高的预期负载。此外,可通过控制面板(Control Panel)为所有实时部署覆盖默认配置。
能否在 GPU 上运行我的 Python 模型?¶
可以。Python 模型的 GPU 支持目前处于测试版开发阶段,可能不适用于您的注册环境。功能在活跃开发期间可能会发生变化。
能否关闭部署以停止产生费用?¶
可以。您可以通过单个部署页面禁用实时部署。
当您准备重新使用时,也可以通过用户界面重新启用。请注意,重新启用后部署将升级到最新版本。
或者,您也可以删除部署;但此操作不可逆,且您将不再保留相同的目标 RID(Target RID)。
我的 Foundry ML Python 模型能否访问外部 API?¶
可以。但您(或授权管理员)必须为实时部署配置网络出站规则。
实时部署请求会发生超时吗?¶
由于默认对话读取超时(dialog read timeout),将发生五分钟超时,因为运行推理是一个同步过程。