跳转至

Media Sets(媒体集 (Media Sets))

How can I manage large datasets in Foundry without duplicating data when Foundry does not support views for media sets?

In Foundry, to manage large datasets efficiently without duplicating data, you can use the fast_copy_media_item() method on the output when copying items from one media set to another. Copying a reference to the same media blob rather than duplicating the blob itself is faster and more efficient option than downloading and re-uploading the media item.

Timestamp: April 13, 2024

What could be causing the MediaSet: TooManyItemsUploadedInTransaction error while trying to upload large number of files to a media set? And how can it be fixed?

The MediaSet: TooManyItemsUploadedInTransaction occurs because there is a limit of 10,000 files that can be written to a media set in a single transaction. The solutions are:

  • Use a transaction-less media set where the restriction does not apply (note that this has side effects such as inability to fail a transaction or snapshot the media set); or
  • Break up the items into sets of 10,000 or fewer and write one set per build.

Timestamp: March 6, 2024

How can I read from or write to transactionless mediasets via transforms without encountering the 'MediaSet:CannotSnapshotNonTransactionalMediaSet' error?

You can read from or write to transactionless mediasets by altering the argument of the media output to media_output=MediaSetOutput(should_snapshot=False). This prevents the system from attempting to snapshot the output, which is not supported for non-transactional media sets. Additionally, incremental pipelines should work the same as transaction media sets for inputs.

Timestamp: March 18, 2024

How can I extract the raw image from a dataset with a mediaReference column?

Add the media set as an input to the build so that the build-token has permission to read from it, then use the method media_input.get_media_item_by_path() to read values from it.

Timestamp: March 6, 2024

Is it possible to use the Extract Text from PDF pipeline expression with a dataset of files that is not a media set?

This is not possible because the pipeline board only works on items in a media set.

Timestamp: March 6, 2024

How do I connect media items from a media reference dataset to the actual object?

The media items in a media reference dataset can be linked to the respective objects from the Capabilities section of the object type in Ontology Manager.

Timestamp: March 6, 2024

Is it possible to write to a media reference property on an object using an action?

Yes, it is possible to write to a media reference property in a function, in the same way that other object properties can be written to in functions. An action can be configured to run this function.

Timestamp: November 1, 2024

When media sets are packaged in Marketplace, which branch is selected?

When packaging media sets in the marketplace, the 'default' branch is selected. Currently, it is not possible to select a different branch to be packaged. The branch name is preserved during packaging and installing.

Timestamp: April 16, 2024

Is there support for the bulk upload of media items via widgets from Workshop?

There is no bulk upload workflow for media items in Workshop at this moment.

Timestamp: April 18, 2024


中文翻译

媒体集 (Media Sets)

当 Foundry 不支持媒体集视图时,如何在不重复数据的情况下管理大型数据集?

在 Foundry 中,要高效管理大型数据集且避免数据重复,可以在将项目从一个媒体集复制到另一个媒体集时,对输出使用 fast_copy_media_item() 方法。复制对同一媒体 blob 的引用(而非复制 blob 本身)比下载并重新上传媒体项更快、更高效。

时间戳: 2024年4月13日

尝试向媒体集上传大量文件时,出现 MediaSet: TooManyItemsUploadedInTransaction 错误的原因是什么?如何解决?

MediaSet: TooManyItemsUploadedInTransaction 错误是由于单次事务中写入媒体集的文件数量限制为 10,000 个。解决方案如下:

  • 使用无事务媒体集 (transaction-less media set),该限制不适用(注意:这会产生副作用,例如无法回滚事务或对媒体集进行快照);或
  • 将项目拆分为每组不超过 10,000 个,每次构建写入一组。

时间戳: 2024年3月6日

如何通过转换 (transforms) 读取或写入无事务媒体集,同时避免出现 MediaSet:CannotSnapshotNonTransactionalMediaSet 错误?

可以通过将媒体输出的参数修改为 media_output=MediaSetOutput(should_snapshot=False) 来读取或写入无事务媒体集。这可以防止系统尝试对输出进行快照(非事务媒体集不支持快照)。此外,增量管道 (incremental pipelines) 在输入方面应与事务媒体集的工作方式相同。

时间戳: 2024年3月18日

如何从包含 mediaReference 列的数据集中提取原始图像?

将媒体集作为输入添加到构建中,以便构建令牌 (build-token) 拥有读取权限,然后使用 media_input.get_media_item_by_path() 方法从中读取值。

时间戳: 2024年3月6日

是否可以对非媒体集的文件数据集使用"从 PDF 提取文本"管道表达式?

不可以,因为管道面板 (pipeline board) 仅适用于媒体集中的项目。

时间戳: 2024年3月6日

如何将媒体引用数据集中的媒体项连接到实际对象?

媒体引用数据集中的媒体项可以通过本体管理器 (Ontology Manager) 中对象类型的"功能" (Capabilities) 部分链接到相应对象。

时间戳: 2024年3月6日

是否可以使用操作 (action) 写入对象上的媒体引用属性?

可以,在函数中写入媒体引用属性的方式与写入其他对象属性相同。可以配置操作来运行此函数。

时间戳: 2024年11月1日

在 Marketplace 中打包媒体集时,会选择哪个分支?

在 Marketplace 中打包媒体集时,会选择"default"分支。目前无法选择其他分支进行打包。分支名称在打包和安装过程中会保留。

时间戳: 2024年4月16日

Workshop 是否支持通过小部件批量上传媒体项?

目前 Workshop 中没有媒体项的批量上传工作流。

时间戳: 2024年4月18日