跳转至

Media sets (unstructured data)(媒体集(非结构化数据))

A media set is a collection of media files with a common schema, for example, files of the same format. Media sets are designed to work with high-scale, unstructured data and enable the processing of media items such as audio, imagery, video, and documents. Media sets enable access to flexible storage, compute optimizations, and schema-specific transformations to enhance media workflows and pipelines.

Media sets support the import of audio, imagery, spreadsheets, video, documents, and emails.

Example media set workflows include:

  • Enabling content analysis by extracting text from PDFs with Pipeline Builder
  • Performing geospatial analysis with raster tiling (TIFF, NITF) in the Map application
  • Processing medical imaging files (DICOM format) with Pipeline Builder

To find out more about media sets and how to work with media in Foundry, visit the media set documentation.


中文翻译

媒体集(非结构化数据)

媒体集(Media set) 是一组具有共同模式的媒体文件集合,例如相同格式的文件。媒体集专为处理大规模非结构化数据而设计,支持对音频、图像、视频和文档等媒体项目进行处理。媒体集可提供灵活的存储访问、计算优化以及特定模式的转换功能,从而增强媒体工作流和流水线(Pipeline)。

媒体集支持导入音频、图像、电子表格、视频、文档和电子邮件。

媒体集工作流示例包括:

  • 使用流水线构建器(Pipeline Builder)从PDF中提取文本,实现内容分析
  • 在地图(Map)应用中通过栅格切片(TIFF、NITF格式)执行地理空间分析
  • 使用流水线构建器处理医学影像文件(DICOM格式)

要了解更多关于媒体集以及在Foundry中处理媒体的信息,请访问媒体集文档