Extract text from images (using OCR)（从图像中提取文本（使用OCR））¶

Supported in: Batch, Faster

Extracts text from an image using optical character recognition (OCR).

Expression categories: Media

Declared arguments¶

Languages to detect: Languages to detect in the input files.
Set\>
Media reference: The column containing media references to image files in a media set.
Expression\
OCR output format: Output will be a string.
Enum\
Scripts to detect: Scripts to detect in the input files.
Set\>
optional Error handling: Determines the behavior of the pipeline for inputs that fail to process.
Enum\

Output type: String

Argument values:

mediaReference	Output
{"mimeType":"image/png","reference":{"type":"mediaSetItem","mediaSetItem":{"mediaSetRid":"ri.mio.main.media-set.a", "mediaItemRid":"ri.mio.main.media-item.a"}}}	This text came from the image in the media set.

支持：批处理（Batch）、快速处理（Faster）

通过光学字符识别（OCR）从图像中提取文本。

表达式类别： 媒体（Media）

要检测的语言（Languages to detect）： 输入文件中需要检测的语言。
集合\<枚举\<南非荷兰语、阿尔巴尼亚语、阿姆哈拉语、阿拉伯语、亚美尼亚语、阿萨姆语、阿塞拜疆语、西里尔阿塞拜疆语、巴斯克语、白俄罗斯语等>>
媒体引用（Media reference）： 包含媒体集中图像文件媒体引用的列。
表达式\<媒体引用>
OCR输出格式（OCR output format）： 输出将为字符串。
枚举\<文本（Text）、hOCR>
要检测的脚本（Scripts to detect）： 输入文件中需要检测的书写系统。
集合\<枚举\<阿拉伯语、亚美尼亚语、孟加拉语、加拿大原住民音节文字、切罗基语、西里尔字母、天城文、埃塞俄比亚文、哥特体、格鲁吉亚文等>>
可选 错误处理（Error handling）： 决定管道对处理失败的输入的行为。
枚举\<失败（FAIL）、空值（NULL）>

输出类型： 字符串（String）

参数值：

mediaReference	输出
{"mimeType":"image/png","reference":{"type":"mediaSetItem","mediaSetItem":{"mediaSetRid":"ri.mio.main.media-set.a", "mediaItemRid":"ri.mio.main.media-item.a"}}}	这段文本来自媒体集中的图像。