跳转至

Ontology volume usage(本体论卷使用量)

Foundry’s Ontology is a fast-access storage layer that allows you to bind object definitions to interactive data for queries, links, scenarios, and actions. Data in the Foundry Ontology is indexed for fast access and for safe edits by multiple simultaneous editors. Ontology volume is a measure of the total size of the indexed object sets and their links with each other. Ontology volume is an average metric that has the unit of GB for an instantaneous reading, but has the unit of GB-Month when measuring over the course of one month.

Measuring Ontology volume

Ontology volume is recorded by measuring the size of the indexes that back the object type. Each object type has a number of objects and a number of properties per object. Each property can be of arbitrary size. The total size of the index is calculated by summing the size of each indexed property for every object of that object type.

:::callout{theme="neutral"} It’s important to note that Ontology volume can be larger than dataset volume because Ontology data cannot be compressed, and Ontology indexing requires additional storage to facilitate faster queries. :::

Every hour, the Foundry platform records a measurement of Ontology volume per object. When measuring Ontology volume over time, all hourly measurements are averaged over the given time period. Averaging over the course of one calendar month produces the GB-Month unit.

Investigating Ontology volume usage

Objects are managed in the Ontology Manager, the hub for all administration and monitoring of objects. The Ontology Manager allows users to configure which datasets should become objects, what types of properties are attached to these objects, and which link sets are defined between object types.

The total Ontology volume for objects and their corresponding link types are listed in the Resource Management Application.

Factors that drive Ontology volume

Ontology volume is effectively a measure of the size of all objects in the ontology, including their properties and links. The following two factors are the main drivers of total volume.

  • The number and size of objects per object type
  • When creating an ontology object type out of a dataset, an object will be indexed per row of that dataset. Therefore, the number of rows in that dataset is tied 1:1 to the number of objects in the corresponding object type.
  • Additionally, datasets that have more columns or that contain more data (e.g. free text fields) can produce individual objects that are larger, because each column is turned into a property.
  • The number of links between objects when using join tables
  • In many-to-many relationships, the Ontology requires the definition of a join table to define all of the links between objects based on their primary keys. These tables are indexed alongside the objects in the Ontology and use ontology volume.
  • In general, look-up tables have a constant size per record and grow linearly in volume with the number of links that are defined.

Managing Ontology volume

The Ontology is designed to be a fast-access backend for operational usage and querying. In the general case, the Ontology is best used with highly refined data that is synthesized from a larger data asset. The volume of the Ontology should therefore be smaller than the total raw and intermediate data size in Foundry’s transformation framework.

To manage Ontology volume usage, pay attention to the number of object types that are defined in the Ontology, as well as the number of objects per object type and the number of properties per object. Overall, the best way to manage Ontology volume usage is to understand and deliberately manage object numbers, property counts, and property sizes.


中文翻译

本体论卷使用量

Foundry 的本体论(Ontology)是一个快速访问存储层,允许您将对象定义绑定到交互式数据,以支持查询、链接、场景和操作。Foundry 本体论中的数据经过索引,可实现快速访问并支持多个同时编辑者安全地进行编辑。本体论卷(Ontology volume)是已索引对象集及其相互链接的总大小的度量指标。本体论卷是一个平均指标,瞬时读数单位为 GB,而按月测量时单位为 GB-月(GB-Month)。

测量本体论卷

本体论卷通过测量支持对象类型的索引大小来记录。每个对象类型包含一定数量的对象和每个对象的一定数量的属性。每个属性的大小可以是任意的。索引的总大小通过将该对象类型中每个对象的每个已索引属性的大小相加来计算。

:::callout{theme="neutral"} 需要注意的是,本体论卷可能大于数据集卷,因为本体论数据无法压缩,并且本体论索引需要额外的存储空间来支持更快的查询。 :::

Foundry 平台每小时记录一次每个对象的本体论卷测量值。在按时间测量本体论卷时,所有小时测量值会在给定时间段内取平均值。在一个日历月内取平均值即得到 GB-月 单位。

调查本体论卷使用情况

对象在对象管理器(Ontology Manager)中进行管理,该管理器是所有对象管理和监控的中心。对象管理器允许用户配置哪些数据集应成为对象、这些对象附加哪些类型的属性,以及在对象类型之间定义哪些链接集。

对象及其对应链接类型的本体论卷总量会在资源管理应用(Resource Management Application)中列出。

影响本体论卷的因素

本体论卷实际上是本体论中所有对象(包括其属性和链接)大小的度量。以下两个因素是总卷的主要驱动因素。

  • 每个对象类型的对象数量和大小
  • 当从数据集创建本体论对象类型时,数据集的每一行都会索引为一个对象。因此,该数据集中的行数与对应对象类型中的对象数量是 1:1 对应的。
  • 此外,拥有更多列或包含更多数据(例如自由文本字段)的数据集可能会产生更大的单个对象,因为每一列都会变成一个属性。
  • 使用连接表时对象之间的链接数量
  • 在多对多关系中,本体论需要定义连接表,以根据对象的主键定义对象之间的所有链接。这些表与本体论中的对象一起被索引,并占用本体论卷。
  • 通常,查找表每条记录的大小是固定的,并且其卷随定义的链接数量线性增长。

管理本体论卷

本体论被设计为用于操作使用和查询的快速访问后端。在一般情况下,本体论最适合用于从较大数据资产中合成的高度精炼数据。因此,本体论的卷应小于 Foundry 转换框架中原始数据和中间数据的总大小。

为管理本体论卷的使用,请注意本体论中定义的对象类型数量、每个对象类型的对象数量以及每个对象的属性数量。总体而言,管理本体论卷使用的最佳方式是理解并有意识地管理对象数量、属性数量和属性大小。