Data restrictions(数据限制)¶
Object Storage V2 (OSv2) enforces data restrictions to ensure the quality of data going into the ontology, provide more deterministic behavior, and increase legibility across the platform. These restrictions are validated during indexing. For object types backed by batch datasources, violations will cause indexing jobs to fail. For object types backed by streaming datasources, records that violate these restrictions are dropped.
Primary keys and uniqueness¶
OSv2 enforces unique object primary keys for datasources. If there are duplicate primary keys within a single transaction, indexing will fail and throw an error. If there are duplicate primary keys across transactions, the version in the later transaction will be used.
OSv2 prevents certain data types from being used as primary keys in order to encourage Ontology modeling best practices. The following types cannot be used as primary keys:
- Geopoint
- Geoshapes
- Arrays
- Time series properties
- Real number types (decimal, double, float)
Property type restrictions¶
- OSv2 enforces data type coherence between datasource schema and object type schema on every sync. Incompatible data types for a property will cause the build to fail.
- When changing the base type of an existing property (for example, from
DoubletoInteger), all existing values for that property must be strictly compatible with the target type. If any data entries include values incompatible with the new type (such as fractional numbers when changing to Integer, or currency symbols), the migration will fail with an error such asA property could not be cast to the new type. Schema migrations will not proceed if incompatible values exist, and the migration process cannot automatically clean or coerce these values. - OSv2 does not allow
NaNor±infinityas property values. - Empty strings are not allowed in OSv2; in OSv1, empty strings were silently converted to nulls.
Lat, Longshould be a comma-separated string with no parentheses, for example-29.123, 150.982.- OSv2 does not allow properties with nested arrays.
- OSv2 does not allow properties with array data types to have null elements within the array.
- OSv2 supports
Notconditions in granular permissioning policies of restricted view datasources where the negated field is a collection and has a non-empty constraint. This can be configured in Ontology Manager by marking the relevant property as required. - OSv2 has stricter validations on geopoint properties.
Property size limits¶
OSv2 enforces size limits on individual properties to ensure reliable indexing performance and stability. These limits address serialization constraints and memory pressure that can occur when processing large property values.
| Property type | Maximum size |
|---|---|
| String properties | 12 MB |
| Array properties | 100,000 elements |
Properties exceeding these limits will cause indexing jobs to fail. These limits are initially enforced for object types backed by batch datasources; similar limits will be applied to object types with streaming datasources in the future.
Recommendations for large data¶
- Large string properties: If you need to store data exceeding 12 MB, use a media reference property instead. Media references allow you to associate large files or binary content with an object without impacting indexing performance.
- Large arrays: If you need to model relationships that would exceed 100,000 elements, consider using link types instead of array properties. Links provide a more scalable and queryable way to represent relationships between objects.
中文翻译¶
数据限制¶
对象存储V2(Object Storage V2,OSv2)强制执行数据限制,以确保进入本体(ontology)的数据质量,提供更确定的行为,并提高整个平台的可读性。这些限制在索引过程中进行验证。对于由批量数据源支持的对象类型,违规将导致索引作业失败。对于由流式数据源支持的对象类型,违反这些限制的记录将被丢弃。
主键与唯一性¶
OSv2 强制要求数据源的对象主键唯一。如果单个事务中存在重复的主键,索引将失败并抛出错误。如果跨事务存在重复的主键,则使用较新事务中的版本。
OSv2 禁止将某些数据类型用作主键,以鼓励遵循本体建模最佳实践。以下类型不能用作主键:
- 地理点(Geopoint)
- 地理形状(Geoshapes)
- 数组(Arrays)
- 时间序列属性(Time series properties)
- 实数类型(decimal、double、float)
属性类型限制¶
- OSv2 在每次同步时强制要求数据源模式与对象类型模式之间的数据类型一致性。属性的不兼容数据类型将导致构建失败。
- 当更改现有属性的基础类型时(例如,从
Double改为Integer),该属性的所有现有值必须与目标类型严格兼容。如果任何数据条目包含与新类型不兼容的值(例如,更改为 Integer 时出现小数,或包含货币符号),迁移将失败,并显示类似A property could not be cast to the new type的错误。如果存在不兼容的值,模式迁移将无法进行,且迁移过程无法自动清理或强制转换这些值。 - OSv2 不允许将
NaN或±infinity作为属性值。 - OSv2 中不允许空字符串;在 OSv1 中,空字符串会被静默转换为 null。
纬度, 经度(Lat, Long)应为不带括号的逗号分隔字符串,例如-29.123, 150.982。- OSv2 不允许包含嵌套数组的属性。
- OSv2 不允许数组数据类型属性中包含 null 元素。
- OSv2 支持在受限视图数据源的细粒度权限策略中使用
Not条件,前提是被否定的字段是一个集合且具有非空约束。这可以通过在 Ontology Manager 中将相关属性标记为必填来配置。 - OSv2 对地理点属性有更严格的验证。
属性大小限制¶
OSv2 对单个属性强制执行大小限制,以确保可靠的索引性能和稳定性。这些限制解决了处理大型属性值时可能出现的序列化约束和内存压力问题。
| 属性类型 | 最大大小 |
|---|---|
| 字符串属性 | 12 MB |
| 数组属性 | 100,000 个元素 |
超出这些限制的属性将导致索引作业失败。这些限制最初适用于由批量数据源支持的对象类型;未来将对具有流式数据源的对象类型应用类似限制。