Ontology design: Structural guidance(本体论设计:结构指导)¶
The following sections provide guidance on how to structure properties, relationships, and access control within the Ontology.
Normalization and derived properties¶
Store each fact once. Use derived properties for convenience.
Denormalized data (copying values from linked objects onto a parent object) can be risky. When the data source changes, every copy must be updated. Normalization keeps data consistent, and derived properties give you the convenience of denormalized access without the upkeep.
Not all computed values are the same. The right approach depends on whether a value can be safely pre-computed from stable inputs or whether it needs to stay in sync with dynamic Ontology changes.
Pre-computed vs. dynamically derived values¶
| Type | Characteristics | Recommended tool | Example |
|---|---|---|---|
| Pre-computed | Computed from properties on the same object; inputs rarely change or only change due to pipeline ingestion. | Pipeline transform | fullName = firstName + " " + lastName Inputs are stable and updated in the same pipeline, so pre-computing is safe and adds zero runtime overhead. |
| Dynamically derived | Depends on linked objects or values that change via actions, automations, or other Ontology-level operations. | Derived property | directReportCount Employees are reassigned, onboarded, and offboarded through actions. A derived property that counts linked Employee objects stays correct automatically. |
:::callout{theme="warning"} When a value depends on changes made through actions, every action that could affect the value must also update the value. If any action fails to do so, the value will remain incorrect until the discrepancy is identified. :::
Anti-patterns¶
- The same value is stored as a property on multiple object types
- Properties go stale because they are copies of values maintained elsewhere
- Updating a single real-world fact requires writes to multiple objects
- Integer or count properties are manually maintained rather than computed from links
Example¶
A Manager object type needs to display a count of direct reports:
✗ Avoid ✓ Prefer
──────────────────────────────────────── ────────────────────────────────────────
Manager Manager
- directReportCount: 5 - directReportCount (derived):
(manually maintained integer; counts linked Employee objects
must be updated every time → at query time
an employee joins or leaves)
Employee
Employee - manager (link to Manager)
- managerName: "Alice"
(copied from the linked Manager;
breaks if the manager's name
changes)
Performance considerations¶
Derived properties are evaluated at runtime. The performance characteristics vary by scale:
| Scale | Recommendation |
|---|---|
| Low to moderate (<~10k objects per query) | Use derived properties freely. Runtime evaluation is sufficiently performant for most workflows. |
| High (>~10k objects per query) | Derived properties may introduce latency due to higher-overhead query paths. Denormalization may be an appropriate tradeoff, but it should be a conscious, documented decision and not the default. |
Best practices¶
- Store each fact in one place, on the object where it semantically belongs.
- Use derived properties to compute or aggregate values from linked objects at query time.
- Monitor performance as scale grows. If derived properties introduce unacceptable latency at high scale, consider selective denormalization.
- Document any denormalization with the rationale, the source of truth, and the update strategy for keeping copies in sync.
Structs¶
Group semantically related fields into structs.
When a property is naturally multi-field (for example, an address with street, city, state, and postal code), use a struct rather than flattening into separate properties. Structs preserve semantic grouping and enable richer metadata capture.
When to use structs¶
| Scenario | Example |
|---|---|
| Multi-field values | Address (street, city, state, postal code), coordinates (geopoint, altitude) |
| Values with metadata | AI-generated outputs with confidence scores, source references, and reasoning |
| Multi-valued properties with selection logic | Multiple phone numbers where a reducer surfaces the primary one |
Example¶
Modeling an address on a Facility object type:
✗ Avoid ✓ Prefer
──────────────────────────────────────── ────────────────────────────────────────
Facility Facility
- addressStreet - address (struct array)
- addressCity - street (Main field)
- addressState → - city (Main field)
- addressPostalCode - state (Main field)
- addressCountry - postalCode (Main field)
- addressGeopoint - country (Main field)
- addressLastOccupied - geopoint
- addressDatasource - lastOccupied (used for reducer sorting)
- addressLlmConfidence - datasource
- addressLlmReasoning - llmConfidence
- llmReasoning
(Ten unrelated properties with a
naming convention as the only link (One semantic concept with main
between them) fields and structured sub-fields)
Key benefits¶
| Benefit | Details |
|---|---|
| Semantic grouping | An address is one concept, not ten unrelated properties. The Ontology reflects this. |
| Metadata capture | Structs can carry source, confidence, and timestamp information alongside the primary value. |
| Reducer support | In multi-valued scenarios, reducers can surface the most relevant value (for example, the address with the most recent lastOccupied field). |
| Main field behavior | A struct can designate one or more main fields so it behaves like a simple property or as a struct with a subset of the fields in interfaces and queries. |
Structs are especially valuable in AI-first workflows where large language model (LLM) outputs have both a primary result and associated metadata (reasoning, source references, confidence scores). Capture these together rather than scattering them across unrelated properties.
Best practices¶
- Identify multi-field properties where the fields are semantically related and always used together.
- Define the struct with clear field names and types.
- Designate a main field so the struct behaves like a simple property in most contexts.
- Use reducers for multi-valued struct properties to surface the most relevant value.
- Capture metadata (source, confidence, timestamps) in the struct alongside the primary value, especially for AI-generated outputs.
Interfaces¶
Use interfaces to build reusable, future-proof abstractions.
Interfaces are the primary tool for achieving the "Don't repeat yourself" design principle and open/closed extensibility. They define a shared shape (properties, links, actions) that multiple object types can implement, enabling workflows to target the interface rather than individual types.
When to use interfaces¶
| Scenario | Example |
|---|---|
| Common properties across types | Inspectable interface with lastInspectionDate and inspectionStatus, implemented by Vehicle, Equipment, Facility |
| Shared workflows | A scheduling workflow targeting SchedulableResource works for arenas, conference rooms, and vehicles without modification |
| Taxonomic grouping | A MilitaryAsset interface implemented by Aircraft, Vessel, GroundVehicle for drilldown aggregation workflows |
| Multi-level abstraction | SchedulableResource extends Trackable, adding scheduling-specific properties to a broader tracking abstraction |
Example¶
Multiple object types need inspection tracking:
✗ Avoid ✓ Prefer
──────────────────────────────────────── ────────────────────────────────────────
Vehicle Interface: Inspectable
- lastInspectionDate - lastInspectionDate
- inspectionStatus - inspectionStatus
- (duplicate action: Schedule - (shared action: Schedule Inspection)
Vehicle Inspection)
→ Vehicle implements Inspectable
Equipment - make, model, mileage, ...
- lastInspectionDate
- inspectionStatus Equipment implements Inspectable
- (duplicate action: Schedule - serialNumber, warrantyExpiry, ...
Equipment Inspection)
Facility implements Inspectable
Facility - address, capacity, ...
- lastInspectionDate
- inspectionStatus (One interface, one shared action,
- (duplicate action: Schedule three implementing types)
Facility Inspection)
(Three copies of the same properties
and logic, maintained independently)
Platform considerations¶
Even where current platform tooling does not fully support interface-backed workflows, designing with interfaces establishes a foundation that pays off as support expands.
| Situation | Guidance |
|---|---|
| The interface is fully supported in your workflow | Target the interface directly. A single workflow covers all implementing types. |
| The interface is not yet supported in a specific context | Define the interface now and duplicate the workflow per type as a temporary measure. This approach is no less efficient than working without an interface, and it establishes a clear path to consolidation once support is available. |
Review our interface documentation for current support details.
Best practices¶
- Identify common shapes: If multiple object types share properties, links, or actions, define an interface that captures the shared shape.
- Design interfaces around capabilities or taxonomy: Capability interfaces may include
Inspectable,Schedulable, orBillable. Taxonomic interfaces may includeMilitaryAssetorMedicalDevice. - Target interfaces in workflows: Build actions, functions, and applications against interfaces where possible.
- Extend interfaces for multi-level abstraction: Interfaces can extend other interfaces to build layered abstractions.
- Scaffold now, consolidate later: Define interfaces even if some workflows must temporarily be duplicated per-type due to current platform support gaps.
Links and object-backed link types¶
Links should represent semantically meaningful relationships.
Every link type should answer a clear domain question, such as:
- Which facility did this patient visit?
- Which team does this employee belong to?
- Which equipment was used in this work order?
When to use link types¶
| Link type | Use when | Example |
|---|---|---|
| Direct link | The relationship is meaningful but carries no metadata of its own. | Employee → Department |
| Object-backed link | The relationship carries its own metadata (dates, roles, status, allocation). | Employee → VentureStaffing → Venture (with role, startDate, allocation) |
Not every linking object needs to be visible in every context. Some workflows care about the join metadata, others just want the direct connection. Object-backed links let you expose either view depending on the workflow.
Example¶
Modeling the relationship between employees and ventures, where each assignment has a role and start date:
✗ Avoid ✓ Prefer
──────────────────────────────────────── ────────────────────────────────────────
Employee → Venture (direct link) Employee → Venture Staffing → Venture
(no way to capture role,
start date, or allocation → Venture Staffing
per assignment) - role
- startDate
— OR — - allocationPercentage
- status
Employee
- ventureRole Workflows can expose either:
- ventureStartDate - Direct: Employee → Venture
(ambiguous if employee has - Detailed: Employee → Staffing → Venture
multiple venture assignments)
Impact of incorrect link design¶
| Problem | Impact |
|---|---|
| Lost metadata | Direct links cannot capture when, why, or in what capacity a relationship exists. |
| Ambiguous multi-links | Properties like ventureRole on the source object become ambiguous when an entity participates in multiple relationships. |
| Meaningless links | Links that exist only because two datasets share a foreign key add noise to the Ontology and confuse navigation. |
Best practices¶
- Validate semantic meaning: Avoid links that exist only because two datasets share a foreign key. Ask if the relationship is meaningful in the domain.
- Evaluate whether the relationship carries metadata: If it does (dates, roles, status), use an object-backed link type to capture that metadata.
- Expose the right level of detail: Design workflows to use either the direct relationship or the detailed relationship through the linking object, depending on the context.
- Name links for clarity: Link names should describe the relationship from each direction. Review the section on naming conventions for more information.
Naming conventions¶
Optimize for human readability and agent navigability.
Consistent, descriptive naming is one of the most impactful investments you can make in Ontology quality. Clear names make the Ontology easier for both humans and AI agents to navigate, and they are far harder to correct once the Ontology is in use.
Naming rules¶
| Element | Convention | Good examples | Bad examples |
|---|---|---|---|
| Object types | Singular, concrete nouns a domain expert would recognize | Patient, WorkOrder, FlightSegment |
Data, Item, Record |
| Properties | Concise, self-evident; no encoded type info or implementation details | age, status, lastInspectionDate |
dtLastInspMod, nVAL01, fieldX |
| Links | Read naturally from each direction | department (Employee → Dept), employees (Dept → Employee) |
relatedItems, link1 |
| Dates | Follow a single convention consistently across the Ontology | createdDate, updatedDate, effectiveDate |
Mixing createdDate and dateOfCreation |
| Ambiguous terms | Qualify with specific meaning | monetaryValue, quantityOnHand, riskScore |
value, quantity, score |
Example¶
✗ Avoid ✓ Prefer
──────────────────────────────────────── ────────────────────────────────────────
Object type: Item → Object type: Product
Property: dtLastInspMod → Property: lastInspectionDate
Property: value → Property: monetaryValue
Property: quantityOnHand
Link: Item → Related Item → Link: Product → Supplier
Link: Employee → Supervisor
Best practices¶
- Establish naming conventions before building: Agree on patterns for dates, statuses, identifiers, and links up front.
- Follow the Ontology's established conventions: If the Ontology already uses
createdDate, do not introducedateOfCreation. - Qualify ambiguous properties: Use
monetaryValue,quantityOnHand, andriskScore. Do not usevalue,quantity, andscore. - Name links by relationship: A link from
EmployeetoDepartmentshould bedepartment(from the employee's perspective) andemployees(from the department's perspective). - Review names with end users: Names that seem clear to the builder may be ambiguous to consumers. Validate with the people who will use the Ontology every day.
Security design¶
Design security semantically, following the principle of least privilege.
Security in the Ontology should be expressed in terms that make sense in the domain, not in terms of data infrastructure. Users should be able to look at a security configuration and understand what is protected and why.
Security model¶
Combine row-level and column-level security for fine-grained cell-level access control:
| Security layer | Controls | Example |
|---|---|---|
| Row-level | The objects a user can view | VIP patients are restricted to senior staff |
| Column-level | The properties a user can view on visible objects | Clinical notes are restricted to the care team |
| Cell-level (combined) | The intersection of row and column restrictions | VIP patients' clinical notes are visible only to the senior care team |
Example¶
Controlling access to sensitive patient data:
✗ Avoid ✓ Prefer
──────────────────────────────────────── ────────────────────────────────────────
PublicPatient (object type) Patient (single object type)
- name - name
- dob - dob
- diagnosis - diagnosis (column-restricted:
care team only)
RestrictedPatient (object type) → - clinicalNotes (column-restricted:
- name care team only)
- dob - mentalHealthRecords (column-
- diagnosis restricted: psychiatry team only)
- clinicalNotes
- mentalHealthRecords Row-level security:
- VIP patients: senior staff only
(Duplicated schemas; security
achieved by splitting types. Column-level security:
Properties added to one type are - clinicalNotes: care team only
easily forgotten on the other.) - mentalHealthRecords: psychiatry only
(One type; security achieved by policy.
Domain boundaries drive access rules.)
Impact of incorrect security design¶
| Problem | Impact |
|---|---|
| Duplicated types for security | Schemas drift out of sync; properties added to one type are easily forgotten on the other. Violates the "Don't repeat yourself" design principle. |
| Over-permissive defaults | Starting with broad access and restricting later risks exposing sensitive data before lockdown is complete. |
| Ad-hoc filtering instead of policy | Security logic scattered through application code rather than enforced at the Ontology layer is fragile and difficult to audit. |
| Misaligned boundaries | Security boundaries that do not follow domain boundaries are harder to reason about and more likely to have gaps. |
Best practices¶
- Start restrictive, open up deliberately: Default to minimal access and widen as needed, rather than starting open and restricting later.
- Use row-level and column-level security together for fine-grained cell-level access control.
- Align security with domain boundaries: If your domain has natural access boundaries (a regional manager sees their region's data; a care team sees their patients), model those boundaries using Ontology relationships and security policies rather than ad-hoc data filtering.
- Avoid duplicating object types for security: A single type with well-designed security policies is better than multiple types with duplicated schemas.
- Review new ontology paths for access-control consistency: Ensure added links, types, or properties preserve the intended protections around restricted data.
Use the guidance on this page to ensure security boundaries align with domain boundaries, then refer to our security and governance documentation for configuration details.
中文翻译¶
本体论设计:结构指导¶
以下章节提供了如何在本体论(Ontology)中构建属性(properties)、关系(relationships)和访问控制(access control)的指导。
规范化与派生属性¶
每个事实只存储一次。使用派生属性(Derived properties)实现便捷性。
非规范化数据(将链接对象的值复制到父对象上)可能存在风险。当数据源发生变化时,每个副本都必须更新。规范化(Normalization)能保持数据一致性,而派生属性则能提供非规范化访问的便利性,且无需维护成本。
并非所有计算值都相同。正确的方法取决于该值能否从稳定的输入中安全地预计算,还是需要与动态的本体论变化保持同步。
预计算值与动态派生值¶
| 类型 | 特征 | 推荐工具 | 示例 |
|---|---|---|---|
| 预计算(Pre-computed) | 根据同一对象上的属性计算;输入很少变化,或仅因管道(Pipeline)摄入而变化。 | 管道转换(Pipeline transform) | fullName = firstName + " " + lastName 输入稳定且在同一个管道中更新,因此预计算是安全的,且不会增加运行时开销。 |
| 动态派生(Dynamically derived) | 依赖于通过操作(Actions)、自动化(Automations)或其他本体论级别操作而变化的链接对象或值。 | 派生属性(Derived property) | directReportCount 员工通过操作被重新分配、入职和离职。一个计算链接 Employee对象数量的派生属性会自动保持正确。 |
:::callout{theme="warning"} 当一个值依赖于通过操作所做的更改时,每个可能影响该值的操作也必须更新该值。如果任何操作未能做到这一点,该值将保持不正确,直到发现差异为止。 :::
反模式¶
- 同一值作为属性存储在多个对象类型上
- 属性因是其他地方维护的值的副本而过时
- 更新单个现实世界的事实需要对多个对象进行写入
- 整数或计数属性是手动维护的,而不是通过链接计算得出
示例¶
一个Manager对象类型需要显示直接下属的数量:
✗ 避免 ✓ 推荐
──────────────────────────────────────── ────────────────────────────────────────
Manager Manager
- directReportCount: 5 - directReportCount (派生):
(手动维护的整数; 在查询时计算链接的
每次有员工加入或离开时 → Employee 对象数量
都必须更新)
Employee
Employee - manager (指向 Manager 的链接)
- managerName: "Alice"
(从链接的 Manager 复制而来;
如果经理姓名更改则会失效)
性能考量¶
派生属性在运行时评估。性能特征因规模而异:
| 规模 | 建议 |
|---|---|
| 低到中等(每次查询约 <10k 个对象) | 可自由使用派生属性。对于大多数工作流,运行时评估性能足够。 |
| 高(每次查询约 >10k 个对象) | 派生属性可能因查询路径开销较高而导致延迟。非规范化可能是一个适当的权衡,但这应该是一个有意识、有记录的决定,而不是默认做法。 |
最佳实践¶
- 将每个事实存储在一个地方,放在其语义上所属的对象上。
- 使用派生属性在查询时从链接对象计算或聚合值。
- 随着规模增长监控性能。如果派生属性在高规模下引入不可接受的延迟,请考虑选择性非规范化。
- 记录任何非规范化,包括理由、真实来源以及保持副本同步的更新策略。
结构体(Structs)¶
将语义相关的字段分组到结构体中。
当一个属性自然包含多个字段时(例如,包含街道、城市、州和邮政编码的地址),请使用结构体,而不是将其扁平化为单独的属性。结构体保留了语义分组,并支持更丰富的元数据捕获。
何时使用结构体¶
| 场景 | 示例 |
|---|---|
| 多字段值 | 地址(街道、城市、州、邮政编码),坐标(地理点、海拔) |
| 带元数据的值 | AI生成的输出,包含置信度分数、来源引用和推理过程 |
| 带选择逻辑的多值属性 | 多个电话号码,其中归约器(Reducer)显示主要号码 |
示例¶
在Facility对象类型上建模地址:
✗ 避免 ✓ 推荐
──────────────────────────────────────── ────────────────────────────────────────
Facility Facility
- addressStreet - address (结构体数组)
- addressCity - street (主字段)
- addressState → - city (主字段)
- addressPostalCode - state (主字段)
- addressCountry - postalCode (主字段)
- addressGeopoint - country (主字段)
- addressLastOccupied - geopoint
- addressDatasource - lastOccupied (用于归约器排序)
- addressLlmConfidence - datasource
- addressLlmReasoning - llmConfidence
- llmReasoning
(十个不相关的属性,仅靠
命名约定作为它们之间的唯一联系) (一个语义概念,包含主字段
和结构化子字段)
主要优势¶
| 优势 | 详情 |
|---|---|
| 语义分组 | 一个地址是一个概念,而不是十个不相关的属性。本体论反映了这一点。 |
| 元数据捕获 | 结构体可以携带来源、置信度和时间戳信息,与主要值一起。 |
| 归约器支持 | 在多值场景中,归约器可以显示最相关的值(例如,具有最新lastOccupied字段的地址)。 |
| 主字段行为 | 结构体可以指定一个或多个主字段,使其在接口和查询中表现得像简单属性或具有字段子集的结构体。 |
结构体在AI优先的工作流中尤其有价值,其中大语言模型(LLM)输出既有主要结果,也有相关的元数据(推理过程、来源引用、置信度分数)。将这些信息捕获在一起,而不是分散在不相关的属性中。
最佳实践¶
- 识别多字段属性,其中字段在语义上相关且始终一起使用。
- 定义结构体,使用清晰的字段名称和类型。
- 指定一个主字段,使结构体在大多数上下文中表现得像简单属性。
- 使用归约器处理多值结构体属性,以显示最相关的值。
- 在结构体中捕获元数据(来源、置信度、时间戳)以及主要值,特别是对于AI生成的输出。
接口(Interfaces)¶
使用接口构建可重用、面向未来的抽象。
接口是实现"不要重复自己"设计原则和开放/封闭可扩展性的主要工具。它们定义了多个对象类型可以实现共享形状(属性、链接、操作),使工作流能够针对接口而不是单个类型。
何时使用接口¶
| 场景 | 示例 |
|---|---|
| 跨类型的公共属性 | Inspectable接口,包含lastInspectionDate和inspectionStatus,由Vehicle、Equipment、Facility实现 |
| 共享工作流 | 针对SchedulableResource的调度工作流无需修改即可适用于竞技场、会议室和车辆 |
| 分类分组 | MilitaryAsset接口,由Aircraft、Vessel、GroundVehicle实现,用于下钻聚合工作流 |
| 多层抽象 | SchedulableResource扩展Trackable,在更广泛的跟踪抽象上添加调度特定属性 |
示例¶
多个对象类型需要检查跟踪:
✗ 避免 ✓ 推荐
──────────────────────────────────────── ────────────────────────────────────────
Vehicle Interface: Inspectable
- lastInspectionDate - lastInspectionDate
- inspectionStatus - inspectionStatus
- (重复操作:安排 - (共享操作:安排检查)
车辆检查)
Vehicle implements Inspectable
Equipment - make, model, mileage, ...
- lastInspectionDate
- inspectionStatus Equipment implements Inspectable
- (重复操作:安排 - serialNumber, warrantyExpiry, ...
设备检查)
Facility implements Inspectable
Facility - address, capacity, ...
- lastInspectionDate
- inspectionStatus (一个接口,一个共享操作,
- (重复操作:安排 三个实现类型)
设施检查)
(相同属性和逻辑的三个副本,
独立维护)
平台考量¶
即使当前平台工具不完全支持基于接口的工作流,使用接口进行设计也能建立一个基础,随着支持范围的扩大而带来回报。
| 情况 | 指导 |
|---|---|
| 接口在工作流中完全受支持 | 直接针对接口。单个工作流覆盖所有实现类型。 |
| 接口在特定上下文中尚不受支持 | 现在定义接口,并临时为每个类型复制工作流。这种方法与不使用接口相比效率并不低,并且为支持可用时的整合建立了清晰的路径。 |
请查阅我们的接口文档了解当前支持详情。
最佳实践¶
- 识别共享形状: 如果多个对象类型共享属性、链接或操作,定义一个捕获共享形状的接口。
- 围绕能力或分类设计接口: 能力接口可能包括
Inspectable、Schedulable或Billable。分类接口可能包括MilitaryAsset或MedicalDevice。 - 在工作流中针对接口: 尽可能针对接口构建操作、函数和应用程序。
- 扩展接口以实现多层抽象: 接口可以扩展其他接口以构建分层抽象。
- 现在搭建框架,稍后整合: 即使某些工作流由于当前平台支持缺口而必须临时按类型复制,也要定义接口。
链接与对象支持的链接类型¶
链接应表示语义上有意义的关系。
每个链接类型都应回答一个清晰的领域问题,例如:
- 这位患者访问了哪个设施?
- 这位员工属于哪个团队?
- 此工单中使用了哪些设备?
何时使用链接类型¶
| 链接类型 | 使用时机 | 示例 |
|---|---|---|
| 直接链接(Direct link) | 关系有意义但不携带自身元数据。 | Employee → Department |
| 对象支持的链接(Object-backed link) | 关系携带自身元数据(日期、角色、状态、分配)。 | Employee → VentureStaffing → Venture(包含role、startDate、allocation) |
并非每个链接对象都需要在每个上下文中可见。某些工作流关心连接元数据,其他工作流只想要直接连接。对象支持的链接允许您根据工作流暴露任一视图。
示例¶
建模员工与项目(Venture)之间的关系,其中每个分配都有角色和开始日期:
✗ 避免 ✓ 推荐
──────────────────────────────────────── ────────────────────────────────────────
Employee → Venture (直接链接) Employee → Venture Staffing → Venture
(无法捕获每次分配的
角色、开始日期或分配比例) → Venture Staffing
- role
— 或 — - startDate
- allocationPercentage
Employee - status
- ventureRole
- ventureStartDate 工作流可以暴露任一:
(如果员工有多个项目分配,则 - 直接:Employee → Venture
存在歧义) - 详细:Employee → Staffing → Venture
错误链接设计的影响¶
| 问题 | 影响 |
|---|---|
| 元数据丢失 | 直接链接无法捕获关系存在的时间、原因或方式。 |
| 多链接歧义 | 当实体参与多个关系时,源对象上的ventureRole等属性变得模糊不清。 |
| 无意义的链接 | 仅因两个数据集共享外键而存在的链接会增加本体论的噪音并混淆导航。 |
最佳实践¶
- 验证语义含义: 避免仅因两个数据集共享外键而存在的链接。询问该关系在领域中是否有意义。
- 评估关系是否携带元数据: 如果是(日期、角色、状态),使用对象支持的链接类型来捕获该元数据。
- 暴露适当的详细程度: 设计工作流以根据上下文使用直接关系或通过链接对象的详细关系。
- 为链接命名以清晰: 链接名称应从每个方向描述关系。有关更多信息,请参阅命名约定部分。
命名约定¶
优化人类可读性和代理可导航性。
一致、描述性的命名是您在本体论质量方面可以做出的最有影响力的投资之一。清晰的名称使人类和AI代理都更容易导航本体论,并且一旦本体论投入使用,纠正起来要困难得多。
命名规则¶
| 元素 | 约定 | 好示例 | 差示例 |
|---|---|---|---|
| 对象类型 | 领域专家能识别的单数、具体名词 | Patient、WorkOrder、FlightSegment |
Data、Item、Record |
| 属性 | 简洁、不言自明;不包含编码的类型信息或实现细节 | age、status、lastInspectionDate |
dtLastInspMod、nVAL01、fieldX |
| 链接 | 从每个方向自然阅读 | department (Employee → Dept), employees (Dept → Employee) |
relatedItems、link1 |
| 日期 | 在整个本体论中一致遵循单一约定 | createdDate、updatedDate、effectiveDate |
混用createdDate和dateOfCreation |
| 歧义术语 | 用具体含义限定 | monetaryValue、quantityOnHand、riskScore |
value、quantity、score |
示例¶
✗ 避免 ✓ 推荐
──────────────────────────────────────── ────────────────────────────────────────
对象类型: Item → 对象类型: Product
属性: dtLastInspMod → 属性: lastInspectionDate
属性: value → 属性: monetaryValue
属性: quantityOnHand
链接: Item → Related Item → 链接: Product → Supplier
链接: Employee → Supervisor
最佳实践¶
- 在构建前建立命名约定: 提前就日期、状态、标识符和链接的模式达成一致。
- 遵循本体论已建立的约定: 如果本体论已使用
createdDate,不要引入dateOfCreation。 - 限定歧义属性: 使用
monetaryValue、quantityOnHand和riskScore。不要使用value、quantity和score。 - 按关系命名链接: 从
Employee到Department的链接应为department(从员工的角度)和employees(从部门的角度)。 - 与最终用户一起审查名称: 对构建者来说清晰的名称可能对消费者来说模糊不清。与每天使用本体论的人一起验证。
安全设计¶
语义化设计安全,遵循最小权限原则。
本体论中的安全应以领域中有意义的术语来表达,而不是以数据基础设施的术语。用户应该能够查看安全配置并理解保护了什么以及为什么。
安全模型¶
结合行级和列级安全以实现细粒度的单元格级访问控制:
| 安全层 | 控制 | 示例 |
|---|---|---|
| 行级(Row-level) | 用户可以查看的对象 | VIP患者仅限于高级员工 |
| 列级(Column-level) | 用户可以在可见对象上查看的属性 | 临床笔记仅限于护理团队 |
| 单元格级(Cell-level)(组合) | 行和列限制的交集 | VIP患者的临床笔记仅对高级护理团队可见 |
示例¶
控制对敏感患者数据的访问:
✗ 避免 ✓ 推荐
──────────────────────────────────────── ────────────────────────────────────────
PublicPatient (对象类型) Patient (单一对象类型)
- name - name
- dob - dob
- diagnosis - diagnosis (列限制:
仅限护理团队)
RestrictedPatient (对象类型) → - clinicalNotes (列限制:
- name 仅限护理团队)
- dob - mentalHealthRecords (列限制:
- diagnosis 仅限精神科团队)
- clinicalNotes
- mentalHealthRecords 行级安全:
- VIP患者:仅限高级员工
(重复的模式;通过拆分类型
实现安全。 列级安全:
添加到一个类型的属性 - clinicalNotes:仅限护理团队
很容易在另一个类型上被遗忘。) - mentalHealthRecords:仅限精神科
(单一类型;通过策略实现安全。
领域边界驱动访问规则。)
错误安全设计的影响¶
| 问题 | 影响 |
|---|---|
| 为安全而重复类型 | 模式不同步;添加到一个类型的属性很容易在另一个类型上被遗忘。违反"不要重复自己"设计原则。 |
| 过度宽松的默认值 | 从广泛访问开始,稍后限制,在锁定完成前有暴露敏感数据的风险。 |
| 临时过滤而非策略 | 安全逻辑分散在应用程序代码中,而不是在本体论层强制执行,这很脆弱且难以审计。 |
| 边界错位 | 不遵循领域边界的安全边界更难推理,且更可能有漏洞。 |
最佳实践¶
- 从限制开始,有意识地开放: 默认最小访问权限,根据需要扩大,而不是从开放开始,稍后限制。
- 结合使用行级和列级安全以实现细粒度的单元格级访问控制。
- 使安全与领域边界对齐: 如果您的领域有自然的访问边界(区域经理查看其区域的数据;护理团队查看其患者),使用本体论关系和策略建模这些边界,而不是临时的数据过滤。
- 避免为安全而重复对象类型: 一个具有精心设计的安全策略的类型优于多个具有重复模式的类型。
- 审查新的本体论路径以确保访问控制一致性: 确保添加的链接、类型或属性保留对受限数据的预期保护。
使用本页面的指导确保安全边界与领域边界对齐,然后参考我们的安全与治理文档了解配置详情。