Audit logs(审计日志(Audit logs))¶
Audit logs provide a comprehensive record of every action taken in Foundry, enabling security teams to detect threats, investigate incidents, ensure compliance, and maintain accountability across the platform. Audit logs can be thought of as a distilled record of all actions taken by users in the platform. This is often a compromise between verbosity and precision, where overly verbose logs may contain more information but be more difficult to reason about.
What audit logs capture¶
Audit logs in Foundry contain enough information to answer the critical questions for any security investigation or compliance review:
- Who performed an action (user identity, session, or service account).
- What the action was (categorized by type and intent).
- When the action happened (precise timestamps for temporal analysis).
- Where the action occurred (which resources and systems were involved).
Sometimes, audit logs will contain contextual information about users including Personal Identifiable Information (PII), such as names and email addresses, as well as other potentially sensitive usage data. As such, audit log contents should be considered sensitive and viewed only by persons with the necessary security qualifications.
Audit logs (and associated detail) should generally be consumed and analyzed in a separate purpose-built system for security monitoring (a "security information and event management", or SIEM solution) owned by the customer if one is available. If no such system has been provisioned, Foundry itself is flexible enough for some light SIEM-native workflows to be performed directly in the platform instead.
This documentation explains how to access, consume, and analyze Foundry audit logs:
- Audit delivery: How to ingest logs into your SIEM or export them to Foundry for analysis.
- Audit schema: The structure and contents of audit logs, including field definitions and guarantees.
- Migrating from
audit.2toaudit.3: Guidance for transitioning existingaudit.2analyses to the newaudit.3schema.
Other documentation available includes:
- The audit log categories documentation about category-based query patterns.
- Documentation on monitoring security audit logs with analysis best practices.
- The
list-log-filesandget-log-file-contentaudit API documentation for SIEM integrations.
:::callout{theme="success" title="Best practice"}
Customers are strongly encouraged to consume and monitor their own audit logs via the mechanisms presented below. All audit log analyses should use the new and improved audit.3 schema logs to maintain continuity as we are in the process of fully migrating audit log archival from audit.2 to audit.3 for new audit logs. Audit.3 logs are available via API for use in a SIEM or exportable to Foundry for in-platform analysis. Review our documentation on monitoring audit logs for additional guidance.
:::
Audit delivery¶
Foundry provides flexible mechanisms for delivering audit logs to meet diverse security infrastructure and SIEM requirements. The delivery method you choose depends on your organization's existing security tooling and analysis workflows.
Compare delivery methods¶
Audit.3 logs (recommended for all new implementations):
Audit.3 schema logs offer significant advantages for security operations:
- Low latency: Available within ~15 minutes of event occurrence, enabling timely threat detection.
- Direct API access: Ingest directly into external SIEMs through public API without requiring Foundry as an intermediary.
- Structured categories: Enforced, standardized categories provide predictable structure for automated analysis.
- Future-proof: New Foundry features automatically use existing categories, so monitoring queries do not need updates.
Audit.3 logs can be consumed through API into a SIEM or through an audit export to a Foundry dataset if necessary.
Audit.2 logs (legacy, for historical analysis only):
The historical audit.2 schema logs are compiled, compressed, and moved to log archival storage within about 24 hours to environment-dependent storage (for example, an S3 bucket). These logs:
- Must be exported to a Foundry dataset before analysis.
- Do not support direct API ingestion to external SIEMs.
- Have optional category usage with inconsistent structure.
- Should only be used for analyzing historical periods before the
audit.3migration.
From archival storage, Foundry can deliver audit.2 logs to customers through an audit export to Foundry.
Audit ingest directly to SIEM¶
External SIEMs can ingest audit.3 logs directly from storage using Palantir's list-log-files and get-log-file-content audit API endpoints. This approach is preferred for organizations with dedicated security operations centers and established SIEM platforms, as it provides the following:
- Minimal latency: Logs available in your SIEM within 15 minutes for rapid threat detection and response.
- Native integration: Logs flow directly into your existing security tooling and alerting workflows.
- No intermediary: Eliminates the need to route logs through Foundry first, simplifying architecture and saving any potential costs spent on storage/compute of audit logs if not exported to and analyzed in Foundry itself.
- Standard protocols: Uses familiar API patterns that integrate with most major SIEM platforms.
SIEM ingestion is an advantage of audit.3 logs over the historical audit.2 logs, which do not have public APIs that allow direct ingestion by external SIEMs (audit.2 logs must first be exported to a Foundry dataset, then exported from there to a SIEM).
Pagination and polling for SIEM ingestion¶
The list-log-files endpoint uses token-based pagination. The nextPageToken field is always present in the response, even when the data field is empty or omitted, meaning no new log files are available.
To continuously poll for new audit logs, follow the steps below:
- Make an initial request with a
startDateand optionalendDate. - Store the
nextPageTokenfrom the response. - On subsequent requests, provide the stored
nextPageTokento resume from where you left off. New log files produced since your last request will be returned. - When the
datafield is empty, there are no new logs yet, but thenextPageTokenis still valid and should be saved for the next polling cycle.
This design allows SIEM integrations to poll on any cadence without missing logs or receiving duplicates. If you omit the endDate parameter, the endpoint will always return the latest available logs, making it suitable for open-ended continuous polling.
Authentication for API access¶
To access the audit API endpoints, you must authenticate your SIEM requests. We recommend the following approaches, in order of general preference. However, the option best for you depends on your desired security model.
Note that your API requests must include an auth header with audit-export:view on the organization for which you are requesting logs. audit-export:view is a gatekeeper operation that must be granted to the client/user whose token is used in the auth header for the organization whose audit logs are being requested. Under Organization permissions in Control Panel, grant the generated client/user a role that includes the Create datasets with audit logs for the organization workflow for the organization you want to have logs for. Review the permissions documentation for more information.
Third-party application through Developer Console (recommended):
The most secure and maintainable approach is to create a third-party application in the Developer Console application with the appropriate audit log permissions. The Developer Console allows you to create and manage applications that talk to public APIs using the Ontology SDK and OAuth.
This method provides:
- Scoped permissions specific to audit log access.
- Better security through the OAuth2 client credentials flow.
- Easier credential rotation and management.
- Clear audit trails of which application is accessing logs.
Review the Developer Console documentation and third-party applications documentation for setup instructions.
OAuth2 client credentials:
If you cannot use a Developer Console application, OAuth2 client credentials provide a secure programmatic authentication method suitable for automated SIEM ingestion.
User tokens (not recommended):
Administrator user tokens should be avoided for production SIEM integrations as they come with the following limitations:
- Are tied to individual user accounts, creating operational dependencies.
- May have broader permissions than necessary.
- Are harder to rotate without disrupting service.
- Create unclear audit trails when used by automated systems.
Audit export to Foundry¶
Both audit.2 and audit.3 logs can be exported, per-organization, directly into a Foundry dataset through the audit logs tooling in Control Panel. This approach is suitable for organizations that do not currently have SIEM tooling and still need to analyze their audit logs.
Once audit log data has landed in a Foundry dataset, it can be analyzed directly in Foundry. Pipeline Builder can be used for prototyping or basic quick analyses, while Code Repositories can perform long-term analysis due to the immense data scale. Audit log datasets may be too large to effectively analyze in Contour without filtering them first. You may choose to export from Foundry to an external SIEM through Data Connection (audit.3 logs should be consumed directly through public APIs if a SIEM is the ultimate destination for audit log analysis).
Export permissions¶
To export audit logs, you will need the audit-export:orchestrate-v3 operation (for audit.3) on the target organization(s). This can be granted with the Organization administrator role in Control Panel, configurable from the Organization permissions tab. Review the organization permissions documentation for more details.
Export setup¶
To set up audit log exports to a Foundry dataset, follow the steps below:

- Navigate to Control Panel.
- In the top toolbar, confirm the relevant Organization is selected from the dropdown menu.
- In the left sidebar, select Search, then search for and select Audit logs.
- Choose Create export dataset.
- Select a log type: Choose audit.3 for the new schema of current audit logs, or audit.2 if you require analyzing historical audit logs from before the upgrade to the
audit.3schema. Review important limitations when migrating fromaudit.2toaudit.3. - Select an export location and name for this audit log dataset to live in Foundry.
- Use markings to restrict your highly sensitive audit log dataset and specify the set of qualified platform administrators with need-to-know access who can view potentially sensitive usage details like PII and search queries. By default, audit log datasets will be marked with the organization selected above. Review our organization markings documentation for more information.
- Optionally, enable a start date filter to limit this dataset to events that occur on or after a given date.
- Optionally, enable a dataset-specific retention policy to limit the number of days logs are preserved in this particular export dataset (max 730 days). Note that retention policies are based on the transaction timestamp when logs were added to the export dataset, not the timestamp of the log entries themselves. Deleting the relevant transactions may take up to seven days after they are marked for deletion. The following are two examples to illustrate how this works in practice:
- Example one:
- Start date: 2025-01-01
- Retention policy: 90 days
- In this case, the export dataset will initially contain logs since January 1st, 2025. At 90 days after the creation of the dataset, the retention policy will take effect; only logs added in the last 90 days will be retained, rather than all logs since the start of January 2025 that were initially in the dataset. In practice, this means that the size of the dataset decreases significantly after 90 days when older logs are removed.
- Example two:
- Start date: 30 days ago
- Retention policy: 90 days
- In this scenario, the export dataset will begin with logs from the past 30 days. As new logs are added, the dataset will grow until it holds a rolling window of the most recent 90 days of data. Logs added in transactions older than 90 days will be removed according to the retention policy.
- Acknowledge you understand the security implications of creating the dataset and the configuration settings you are applying to the audit log export dataset, then select Create export.
Update or disable export datasets¶
Export dataset updates:
- For larger environments, builds of
audit.2datasets may produce empty append transactions in the first several hours (or longer). This is expected behavior as the pipeline processes the full backlog of audit logs. This delay is not present for new exports ofaudit.3datasets. - New logs are appended to the export dataset on a regular cadence. Each append pulls, at most, 100 GiB of log data (and 10k files) for
audit.3datasets, and 10 GiB foraudit.2datasets. Such large appends are generally only needed when an audit log dataset is first created. - The runtime of each audit log append is directly proportional to how many new logs are being appended to an audit log dataset.
- Schedules controlling the builds of the export dataset are controlled by the audit-export service and are hidden from view of the user.
Audit.3export datasets use thedatecolumn as a Hive Partition column in their schema. This means thedatecolumn functions as a partition key rather than a regular data column, which affects how the data appears when queried. For example,datevalues will be formatted as date types rather than datetime. The primary advantage of Hive Partitioning is that for ad-hoc analyses, filtering bydatefirst rather than by timestamp will provide much faster lookup in Spark or general builds.
Export dataset disablement: To disable an export, open the audit log dataset and select File > Move to Trash, or manually move the dataset to another project.
- Moving an audit log dataset will stop any further builds of that dataset after roughly one hour has elapsed in the Trash or different project.
- Note that there is no way to restart these builds once halted, even if the dataset is subsequently restored from the Trash or moved back to the original project.
Audit schema¶
All logs that Palantir products produce are structured logs. This means that they have a specific schema that they follow, which can be relied on by downstream systems for automated analysis and alerting.
Palantir audit logs are currently delivered in both the historical audit.2 schema and in the new and improved schema, audit.3. Audit.2 logs of new events will soon cease to be available for export, and only historical logs will be available in that schema; all new events will be available for export in only the audit.3 schema.
Within both the audit.2 and audit.3 schemas, audit logs will vary depending on the product that produces the log. This is because each product is reasoning about a different domain, and thus will have different concerns that it needs to describe. This variance is more noticeable in audit.2, as will be explained below.
Product-specific information is primarily captured within the requestFields and resultFields for audit.3 logs (or request_params and result_params for audit.2 logs). The contents of these fields will change shape depending on both the product doing the logging and the event being logged.
Audit log categories¶
Palantir logs use a concept called audit log categories to make logs easier to understand with little product-specific knowledge. Rather than needing to track hundreds of service-specific event names, categories let security analysts focus on high-level actions like data loading, permission changes, or authentication attempts, regardless of which product or feature generated the log. This abstraction enables analysts to build monitoring queries that work across all Foundry services without needing to understand implementation details. For example, filtering for dataExport captures all data export events regardless of what product was used to export the data.
With audit log categories, audit logs are described as a union of auditable events. Audit log categories are based on a set of core concepts and divided into categories that describe actions on those concepts, such as the following:
| Category | Description | Example use case |
|---|---|---|
authenticationCheck |
Checks authentication status via a programmatic or manual authentication event, such as token validation. | Detect token validation patterns that suggest credential stuffing. |
dataCreate |
Indicates the addition of some new entry of data into the platform where it did not exist prior. This event may be reflected as a dataPromote in a separate service if it is logged in the landing service. |
Track data creation patterns and enforce governance policies. |
dataDelete |
Related to the deletion of data, independent of the granularity of that deletion. | Alert on deletion of critical or protected resources. |
dataExport |
Export of data from the platform. Use for instances like downloading data from the platform, such as a system external to Palantir, CSV file, and more. If data was exported to another Palantir system, use the dataPromote category. |
Alert on large exports, exports of sensitive data, or exports outside business hours. |
dataImport |
Imports to the platform. Unlike dataPromote, dataImport refers only to data being ingested from outside the platform. This means that a dataImport in one service could show up as a dataPromote in a separate service. |
Monitor for malicious file uploads or policy violations. |
dataLoad |
Refers to the loading of data to be returned to a user. For purely back-end loads, use internal. |
Establish baseline normal access patterns and detect anomalous bulk data access. |
tokenGeneration |
Action that leads to generation of a new token. | Detect unusual token creation that could indicate preparation for bulk data access. |
userLogin |
Login events of users. | Monitor for failed login attempts, unusual login times, or geographic anomalies. |
userLogout |
Logout events of users. | Track session durations and identify abnormally long sessions. |
Audit log categories have also gone through a versioning change, from a looser form within audit.2 logs to a stricter and richer form within audit.3 logs. In audit.3, each log must specify at least one category, and each category defines exactly which request and result fields will be present, making automated analysis far more reliable.
Refer to our documented audit log categories for a detailed list of available categories with field specifications. Also, review the monitoring security audit logs documentation for additional guidance on using categories.
Audit log attribution¶
Audit logs are written to a single log archive per environment. When audit logs are processed through the delivery pipeline, the User ID fields (uid and otherUids in the schema below) are extracted, and the users are mapped to their corresponding organizations.
An audit export orchestrated for a given organization is limited to audit logs attributed to that organization. Actions taken solely by service (non-human) users will not typically be attributed to any organization as these users are not organization members. The special case of service users for third-party applications using client credentials grants and used only by the registering organization do generate audit logs attributed to that organization.
Audit.3 logs¶
Any new log exports or analyses should use audit.3 logs rather than audit.2 logs.
Audit.3 logs are built upon a new log schema that provides a number of advantages over audit.2 logs. The key benefits to audit log consumers of the new schema and associated delivery pipeline are the following:
- Faster delivery: Latency is reduced from potentially more than 24 hours to 15 minutes or less, enabling near-real-time threat detection.
- Structured categories: Enforced, documented categories provide predictable structure for analysis. This helps to reduce the need to understand individual products when reasoning about log contents, especially if analytical needs encompass entire workflows (for example,
dataExport). - Enhanced top-level fields: New fields like
productallow easier filtering by product name instead of requiring complex event name mapping filters to focus analysis on a single product's usage. - Direct API access: Public APIs are now available for ingesting audit logs into external SIEMs without requiring Foundry as an intermediary.
- Enriched context: Additional contextualized information about users, resources, and relationships extracted during delivery is available for easier downstream analysis.
Schema guarantees¶
Audit.3 logs are produced with the following guarantees in mind:
- Explicit category definitions: Each audit category explicitly defines the values/items on which it applies; for example,
dataLoaddescribes the precise resources that are loaded. - Union of categories: Each log is produced strictly as a union of audit categories. This means that logs will not contain free-form data, ensuring predictable structure.
- Promoted key information: Certain important information within an audit log is promoted to the top level of the
audit.3schema. For example, all named resources are present at the top level, as well as within the request and result fields.
These guarantees mean that for any particular log it is possible to tell (1) what auditable event created it, and (2) exactly what fields it contains. These guarantees are product-agnostic, enabling security analysts to build monitoring queries that work across all Foundry services.
Audit.3 schema reference¶
The audit.3 log schema is provided below:
| Field | Type | Description |
|---|---|---|
categories |
set<string> |
All audit categories produced by this audit event. |
entities |
list<any> |
All entities (for example, resources) present in the request and result fields of this log. |
environment |
optional<string> |
The environment that produced this log. |
eventId |
uuid |
The unique identifier for an auditable event. This can be used to group log lines that are part of the same event. For example, the same eventId will be logged in lines emitted at the start and end of a large binary response streamed to the consumer. |
host |
string |
The host that produced this log. |
logEntryId |
uuid |
The unique identifier for this audit log line, not repeated across any other log line in the system. Note that some log lines may be duplicated during ingestion into Foundry, and there may be several rows with the same logEntryId. Rows with the same logEntryId are duplicates and can be ignored. |
name |
string |
The name of the audit event, generally following a (product name)_(endpoint name) structure in ALL CAPS, snake-cased. For example: DATA_PROXY_SERVICE_GENERATED_GET_DATASET_AS_CSV2. |
orgId |
optional<string> |
The organization to which the uid belongs, if available. |
origin |
optional<string> |
The best-effort identifier of the originating machine. For example, an IP address, a Kubernetes node identifier, or similar. This value can be spoofed. |
origins |
list<string> |
The origins of the network request, determined by request headers. This value can be spoofed. To identify audit logs for user-initiated requests, filter to audit logs that have non-empty origins. Audit logs with empty origins correspond to service-initiated requests made by the Palantir backend while fulfilling user-initiated requests.If an audit log with non-empty origins has categories including apiGatewayRequest, then the associated request was fulfilled by an API gateway. To find audit logs for the requests made by the API gateway to fulfill the user-initiated request, filter to logs with the same traceId that have a userAgent starting with the service in this audit log. |
product |
string |
The product that produced this log. |
producerType |
AuditProducer |
How this audit log was produced; for example, from a backend (SERVER) or frontend (CLIENT). |
productVersion |
string |
The version of the product that produced this log. |
requestFields |
map<string, any> |
The parameters known at method invocation time. Entries in the request and result fields will be dependent on the categories field defined above. |
result |
AuditResult |
Indicates whether the request was successful or the type of failure; for example, ERROR or UNAUTHORIZED. |
resultFields |
map<string, any> |
Information derived within a method, commonly parts of the return value. |
sequenceId |
uuid |
A best-effort ordering field for events that share the same eventId. |
service |
optional<string> |
The service that produced this log. |
sid |
optional<SessionId> |
The session ID, if available. |
sourceOrigin |
optional<string> |
The origin of the network request, determined by the TCP stack. |
stack |
optional<string> |
The stack on which this log was generated. |
time |
datetime |
The RFC3339Nano UTC datetime string, for example 2025-11-13T23:20:24.180Z. |
tokenId |
optional<TokenId> |
The API token ID, if available. |
traceId |
optional<TraceId> |
The Zipkin trace ID, if available. |
uid |
optional<UserId> |
The user ID, if available. This is the most downstream caller. |
userAgent |
optional<string> |
The user agent of the user that originated this log. |
users |
set<ContextualizedUser> |
All users present in this audit log.ContextualizedUser: fields:
|
:::callout{theme="neutral"}
In the current audit.3 pipeline, only the uid field is populated. The userName, firstName, lastName, and realm fields are not set, and groups is always empty. Populating these fields would require real-time lookups against the identity provider, which is incompatible with the low-latency design of the audit.3 pipeline. To enrich audit logs with user profile information, perform downstream lookups against user directory data in a Foundry pipeline.
:::
We generally recommend use of the immutable product schema field to filter audit logs when working to analyze particular applications. The service field may sometimes be useful for consumers as it allows the filtering between different instances of the same product, potentially helpful for understanding a specific incident more granularly.
Audit.2 logs¶
Only analyses of historical periods before the logging of new audit.3 logs should use audit.2 logs. Refer to important limitations when migrating from audit.2 to audit.3, as the archiving of new events as logs in the audit.2 schema will be deprecated soon. Audit.2 logs have no inter-product guarantees about the shape of the request or result parameters. As such, reasoning about audit logs must typically be performed on a product-by-product basis.
Audit.2 logs may present an audit category within them that can be useful for narrowing a search. However, this category does not contain further information or prescribe the rest of the contents of the audit log. Additionally, audit.2 logs are not guaranteed to contain an audit category. If present, categories will be included in either the _category or _categories field within request_params.
The schema of audit.2 log export datasets is provided below.
| Field | Type | Description |
|---|---|---|
filename |
.log.gz |
Name of the compressed file from the log archive. |
ip |
string |
Best-effort identifier of the originating IP address. |
name |
string |
Name of the audit event, such as PUT_FILE. |
request_params |
map<string, any> |
The parameters known at method invocation time. |
result |
AuditResult |
The result of the event (success, failure, and so on). |
result_params |
map<string, any> |
Information derived within a method, commonly parts of the return value. |
sid |
optional<SessionId> |
Session ID (if available). |
time |
datetime |
RFC3339Nano UTC datetime string, for example: 2025-11-13T23:20:24.180Z. |
token_id |
optional<TokenId> |
API token ID (if available). |
trace_id |
optional<TraceId> |
Zipkin trace ID (if available). |
type |
string |
Specifies the audit schema version: "audit.2" |
uid |
optional<UserId> |
User ID (if available); this is the most downstream caller. |
Migrate from audit.2 to audit.3¶
Organizations currently using audit.2 logs for security monitoring or compliance must migrate their analyses to the audit.3 schema. This section provides guidance for transitioning your audit log workflows to take advantage of the improvements contained in audit.3.
Differences between audit.2 and audit.3¶
The audit.3 schema represents a fundamental architectural change from audit.2, not just a version update. Key differences that affect migration are described in the sections below.
Field name and structure changes¶
request_paramsandresult_paramsinaudit.2are now namedrequestFieldsandresultFieldsinaudit.3.- The contents of
requestFieldsandresultFieldsare generally completely different from theiraudit.2equivalents (request_paramsandresult_params). - Information found in
request_paramsinaudit.2may now be inresultFieldsinaudit.3, or vice versa. - Always check both
requestFieldsandresultFieldswhen looking for specific information inaudit.3. - The
typetop level field identifies which schema produced the log:"audit.2"or"audit.3".
Event name changes¶
- Event names (in the
namefield) can be different between the schemas, which can lead to mismatches when parsing depends specifically on event names. - Event names in audit.3 are much more standardized, generally following a (product name)_(endpoint name) structure in ALL CAPS, snake-cased. For example:
DATA_PROXY_SERVICE_GENERATED_GET_DATASET_AS_CSV2. - While there is not a direct 1:1 mapping from all
audit.2toaudit.3event names, mostaudit.2log event names can be straightforwardly mapped toaudit.3. - Recommended approach: Use audit categories instead of specific event names for queries going forward, as categories provide consistent abstraction across both schemas (though
audit.2has optional category usage).
New capabilities¶
- In
audit.3, there is a top levelcategoriesfield with enforced, standardized values. - New fields like
product,service,entities, andusersare available inaudit.3for easier filtering. - The
usersfield is available inaudit.3and contains user IDs for all users associated with a log entry. Additional user profile fields (userName,firstName,lastName,groups,realm) are defined in the schema but are not currently populated; enrich these downstream if needed.
Important limitations¶
- No historical backfill:
Audit.3only captures events that occur after it was enabled. Historical events will remain in theaudit.2log schema only.
:::callout{theme="neutral"}
Due to infrastructure updates that occurred between the 24th and 25th of November 2025 in preparation for general availability, some enrollments may have experienced potential gaps of approximately three hours in audit.3 log availability. We recommend using audit.2 logs as a fallback to cover analytical needs during this period, and full migration to audit.3 for analyses after December 4, 2025.
:::
- Concurrent logging period: Both schemas will emit logs during the transition period, allowing validation before full cutover.
Migration path¶
If you have existing analyses built on audit.2 logs, your migration approach depends on where you currently perform analysis. Follow the steps in the section below that matches your analysis approach.
If you use an external SIEM¶
We strongly recommend migrating to direct API ingestion rather than continuing to use Foundry export datasets. This approach provides:
- Significantly lower latency: Approximately 15 minutes or less, versus 24 or more hours for
audit.2. - Simpler architecture: Eliminates Foundry as an intermediary, reducing additional points of latency or failure.
- Native SIEM integration: Full logs flow directly into your existing security tooling.
Migration steps:
- Configure your SIEM to ingest from the
list-log-filesandget-log-file-contentaudit API endpoints. - Update parsing rules to handle the new
audit.3schema structure (see the section below). - Update detection rules to use categories instead of event names where possible.
- Run in parallel with your existing
audit.2ingestion during a validation period. - Test alerting and detection workflows thoroughly.
- Once validated, deprecate your
audit.2Foundry export and SIEM ingestion.
If you analyze logs in Foundry¶
- Create a new
audit.3export dataset in Control Panel following the export setup instructions. - Keep your existing
audit.2export dataset running during the transition period. - Create a new Global Branch to perform the migration by bundling together PRs (in potentially multiple code repositories, if necessary).
- Use the Data Lineage application to view your downstream consumption of
audit.2data in Foundry to understand what analyses are deprecated and which need to be migrated. - Follow the refactoring steps below to update your analyses.
Identify category-based analysis patterns¶
Rather than migrating analyses by every event name, refactor your analyses to use the standardized audit categories instead of individual product event names:
Old approach (audit.2):
# Fragile: Depends on specific event names that may change
export_events = audit_logs.filter(
col("name").isin(["EXPORT_DATASET", "DOWNLOAD_FILE", "CREATE_EXTERNAL_CONNECTION"])
)
New approach (audit.3):
# Robust: Uses standardized categories
export_events = audit_logs.filter(
col("categories").contains("dataExport")
)
This category-based approach provides several advantages:
- Future-proof: New export features released will use the
dataExportcategory. - Comprehensive: Captures all export mechanisms without needing to enumerate every event name.
- Cross-product: Works consistently across all Foundry services.
Validate during transition period¶
During the concurrent logging period:
- Run analyses on both
audit.2andaudit.3data sources in parallel. - Compare outputs to ensure
audit.3analyses capture expected events. - Verify that category-based filters provide equivalent or better coverage than event name filters.
- For SIEM users: Validate that alerts fire correctly on
audit.3data.
Complete the cutover¶
Once you have validated that your audit.3 analyses provide equivalent or better coverage:
- Switch all downstream consumers to use the
audit.3API or dataset. - Disable your
audit.2export dataset (if using Foundry exports) by moving to the Trash or a different project. - Remove
audit.2ingestion from your SIEM (if using external SIEM). - Note that historical
audit.2logs will remain accessible even after you stop generating newaudit.2logs. Analyses or investigations looking back in time will require querying bothaudit.2(historical) andaudit.3(current) logs separately. Refer to important limitations for more information.
Common migration challenges¶
Challenge: Missing event names¶
Some event names from audit.2 may not have direct equivalents in audit.3. This typically occurs when:
- The event was deprecated or replaced by a new endpoint.
- The event was low-signal and deliberately not ported to
audit.3.
Solution
- Verify the event is still actively being emitted in your current
audit.2logs. - If it is, use the
traceIdfield from theaudit.2log to find related audit logs in theaudit.3dataset that may be relevant. - If you cannot find an equivalent, contact your Palantir Support team for assistance.
Challenge: Unpacking nested request/result fields¶
The structure of requestFields and resultFields can be complex and product-specific, differing between audit.2 and audit.3.
Solution
- If your analyses unpack
request_paramsorresult_paramsinaudit.2and this level of granular detail is important in your analysis, you must refactor this logic due to the new field structures inaudit.3. Always check bothrequestFieldsandresultFieldsinaudit.3, as information may have moved between the request and result fields compared toaudit.2. - Use the audit categories documentation to understand which fields are guaranteed for each category.
- Favor top level fields (
entities,users,categories, and so on) over parsing request/result fields when possible. - Build defensive parsing logic that handles missing or unexpected fields gracefully.
Challenge: Continuity across cutover date¶
You need to analyze a period that spans both historical audit.2 and current audit.3 data. Refer to important limitations for more information.
Solution
- For SIEM analysis: Maintain parallel ingestion during transition, then rely on SIEM's historical data retention for
audit.2logs. - For Foundry analysis: Create a unified view that unions
audit.2andaudit.3datasets. - Use the
typefield to apply schema-specific parsing logic when needed (for example, categories for filtering inaudit.3, and name/result_params inaudit.2). - Perfect continuity may not be possible for all metrics due to schema differences.
Additional resources¶
- Review the audit log categories documentation for category-based query patterns.
- Review the monitoring security audit logs documentation for analysis best practices.
- Consult the
list-log-filesandget-log-file-contentaudit API documentation for SIEM integration details. - Contact your Palantir Support team if you have any product-specific migration questions.
中文翻译¶
审计日志(Audit logs)¶
审计日志提供了Foundry中所有操作的全面记录,使安全团队能够检测威胁、调查事件、确保合规性,并在整个平台中维护问责制。审计日志可以被视为用户在平台中所有操作的精简记录。这通常是详细程度与精确度之间的权衡——过于详细的日志可能包含更多信息,但更难以分析。
审计日志捕获的内容¶
Foundry中的审计日志包含足够的信息,能够回答任何安全调查或合规审查中的关键问题:
- 谁执行了操作(用户身份、会话或服务账户)。
- 什么操作被执行(按类型和意图分类)。
- 何时操作发生(用于时间分析的精确时间戳)。
- 何处操作发生(涉及哪些资源和系统)。
有时,审计日志会包含用户的上下文信息,包括个人身份信息(PII),如姓名和电子邮件地址,以及其他潜在的敏感使用数据。因此,审计日志内容应被视为敏感信息,仅由具备必要安全资质的人员查看。
审计日志(及相关详细信息)通常应在客户拥有的、专用于安全监控的独立系统中进行消费和分析(即"安全信息和事件管理",简称SIEM解决方案)。如果尚未配置此类系统,Foundry本身也足够灵活,可以直接在平台内执行一些轻量级的SIEM原生工作流。
本文档说明了如何访问、消费和分析Foundry审计日志:
- 审计日志交付:如何将日志导入SIEM或导出到Foundry进行分析。
- 审计日志模式:审计日志的结构和内容,包括字段定义和保证。
- 从
audit.2迁移到audit.3:将现有audit.2分析迁移到新audit.3模式的指南。
其他可用文档包括:
- 审计日志类别(Audit log categories)文档,介绍基于类别的查询模式。
- 监控安全审计日志(Monitoring security audit logs)文档,包含分析最佳实践。
- 用于SIEM集成的
list-log-files和get-log-file-content审计API文档。
:::callout{theme="success" title="最佳实践"}
强烈建议客户通过下文介绍的机制消费和监控自己的审计日志。所有审计日志分析应使用新的、改进的audit.3模式日志,以保持连续性——我们正在将新审计日志的归档从audit.2完全迁移到audit.3。Audit.3日志可通过API在SIEM中使用,或导出到Foundry进行平台内分析。请查阅我们的监控审计日志(monitoring audit logs)文档以获取更多指导。
:::
审计日志交付¶
Foundry提供灵活的审计日志交付机制,以满足不同的安全基础设施和SIEM需求。您选择的交付方法取决于您组织的现有安全工具和分析工作流。
比较交付方法¶
Audit.3日志(推荐用于所有新实施):
Audit.3模式日志为安全运营提供了显著优势:
- 低延迟: 事件发生后约15分钟内可用,实现及时的威胁检测。
- 直接API访问: 通过公共API直接导入外部SIEM,无需Foundry作为中介。
- 结构化类别: 强制执行的标准化类别为自动化分析提供了可预测的结构。
- 面向未来: 新的Foundry功能自动使用现有类别,因此监控查询无需更新。
Audit.3日志可以通过API导入SIEM,或在必要时通过审计导出到Foundry数据集进行消费。
Audit.2日志(遗留模式,仅用于历史分析):
历史audit.2模式日志在大约24小时内被编译、压缩并移动到日志归档存储(例如S3存储桶)。这些日志:
- 必须在分析前导出到Foundry数据集。
- 不支持直接API导入到外部SIEM。
- 具有可选的类别使用,结构不一致。
- 仅应用于分析
audit.3迁移之前的历史时期。
从归档存储中,Foundry可以通过审计导出到Foundry向客户交付audit.2日志。
直接导入SIEM¶
外部SIEM可以使用Palantir的list-log-files和get-log-file-content审计API端点直接从存储中导入audit.3日志。这种方法适用于拥有专门安全运营中心和成熟SIEM平台的组织,因为它提供了以下优势:
- 最小延迟: 日志在15分钟内即可在SIEM中可用,实现快速威胁检测和响应。
- 原生集成: 日志直接流入您现有的安全工具和告警工作流。
- 无需中介: 消除了先通过Foundry路由日志的需求,简化了架构,并节省了在Foundry中存储/计算审计日志的潜在成本(如果不需要导出到Foundry本身进行分析)。
- 标准协议: 使用与大多数主流SIEM平台集成的熟悉API模式。
SIEM导入是audit.3日志相对于历史audit.2日志的优势——后者没有允许外部SIEM直接导入的公共API(audit.2日志必须先导出到Foundry数据集,然后再从那里导出到SIEM)。
SIEM导入的分页和轮询¶
list-log-files端点使用基于令牌的分页。nextPageToken字段始终存在于响应中,即使data字段为空或被省略(意味着没有新的日志文件可用)。
要持续轮询新的审计日志,请遵循以下步骤:
- 使用
startDate和可选的endDate发起初始请求。 - 存储响应中的
nextPageToken。 - 在后续请求中,提供存储的
nextPageToken以从中断处继续。自上次请求以来产生的新日志文件将被返回。 - 当
data字段为空时,表示尚无新日志,但nextPageToken仍然有效,应保存用于下一个轮询周期。
这种设计允许SIEM集成以任何频率进行轮询,而不会丢失日志或收到重复日志。如果您省略endDate参数,端点将始终返回最新的可用日志,使其适用于无限制的持续轮询。
API访问的身份验证¶
要访问审计API端点,您必须对SIEM请求进行身份验证。我们推荐以下方法,按一般偏好顺序排列。然而,最适合您的方法取决于您所需的安全模型。
请注意,您的API请求必须包含一个带有audit-export:view权限的认证头,针对您请求日志的组织。audit-export:view是一个门控操作,必须授予给在认证头中使用的客户端/用户,以便访问其审计日志被请求的组织。在控制面板的组织权限下,为生成的客户端/用户授予一个角色,该角色包含针对您需要日志的组织的使用审计日志创建数据集工作流。请查阅权限文档以获取更多信息。
通过开发者控制台(Developer Console)的第三方应用程序(推荐):
最安全且可维护的方法是,在开发者控制台应用程序中创建一个具有适当审计日志权限的第三方应用程序。开发者控制台允许您创建和管理使用本体论SDK(Ontology SDK)和OAuth与公共API通信的应用程序。
此方法提供:
- 针对审计日志访问的范围限定权限。
- 通过OAuth2客户端凭证流程实现更好的安全性。
- 更轻松的凭证轮换和管理。
- 清晰的审计跟踪,记录哪个应用程序正在访问日志。
OAuth2客户端凭证:
如果您无法使用开发者控制台应用程序,OAuth2客户端凭证提供了一种安全的编程身份验证方法,适用于自动化SIEM导入。
用户令牌(不推荐):
应避免在生产SIEM集成中使用管理员用户令牌,因为它们具有以下限制:
- 与单个用户账户绑定,造成运营依赖。
- 可能拥有比必要更广泛的权限。
- 轮换困难,且可能中断服务。
- 被自动化系统使用时会产生不清晰的审计跟踪。
审计导出到Foundry¶
audit.2和audit.3日志都可以通过控制面板(Control Panel)中的审计日志工具,按组织直接导出到Foundry数据集。这种方法适用于当前没有SIEM工具但仍需要分析其审计日志的组织。
一旦审计日志数据落入Foundry数据集,就可以直接在Foundry中进行分析。管道构建器(Pipeline Builder)可用于原型设计或基本的快速分析,而代码仓库(Code Repositories)由于数据规模巨大,可以执行长期分析。审计日志数据集可能过大,无法在不先进行过滤的情况下在Contour中有效分析。您可以选择通过数据连接(Data Connection)从Foundry导出到外部SIEM(如果SIEM是审计日志分析的最终目的地,则应通过公共API直接消费audit.3日志)。
导出权限¶
要导出审计日志,您需要在目标组织上拥有audit-export:orchestrate-v3操作(用于audit.3)。这可以通过控制面板中的组织管理员角色授予,可在组织权限选项卡中配置。请查阅组织权限文档以获取更多详细信息。
导出设置¶
要设置审计日志导出到Foundry数据集,请遵循以下步骤:

- 导航至控制面板(Control Panel)。
- 在顶部工具栏中,确认从下拉菜单中选择了相关的组织(Organization)。
- 在左侧边栏中,选择搜索(Search),然后搜索并选择审计日志(Audit logs)。
- 选择创建导出数据集(Create export dataset)。
- 选择日志类型:选择audit.3以使用当前审计日志的新模式,或选择audit.2(如果您需要分析从升级到
audit.3模式之前的历史审计日志)。在从audit.2迁移到audit.3时,请查阅重要限制。 - 为此审计日志数据集在Foundry中选择一个导出位置和名称(export location and name)。
- 使用标记(Markings)来限制您高度敏感的审计日志数据集,并指定一组具有"需要知道"访问权限的合格平台管理员,他们可以查看潜在的敏感使用详情,如PII和搜索查询。默认情况下,审计日志数据集将使用上面选择的组织进行标记。请查阅我们的组织标记(organization markings)文档以获取更多信息。
- 可选地,启用开始日期过滤器(start date filter),将此数据集限制为在给定日期或之后发生的事件。
- 可选地,启用数据集特定的保留策略(retention policy),以限制在此特定导出数据集中保留日志的天数(最长730天)。请注意,保留策略基于日志添加到导出数据集时的事务时间戳,而不是日志条目本身的时间戳。删除相关事务可能需要最多七天。以下两个示例说明了这在实践中的工作方式:
- 示例一:
- 开始日期:2025-01-01
- 保留策略:90天
- 在这种情况下,导出数据集最初将包含自2025年1月1日以来的日志。在数据集创建90天后,保留策略将生效;仅保留最近90天内添加的日志,而不是最初在数据集中的所有自2025年1月以来的日志。实际上,这意味着当旧日志被移除后,数据集的大小在90天后会显著减小。
- 示例二:
- 开始日期:30天前
- 保留策略:90天
- 在此场景中,导出数据集将从过去30天的日志开始。随着新日志的添加,数据集将增长,直到它包含最近90天数据的滚动窗口。根据保留策略,在超过90天的事务中添加的日志将被移除。
- 确认您了解创建数据集以及应用于审计日志导出数据集的配置设置的安全影响,然后选择创建导出(Create export)。
更新或禁用导出数据集¶
导出数据集更新:
- 对于较大的环境,
audit.2数据集的构建可能会在最初几小时(或更长时间)内产生空的追加事务。这是预期行为,因为管道正在处理审计日志的完整积压。对于新的audit.3数据集导出,不存在此延迟。 - 新日志会按固定节奏追加到导出数据集。每次追加最多拉取100 GiB的日志数据(以及10k个文件)用于
audit.3数据集,以及10 GiB用于audit.2数据集。通常仅在首次创建审计日志数据集时才需要如此大的追加。 - 每次审计日志追加的运行时间与追加到审计日志数据集的新日志数量成正比。
- 控制导出数据集构建的调度由审计导出服务管理,对用户不可见。
Audit.3导出数据集在其模式中使用date列作为Hive分区列。这意味着date列作为分区键而不是常规数据列,这会影响查询时数据的显示方式。例如,date值将被格式化为日期类型而不是日期时间。Hive分区的主要优点是,对于临时分析,首先按date过滤而不是按时间戳过滤,将在Spark或常规构建中提供更快的查找速度。
导出数据集禁用: 要禁用导出,请打开审计日志数据集并选择文件 > 移动到回收站(File > Move to Trash),或手动将数据集移动到另一个项目。
- 移动审计日志数据集将在大约一小时后停止该数据集的任何进一步构建(如果它位于回收站或不同的项目中)。
- 请注意,一旦停止,这些构建无法重新启动,即使数据集随后从回收站恢复或移回原始项目。
审计日志模式¶
Palantir产品生成的所有日志都是结构化日志。这意味着它们遵循特定的模式,下游系统可以依赖该模式进行自动化分析和告警。
Palantir审计日志目前以历史audit.2模式以及新的改进模式audit.3交付。新事件的Audit.2日志将很快停止提供导出,只有历史日志将以该模式提供;所有新事件将仅以audit.3模式提供导出。
在audit.2和audit.3模式中,审计日志会根据生成日志的产品而有所不同。这是因为每个产品处理不同的领域,因此需要描述不同的问题。这种差异在audit.2中更为明显,如下文所述。
产品特定信息主要捕获在audit.3日志的requestFields和resultFields中(或audit.2日志的request_params和result_params中)。这些字段的内容将根据执行日志记录的产品以及被记录的事件而变化。
审计日志类别(Audit log categories)¶
Palantir日志使用一个称为审计日志类别(audit log categories)的概念,使日志更容易理解,而无需太多产品特定知识。安全分析师无需跟踪数百个特定于服务的事件名称,而是可以专注于高级操作,如数据加载、权限更改或身份验证尝试,无论生成日志的是哪个产品或功能。这种抽象使分析师能够构建跨所有Foundry服务工作的监控查询,而无需了解实现细节。例如,过滤dataExport会捕获所有数据导出事件,无论使用哪个产品导出数据。
通过审计日志类别,审计日志被描述为可审计事件的联合。审计日志类别基于一组核心概念,并分为描述对这些概念的操作的类别,例如:
| 类别 | 描述 | 示例用例 |
|---|---|---|
authenticationCheck |
通过编程或手动身份验证事件(如令牌验证)检查身份验证状态。 | 检测表明凭证填充的令牌验证模式。 |
dataCreate |
表示向平台添加一些之前不存在的新数据条目。如果在着陆服务中记录,此事件可能在另一个服务中反映为dataPromote。 |
跟踪数据创建模式并执行治理策略。 |
dataDelete |
与数据删除相关,无论删除的粒度如何。 | 对关键或受保护资源的删除发出告警。 |
dataExport |
从平台导出数据。用于从平台下载数据的实例,例如导出到Palantir外部的系统、CSV文件等。如果数据导出到另一个Palantir系统,请使用dataPromote类别。 |
对大量导出、敏感数据导出或非工作时间的导出发出告警。 |
dataImport |
导入到平台。与dataPromote不同,dataImport仅指从平台外部导入的数据。这意味着一个服务中的dataImport可能在另一个服务中显示为dataPromote。 |
监控恶意文件上传或策略违规。 |
dataLoad |
指加载数据以返回给用户。对于纯后端加载,请使用internal。 |
建立基线正常访问模式并检测异常批量数据访问。 |
tokenGeneration |
导致生成新令牌的操作。 | 检测可能表明准备批量数据访问的异常令牌创建。 |
userLogin |
用户的登录事件。 | 监控失败的登录尝试、异常登录时间或地理位置异常。 |
userLogout |
用户的登出事件。 | 跟踪会话持续时间并识别异常长的会话。 |
审计日志类别也经历了版本变更,从audit.2日志中较松散的形式变为audit.3日志中更严格、更丰富的形式。在audit.3中,每个日志必须指定至少一个类别,并且每个类别精确定义了哪些请求和结果字段将存在,使自动化分析更加可靠。
请参考我们记录的审计日志类别(audit log categories)以获取带有字段规范的详细类别列表。同时,请查阅监控安全审计日志(monitoring security audit logs)文档以获取关于使用类别的额外指导。
审计日志归属(Audit log attribution)¶
审计日志被写入每个环境的单个日志归档。当审计日志通过交付管道处理时,用户ID字段(下文模式中的uid和otherUids)被提取,并且用户被映射到其对应的组织。
为给定组织编排的审计导出仅限于归属于该组织的审计日志。仅由服务(非人类)用户执行的操作通常不会归属于任何组织,因为这些用户不是组织成员。对于使用客户端凭证授权(client credentials grants)且仅由注册组织使用的第三方应用程序的服务用户,其生成的审计日志归属于该组织,这是一个特殊情况。
Audit.3日志¶
任何新的日志导出或分析应使用audit.3日志,而不是audit.2日志。
Audit.3日志基于新的日志模式构建,相比audit.2日志提供了许多优势。新模式及相关交付管道对审计日志消费者的主要好处如下:
- 更快的交付: 延迟从可能超过24小时减少到15分钟或更短,实现近实时的威胁检测。
- 结构化类别: 强制执行的、文档化的类别为分析提供了可预测的结构。这有助于减少在推理日志内容时理解单个产品的需求,特别是当分析需求涵盖整个工作流时(例如,
dataExport)。 - 增强的顶层字段: 新字段如
product允许按产品名称更轻松地过滤,而不是需要复杂的事件名称映射过滤器来将分析集中在单个产品的使用上。 - 直接API访问: 现在有公共API可用于将审计日志导入外部SIEM,无需Foundry作为中介。
- 丰富的上下文: 在交付期间提取的关于用户、资源和关系的额外上下文信息可用于更轻松的下游分析。
模式保证¶
Audit.3日志在生成时遵循以下保证:
- 明确的类别定义: 每个审计类别明确定义其适用的值/项目;例如,
dataLoad描述了被加载的精确资源。 - 类别的联合: 每个日志严格作为审计类别的联合生成。这意味着日志不会包含自由格式的数据,确保可预测的结构。
- 提升的关键信息: 审计日志中的某些重要信息被提升到
audit.3模式的顶层。例如,所有命名的资源都存在于顶层,以及请求和结果字段中。
这些保证意味着,对于任何特定的日志,都可以判断(1)哪个可审计事件创建了它,以及(2)它确切包含哪些字段。这些保证是与产品无关的,使安全分析师能够构建跨所有Foundry服务工作的监控查询。
Audit.3模式参考¶
audit.3日志模式如下所示:
| 字段 | 类型 | 描述 |
|---|---|---|
categories |
set<string> |
此审计事件产生的所有审计类别。 |
entities |
list<any> |
此日志的请求和结果字段中存在的所有实体(例如,资源)。 |
environment |
optional<string> |
产生此日志的环境。 |
eventId |
uuid |
可审计事件的唯一标识符。可用于对属于同一事件的日志行进行分组。例如,在流式传输给消费者的大型二进制响应开始和结束时发出的行中,会记录相同的eventId。 |
host |
string |
产生此日志的主机。 |
logEntryId |
uuid |
此审计日志行的唯一标识符,在系统中的任何其他日志行中不会重复。请注意,在导入Foundry期间,某些日志行可能会被复制,并且可能存在多行具有相同的logEntryId。具有相同logEntryId的行是重复的,可以忽略。 |
name |
string |
审计事件的名称,通常遵循(产品名称)_(端点名称)的结构,使用全大写、蛇形命名法。例如:DATA_PROXY_SERVICE_GENERATED_GET_DATASET_AS_CSV2。 |
orgId |
optional<string> |
uid所属的组织(如果可用)。 |
origin |
optional<string> |
发起机器的尽力而为标识符。例如,IP地址、Kubernetes节点标识符等。此值可能被伪造。 |
origins |
list<string> |
网络请求的来源,由请求头确定。此值可能被伪造。 要识别用户发起的请求的审计日志,请过滤到具有非空 origins的审计日志。具有空origins的审计日志对应于Palantir后端在履行用户发起的请求时发出的服务发起请求。如果具有非空 origins的审计日志的categories包含apiGatewayRequest,则相关请求由API网关履行。要查找API网关为履行用户发起的请求而发出的请求的审计日志,请过滤到具有相同traceId且userAgent以此审计日志中的service开头的日志。 |
product |
string |
产生此日志的产品。 |
producerType |
AuditProducer |
此审计日志的产生方式;例如,来自后端(SERVER)或前端(CLIENT)。 |
productVersion |
string |
产生此日志的产品的版本。 |
requestFields |
map<string, any> |
方法调用时已知的参数。 请求和结果字段中的条目将取决于上面定义的 categories字段。 |
result |
AuditResult |
指示请求是否成功或失败的类型;例如,ERROR或UNAUTHORIZED。 |
resultFields |
map<string, any> |
在方法内派生的信息,通常是返回值的一部分。 |
sequenceId |
uuid |
共享相同eventId的事件的尽力而为排序字段。 |
service |
optional<string> |
产生此日志的服务。 |
sid |
optional<SessionId> |
会话ID(如果可用)。 |
sourceOrigin |
optional<string> |
网络请求的来源,由TCP堆栈确定。 |
stack |
optional<string> |
生成此日志的堆栈。 |
time |
datetime |
RFC3339Nano UTC日期时间字符串,例如 2025-11-13T23:20:24.180Z。 |
tokenId |
optional<TokenId> |
API令牌ID(如果可用)。 |
traceId |
optional<TraceId> |
Zipkin跟踪ID(如果可用)。 |
uid |
optional<UserId> |
用户ID(如果可用)。这是最下游的调用者。 |
userAgent |
optional<string> |
发起此日志的用户的用户代理。 |
users |
set<ContextualizedUser> |
此审计日志中存在的所有用户。ContextualizedUser:字段:
|
:::callout{theme="neutral"}
在当前audit.3管道中,仅填充了uid字段。userName、firstName、lastName和realm字段未设置,groups始终为空。填充这些字段需要对身份提供者进行实时查找,这与audit.3管道的低延迟设计不兼容。要使用用户档案信息丰富审计日志,请在Foundry管道中对用户目录数据执行下游查找。
:::
我们通常建议使用不可变的product模式字段来过滤审计日志,以便分析特定应用程序。service字段有时可能对消费者有用,因为它允许在同一产品的不同实例之间进行过滤,可能有助于更细致地理解特定事件。
Audit.2日志¶
只有对新audit.3日志记录之前的历史时期进行分析时,才应使用audit.2日志。从audit.2迁移到audit.3时,请参考重要限制,因为将新事件作为audit.2模式日志归档的功能将很快被弃用。Audit.2日志没有跨产品的保证来规定请求或结果参数的形状。因此,对审计日志的推理通常必须逐个产品进行。
Audit.2日志可能在其中呈现一个审计类别,这对于缩小搜索范围很有用。但是,此类别不包含进一步的信息,也不规定审计日志其余部分的内容。此外,audit.2日志不保证包含审计类别。如果存在,类别将包含在request_params中的_category或_categories字段中。
audit.2日志导出数据集的模式如下所示。
| 字段 | 类型 | 描述 |
|---|---|---|
filename |
.log.gz |
来自日志归档的压缩文件名。 |
ip |
string |
发起IP地址的尽力而为标识符。 |
name |
string |
审计事件的名称,例如PUT_FILE。 |
request_params |
map<string, any> |
方法调用时已知的参数。 |
result |
AuditResult |
事件的结果(成功、失败等)。 |
result_params |
map<string, any> |
在方法内派生的信息,通常是返回值的一部分。 |
sid |
optional<SessionId> |
会话ID(如果可用)。 |
time |
datetime |
RFC3339Nano UTC日期时间字符串,例如:2025-11-13T23:20:24.180Z。 |
token_id |
optional<TokenId> |
API令牌ID(如果可用)。 |
trace_id |
optional<TraceId> |
Zipkin跟踪ID(如果可用)。 |
type |
string |
指定审计模式版本:"audit.2" |
uid |
optional<UserId> |
用户ID(如果可用);这是最下游的调用者。 |
从audit.2迁移到audit.3¶
当前使用audit.2日志进行安全监控或合规的组织必须将其分析迁移到audit.3模式。本节提供有关转换审计日志工作流以利用audit.3中包含的改进的指导。
audit.2和audit.3之间的差异¶
audit.3模式代表了与audit.2的根本性架构变更,而不仅仅是版本更新。影响迁移的关键差异在以下各节中描述。
字段名称和结构变更¶
audit.2中的request_params和result_params现在在audit.3中分别命名为requestFields和resultFields。requestFields和resultFields的内容通常与其audit.2对应项(request_params和result_params)完全不同。- 在
audit.2的request_params中找到的信息现在可能在audit.3的resultFields中,反之亦然。 - 在
audit.3中查找特定信息时,始终检查requestFields和resultFields。 - 顶层字段
type标识产生日志的模式:"audit.2"或"audit.3"。
事件名称变更¶
- 模式之间的事件名称(在
name字段中)可能不同,这可能导致当解析特别依赖于事件名称时出现不匹配。 - audit.3中的事件名称更加标准化,通常遵循(产品名称)_(端点名称)的结构,使用全大写、蛇形命名法。例如:
DATA_PROXY_SERVICE_GENERATED_GET_DATASET_AS_CSV2。 - 虽然并非所有
audit.2到audit.3事件名称都存在直接的1:1映射,但大多数audit.2日志事件名称可以直接映射到audit.3。 - 推荐方法: 在未来的查询中使用审计类别(audit categories)而不是特定的事件名称,因为类别在两个模式中提供了一致的抽象(尽管
audit.2具有可选的类别使用)。
新功能¶
- 在
audit.3中,有一个顶层categories字段,具有强制执行的标准化值。 - 新字段如
product、service、entities和users在audit.3中可用,便于过滤。 users字段在audit.3中可用,包含与日志条目关联的所有用户的用户ID。额外的用户档案字段(userName、firstName、lastName、groups、realm)在模式中定义,但目前未填充;如果需要,可在下游进行丰富。
重要限制¶
- 无历史回填:
Audit.3仅捕获在其启用后发生的事件。历史事件将仅保留在audit.2日志模式中。
:::callout{theme="neutral"}
由于2025年11月24日至25日期间为准备正式发布而进行的基础设施更新,某些注册可能在audit.3日志可用性方面存在大约三小时的潜在缺口。我们建议使用audit.2日志作为回退,以覆盖此期间的分析需求,并在2025年12月4日之后完全迁移到audit.3进行分析。
:::
- 并发日志记录期: 在过渡期间,两种模式都会发出日志,允许在完全切换前进行验证。
迁移路径¶
如果您有基于audit.2日志构建的现有分析,您的迁移方法取决于您当前执行分析的位置。请遵循与您的分析方法相匹配的以下部分中的步骤。
如果您使用外部SIEM¶
我们强烈建议迁移到直接API导入,而不是继续使用Foundry导出数据集。这种方法提供:
- 显著更低的延迟: 大约15分钟或更短,而
audit.2需要24小时或更长时间。 - 更简单的架构: 消除了Foundry作为中介,减少了额外的延迟或故障点。
- 原生SIEM集成: 完整日志直接流入您现有的安全工具。
迁移步骤:
- 配置您的SIEM从
list-log-files和get-log-file-content审计API端点进行导入。 - 更新解析规则以处理新的
audit.3模式结构(请参阅下面的部分)。 - 尽可能更新检测规则以使用类别而不是事件名称。
- 在验证期间与您现有的
audit.2导入并行运行。 - 彻底测试告警和检测工作流。
- 验证后,弃用您的
audit.2Foundry导出和SIEM导入。
如果您在Foundry中分析日志¶
- 按照导出设置说明在控制面板中创建一个新的
audit.3导出数据集。 - 在过渡期间保持您现有的
audit.2导出数据集运行。 - 创建一个新的全局分支(Global Branch),通过捆绑PR(如果需要,可能在多个代码仓库中)来执行迁移。
- 使用数据沿袭(Data Lineage)应用程序查看您在Foundry中对
audit.2数据的下游消费情况,以了解哪些分析已弃用,哪些需要迁移。 - 按照下面的重构步骤更新您的分析。
识别基于类别的分析模式¶
与其按每个事件名称迁移分析,不如重构您的分析以使用标准化的审计类别(audit categories)而不是单个产品事件名称:
旧方法(audit.2):
# 脆弱:依赖于可能更改的特定事件名称
export_events = audit_logs.filter(
col("name").isin(["EXPORT_DATASET", "DOWNLOAD_FILE", "CREATE_EXTERNAL_CONNECTION"])
)
新方法(audit.3):
# 健壮:使用标准化类别
export_events = audit_logs.filter(
col("categories").contains("dataExport")
)
这种基于类别的方法提供了几个优势:
- 面向未来: 发布的新导出功能将使用
dataExport类别。 - 全面: 无需枚举每个事件名称即可捕获所有导出机制。
- 跨产品: 在所有Foundry服务中一致工作。
在过渡期间进行验证¶
在并发日志记录期间:
- 并行对
audit.2和audit.3数据源运行分析。 - 比较输出以确保
audit.3分析捕获了预期的事件。 - 验证基于类别的过滤器是否提供与事件名称