WACP: Trail

WACP: Trail

Metadata

title: "WACP: Trail"
id: wacp-spec-trail
type: constituent-spec
tier: abstract
category: primitives
status: complete
created: 2026-02-24
lineage: PROTOCOL.md (wacp-v0.1)
protocol_sections:
  - §9
depends_on:
  - wacp-spec-clock
  - wacp-spec-identity
  - wacp-spec-roles
  - wacp-spec-workspace
  - wacp-spec-signal
  - wacp-spec-envelope
  - wacp-spec-checkpoint
  - wacp-spec-task
  - wacp-spec-integration
  - wacp-spec-human-highway
  - wacp-spec-user
authors:
  - Akil Abderrahim (Lead)
  - Claude Opus 4.6 (co-author)
tags: [wacp, trail, audit, observability, history, integrity, storage]

Table of Contents

  1. Purpose
  2. Trail Entry Schema
  3. Scopes
  4. Integrity Invariants
  5. Querying
  6. Observability Views
  7. Storage Model
  8. Compaction and Retention
  9. Event Registry
  10. Conformance Requirements
  11. Implementation Notes

1. Purpose

The trail is the protocol’s memory. AI agents have no persistent state after a run ends — their context windows close, their working memory is released, their reasoning is gone. What survives is the trail: the complete, ordered, immutable record of everything that happened. The trail is not a log. Logs are operational afterthoughts — write-behind, best-effort, grepped when something breaks. The trail is first-class: append-only, queryable, and authoritative. If an event is not in the trail, it did not happen. If it is in the trail, it is immutable.

Every other Protocol primitive produces trail entries. Workspaces record state transitions. Signals record emissions and deliveries. Envelopes record creation, delivery, rejection. Checkpoints record production and validation. Tasks record lifecycle changes. Integration records decisions. The Human Highway records every human action. This spec does not define those events — each primitive’s spec defines its own trail entries (Workspace spec §13, Signal spec §8, Envelope spec §10, Checkpoint spec §7, Task spec §8, Integration spec §8, Human Highway spec §8). This spec defines the container: the entry format, the scoping model, the integrity guarantees, the query interface, the storage lifecycle, and the consolidated event registry.

Three systems depend on the trail as their foundation. Recovery replays trail entries to reconstruct system state after failure — the trail is the single source of truth from which any workspace can be restored (PROTOCOL §10.2). Observability reads the trail to produce views — run overviews, workspace timelines, message flows, workflow comparisons. Security audits the trail to verify that every action was authorized and every actor is accountable — the hash chain makes tampering detectable (PROTOCOL §11.5). The trail serves all three because it captures the same thing: what happened, in what order, by whom.

2. Trail Entry Schema

Every protocol event produces exactly one trail entry. The entry is the atomic unit of the trail — indivisible, immutable once written, and self-describing. All entries share a common header; each event type adds a type-specific body.

trail_entry:
  id: string              # globally unique (Identity spec, rule 1)
  timestamp: datetime     # runtime-assigned (Clock spec, invariant 1)
  workspace: workspace_id # the workspace this event belongs to (null for system-level events)
  actor: string           # who caused this event (see actor semantics below)
  event_type: string      # the event type from the registry (§9)
  body: object            # type-specific payload (defined by each primitive's spec)
  prev_hash: string       # hash of the previous entry in the same scope (§4)

Field semantics:

  • id — globally unique, runtime-assigned, never reused (Identity spec, rules 1, 6). The entry ID is the trail’s primary key — every reference to a trail event uses this ID.
  • timestamp — assigned by the runtime at the moment the event occurs, not when the entry is written. Monotonic within a workspace (Clock spec, invariant 1). Across workspaces, timestamps provide partial ordering only (Clock spec, invariant 2) — two entries in different workspaces with the same timestamp are concurrent.
  • workspace — the workspace this event is associated with. Most events belong to exactly one workspace. System-level events (system lifecycle, capacity exhaustion) have workspace: null — they belong to the global trail but no specific workspace.
  • actor — identifies who caused the event. Three categories:
Actor valueMeaningExample events
Role name (coordinator, worker, or derived)An agent in that role caused the eventsignal_emitted, checkpoint_created, envelope_created
protocolThe runtime or protocol rules caused the eventworkspace_state_changed, budget_exceeded, envelope_rejected
user_idA specific human identified by their user_id (User spec, §2)gate_resolved, human_injection, escalation_resolved

The actor field serves two purposes: it classifies the source (agent, runtime, or human) and, for human actions, identifies the specific person. Agent actors are recorded by role, not by agent ID — agent identity is tracked through the workspace binding. Human actors are recorded by user_id — the protocol carries the specific human identity because the trail must answer “what did user X do?” without application-layer joins. The system tokens "fallback" and "protocol" are reserved — any actor value that is not a role name or a system token is a user_id.

  • event_type — a string from the event registry (§9). The event type determines the schema of the body field. Unknown event types are rejected — the registry is closed within a protocol version.
  • body — the type-specific payload. Each primitive’s spec defines the body schema for its event types. The trail spec does not constrain body contents beyond requiring that they are structured objects (not opaque blobs). Bodies must be serializable and round-trip without loss.
  • prev_hash — cryptographic hash of the previous entry in the same scope (local trail or global trail). Forms the hash chain that makes the trail tamper-evident (§4). The first entry in a scope has prev_hash: null — it is the chain’s anchor.

One event, one entry. Every protocol event produces exactly one trail entry. No event produces zero entries (the no-gaps invariant, §4). No event produces multiple entries — if an event has compound effects (e.g., a workspace state change that also triggers a budget check), each effect is a separate event with its own entry. The trail is a sequence of atomic facts, not a sequence of compound narratives.

Write-ahead. The trail entry is written before the event takes effect. A workspace state transition is recorded in the trail before the state machine is updated. An envelope delivery is recorded before the envelope is placed in the inbox. This ensures that if the runtime crashes mid-operation, the trail reflects the intent — recovery can determine whether the operation completed or needs to be replayed (PROTOCOL §10.2).

3. Scopes

The trail exists at two scopes: local and global. The local trail is a workspace’s own history — component 7 of the internal model (Workspace spec, §2). The global trail is the system’s complete history — the union of all local trails, ordered by timestamp. The global trail is the master record; local trails are projections of it.

Local trail. Every workspace has a local trail — the ordered sequence of trail entries where workspace equals that workspace’s ID. The local trail captures everything that happened to and within the workspace: state transitions, envelope deliveries, signal emissions, checkpoint creations, budget events, migration events, suspension events, visibility grants. The agent can read its own local trail to understand its history (Workspace spec, §2, component 7). The local trail is strictly ordered — entries are monotonic by timestamp with no duplicates (Clock spec, invariant 1). This total ordering within a workspace enables deterministic replay.

Global trail. The global trail is the union of all local trails plus system-level entries (workspace: null). It is ordered by timestamp, but the ordering is partial — entries from different workspaces with the same timestamp are concurrent (Clock spec, invariant 2). The global trail is not a separate data structure that duplicates local trails. It is the master: local trails are projections derived by filtering on workspace. Implementations may store the global trail as a single sequence and derive local views at query time, or maintain local trails independently and reconstruct the global view by merging — both are valid as long as the projection relationship holds.

Access rules. Trail access follows the role hierarchy:

RoleLocal trail accessGlobal trail access
WorkerOwn workspace’s local trail onlyNo access
ObserverOwn workspace’s local trail, plus observed workspaces’ local trails (per visibility set)No access
CoordinatorAll local trails in the workspace treeFull global trail

The coordinator reads the global trail to understand the system’s state. Workers read their own local trail to understand their history — they cannot read other workspaces’ trails unless granted visibility (Workspace spec, §8). Observers read the trails of workspaces in their visibility set. These access rules are enforced by the runtime; unauthorized trail reads are rejected and recorded as trail entries themselves (a trail_access_denied event).

System-level entries. Some events do not belong to any workspace: system initialization, forced shutdown, capacity exhaustion at the system level. These entries have workspace: null and appear only in the global trail — they have no local trail to belong to. System-level entries use actor: protocol because the runtime, not an agent, caused them.

The projection invariant. For any workspace W, the local trail of W equals the global trail filtered to entries where workspace == W.id, preserving order. This is not an optimization — it is a correctness requirement. If the local trail and the global trail disagree about what happened to a workspace, the system is in an inconsistent state. Recovery validates this invariant (PROTOCOL §10.2).

4. Integrity Invariants

The trail’s value depends on trust. If entries can be deleted, the trail cannot prove what happened. If entries can be modified, the trail cannot prove what was recorded. If events can occur without producing entries, the trail cannot prove what didn’t happen. Three invariants protect these properties.

Invariant 1: Append-only. No trail entry may be deleted within the retention window (§8). The trail only grows. Entries move between storage tiers (hot → warm → cold) but are never removed until the retention policy for their tier permits it. Even then, deletion is of the physical storage — a trail_compacted summary entry proves the original entries existed. The append-only property is what makes the trail a reliable audit record. If deletion were permitted, recovery could not trust the trail to be complete, and security audits could not trust the trail to be honest.

Invariant 2: Immutable. No trail entry may be modified after creation. The id, timestamp, workspace, actor, event_type, body, and prev_hash are fixed at write time. There is no update operation — only append. If a previous entry was incorrect (e.g., a resource meter discrepancy discovered later), the correction is recorded as a new entry that references the original, not as a modification of the original. The trail is a court record, not a wiki.

Invariant 3: No gaps. Every protocol event produces exactly one trail entry. If an event occurs and no entry is recorded, the system is in violation. The no-gaps invariant is enforced by the write-ahead rule (§2): the trail entry is written before the event takes effect. If the write fails, the event does not proceed — the runtime must either retry the write or fail the operation. A gap in the trail means either the runtime has a bug or the storage layer has lost data — both are system failures that trigger recovery (PROTOCOL §10.2).

Hash chain. Each trail entry includes a prev_hash field — the cryptographic hash of the previous entry in the same scope. Local trails have their own hash chains (each entry hashes its predecessor within that workspace). The global trail has a separate chain (each entry hashes its global predecessor). The first entry in each scope has prev_hash: null.

The hash chain provides tamper evidence: modifying or removing any entry breaks the chain from that point forward. Verification is a linear scan — compute the hash of each entry and compare it to the next entry’s prev_hash. A mismatch at entry N means entries at or before N have been tampered with. The hash chain does not prevent tampering — it makes tampering detectable.

Hash algorithm. The protocol does not mandate a specific hash algorithm. The runtime selects an algorithm at initialization and records it in the first trail entry’s body. All entries in a run use the same algorithm. Implementations should use a cryptographically strong hash (SHA-256 or better). The choice is recorded, not prescribed, so that the protocol can outlive any single algorithm.

Cross-scope anchoring. Each local trail’s chain is independent, but the global trail’s chain interleaves entries from all workspaces. This means the global chain implicitly orders and anchors local entries — if a local entry appears in the global chain at position N, it is anchored to the global chain’s integrity. Tampering with a local entry would break both the local chain and the global chain.

5. Querying

The trail is queryable. Every trail entry is a structured record with indexed fields — the trail is a database, not a file. Queries are read-only: they project, filter, and aggregate trail entries without modifying them.

Query dimensions. A trail query filters entries along one or more dimensions:

DimensionFieldExample
WorkspaceworkspaceAll entries for ws-task-03
ActoractorAll entries by user X (actor == user_id) or all agent entries (actor == coordinator)
Event typeevent_typeAll checkpoint_created entries
Time rangetimestampAll entries between T1 and T2
Body fieldbody.*All workspace_state_changed where to_state == failed

Compound queries. Dimensions combine with AND logic. “All envelope_delivered events for workspace ws-task-03 between T1 and T2” filters on event type, workspace, and time range simultaneously. OR logic is expressed as multiple queries — the protocol does not define a query language, only the dimensions that must be filterable.

Aggregate queries. Beyond filtering, the trail supports aggregation for analytical use:

  • Count — how many entries match a filter. “How many checkpoints did workspace W produce?” “How many envelopes were rejected?”
  • Group — partition matching entries by a dimension. “Count entries by event type for workspace W.” “Count failures by workspace for the entire run.”
  • Sum — aggregate numeric body fields. “Total tokens consumed across all workspaces in group G” (summing resource_usage.tokens from checkpoint_created entries).
  • Timeline — entries ordered by timestamp for a filter. This is the default — filtering without aggregation returns a timeline.

Aggregate queries are the foundation of comparative analysis. Comparing two workflow configurations means running the same aggregate queries against two runs’ trails and comparing the results: total cost, total time, failure count, checkpoint count, human intervention count. The trail makes workflows measurable.

Scope-aware querying. Queries respect the access rules (§3). A worker querying its own local trail sees all entries. A worker querying a peer’s trail sees nothing (unless visibility is granted). The coordinator queries the global trail and sees everything. The runtime enforces scope restrictions at query time — a query that requests entries outside the caller’s access scope returns an empty result, not an error. This is consistent with visibility semantics elsewhere in the protocol: you simply do not see what you do not have access to.

Query interface. The protocol defines what must be queryable, not how queries are expressed. Implementations may use SQL, a custom query API, in-memory filtering, or any mechanism that supports the dimensions and aggregations above. The protocol requires only that the trail store can answer these queries efficiently enough to support real-time observability — the coordinator must be able to query the trail during a run, not only after it.

6. Observability Views

Views are structured projections of the trail designed for specific audiences and questions. They are not separate data structures — each view is a query pattern applied to the trail. The protocol defines five standard views that implementations should support.

6.1 Run Overview. A high-level snapshot of the entire run. Shows every workspace’s current state, grouped by state category (productive, suspended, resolution, terminal). Answers: “What is happening right now?” and “How far along is the run?”

Constructed from: workspace_created entries (to enumerate workspaces), the latest workspace_state_changed entry per workspace (to determine current state), and task_status_changed entries (to show task progress as X of Y completed).

The run overview is the coordinator’s dashboard — the first thing read when assessing system health. It is also the human highway’s primary visibility surface (Human Highway spec, §2.1).

6.2 Workspace Timeline. A chronological view of a single workspace’s activity. Shows every trail entry for that workspace in timestamp order — state transitions, envelope deliveries, signal emissions, checkpoint creations, budget events. Answers: “What happened in this workspace, and in what order?”

Constructed from: the workspace’s local trail, unfiltered and unmodified. The workspace timeline is the simplest view — it is the local trail itself, presented chronologically. It is also the recovery target: replaying a workspace timeline reconstructs the workspace’s state (PROTOCOL §10.2).

6.3 Message Flow. A directed view of communication between workspaces. Shows envelopes and signals as edges between workspace nodes. Answers: “Who is talking to whom, and what are they saying?”

Constructed from: envelope_created and envelope_delivered entries (showing message flow), signal_emitted and signal_delivered entries (showing notification flow). Each edge carries the envelope type or signal type as a label. The message flow view reveals the communication topology — which workspaces are active collaborators, which are isolated, where bottlenecks form.

6.4 Comparison View. A side-by-side analysis of two or more runs. Shows the same aggregate metrics for each run, enabling direct comparison. Answers: “Which workflow configuration produced better results?”

Constructed from: aggregate queries (§5) applied to each run’s trail. Standard comparison metrics include: total cost, total wall time, workspace count, failure count, retry count, checkpoint count, human intervention count, and per-task completion time. The comparison view is the protocol’s core instrument for evaluating workflows — the finding that “review adds ~80% time but catches silent failures” was produced by this view.

Comparison requires that runs use the same task graph (or comparable graphs) so that metrics are meaningful. Comparing a 3-task run against a 30-task run tells you nothing about workflow quality. The protocol does not enforce comparability — it provides the instrument. The coordinator (or human) decides what to compare.

6.5 Highway Activity View. All human interactions extracted from the trail. Shows gate events (triggered, resolved, timed out), injections, and escalations in chronological order. Answers: “What did humans do, and when?”

Constructed from: all trail entries where actor is a user_id (not a role name or system token), plus gate events (gate_triggered, gate_resolved, gate_timeout), injection events (human_injection), and escalation events (escalation_received, escalation_resolved, escalation_timeout). The highway activity view gives the human supervisor a record of their own interventions — and in multi-user deployments, can be filtered by user_id to show a single user’s actions.

7. Storage Model

The trail grows monotonically. Every protocol event adds an entry, and entries are never deleted within the retention window. Without a storage model, the trail would consume unbounded resources. The three-tier storage model manages this growth by matching access latency to access frequency — recent entries are fast to read, older entries are progressively cheaper to store.

Three tiers:

TierContentsLatencyDurability
HotEntries for active workspaces (non-terminal state)Sub-millisecondMust survive runtime restart
WarmEntries for closed or failed workspaces within the retention windowMilliseconds to secondsMust survive runtime restart
ColdEntries archived beyond the retention window or compacted from warmSeconds to minutesMust survive storage failure (replicated or backed up)

Hot tier. Entries for workspaces in idle, active, blocked, migrating, suspended, integrating, or conflicted state. These entries are actively read — by the coordinator for observability, by agents for their local trail, by the runtime for state reconstruction. Sub-millisecond read latency is required because trail queries are in the critical path: the coordinator reads the trail to make dispatch decisions, and the runtime reads the trail to enforce the write-ahead rule. When a workspace reaches a terminal state (closed or failed), its entries are candidates for migration to warm.

Warm tier. Entries for terminal workspaces that are still within the retention window. These entries are read less frequently — during post-run analysis, comparison queries, or recovery after a delayed failure. Millisecond-to-second latency is acceptable. The warm tier is the trail’s primary storage — most entries live here for most of their lifetime, because most workspaces are terminal by the time the run completes.

Cold tier. Entries that have been archived beyond the retention window or compacted from warm storage. These entries are rarely read — only for historical analysis, compliance audits, or long-term comparisons across runs. Seconds-to-minutes latency is acceptable. Cold storage may use compressed formats, object storage, or offline media. The protocol requires only that cold entries remain queryable through the standard query interface (§5), even if queries are slow.

Tier migration. Entries move from hot to warm when their workspace reaches a terminal state. Entries move from warm to cold when compaction runs (§8) or when the retention window expires. Entries never move backward — cold entries do not return to warm, warm entries do not return to hot. Tier migration is a storage optimization; it does not change the entries themselves. An entry in cold storage has the same id, timestamp, body, and prev_hash as when it was in hot storage.

Durability requirements. Hot and warm tiers must survive a runtime restart — if the runtime crashes and recovers, the trail must still be complete. This means hot and warm storage must be durable (written to persistent storage, not only held in memory). Cold storage must survive storage failure — replication or backup is required. The trail is the protocol’s single source of truth; losing trail data is a system failure.

8. Compaction and Retention

The trail is append-only but not infinite. Compaction reduces the storage footprint of the warm tier by archiving non-essential entries to cold storage while preserving essential entries in place. Retention policies govern how long entries persist in each tier before they may be deleted entirely.

Compaction. Compaction operates on warm-tier entries — entries for terminal workspaces. It separates essential entries from archivable entries:

Essential entries (preserved in warm, never compacted):

CategoryEvent types
State transitionsworkspace_created, workspace_state_changed (terminal)
Checkpointscheckpoint_created
Integration decisionsintegration_completed, integration_aborted
Conflictsconflict_detected, conflict_resolved
Human decisionsgate_resolved, human_injection, escalation_resolved
Task milestonestask_created, task_completed, task_failed
FailuresAny entry where to_state == failed or reason indicates error

Archivable entries (moved to cold during compaction):

CategoryEvent types
Intermediate state changesworkspace_state_changed (non-terminal transitions)
Routine deliveriesenvelope_delivered, signal_delivered
Budget monitoringbudget_warning, liveness_warning
Operational eventspriority_changed, visibility_granted

The distinction is recoverability: essential entries are sufficient to reconstruct the workspace’s outcome — what was produced, whether it succeeded, what decisions were made. Archivable entries capture the journey — how the workspace got there. Both are valuable; essential entries are more valuable per byte.

Compaction summary. When compaction runs, it produces a trail_compacted entry in the warm tier:

trail_compacted:
  workspace: workspace_id
  entries_archived: integer        # count of entries moved to cold
  entries_preserved: integer       # count of entries kept in warm
  archive_range:                   # timestamp range of archived entries
    from: datetime
    to: datetime
  archive_hash: string             # hash of the archived entries (integrity proof)

The compaction summary proves that the archived entries existed — it records the count, time range, and hash. The no-gaps invariant is preserved: the entries still exist in cold storage, and the summary in warm storage proves their existence. Compaction does not delete — it relocates.

Retention policies. Retention governs how long entries persist before they may be deleted:

TierRetentionDeletion permitted?
HotAs long as workspace is activeNever — entries move to warm, not deleted
WarmConfigurable per run (recommended: run duration + analysis window)After retention expires, entries may be moved to cold
ColdConfigurable per deployment (recommended: months to years)After retention expires, entries may be permanently deleted

Hot tier entries are never deleted. They move to warm when the workspace terminates. Deleting hot entries would violate the write-ahead rule — the runtime needs hot entries for in-flight operations.

Warm tier retention should be long enough for post-run analysis. A run that completes in 10 minutes may need warm retention of hours (for immediate analysis) or days (for batch comparison across runs). The protocol recommends warm retention of at least the run duration plus an implementation-defined analysis window.

Cold tier retention is a deployment decision. Compliance requirements may demand years of retention. Cost-sensitive deployments may delete after weeks. The protocol requires only that cold entries remain queryable until their retention expires — once expired, deletion is permitted (not required).

Snapshots. A snapshot is a point-in-time immutable copy of the trail — all entries up to a given timestamp, frozen as a single artifact. Snapshots serve two purposes: archival (a snapshot is the boundary between “retainable” and “deletable”) and replay (a snapshot is a known-good starting point for recovery). Snapshots are taken at the warm tier — they capture the post-compaction state of a completed run. A snapshot’s integrity is verifiable through the hash chain — the chain from the first entry to the snapshot boundary must be unbroken.

9. Event Registry

The event registry is the consolidated index of every trail event type defined across the Protocol tier. Each primitive’s spec defines its own trail events — this registry collects them into a single reference. The registry is closed within a protocol version: only event types listed here are valid. Unknown event types are rejected by the runtime.

Workspace events (Workspace spec, §13) — 22 events:

EventCategory
workspace_createdLifecycle
workspace_state_changedLifecycle
workspace_rejectedLifecycle
budget_warningResource
budget_exceededResource
budget_modifiedResource
liveness_warningResource
priority_changedResource
visibility_grantedVisibility
batch_abortGroup
batch_priority_changedGroup
migration_startedMigration
migration_completedMigration
migration_failedMigration
suspension_startedSuspension
suspension_resumedSuspension
graceful_termination_initiatedGraceful termination
graceful_termination_expiredGraceful termination
conflict_detectedIntegration
conflict_resolvedIntegration
workspace_ownership_transferredOwnership
workspace_reparentedOwnership

User events (User spec, §8) — 12 events:

EventCategory
user_createdUser lifecycle
authentication_succeededAuthentication
authentication_failedAuthentication
user_suspendedUser lifecycle
user_resumedUser lifecycle
user_blockedUser lifecycle
user_unblockedUser lifecycle
user_deactivatedUser lifecycle
user_reactivatedUser lifecycle
capability_grantedCapability lifecycle
capability_revokedCapability lifecycle
capability_deniedCapability enforcement

Signal events (Signal spec, §8) — 2 events:

EventCategory
signal_emittedSignal lifecycle
signal_deliveredSignal lifecycle

Envelope events (Envelope spec, §10) — 9 events:

EventCategory
envelope_createdEnvelope lifecycle
envelope_deliveredEnvelope lifecycle
envelope_rejectedEnvelope lifecycle
envelope_undeliverableEnvelope lifecycle
envelope_redeliveredEnvelope lifecycle
port_right_createdPort right lifecycle
port_right_transferredPort right lifecycle
port_right_revokedPort right lifecycle
port_right_consumedPort right lifecycle

Checkpoint events (Checkpoint spec, §7) — 3 events:

EventCategory
checkpoint_createdCheckpoint lifecycle
checkpoint_rejectedCheckpoint lifecycle
resource_discrepancyCheckpoint lifecycle

Task events (Task spec, §8) — 7 events:

EventCategory
task_createdTask lifecycle
task_approvedTask lifecycle
task_assignedTask lifecycle
task_status_changedTask lifecycle
task_completedTask lifecycle
task_failedTask lifecycle
graph_createdTask lifecycle

Integration events (Integration spec, §8) — 3 events:

EventCategory
integration_startedIntegration lifecycle
integration_completedIntegration lifecycle
integration_abortedIntegration lifecycle

Human Highway events (Human Highway spec, §8) — 8 events:

EventCategory
gate_triggeredGate
gate_resolvedGate
gate_timeoutGate
gate_reentry_blockedGate
human_injectionInjection
escalation_receivedEscalation
escalation_resolvedEscalation
escalation_timeoutEscalation

Recovery events (Recovery spec, §4) — 2 events:

EventCategory
system_degradedSystem failure
recovery_completedRecovery lifecycle

Security events (Security spec, §9) — 1 event:

EventCategory
integrity_violationIntegrity

Note: authentication_failed is listed under User events (above) since credential verification is the authentication boundary’s concern. Its canonical schema is defined in the Security spec §9, which covers all entity types (agents, coordinators, and users).

Trail-own events — 3 events:

EventCategory
trail_compactedStorage lifecycle
trail_access_deniedAccess control
trail_snapshot_createdStorage lifecycle

Total: 72 unique event types across 11 sources (8 primitive specs + 2 mechanism specs + trail-own). The conflict_detected and conflict_resolved events appear in both Workspace and Integration specs — they are the same events, defined in Workspace §13 (where the state transition occurs) and referenced in Integration §8 (where the cause originates). The workspace_ownership_transferred and workspace_reparented events appear in both Workspace §13 and User §8 — they are workspace events triggered by ownership operations; listed under Workspace, not duplicated under User. The authentication_failed event appears in both User §8 and Security §9 — the User spec lists it; the Security spec defines the canonical schema. The port right events are defined in Envelope spec §10 alongside envelope events. The registry counts each once.

Registry growth. New event types are added only through new protocol versions or new specs. The Infrastructure, System, and Runtime tiers will add their own trail events when those specs are drafted — the registry will grow accordingly. Within a protocol version, the registry is frozen.

10. Conformance Requirements

RequirementLevelDescription
Trail entry schemaCoreRuntime MUST produce trail entries conforming to the schema defined in §2 for every protocol event.
One event one entryCoreEvery protocol event MUST produce exactly one trail entry. No event may produce zero or multiple entries.
Write-aheadCoreTrail entries MUST be written before the corresponding event takes effect. If the write fails, the event MUST NOT proceed.
ImmutabilityCoreTrail entries MUST NOT be modified after creation. No update operation exists.
Append-onlyCoreTrail entries MUST NOT be deleted within the retention window.
Hash chainCoreEvery trail entry MUST include a prev_hash linking to the previous entry in its scope. The runtime MUST validate chain integrity.
Local trail projectionCoreFor any workspace W, the local trail MUST equal the global trail filtered to entries where workspace == W.id, preserving order.
Scope-aware accessCoreRuntime MUST enforce trail access rules (§3). Workers read own local trail only. Observers read per visibility set. Coordinator reads global trail.
Global trail orderingCoreThe global trail MUST order entries by timestamp. Entries with equal timestamps from different workspaces are concurrent — no artificial ordering imposed.
Query dimensionsStandardTrail store MUST support filtering by workspace, actor, event type, time range, and body fields as defined in §5.
Compound queriesStandardTrail store MUST support AND-combination of multiple query dimensions.
Aggregate queriesStandardTrail store MUST support count, group, and sum aggregations over query results.
Real-time queryabilityStandardTrail store MUST be queryable during a run, not only after completion. Hot-tier entries MUST be available for query within the same run.
Three-tier storageStandardRuntime MUST implement hot, warm, and cold storage tiers as defined in §7 with the specified latency and durability requirements.
Tier migrationStandardRuntime MUST migrate entries from hot to warm when workspaces reach terminal state. Entries MUST NOT move backward between tiers.
Hot and warm durabilityStandardHot and warm tier entries MUST survive runtime restart.
CompactionStandardRuntime MUST support compaction of warm-tier entries, preserving essential entries and producing trail_compacted summary entries.
Retention policiesStandardRuntime MUST support configurable retention windows for warm and cold tiers.
Observability viewsFullRuntime SHOULD support the five standard observability views defined in §6.
SnapshotsFullRuntime SHOULD support point-in-time trail snapshots with hash-chain-verifiable integrity.
Cold tier queryabilityFullCold-tier entries SHOULD remain queryable through the standard query interface.
User attribution on highway eventsStandardHuman-originated trail entries MUST carry the acting human’s user_id in the actor field (User spec, §2). Trail queries MUST support filtering by specific user_id.
Cross-run comparisonFullTrail store SHOULD support aggregate queries across multiple runs for workflow comparison.

11. Implementation Notes

These notes are non-normative. They capture practical guidance for implementers.

Trail store as a database. The simplest conformant implementation uses an append-only table in SQLite or PostgreSQL. Each trail entry is a row. The workspace, actor, event_type, and timestamp columns are indexed for query performance. The body column stores JSON. The hash chain is a computed column or maintained by a trigger on insert. SQLite is sufficient for single-runtime deployments; PostgreSQL or a distributed database is needed for multi-runtime federation.

Hot tier as in-memory with write-through. Keep hot-tier entries in an in-memory data structure (ordered map keyed by entry ID) and write-through to persistent storage on every append. This satisfies both the sub-millisecond latency requirement and the durability requirement. The in-memory structure is the read path; the persistent store is the durability path. On restart, reload hot-tier entries from persistent storage.

Hash chain implementation. Hash the concatenation of the entry’s fields (excluding prev_hash) using the selected algorithm. The hash input should be a canonical serialization — implementations must define a deterministic byte representation to avoid hash mismatches from field reordering or whitespace differences. JSON with sorted keys and no whitespace is a simple canonical form.

Write-ahead with transactions. If the trail store and the state machine share a database, use a single transaction: write the trail entry and update the state in one atomic operation. This simplifies the write-ahead guarantee — either both succeed or neither does. If the trail store is separate from the state machine, write the trail entry first, then update the state. If the state update fails after the trail write, the trail contains an entry for an event that did not complete — recovery detects this (the trail says “transitioning to X” but the state machine is still at the previous state) and replays the transition.

Compaction as a background job. Run compaction periodically (e.g., after each workspace termination, or on a timer) as a background process. Compaction reads warm-tier entries, classifies them as essential or archivable, moves archivable entries to cold storage, and writes the trail_compacted summary. Compaction should not block hot-tier writes — it operates on a different tier.

Observability views as materialized queries. The five standard views can be implemented as cached query results that refresh on a configurable interval. The run overview refreshes on every workspace_state_changed event. The workspace timeline is a live view of the local trail — no caching needed. The message flow and highway activity views refresh on relevant event types. The comparison view is computed on demand (it queries completed runs, not live data).

Scope enforcement at the query layer. Implement trail access control in the query layer, not the storage layer. The storage layer stores all entries uniformly. The query layer checks the caller’s workspace ID and role, computes the accessible scope (own local trail, visibility set trails, or global), and filters results accordingly. This keeps the storage layer simple and the access logic centralized.

Cold storage as compressed archives. Cold-tier entries can be stored as compressed files (gzip, zstd) grouped by workspace or by time range. Each archive file includes a manifest listing entry IDs and timestamps for fast lookup without decompression. Querying cold storage decompresses the relevant archive, applies the query filter, and returns results. This trades latency for storage cost — acceptable for the cold tier’s seconds-to-minutes latency requirement.

Event registry validation. At startup, load the event registry and build a lookup table from event type string to body schema. On every trail write, validate the event_type against the registry and the body against the expected schema. Reject unknown event types immediately — they indicate a protocol version mismatch or a bug. Schema validation can be strict (reject malformed bodies) or lenient (log a warning but accept) depending on the deployment’s tolerance for schema drift.


WACP constituent specification — authored by Akil Abderrahim and Claude Opus 4.6 Protocol: PROTOCOL.md | Taxonomy: TAXONOMY.md