Skip to content

Bind independent safety-controller execution evidence into the audit chain (embodied agents, follow-up to #293) #301

@carloshvp

Description

@carloshvp

Context

The industrial-embodied-ai example (agentrust-io/examples#16, refined in #18)
deliberately leaves one boundary open and documents it under "Evidence
boundaries": the audit bundle binds request hashes and authorization decisions,
but the controller outcome is recorded only as a client-observed value, not as
independent execution evidence. The README states plainly that binding
independent controller evidence is a follow-up design question rather than
something the example should silently invent.

#293 closed the adjacent plumbing gap: the proxy now populates
response_payload_hash for the post-scan response it forwards. That is necessary
but not sufficient for embodied agents, and this issue is about the part #293
does not cover.

The gap

For an embodied agent, the authoritative outcome is decided by an independent
functional-safety controller that sits on the far side of the tool boundary.
cMCP can hash the response it forwards, but that response is the tool server's
self-report. The audit chain will faithfully hash whatever the server returns,
including a server that reports accepted when the safety controller actually
rejected motion. LIMITATIONS.md already draws this line: Phase 1 attests the
gateway boundary and not what happens on the other side, and any claim that
relies on server-side proof is Phase 2 work.

The example's third scenario makes the stakes concrete: cMCP authorizes the
request, the physical state changes, and the independent controller then rejects
motion on a stale state token. Today the audit chain can prove the request was
authorized under a specific policy. It cannot prove, against a hostile or faulty
tool server, what the controller actually decided or did.

Why this is embodied-specific

We cannot route actuation through cMCP. The functional-safety controller must
remain an independent authority outside the AI governance plane, for the same
certification reasons that make it trustworthy. So the governance plane cannot
observe the actuation directly. It can only bind the controller's own evidence
after the fact. The design question is how to bind an independent safety
authority's signed execution record into a tamper-evident chain without (a)
implying cMCP observed it, or (b) implying functional-safety certification.

Proposed direction

Add an optional, backward-compatible way to attach independent execution evidence
to the audit entry for a tool call, distinct from response_payload_hash:

  • Option A, external attestation block on the audit entry. Extend the AuditEntry
    schema with an optional external_execution_evidence object:
    { issuer, issuer_key_id, signature, evidence_hash, evidence_type, linked_call_id }.
    The controller (or a thin signer next to it) produces a signed receipt of its
    decision and, where available, the actuation outcome. cMCP records the receipt
    hash and signature, never the actuation itself. Clear separation:
    response_payload_hash stays "what the gateway forwarded", the new field is
    "what an independent authority attested".
  • Option B, separate execution-evidence TRACE claim subtype, linked to the
    session and call_id, emitted out of band by the controller signer and
    verified alongside the gateway claim. Keeps the gateway schema untouched, at
    the cost of a second artifact and a join.
  • Option C, do nothing in cMCP, document it as permanently out of scope and
    leave it to integrators. Rejected: the example already shows the demand, and
    leaving it undefined invites integrators to overload response_payload_hash
    and quietly claim more than it proves.

Recommendation: Option A for the binding, with the receipt format specified so
trace-spec can describe verification. This keeps one chain, one join key
(call_id), and an honest field boundary.

Scope for a first iteration

  • Optional schema field on the audit entry, backward compatible, with verify
    treating it as present-or-absent.
  • One conformance test: a populated receipt verifies, a tampered receipt fails.
  • LIMITATIONS.md and the response-inspection / audit spec updated to state
    exactly what the new field does and does not prove.
  • A companion update to the industrial-embodied-ai example so the third scenario
    shows a controller-signed reject receipt bound to the authorizing call, with
    the README evidence table updated.
  • Coordination note filed against trace-spec for the verification side.

Non-goals

  • This is evidence binding, not functional-safety certification. IEC 61508 /
    ISO 13849 conformance stays with the certified controller and is explicitly
    out of scope.
  • No change to how actuation is gated. cMCP does not become an actuation path.
  • Not a replacement for Populate response_payload_hash in the forwarding path audit entries #293. The forwarded-response hash and the independent
    execution receipt are different fields with different trust meanings.

I am happy to own this end to end (spec text, schema, conformance test, and the
example update), in a personal capacity.

Metadata

Metadata

Assignees

Labels

attestationTEE / hardware attestationimplImplementation task (vs spec)specSpecification or design decisiontrack:auditAudit chain and TRACE Claim

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions