Bind independent safety-controller execution evidence into the audit chain (embodied agents, follow-up to #293)

## Context

The industrial-embodied-ai example (agentrust-io/examples#16, refined in #18)
deliberately leaves one boundary open and documents it under "Evidence
boundaries": the audit bundle binds request hashes and authorization decisions,
but the controller outcome is recorded only as a client-observed value, not as
independent execution evidence. The README states plainly that binding
independent controller evidence is a follow-up design question rather than
something the example should silently invent.

#293 closed the adjacent plumbing gap: the proxy now populates
`response_payload_hash` for the post-scan response it forwards. That is necessary
but not sufficient for embodied agents, and this issue is about the part #293
does not cover.

## The gap

For an embodied agent, the authoritative outcome is decided by an independent
functional-safety controller that sits on the far side of the tool boundary.
cMCP can hash the response it forwards, but that response is the tool server's
self-report. The audit chain will faithfully hash whatever the server returns,
including a server that reports `accepted` when the safety controller actually
rejected motion. LIMITATIONS.md already draws this line: Phase 1 attests the
gateway boundary and not what happens on the other side, and any claim that
relies on server-side proof is Phase 2 work.

The example's third scenario makes the stakes concrete: cMCP authorizes the
request, the physical state changes, and the independent controller then rejects
motion on a stale state token. Today the audit chain can prove the request was
authorized under a specific policy. It cannot prove, against a hostile or faulty
tool server, what the controller actually decided or did.

## Why this is embodied-specific

We cannot route actuation through cMCP. The functional-safety controller must
remain an independent authority outside the AI governance plane, for the same
certification reasons that make it trustworthy. So the governance plane cannot
observe the actuation directly. It can only bind the controller's own evidence
after the fact. The design question is how to bind an independent safety
authority's signed execution record into a tamper-evident chain without (a)
implying cMCP observed it, or (b) implying functional-safety certification.

## Proposed direction

Add an optional, backward-compatible way to attach independent execution evidence
to the audit entry for a tool call, distinct from `response_payload_hash`:

- Option A, external attestation block on the audit entry. Extend the AuditEntry
  schema with an optional `external_execution_evidence` object:
  `{ issuer, issuer_key_id, signature, evidence_hash, evidence_type, linked_call_id }`.
  The controller (or a thin signer next to it) produces a signed receipt of its
  decision and, where available, the actuation outcome. cMCP records the receipt
  hash and signature, never the actuation itself. Clear separation:
  `response_payload_hash` stays "what the gateway forwarded", the new field is
  "what an independent authority attested".
- Option B, separate execution-evidence TRACE claim subtype, linked to the
  session and `call_id`, emitted out of band by the controller signer and
  verified alongside the gateway claim. Keeps the gateway schema untouched, at
  the cost of a second artifact and a join.
- Option C, do nothing in cMCP, document it as permanently out of scope and
  leave it to integrators. Rejected: the example already shows the demand, and
  leaving it undefined invites integrators to overload `response_payload_hash`
  and quietly claim more than it proves.

Recommendation: Option A for the binding, with the receipt format specified so
trace-spec can describe verification. This keeps one chain, one join key
(`call_id`), and an honest field boundary.

## Scope for a first iteration

- Optional schema field on the audit entry, backward compatible, with `verify`
  treating it as present-or-absent.
- One conformance test: a populated receipt verifies, a tampered receipt fails.
- LIMITATIONS.md and the response-inspection / audit spec updated to state
  exactly what the new field does and does not prove.
- A companion update to the industrial-embodied-ai example so the third scenario
  shows a controller-signed reject receipt bound to the authorizing call, with
  the README evidence table updated.
- Coordination note filed against trace-spec for the verification side.

## Non-goals

- This is evidence binding, not functional-safety certification. IEC 61508 /
  ISO 13849 conformance stays with the certified controller and is explicitly
  out of scope.
- No change to how actuation is gated. cMCP does not become an actuation path.
- Not a replacement for #293. The forwarded-response hash and the independent
  execution receipt are different fields with different trust meanings.

I am happy to own this end to end (spec text, schema, conformance test, and the
example update), in a personal capacity.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bind independent safety-controller execution evidence into the audit chain (embodied agents, follow-up to #293) #301

Context

The gap

Why this is embodied-specific

Proposed direction

Scope for a first iteration

Non-goals

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bind independent safety-controller execution evidence into the audit chain (embodied agents, follow-up to #293) #301

Description

Context

The gap

Why this is embodied-specific

Proposed direction

Scope for a first iteration

Non-goals

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions