Skip to content

feat(audit): add optional external_execution_evidence on AuditEntry (#301)#314

Open
carloshvp wants to merge 3 commits into
agentrust-io:mainfrom
carloshvp:feat/external-execution-evidence
Open

feat(audit): add optional external_execution_evidence on AuditEntry (#301)#314
carloshvp wants to merge 3 commits into
agentrust-io:mainfrom
carloshvp:feat/external-execution-evidence

Conversation

@carloshvp

Copy link
Copy Markdown
Member

Summary

Implements Option A for #301 (confirmed direction). Adds an optional,
independently-signed execution receipt bound to an audit entry, kept distinct
from response_payload_hash: response_payload_hash is what the gateway
forwarded, external_execution_evidence is what an independent authority (for
example a safety controller) attested.

This PR lands the data model, schema, verification, and tests. Proxy ingestion
and the example follow next (see Scope and follow-ups).

What changed

  • audit/chain.py: optional external_execution_evidence field on AuditEntry
    and an append() keyword. Serialized uniformly via asdict (null when
    absent), so receipt-less entries hash exactly as before and existing evidence
    keeps verifying. No change to the canonical body rule.
  • schemas/audit-entry.schema.json: optional receipt object (issuer,
    issuer_key_id, signature, evidence_hash, evidence_type,
    linked_call_id), not in required so entries that predate the field still
    validate.
  • cmcp_verify.verify_audit_bundle: opt-in receipt verification via an
    external_evidence_keys parameter (issuer_key_id to hex Ed25519 public key).
    When supplied, it checks linked_call_id == call_id and the issuer signature
    over the canonical receipt (all fields except signature). Receipt-less
    entries and callers without keys are unaffected.
  • LIMITATIONS.md: what the receipt does and does not prove.
  • conformance tests (TestExternalExecutionEvidence301): absent-verifies and
    keeps the old hashing, populated-verifies, tampered-fails,
    linked_call_id-mismatch-fails, unknown-issuer-key-fails.

Validation

  • pytest tests/unit tests/conformance: 701 passed
  • ruff, mypy, bandit: clean
  • DCO signed off

Every existing audit and verify test passes unchanged, so the change is
backward compatible.

Scope and follow-ups

Kept out of this PR on purpose, happy to do them next:

  1. Proxy ingestion: how a controller-signed receipt reaches the gateway. My
    default is to let the upstream MCP server include the receipt in its tool
    response and have the proxy bind it at the existing post-scan audit append
    (the Populate response_payload_hash in the forwarding path audit entries #293 path). Flagging the transport convention for your input before I
    wire it.
  2. The industrial-embodied-ai example third scenario showing a controller-signed
    reject receipt (examples repo).
  3. Verification-side coordination is filed as trace-spec#34.

Aside: audit-entry.schema.json already drifts from the dataclass (detail,
workflow_id, and a few entry_type / policy_decision enum values are
missing). Left out of scope here, happy to reconcile separately.

@imran-siddique tagging you as requested.

…#301)

Introduce an optional, independently-signed execution receipt bound to an audit
entry, distinct from response_payload_hash. response_payload_hash is what the
gateway forwarded; external_execution_evidence is what an independent authority
(for example a safety controller) attested. Confirmed direction: Option A.

- chain.py: add the optional external_execution_evidence field to AuditEntry and
  an append() keyword. Serialized uniformly via asdict (null when absent), so
  receipt-less entries hash exactly as before and existing evidence keeps
  verifying.
- schemas/audit-entry.schema.json: add the optional receipt object (issuer,
  issuer_key_id, signature, evidence_hash, evidence_type, linked_call_id), not in
  required so entries that predate the field still validate.
- cmcp_verify: opt-in receipt verification. When external_evidence_keys is
  supplied, check linked_call_id == call_id and the issuer Ed25519 signature over
  the canonical receipt. Receipt-less entries and callers without keys are
  unaffected.
- LIMITATIONS.md: state what the receipt does and does not prove.
- conformance tests: absent verifies and keeps old hashing, populated verifies,
  tampered fails, linked_call_id mismatch fails, unknown issuer key fails.

Scope note: this lands the data model, schema, verification, and tests. Proxy
ingestion (how a controller receipt rides in the upstream response) and the
industrial-embodied-ai example follow next; the transport convention is flagged
for maintainer input. Pre-existing audit-entry.schema.json drift (detail,
workflow_id, extra entry_type enum values) is noted and left out of scope.

Signed-off-by: Carlos Hernandez <carloshvp@gmail.com>

@imran-siddique imran-siddique left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this - attaching controller-signed external execution evidence to individual audit entries is exactly the kind of receipt binding that compliance use cases need. The overall approach makes sense. A few things to work through before this is ready:

Schema gaps

evidence_hash has no pattern constraint. Every other hash field in the schema uses "pattern": "^sha(256|384):[0-9a-f]+". Add one here so schema validation can reject malformed hashes before they reach the verifier.

issuer_key_id has no documented format. Is it a hex-encoded SHA-256 of the public key (matching agent_manifest.py's _key_id())? A DID key identifier? Something else? Pick a format and enforce it in the schema.

Specification

What is the evidence_hash pre-image exactly? The field is described as a "hash of the evidence" but there is no canonical definition of what bytes are hashed, in what encoding, with what canonicalization. An implementer reading the schema today cannot produce a verifiable evidence_hash without guessing. This needs a computation spec in docs/spec/verification-library.md (or a new doc).

evidence_type is a free-form string. Even an initial enumeration of documented values (e.g., tee-signed-receipt, controller-jwt, opaque-receipt) would help both implementers and verifiers. Free-form types make it impossible to write a strict verifier.

docs/spec/verification-library.md is not updated to document the new external_evidence_keys parameter in verify_audit_bundle(). The spec doc should describe what keys are expected, what the verification flow is, and what EXTERNAL_EVIDENCE_VERIFICATION_FAILED would look like as an error code.

Type inconsistency

verify.py declares external_evidence_keys: dict[str, str] | None. The sibling PR #315 uses trusted_agent_manifest_keys: dict[str, bytes] (raw public key bytes). If these two verification paths are going to coexist in the same library, the key type should be consistent. String-encoded keys require a documented encoding convention that currently doesn't exist.

Testing

The conformance tests cover present/absent/tampered/call_id-mismatch cases - that is a solid starting point. But there is no end-to-end test that runs the full path: chain.append(..., external_execution_evidence=...) → export bundle → verify_audit_bundle(external_evidence_keys=...). The conformance suite exercises the schema and mismatch detection, not the cryptographic verification path. Add an integration test that uses a real key pair, produces a valid signature, and confirms it passes.

TRACE Claim

When a session's audit chain contains entries with external_execution_evidence, can a verifier tell from the TRACE Claim alone that external evidence was bound? Currently the Claim carries audit_chain_tip but no evidence-presence flag. Is that intentional? If so, call it out in LIMITATIONS.md.

Signed-off-by: Carlos Hernandez <carloshvp@gmail.com>

Copy link
Copy Markdown
Member Author

Addressed the review feedback in 4f82a66.

What changed:

  • Tightened external_execution_evidence schema validation:
    • evidence_hash now requires a sha256:/sha384: prefixed hash.
    • issuer_key_id is now specified/enforced as lowercase hex SHA-256 of the raw Ed25519 issuer public key.
    • evidence_type is restricted to documented initial values.
  • Updated verify_audit_bundle() so external_evidence_keys is dict[str, bytes], matching the raw-key convention used by the Agent Manifest verifier path.
  • Added key-id self-checking: issuer_key_id must match sha256(public_key_bytes).
  • Documented the detached evidence hash pre-image, receipt signing pre-image, verification flow, trusted key map format, and EXTERNAL_EVIDENCE_VERIFICATION_FAILED semantics in verification-library.md and error-codes.md.
  • Added the TRACE limitation note: the claim commits the audit chain tip but does not carry a separate external-evidence-present flag; verifiers detect evidence by fetching/verifying the committed audit bundle.
  • Added an end-to-end conformance test for chain.append(..., external_execution_evidence=...) -> exported signed bundle -> verify_audit_bundle(..., external_evidence_keys=...) with a real Ed25519 keypair.
  • Softened the receipt-less wording so it no longer implies newly emitted entries omit the null evidence field.

Validation run locally:

  • uv run ruff check src/ tests/
  • uv run mypy src/cmcp_runtime src/cmcp_verify
  • uv run pytest tests/unit -q (645 passed)
  • uv run pytest tests/conformance -q (58 passed)

GitHub currently shows the PR mergeable with the visible gate check passing.

Signed-off-by: Carlos Hernandez <carloshvp@gmail.com>
@carloshvp

Copy link
Copy Markdown
Member Author

Follow-up pushed for the #301 path:

  • added runtime ingestion for well-formed external_execution_evidence objects returned by upstream JSON tool responses
  • kept response_payload_hash bound to the exact response bytes returned to the caller
  • added proxy unit coverage for binding valid receipts and ignoring malformed receipt-like fields
  • added docs for the response-field convention
  • opened the embodied-AI examples follow-up as feat(industrial-embodied-ai): bind controller execution receipts examples#29, with a live local run verifying controller receipts: verified (2)

Validation rerun before push:

  • uv run pytest tests/unit/test_mcp_proxy.py tests/conformance/test_audit_conformance.py -q
  • uv run ruff check src/ tests/ && uv run mypy src/cmcp_runtime src/cmcp_verify
  • uv run pytest tests/unit -q
  • uv run pytest tests/conformance -q

@carloshvp

Copy link
Copy Markdown
Member Author

@imran-siddique this should be ready for re-review when you have a chance.

The original review feedback was addressed in 4f82a66, and the follow-up runtime ingestion path landed in 29ef037. Checks are green, and the trust boundary remains explicit: response_payload_hash is what the gateway forwarded; external_execution_evidence is what the independent controller attested.

Once this lands, I can re-pin and undraft the embodied-AI example follow-up in agentrust-io/examples#29.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants