Skip to content

feat(audit): privacy-preserving tool_transcript entries (#126)#316

Open
aryanputta wants to merge 1 commit into
agentrust-io:mainfrom
aryanputta:feat/tool-transcript-entries
Open

feat(audit): privacy-preserving tool_transcript entries (#126)#316
aryanputta wants to merge 1 commit into
agentrust-io:mainfrom
aryanputta:feat/tool-transcript-entries

Conversation

@aryanputta

Copy link
Copy Markdown
Member

Partially addresses #126.

What

Adds a per-call entries array to the cMCP TRACE Claim trace.tool_transcript, so a regulator can read the tool-call decision trail offline:

"tool_transcript": {
  "hash": "sha256:<audit-chain tip>",
  "call_count": 3,
  "entries": [
    {"tool_name": "document_reader", "data_class": "confidential", "decision": "allow"},
    {"tool_name": "credit_score_lookup", "data_class": "confidential", "decision": "allow"},
    {"tool_name": "risk_report_writer", "data_class": "internal", "decision": "advisory_deny"}
  ]
}

Each entry is built from the audit chain, not from raw tool-call parameters or response bodies, so the transcript stays privacy-preserving (no PII).

Acceptance criteria

Design notes / decisions for review

  • Hash semantics. The issue Design section says tool_transcript.hash = audit_chain.chain_tip; one acceptance bullet instead reads "SHA-256 of canonical JSON of the entries". These conflict. I kept hash = chain_tip (binds the full transcript via the chain) and added a separate transcript_entries_hash() so a verifier can recompute the entries digest independently. Happy to fold this into a dedicated field if maintainers prefer the second reading.
  • data_class source. Taken from the catalog entry sensitivity_level for the tool, falling back to "unknown" when the tool is not in the catalog.
  • Model placement. tool_transcript is defined in the cMCP claim schema (not pulled from agentrust-trace), so entries is added cleanly without a cross-repo model change. A matching trace-spec tool_transcript.entries field can follow if the canonical spec wants to carry the same array.

Tests

tests/unit/test_trace_claim.py: 30 pass locally (5 new). Pre-existing failures elsewhere in the suite are unrelated (missing agent_os/Cedar backend in a bare checkout; fail identically on main). Ruff clean. DCO signed-off.

Add an entries array to the cMCP TRACE Claim tool_transcript: one entry per
tool call carrying tool_name, the catalog data_class, and the policy decision,
derived from the audit chain. No request/response payloads, so the per-call view
a regulator reads leaks no PII. tool_transcript.hash still binds the full
transcript to the audit-chain tip; transcript_entries_hash() lets a verifier
recompute the entries digest offline.

- trace_claim.py: ToolTranscriptEntry, cmcp-local ToolTranscriptOut with entries,
  builder/verify helpers, generate_trace_claim transcript_entries param
- session/manager.py: derive entries from tool-call audit entries + catalog
- schemas/trace-claim.schema.json: entries array under trace.tool_transcript
- tests: ordering, hash==chain tip, no-payload privacy, hash roundtrip, optional
- CHANGELOG

Signed-off-by: Aryan Putta <aryansputta@gmail.com>
@github-actions

Copy link
Copy Markdown
Contributor

🟡 Contributor Check: MEDIUM

Check Result
Profile MEDIUM
Credential NONE
Overall MEDIUM

Automated check by AGT Contributor Check.

@github-actions github-actions Bot added the needs-review:MEDIUM Contributor check flagged MEDIUM risk label Jun 17, 2026

@imran-siddique imran-siddique left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall design is right: deriving entries from the audit chain (not from raw payloads) keeps the transcript privacy-preserving, and keeping tool_transcript.hash as the audit-chain tip preserves the existing binding semantics. A few things to address before this can merge.

1. audit.db committed

The diff includes a binary SQLite file at repo root. This is a test-run artifact and must not be committed. Strip it from the branch and add *.db (or at minimum audit.db) to .gitignore. Committing binary database files bloats history, causes spurious conflicts, and leaks local run state.

2. transcript_entries_hash() is a utility with nothing to verify against

The PR adds transcript_entries_hash(entries) and the docstring says "a verifier recomputes this from the entries and checks it against the digest the profile carries." But the profile does not carry that digest anywhere. tool_transcript.hash is the audit-chain tip, not the entries digest. There is no second field in the schema, the model, or the claim output to carry the entries hash.

This needs a decision: either add a transcript_entries_hash field to ToolTranscriptOut and trace-claim.schema.json so a verifier actually has something to compare against, or remove transcript_entries_hash() entirely (it currently promises offline verification but provides no anchor for it). Leaving an exported utility function whose docstring describes a verification step that is not implementable will confuse integrators.

You flagged this as an open question in the PR body — it needs to be resolved before merge.

3. data_class is a free-form string; decision is properly enumerated

decision has a clean Literal enum. data_class is a bare str that falls back to "unknown". The catalog's sensitivity_level values presumably have a bounded set — if so, add a Literal or Enum constraint here too, or at minimum document the expected values so a verifier knows what to expect. Otherwise a typo in a catalog entry silently passes schema validation.


Items 1 and 2 need to be resolved before merge. Item 3 is worth addressing but not a hard blocker if the catalog sensitivity level is intentionally open-ended.

@carloshvp

Copy link
Copy Markdown
Member

Adding one small follow-up suggestion to Imran’s review: would it be worth adding a session-level regression test for the actual SessionManager.close_session derivation path?

The current tests cover passing transcript_entries directly into generate_trace_claim, but #126’s runtime behavior depends on deriving entries from audit-chain tool_call entries plus catalog metadata. A test could append a few tool calls, close the session with a catalog, and assert order, call_count == len(entries), data_class from sensitivity_level, and no payload fields in serialized entries.

Small semantics question too: the schema allows "fault", but the derivation appears focused on entry_type == "tool_call". If fault or egress_denied entries with a tool name are intentionally outside tool_transcript.entries, maybe that should be captured in the test/docs so consumers know exactly what entries represents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review:MEDIUM Contributor check flagged MEDIUM risk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants