Skip to content

types: add trust identity verification to prevent LLM reviewer forgery#59

Merged
rfunix merged 5 commits intomainfrom
types/reviewer-forgery-detection
Apr 7, 2026
Merged

types: add trust identity verification to prevent LLM reviewer forgery#59
rfunix merged 5 commits intomainfrom
types/reviewer-forgery-detection

Conversation

@rfunix
Copy link
Copy Markdown
Owner

@rfunix rfunix commented Apr 7, 2026

Summary

  • Introduces E0263 and E0264 — compile-time errors that detect when an AI agent forges a @reviewed_by(human: "...") annotation by naming itself as a human reviewer
  • Adds a [trust] section to kodo.toml with two opt-in fields: known_agents (forbidden as human reviewers) and human_reviewers (allowlist of authorized reviewers)
  • Adds trust=verified policy criterion to kodoc audit for CI/CD gating

What changed

Core type system (kodo_types)

  • TrustConfig struct exported as public API
  • TypeChecker::set_trust_config() — injects trust config before check_module
  • validate_reviewer_identity() — validates reviewers case-insensitively against both lists; wired into validate_module_policies and validate_policies_collecting
  • AgentClaimsHumanReview (E0263) with auto-fix patch (human:agent:)
  • ReviewerNotInAllowlist (E0264) with suggestion

Manifest (kodoc)

  • TrustSection + trust: Option<TrustSection> field on Manifest
  • load_trust_config(source_file) — looks up kodo.toml in the file's parent directory
  • All CLI commands (check, build, audit, confidence-report, mir) call set_trust_config automatically

Audit

  • PolicyCriterion::TrustVerified + "trust=verified" policy string
  • validate_policy_with_trust() with known_agents slice
  • FunctionAudit.reviewers: Vec<String> in JSON output

Test plan

  • cargo fmt --all -- --check — clean
  • cargo clippy --workspace -- -D warnings — zero warnings
  • cargo test --workspace — all pass (7 new unit tests for E0263/E0264, 4 manifest tests, 3 audit tests)
  • make ui-test — 127 tests pass including 3 new UI tests in tests/ui/traceability/trust/

Backward compatibility

Everything is opt-in. Projects without a [trust] section in kodo.toml are completely unaffected — TrustConfig::default() is a no-op.

🤖 Generated with Claude Code

rfunix and others added 5 commits April 1, 2026 19:08
…3 E0226-E0227 E0251 E0260-E0262 E0310

Covers error codes across lexer, parser, types, contracts, and annotation
phases. 16 new compile-fail UI tests bringing error-messages coverage to
40 files with 38+ distinct error codes exercised.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers closures, actors, traits/impls, annotations, Option/Result types,
tuples, break/continue, spawn, channels, select, intents, unary/binary ops,
nested collections, and idempotency (format(format(x)) == format(x)).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers: fmt roundtrip and idempotency, annotate JSON output, audit JSON
and policy enforcement, and fix dry-run — bringing all major kodoc
subcommands under automated test coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exercises BinOp (add, sub, mul, div, eq), StringConst assign, Call in
body, multi-block with jump, BoolConst return, and local assign+return
via MIR-level compile_module calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces E0263 and E0264 to detect when an AI agent attempts to forge
a @reviewed_by(human: "...") annotation by naming itself as a human reviewer.

A new [trust] section in kodo.toml configures two opt-in checks:
- known_agents: list of agent names forbidden as human reviewers (E0263)
- human_reviewers: allowlist of authorized reviewer identities (E0264)

Both checks run case-insensitively at type-check time via a new
TrustConfig struct threaded into TypeChecker via set_trust_config().
The kodoc check, build, audit, confidence-report, and mir commands all
load trust config automatically from kodo.toml in the source file's
parent directory.

The audit command gains a new trust=verified policy criterion for CI/CD
gating. FunctionAudit now exposes a reviewers field in JSON output.

Adds 7 unit tests (E0263/E0264 cases), 4 manifest tests, 3 audit tests,
and 3 UI tests (tests/ui/traceability/trust/) with a kodo.toml fixture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rfunix rfunix merged commit 6749e46 into main Apr 7, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant