Skip to content

fix: tolerate a non-dict NESTED value in agent-authored evidence (no 500)#252

Merged
branover merged 1 commit into
mainfrom
fix/evidence-nested-nondict
Jun 15, 2026
Merged

fix: tolerate a non-dict NESTED value in agent-authored evidence (no 500)#252
branover merged 1 commit into
mainfrom
fix/evidence-nested-nondict

Conversation

@branover

Copy link
Copy Markdown
Owner

What

Third in the evidence-robustness series (#250 top-level helper, #251 column boundary). Those guarantee evidence_json reads back as a dict — but a real field finding 500'd GET /api/projects/{id} again after just refresh:

is_verified -> (((ev.get("extra") or {}).get("verification") or {}).get("verified"))
AttributeError: 'str' object has no attribute 'get'

The top-level evidence is a dict here. The malformed value is nested: evidence.extra.verification is a prose string (an agent wrote free text where a {verified, …} object belongs). The (x or {}).get(y) idiom only guards a falsy intermediate — a truthy non-dict (the string) passes through or {} and then .get(...) raises.

The column boundary from #251 can't fix this: evidence.extra is intentionally free-form (DB envelope, not the frozen schema), so its children can be any JSON type. The read accessors have to navigate defensively.

The fix

Add dig_dict(obj, *keys) in db/jsontypes.py: walk nested dicts, return None the moment any level isn't a dict. Route the shared findings-read accessors through it (or guard with isinstance):

  • is_verified (findings.py) — the reported crash.
  • assurance_of (assurance.py) — the same finding crashes it too (extra.verification is a string); now returns the first non-empty assurance dict, else None, preserving "prefer extra.assurance, fall back to extra.verification.assurance".
  • list_findings._row (mcp_tools.py) — the MCP findings list 500s on the same finding (via ver.get and assurance_of); now guards extra/verification/fuzz as dicts.

Scope

The shared accessors + the UI/MCP findings-read hot paths — i.e. the reported crash (GET /api/projects/{id}) and its direct sibling (MCP list_findings). The remaining subsystem-specific inline nested digs in the fuzz/poc/reachability task paths keep the same (((…).get("extra") or {}).get("fuzz") or {}) idiom and are tracked as a follow-up rather than widening this PR's blast radius. No schema/DDL change → no migration.

Verification

  • Validated against the real project DB (read-only): is_verified/assurance_of no longer raise; finding_dict over all 194 findings → 0 crashes; MCP list_findings → OK. So the user's project loads.
  • New unit tests (dig_dict; is_verified/assurance_of vs string-extra / string-verification) + an endpoint regression (extra.verification and extra-itself as strings → 200, not 500). All revert-checked — they fail on the old idiom.
  • Full just test fast tier: 1416 passed, 4 skipped, 14 deselected. Loop untouched (read-path only) → no just demo needed.

🤖 Generated with Claude Code

…500)

Third in the evidence-robustness series (#250 top-level helper, #251 column
boundary). Those guarantee `evidence_json` reads as a dict, but a real field
finding crashed GET /api/projects/{id} again:

  is_verified -> (((ev.get("extra") or {}).get("verification") or {}).get("verified"))
  AttributeError: 'str' object has no attribute 'get'

The top-level evidence IS a dict here; the malformed value is NESTED —
`evidence.extra.verification` is a prose STRING (an agent wrote free text where a
{verified,…} object belongs). The `(x or {}).get(y)` idiom only guards a FALSY
intermediate; a truthy non-dict sails through `or {}` and then `.get` raises. The
column boundary can't fix this — `extra` is intentionally free-form, so its
children can be any JSON type — so the read accessors must navigate defensively.

Add `dig_dict(obj, *keys)` (jsontypes.py): walk nested dicts, return None the
moment any level isn't a dict. Route the shared findings-read accessors through
it / guard with isinstance:
  - is_verified (findings.py) — the reported crash.
  - assurance_of (assurance.py) — same finding also crashes it (extra.verification
    is a string); now returns the first non-empty assurance dict, else None.
  - list_findings._row (mcp_tools.py) — guards extra/verification/fuzz as dicts
    (the MCP findings list 500s on the same finding via ver.get / assurance_of).

Scope: the shared accessors + the UI/MCP findings-read hot paths (the reported
crash and its direct sibling). The remaining subsystem-specific inline nested
digs (fuzz/poc/reachability task paths) keep the same idiom and are tracked as a
follow-up. No schema/DDL change -> no migration.

Verification: validated the fixed code against the REAL project DB (read-only) —
is_verified/assurance_of no longer raise, finding_dict over all 194 findings: 0
crashes, MCP list_findings OK. New unit tests (dig_dict; is_verified/assurance_of
vs string-extra / string-verification) + an endpoint regression
(extra.verification and extra-itself as strings -> 200, not 500); all
revert-checked (fail on the old idiom). Full `just test` fast tier: 1416 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@branover branover left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge-gate review — PR #252 (independent reviewer)

Verdict: APPROVE. A tight, correct, well-tested defensive fix. Both /code-review (high) and /security-review ran on git diff main...HEAD. No blocking issues. One non-blocking follow-up noted below.

What was reviewed

The third in the #250/#251 series. A field finding 500'd GET /api/projects/{id}: top-level evidence was a dict, but evidence.extra.verification was a prose STRING. The (x or {}).get(y) idiom only guards a falsy intermediate; a truthy non-dict crashes ('str' object has no attribute 'get'). The PR adds dig_dict(obj, *keys) and routes the shared findings-read accessors through it.

Correctness — verified

  • dig_dict (jsontypes.py) — correct at every level: guards isinstance(cur, dict) before each .get, returns the leaf as-is, yields None the moment any level (including the root) isn't a dict. Empty-keys returns the object unchanged; all call sites pass >=1 key. Tests cover missing leaf, truthy non-dict at each level, non-dict root, and None.
  • is_verified (findings.py) — semantics preserved: still returns bool, True only when extra.verification.verified is truthy. A non-dict extra/verification now yields False (more conservative; never a false-positive "verified"). No trust-signal regression.
  • assurance_of (assurance.py) — preserves the original "prefer extra.assurance, else extra.verification.assurance" precedence, and now guarantees dict-or-None. The empty-dict fall-through matches the old behavior (old extra.get("assurance") returning {} was falsy and fell through; new isinstance({},dict) and {} is falsy and falls through). The original could leak a truthy non-dict (a string) to callers (compact_assurance/summary_line/merge_assurance all .get it -> would have crashed); narrowing to dict-or-None removes that latent crash. Traced all callers (mcp_tools, findings router, reachability, demo, findings.persist_finding) — every consumer treats the return as dict-or-None.
  • _row isinstance guards (mcp_tools.py) — behavior for well-formed findings is unchanged: a real verification/fuzz is a dict (truthy), so if isinstance(ver, dict) matches the old if ver. The only difference is an empty {} (falsy before, dict now) — emitting {"verified": False, "detail": None} for an empty {} is correct, not harmful. The inner (fz.get("exploitability") or {}).get("rating") is correctly replaced by dig_dict.
  • No circular importassurance.py -> db.jsontypes is a clean leaf import (jsontypes imports only stdlib + sqlalchemy).

Completeness / scope

The GET /api/projects path's only nested-extra read is is_verified (via finding_dict in api/routers/_shared.py); row_to_payload emits the whole coerce_evidence(...) object with no nested .get. So the hot path is fully covered. The PR's decision to scope OUT the poc/fuzz/reachability inline digs is reasonable — they're not on the project-load / list_findings read path.

Non-blocking follow-up (not on the GET hot path; not a regression)

src/hexgraph/api/routers/findings.py:114POST /api/findings/{id}/verify still does extra0 = (f.evidence_json or {}).get("extra") or {} then extra0.get("poc")/extra0.get("fuzz"). A truthy non-dict extra would raise AttributeError before the try/except -> 500. This is an explicit write/action endpoint (analyst clicks "verify"), not the GET read path this PR fixes, and the PR deliberately scopes the poc/fuzz digs out. Suggest routing through the new dig_dict here in a later PR (extra0 = dig_dict(f.evidence_json, "extra") or {}). Filing as discussion, not a blocker.

Security review — no findings

Pure defensive read-path hardening: json only (no pickle/yaml/eval), no subprocess, no file/network I/O, no new endpoint, no secret handling. Loopback / sandbox / opt-in-policy / secret-never-logged invariants untouched. The verified trust signal only becomes more conservative (a malformed verification reads False, never falsely True). No new attack surface.

Tests

Targeted suites re-run green in the worktree:
tests/test_evidence_coerce.py tests/test_evidence_json_column.py tests/test_assurance.py tests/test_mcp.py tests/test_hidden_targets.py -> 93 passed, 1 skipped. New tests are high quality: they assert the exact field shape that crashed (extra.verification as prose), exercise the real GET /api/projects/{id} returning 200 for both a non-dict nested value and a non-dict extra, and check assurance_of recovers a valid extra.assurance even when verification is malformed. Docs/migrations: none required (no model/UI change; dig_dict is an internal helper).

Reviewed at 3f2fa48.

@branover branover merged commit c0ae62e into main Jun 15, 2026
7 checks passed
@branover branover deleted the fix/evidence-nested-nondict branch June 15, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant