diff --git a/docs/architecture/codex-handoff-summary.md b/docs/architecture/codex-handoff-summary.md index 039dc3a..1318b23 100644 --- a/docs/architecture/codex-handoff-summary.md +++ b/docs/architecture/codex-handoff-summary.md @@ -18,7 +18,7 @@ compact FastAPI service (~1,100 lines of Python) backed by synthetic data. | Policy evaluation | Complete | 10 policy outcomes with hard-block precedence | | Adjudication | Complete | Cascading status resolution | | Review routing | Complete | Condition-based manual review triggers | -| Receipt generation | Partial | SHA-256 based, no verification or revocation | +| Receipt generation | Complete | SHA-256 based, verification and revocation added | | Anchoring | Partial | Local deterministic, no external anchor network | | Case memory (shadow learning) | Complete | Jaccard similarity, feedback capture | | Review queue | Complete | In-memory, enqueue/list | @@ -32,17 +32,17 @@ compact FastAPI service (~1,100 lines of Python) backed by synthetic data. Codex explicitly documented these as future work in `trustsignal-flow-refactor-plan.md`: -1. **Receipt verification API** — no endpoint to verify a previously issued receipt -2. **Receipt revocation lifecycle** — no revocation or status lookup +1. ~~**Receipt verification API**~~ — **Done**: `POST /api/v1/oracle/receipts/verify` +2. ~~**Receipt revocation lifecycle**~~ — **Done**: `POST /api/v1/oracle/receipts/revoke`, `GET /api/v1/oracle/receipts/{id}`, in-memory `ReceiptStore` 3. **Durable storage** — all stores are in-memory (jobs, review queue, case memory, idempotency) -4. **Structured logging** — no logging instrumentation +4. ~~**Structured logging**~~ — **Done**: `get_logger` / `log_stage` throughout all pipeline stages 5. **Metrics and tracing** — no observability hooks -6. **Error response schema** — errors returned as unstructured HTTP exceptions +6. ~~**Error response schema**~~ — **Done**: `OracleError` with machine-readable codes, pre-defined error constants 7. **Retry and resilience** — HTTP connector has no retry logic 8. **Multi-source registry configuration** — hardcoded to mock_registry 9. **Signed key management** — receipt signatures use plain SHA-256 10. **CI/CD pipeline** — no workflow configuration -11. **Compliance-gap handling** — incomplete coverage not flagged as a distinct risk +11. ~~**Compliance-gap handling**~~ — **Done**: `compliance_gap` risk flag now emits a distinct reason in review routing 12. **External anchor integration** — stub only ### Codex's Intended Direction diff --git a/docs/architecture/enterprise-gap-analysis.md b/docs/architecture/enterprise-gap-analysis.md index 46651d7..8fea9dd 100644 --- a/docs/architecture/enterprise-gap-analysis.md +++ b/docs/architecture/enterprise-gap-analysis.md @@ -7,48 +7,47 @@ categorized by domain. ## Gaps -### 1. Observability — Critical +### 1. Observability — Partially Resolved -| Gap | Description | -|---|---| -| No structured logging | The oracle pipeline produces no log output. There is no way to trace a request through the pipeline, diagnose failures, or audit operational behavior. | -| No metrics collection | No counters, histograms, or gauges for request volume, latency, error rates, or risk band distributions. | -| No distributed tracing | No correlation IDs or trace spans for cross-service observability. | +| Gap | Description | Status | +|---|---|---| +| No structured logging | The oracle pipeline produces no log output. There is no way to trace a request through the pipeline, diagnose failures, or audit operational behavior. | **Resolved** — `get_logger` / `log_stage` added to all pipeline stages. | +| No metrics collection | No counters, histograms, or gauges for request volume, latency, error rates, or risk band distributions. | Open | +| No distributed tracing | No correlation IDs or trace spans for cross-service observability. | Open | -**Impact:** An enterprise deployment without logging or metrics is opaque and -unauditable. This is the single highest-priority gap. +**Impact:** Structured logging now covers the full pipeline lifecycle. Metrics and +distributed tracing remain open for a future instrumentation pass. -### 2. Receipt Lifecycle — Critical +### 2. Receipt Lifecycle — Resolved -| Gap | Description | -|---|---| -| No receipt verification | Receipts are generated but cannot be independently verified. The collect → receipt → **verify** → review flow is incomplete. | -| No receipt revocation | There is no mechanism to revoke a receipt, mark it as superseded, or look up receipt status. | -| No receipt store | Receipts are embedded in oracle decisions but not independently addressable. | +| Gap | Description | Status | +|---|---|---| +| No receipt verification | Receipts are generated but cannot be independently verified. | **Resolved** — `POST /api/v1/oracle/receipts/verify` endpoint added. | +| No receipt revocation | There is no mechanism to revoke a receipt, mark it as superseded, or look up receipt status. | **Resolved** — `POST /api/v1/oracle/receipts/revoke` and `GET /api/v1/oracle/receipts/{id}` added. | +| No receipt store | Receipts are embedded in oracle decisions but not independently addressable. | **Resolved** — in-memory `ReceiptStore` with full lifecycle support. | -**Impact:** Without verification and revocation, receipts are write-only artifacts -with no operational lifecycle. +**Impact:** The collect → receipt → verify → review lifecycle is now complete for +simulation scope. -### 3. Error Handling — High +### 3. Error Handling — Resolved -| Gap | Description | -|---|---| -| Unstructured errors | API errors are raised as bare `HTTPException` with string details. No consistent error schema. | -| No error classification | No distinction between client errors, pipeline failures, and transient errors. | -| Silent pipeline failures | Exceptions inside the oracle pipeline may propagate as 500s without context. | +| Gap | Description | Status | +|---|---|---| +| Unstructured errors | API errors are raised as bare `HTTPException` with string details. No consistent error schema. | **Resolved** — `OracleError` class with machine-readable codes, messages, and retryability flags. | +| No error classification | No distinction between client errors, pipeline failures, and transient errors. | **Resolved** — pre-defined error constants (`RECEIPT_NOT_FOUND`, `TENANT_MISMATCH`, etc.). | +| Silent pipeline failures | Exceptions inside the oracle pipeline may propagate as 500s without context. | **Resolved** — `log_stage` captures and logs pipeline exceptions with full context. | -**Impact:** Operators and integrators cannot programmatically handle errors or -distinguish retriable from terminal failures. +**Impact:** Structured error responses are now consistent across all API endpoints. -### 4. Resilience — High +### 4. Resilience — Partially Resolved -| Gap | Description | -|---|---| -| No retry logic | The HTTP connector fails immediately on timeout or error. No retry, backoff, or circuit-breaker patterns. | -| No compliance-gap signaling | When source coverage is incomplete, this is handled generically rather than as a distinct compliance gap. | +| Gap | Description | Status | +|---|---|---| +| No retry logic | The HTTP connector fails immediately on timeout or error. No retry, backoff, or circuit-breaker patterns. | Open | +| No compliance-gap signaling | When source coverage is incomplete, this is handled generically rather than as a distinct compliance gap. | **Resolved** — `compliance_gap` risk flag now emits a distinct reason in review routing. | -**Impact:** A single transient source failure causes degraded decisions with no -recovery path. +**Impact:** Compliance-gap signals are now correctly distinguished from partial +source coverage. HTTP connector retry logic remains open. ### 5. Storage — Medium @@ -82,12 +81,12 @@ deployment. ## Priority Order for Enterprise Hardening -1. Structured logging throughout the pipeline -2. Receipt verification and revocation endpoints -3. Structured error response model +1. ~~Structured logging throughout the pipeline~~ — **Done** +2. ~~Receipt verification and revocation endpoints~~ — **Done** +3. ~~Structured error response model~~ — **Done** 4. Retry wrapper for connectors -5. Compliance-gap handling in the risk pipeline -6. Receipt store for independent addressability +5. ~~Compliance-gap handling in the risk pipeline~~ — **Done** +6. ~~Receipt store for independent addressability~~ — **Done** 7. Metrics hooks (counters, histograms) 8. Storage interface abstractions 9. HMAC receipt signatures diff --git a/src/trustagents/oracle/stages/review.py b/src/trustagents/oracle/stages/review.py index 23d6759..3498b02 100644 --- a/src/trustagents/oracle/stages/review.py +++ b/src/trustagents/oracle/stages/review.py @@ -26,7 +26,10 @@ def route_review( if "identity_ambiguity" in risk_flags or "near_match_signal" in risk_flags: reasons.append("Ambiguity or near-match evidence present") needs_manual = True - if not source_results_complete or "compliance_gap" in risk_flags: + if "compliance_gap" in risk_flags: + reasons.append("Compliance gap: no registry source returned a successful result") + needs_manual = True + elif not source_results_complete: reasons.append("Registry coverage incomplete") needs_manual = True if extraction_confidence < 0.7: diff --git a/tests/unit/test_observability_and_compliance_gap.py b/tests/unit/test_observability_and_compliance_gap.py index 9813138..5b4f793 100644 --- a/tests/unit/test_observability_and_compliance_gap.py +++ b/tests/unit/test_observability_and_compliance_gap.py @@ -4,7 +4,8 @@ import logging from trustagents.observability import get_logger, log_stage -from trustagents.oracle.models import RetrievalStatus, SourceResult +from trustagents.oracle.models import FraudRiskBand, RetrievalStatus, SourceResult +from trustagents.oracle.stages.review import route_review from trustagents.risk.core import generate_risk_flags @@ -58,3 +59,33 @@ def test_no_compliance_gap_flag_when_source_succeeds(): ] flags = generate_risk_flags([], sources) assert "compliance_gap" not in flags + + +def test_route_review_compliance_gap_reason_is_distinct(): + """compliance_gap risk flag must produce a distinct reason, not the generic coverage message.""" + _, _, needs_manual, reasons = route_review( + band=FraudRiskBand.LOW, + risk_flags=["compliance_gap"], + policy_results=[], + extraction_confidence=1.0, + source_results_complete=False, + conflicting_sources=False, + ) + assert needs_manual is True + assert any("Compliance gap" in r for r in reasons) + assert not any(r == "Registry coverage incomplete" for r in reasons) + + +def test_route_review_incomplete_coverage_reason_without_compliance_gap(): + """Incomplete source coverage without the compliance_gap flag uses the generic message.""" + _, _, needs_manual, reasons = route_review( + band=FraudRiskBand.LOW, + risk_flags=[], + policy_results=[], + extraction_confidence=1.0, + source_results_complete=False, + conflicting_sources=False, + ) + assert needs_manual is True + assert any(r == "Registry coverage incomplete" for r in reasons) + assert not any("Compliance gap" in r for r in reasons)