feat(ai): synthetic-defect harness for critic calibration (AI-044)#339
Merged
Conversation
Injects KNOWN defects (hallucinated facts, banned phrases, wrong length, tone breaks) into clean drafts, runs them through the real AI-041 critic (nano), and measures catch-rate + clean-control false-positive rate — validating the calibration AutoPublishCrew/SeoCrew gate publish on. - Deterministic injector + scoring → CI-testable with a FAKE critic, no key; live nano run admin-triggered via POST /admin/ai-quality/evals/ criticdefects/run, persists a criticdefects eval_run. Mirrors ToolCallEvalRunner (no judge, Score=catch-rate, BreakdownJson per-axis). - Honest gate: Passed = catch-rate >= 0.80 AND false-positive <= 0.20 — a flag-everything critic (FP=1.0) correctly FAILS, not passes. - 23 fixtures (factual x6, banned x4, length x4, tone x4, clean x5) on a real edition-description brief; clean controls neutral + grounded, length defects breach by a wide margin so an LLM can actually catch them. - Admin Evals tab: Run critic-defect button → catch%/FP%/n + PASS/FAIL. 15 tests (injector + runner w/ fake critic, fail-closed + gate cases). FP-rate enforced; golden grows later. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AI-044 — Synthetic-defect injection harness (Phase 7)
Validates the AI-041 critic's calibration — the calibration
AutoPublishCrew/SeoCrewgate publish on. Injects KNOWN defects into clean drafts, runs them through the real nano critic, and measures catch-rate + a clean-control false-positive rate.How
Mirrors
ToolCallEvalRunner: deterministic injection + scoring (pure, no LLM) → CI-testable with a fake critic, no key; the live nano run is admin-triggered (POST /admin/ai-quality/evals/criticdefects/run, ~23 calls, sync) and persists acriticdefectseval_run(no judge,Score=catch-rate,BreakdownJsonper-axis + FP). ReusesEvalRun— no schema change.Defect taxonomy (23 fixtures on a real edition-description brief)
ParseFailed(fail-closed verdict) counts as a correct reject for any defect.Honest gate (hardened per adversarial QA)
Passed = catchRate ≥ 0.80 AND falsePositiveRate ≤ 0.20. The originalcatchRate-only gate let a flag-everything critic (FP=1.0) report success — the runner's own test now proves it correctly fails. A useless critic can't masquerade as calibrated.Admin UI
New "Run critic-defect eval" button on the AI-quality Evals tab → catch% / FP% / n + PASS/FAIL badge; result persists into the eval history.
Tests — 15 (full AiEvals suite 41 pass / 5 live-key skip)
Pure injector transforms (breaches are real, deterministic) + runner scoring with fake critics: catches-all → fails (FP guard), catches-none → 0.0, good→pass, FP-just-over→fail, garbage→ParseFailed-caught. StudyBuddy set-equality green; no
IToolleaked.Verify
dotnet test tests/TextStack.AiEvals→ 41 pass / 5 skip (deterministic half runs with no key)dotnet test tests/TextStack.UnitTests→ 402 passdotnet format --verify-no-changes→ cleanpnpm -C apps/admin exec tsc --noEmit+build→ cleanNote: FP-rate enforced now; golden set grows later (per RAG/StudyBuddy golden TODOs). Admin button is build-verified; live click is owner-triggered (needs prod key + admin session).
🤖 Generated with Claude Code