corpus: KO collection wave (measure-only, G007) by devswha · Pull Request #491 · devswha/patina

devswha · 2026-06-14T11:59:06Z

Summary

Wave 1 of the approved corpus-expansion plan (.gjc/plans/ralplan/2026-06-03-1359-33c4). Measure-only: no detector threshold change, no src/features change.

What's in the manifest

artifacts/rebaseline-2025/manifest.ko.scored.public.jsonl — 380 hash-only rows:

250 natural-human controls (5 registers × 50, reused)
120 ai-like positives across 3 model families (gpt 40 / claude 40 / gemini 40)
5 lightly-edited-ai + 5 heavily-edited-ai (one light + one heavy per register)

Raw text stays in the gitignored private workspace; only hashes/metadata/scores are committed. .gitignore allowlists the new hash-only manifest.

Findings (docs/benchmarks/)

rebaseline-ko-latest — accuracy 75.0%, recall 59.2%, FP 16.8%; catch rate gpt 50% / claude 62.5% / gemini 67.5%. Public claim gate stays BLOCKED (per-family n<100 is an explicit measure-only Non-Goal).
rebaseline-low-fpr-ko-latest — B4 TPR@1%/5%FPR for ko and ko × register. Overall TPR at 5% FPR is 0.0% — high-scoring human controls block low-FPR operation (the honest "corpus is hard" outcome motivating a future, separately-approved calibration delta).
rebaseline-audit-ko-latest — operator audit of perfect/boundary samples: 0 mislabeled, 0 too-easy.

Verification

npm test 766/766
npm run benchmark 100% / ROC-AUC 1.000 / PR-AUC 1.000 (baseline fixtures unchanged)
benchmark:report, benchmark:robustness, check:no-private-assets, lint all pass

Wave 1 of the approved corpus-expansion plan. Measure-only: no detector threshold change, no src/features change. Manifest artifacts/rebaseline-2025/manifest.ko.scored.public.jsonl (380 rows, hash-only): 250 natural-human controls + 120 ai-like positives across 3 model families (gpt 40 / claude 40 / gemini 40) + 5 lightly-edited-ai + 5 heavily-edited-ai (one light + one heavy per register). Raw text stays in the gitignored private workspace; only hashes/metadata/scores are committed. Reports (docs/benchmarks/): - rebaseline-ko-latest.{md,json}: accuracy 75.0%, recall 59.2%, FP 16.8%; catch rate by family gpt 50% / claude 62.5% / gemini 67.5%. Public claim gate stays BLOCKED (per-family n<100 is an explicit measure-only Non-Goal). - rebaseline-low-fpr-ko-latest.{md,json}: B4 TPR@1%/5%FPR for ko and ko x register. Overall TPR at 5% FPR is 0.0% — high-scoring human controls block low-FPR operation, the honest "corpus is hard" outcome. - rebaseline-audit-ko-latest.md: operator audit of perfect/boundary cases; 0 mislabeled, 0 too-easy. Verify: npm test 766/766; npm run benchmark 100% / ROC-AUC 1.000 / PR-AUC 1.000; benchmark:report, benchmark:robustness, check:no-private-assets, lint all pass.

vercel · 2026-06-14T11:59:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
patina	Ready	Preview, Comment	Jun 14, 2026 11:59am

devswha merged commit c877a97 into main Jun 14, 2026
8 checks passed

devswha deleted the bot/corpus-ko-wave1 branch June 14, 2026 11:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

corpus: KO collection wave (measure-only, G007)#491

corpus: KO collection wave (measure-only, G007)#491
devswha merged 1 commit into
mainfrom
bot/corpus-ko-wave1

devswha commented Jun 14, 2026

Uh oh!

vercel Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devswha commented Jun 14, 2026

Summary

What's in the manifest

Findings (docs/benchmarks/)

Verification

Uh oh!

vercel Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant