corpus: EN collection wave (measure-only, G008) by devswha · Pull Request #492 · devswha/patina

devswha · 2026-06-14T12:23:04Z

Summary

Wave 2 of the approved corpus-expansion plan. Measure-only: no detector threshold change, no src/features change.

Manifest

artifacts/rebaseline-2025/manifest.en.scored.public.jsonl — 330 hash-only rows:

200 natural-human from HAP-E (browndw/human-ai-parallel-corpus, MIT), balanced academic-summary / blog
120 ai-like across 3 families (gpt 40 / claude 40 / gemini 40)
5 lightly-edited-ai + 5 heavily-edited-ai (light+heavy per register)

Raw text stays gitignored; only hashes/metadata/scores committed.

Findings

rebaseline-en-latest — accuracy 85.8%, recall 86.9%, FP 15.0% (EN detection stronger than KO's 75%/59.2%).
rebaseline-low-fpr-en-latest — en + en×register at 1%/5% FPR. academic-summary/blog supported (blog TPR 88.5% at 5%); product-doc/chat-update/technical-how-to honestly no_negatives (HAP-E maps only 2 registers). Overall low-FPR TPR still 0% (high-scoring human controls).
rebaseline-audit-en-latest — 0 mislabeled, 0 too-easy; heavy human edits evade 4/5 registers.

Verification

npm test 766/766
npm run benchmark 100% / ROC-AUC 1.000 / PR-AUC 1.000
benchmark:report, benchmark:robustness, check:no-private-assets, lint all pass

Wave 2 of the approved corpus-expansion plan. Measure-only: no detector threshold change, no src/features change. Manifest artifacts/rebaseline-2025/manifest.en.scored.public.jsonl (330 rows, hash-only): 200 natural-human controls from HAP-E (browndw/human-ai-parallel-corpus, MIT; balanced academic-summary/blog) + 120 ai-like positives across 3 model families (gpt 40 / claude 40 / gemini 40) + 5 lightly-edited-ai + 5 heavily-edited-ai (one light + one heavy per register). Raw text stays in the gitignored private workspace; only hashes/metadata/scores are committed. Reports (docs/benchmarks/): - rebaseline-en-latest.{md,json}: accuracy 85.8%, recall 86.9%, FP 15.0% (EN detection notably stronger than KO). - rebaseline-low-fpr-en-latest.{md,json}: B4 TPR@1%/5%FPR for en and en x register. academic-summary/blog supported (blog TPR 88.5% at 5% FPR); other registers honestly report no_negatives (HAP-E maps only 2 registers). Overall TPR at low FPR still collapses to 0% (high-scoring human controls). - rebaseline-audit-en-latest.md: operator audit; 0 mislabeled, 0 too-easy. Heavy human edits evade detection in 4/5 registers. Verify: npm test 766/766; npm run benchmark 100% / ROC-AUC 1.000 / PR-AUC 1.000; benchmark:report, benchmark:robustness, check:no-private-assets, lint all pass.

vercel · 2026-06-14T12:23:06Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
patina	Ready	Preview, Comment	Jun 14, 2026 12:23pm

vercel Bot deployed to Preview June 14, 2026 12:23 View deployment

devswha merged commit 986eee6 into main Jun 14, 2026
8 checks passed

devswha deleted the bot/corpus-en-wave2 branch June 14, 2026 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

corpus: EN collection wave (measure-only, G008)#492

corpus: EN collection wave (measure-only, G008)#492
devswha merged 1 commit into
mainfrom
bot/corpus-en-wave2

devswha commented Jun 14, 2026

Uh oh!

vercel Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devswha commented Jun 14, 2026

Summary

Manifest

Findings

Verification

Uh oh!

vercel Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 14, 2026 •

edited

Loading