calibrate: lexicon density_threshold 2.0 -> 3.0 (cut detector false-positives) by devswha · Pull Request #495 · devswha/patina

devswha · 2026-06-14T14:09:09Z

Summary

Approved calibration delta (ralplan run 2026-06-03-1359-33c4, Critic APPROVE). The deferred "separate approved-delta PR" from the corpus-expansion effort. Measured, fixtures-first; src/features stays deterministic; burstiness/MATTR/ko-diagnostics unchanged (burstiness-FP deferred to a separate delta).

Change

DEFAULT_LEXICON_DENSITY_THRESHOLD 2.0 → 3.0 (src/features/lexicon-core.js), mirrored in .patina.default.yaml, SKILL.md, core/stylometry.md (doc-sync parity tests enforce it). Single constant; minMatches unchanged.

The C1 experiment (16-candidate grid over the 49 fixtures + private KO/EN corpus via analyzeText opts overrides) showed the lexicon signal is largely a false-positive generator on modern text, and dT=3.0 is the smallest change delivering the clean win.

Effect (re-scored manifests)

	before	after	recall
EN human FP	15.0% (30/200)	5.0% (10/200)	86.9% (unchanged)
KO human FP	16.8%*	14.0% (35/250)	59.2% (unchanged)

*KO 16.8% was the stale 2026-05-22 analyzer; re-scoring corrects it to 14.0%. The threshold change does not move KO — KO FPs are burstiness-driven and deferred.

49 suspect-zone fixtures stay 100% accuracy / ROC-AUC + PR-AUC 1.000.
Refreshed rebaseline-{ko,en}-latest, rebaseline-low-fpr-{ko,en}-latest, audit notes.
0 raw text committed; check:no-private-assets OK.

Verification

npm test 766/766 · npm run benchmark 100% / AUC 1.000 · npm run lint clean · npm run check:no-private-assets clean · npm run release:check 4.3.0 sync intact.

… false-positives Approved calibration delta (ralplan run 2026-06-03-1359-33c4, Critic APPROVE). Measured, fixtures-first: src/features stays deterministic; burstiness/MATTR/ ko-diagnostics UNCHANGED (burstiness-FP is a deferred separate delta). Lever: DEFAULT_LEXICON_DENSITY_THRESHOLD 2.0 -> 3.0 (src/features/lexicon-core.js), mirrored in .patina.default.yaml, SKILL.md, core/stylometry.md (doc-sync tests). The lexicon signal was largely a false-positive generator on modern text; the C1 experiment (16-candidate grid over the 49 fixtures + private KO/EN corpus via analyzeText opts) showed dT=3.0 is the smallest change delivering the clean win. Effect (re-scored manifests, current analyzer + dT=3.0): - EN human FP 15.0% (30/200) -> 5.0% (10/200), AI recall unchanged at 86.9%. - KO human FP 14.0% (35/250), recall unchanged at 59.2% (lexicon does not move KO; KO FPs are burstiness-driven -> deferred. The prior 16.8% was the stale 2026-05-22 analyzer; re-scoring corrects it). - 49 suspect-zone fixtures stay 100% accuracy / ROC-AUC + PR-AUC 1.000. Refreshed docs/benchmarks/rebaseline-{ko,en}-latest, rebaseline-low-fpr-{ko,en}-latest, and audit notes. No raw text committed (0 text rows; check:no-private-assets OK). Verify: npm test 766/766; npm run benchmark 100% / AUC 1.000; lint; release:check 4.3.0.

vercel · 2026-06-14T14:09:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
patina	Ready	Preview, Comment	Jun 14, 2026 2:09pm

vercel Bot deployed to Preview June 14, 2026 14:09 View deployment

devswha merged commit 4d32751 into main Jun 14, 2026
8 checks passed

devswha deleted the bot/calibration-lexicon-fp branch June 14, 2026 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

calibrate: lexicon density_threshold 2.0 -> 3.0 (cut detector false-positives)#495

calibrate: lexicon density_threshold 2.0 -> 3.0 (cut detector false-positives)#495
devswha merged 1 commit into
mainfrom
bot/calibration-lexicon-fp

devswha commented Jun 14, 2026

Uh oh!

vercel Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devswha commented Jun 14, 2026

Summary

Change

Effect (re-scored manifests)

Verification

Uh oh!

vercel Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 14, 2026 •

edited

Loading