docs(README): replace 'honest open question' with measured 80% real-app precision by cunninghambe · Pull Request #267 · cunninghambe/BugHunter

cunninghambe · 2026-05-14T16:19:34Z

Summary

Real-app precision on spoonworks (measured 2026-05-14 with v0.52 from PR #266) is 4/5 = 80%, a 10× uplift from baseline 7.8%. README now leads with this number instead of the synthetic 127/127 framing.

Why

The prior README said "the current FP rate on real apps is the honest open question — not the kind-recall number." That sentence is no longer true: the spoonworks benchmark (docs/benchmarks/BENCHMARK_SPOONWORKS.md) measured precision empirically, and the v0.52 fixes (PRs #265, #266) brought it to 80%.

Trajectory shown in README

run	clusters	precision
2026-05-11 baseline	77	6/77 = 7.8 %
2026-05-14 v0.51 (PR #265)	25	~4/25 = ~16 %
2026-05-14 v0.52 (PR #266)	5	4/5 = 80 %

The single remaining unexplained cluster (`missing_state_change` on a "Remove row" button) appears to be a detector over-fire — manual code review of `components/admin/inventory/RecipeEditor.tsx` confirms the handler removes the row from state correctly. Tracked as a follow-up.

Also

Reframed the 127/127 self-test number with an explicit caveat that it's synthetic and the real-app number is load-bearing
Dropped the v0.50 "open question" paragraph (the question is no longer open)
No changes to code; this is purely a documentation update reflecting empirical work done in PRs chore(v0.51): green CI + commit pending session-recovery, datetime, bugIdentity fixes #264–266

🤖 Generated with Claude Code

…recision Per the spoonworks v0.52 benchmark, real-app precision is now 4/5 = 80 % (measured 2026-05-14, full triage in docs/benchmarks/BENCHMARK_SPOONWORKS.md). Baseline was 6/77 = 7.8 %; the three-step trajectory is now in the README so the empirical numbers tell the story rather than aspirational framing. Also reframes 127/127 self-test number with the explicit caveat that it's synthetic — real-app number is load-bearing. Dropped the v0.50 'open question' paragraph (it's no longer open).

github-actions · 2026-05-14T16:22:56Z

✅ BugHunter Calibration | | 2026-05-14

Overall: tp=0 fp=0 fn=0 precision=1 recall=1 f1=0

BugKind	Precision	Recall	F1	Status

cunninghambe merged commit 22bb333 into main May 14, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(README): replace 'honest open question' with measured 80% real-app precision#267

docs(README): replace 'honest open question' with measured 80% real-app precision#267
cunninghambe merged 1 commit into
mainfrom
docs/v0.52-real-app-precision-readme

cunninghambe commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cunninghambe commented May 14, 2026

Summary

Why

Trajectory shown in README

Also

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant