fix: kill SEO-on-admin + new-window/download false positives (real-app precision 13.6%→~67%) by cunninghambe · Pull Request #271 · cunninghambe/BugHunter

cunninghambe · 2026-06-03T13:54:29Z

Why

The README claimed 4/4 = 100% real-app precision on the spoonworks benchmark. An audit showed that number was a read-only artifact: that run was invoked with --read-only, which skipped 2738/2932 tests (every mutating one), leaving only the static dependency-audit detector to fire. The "32 infra failures" attributed to a broken browser surface were all MutatingActionRejectedError: read-only mode — the browser surface was never broken.

The honest full-surface measurement (authed as owner + anon, mutating, mcp-http, 3676 tests, all detectors incl. a11y/SEO) plus per-cluster adversarial triage (skeptic → independent refuter, 0 flips, 0 uncertain) gives:

	clusters	real	precision
Honest full-surface (run `trucp69…`)	44	6	13.6%

Per-detector: vulnerable_dependency_high 5/5 real, seo_title_duplicate_across_routes 1/1 real, missing_state_change 0/3, seo_h1_missing_or_multiple 0/35. The false positives are concentrated in two patterns this PR fixes.

What

1. fix(seo) — skip SEO hygiene rules on non-indexable routes.
SEO rules (h1/title/meta/canonical + duplicate-title) fired on auth-gated /admin/* pages that search engines never index — 34/35 of all seo_h1 findings. Adds isSeoIndexable() + an indexable field on SeoPageInput. A page is indexable only when it declares no noindex and an anonymous request fetches it with a 2xx (search engines crawl unauthenticated; auth-gated routes 3xx-redirect to login). classifySeoCorpus skips per-page hygiene rules and excludes non-indexable pages from the duplicate-title cross-check; the producer in execute.ts probes anonymous reachability per scraped page.

Validated against run trucp69…: 34/35 admin seo_h1 FPs suppressed; the 1 kept (/admin/login) is genuinely anonymously reachable (200), and the public duplicate-title true positive is preserved.

2. fix(state-change) — ignore clicks that open a new window or download.
missing_state_change flagged working controls whose click opens a new tab / triggers a download (e.g. "Download label" → window.open(url, '_blank')) — no in-page change happens, by design. Adds an openedNewWindowOrDownload signal to PostState; the producer instruments window.open before each click and inspects the clicked element's nearest anchor (target="_blank" / download); the classifier returns null when set.

Validated live in camofox: window.open → count 1, _blank/download anchor → detected, normal click → neither (still fires).

Deliberately deferred (documented, not silently skipped)

The other two missing_state_change FPs ("Create return label" POST-with-loading, "Increase quantity" async stepper) stem from deeper executor signal gaps: execute.ts hardcodes networkRequests: [] on the UI click path, and domMutationCount counts only childList mutations (not attribute/text). Fixing those touches the detector's true-positive surface, so they're documented in docs/follow-ups/missing-state-change-executor-gaps.md rather than risked here.

Impact

Projected precision on this surface: 13.6% → ~6/9 ≈ 67% (38 FPs → ~3). A full re-run to confirm the aggregate has not been done.

Verification

New TDD tests: classify/seo.test.ts (11), classify/state-change.test.ts (+2). Red→green watched.
tsc --noEmit clean; eslint clean on changed files; classify suite 316 pass; full suite 2961 pass (the 1 failure, tests/plan-probe.test.ts, fails on clean main too — a stale plan-count assertion, unrelated to this PR).
Producer behavior validated end-to-end against the real spoonworks routes and live camofox browser.

Note on the 5 real `vulnerable_dependency_high`

All 5 are genuine (npm-audit + GHSA confirmed) but severity-inflated — tagged critical while the applicable advisory is high (e.g. the critical Next.js advisories target 15.x, not the installed 16.2.2). A severity-calibration follow-up, not a precision issue.

🤖 Generated with Claude Code

SEO detectors (h1/title/meta/canonical + duplicate-title) fired on auth-gated /admin/* pages that search engines never index. On the spoonworks real-app benchmark this was 34/35 of all seo_h1 findings — the dominant false-positive category behind the honest 13.6% precision. Add isSeoIndexable() + an `indexable` field on SeoPageInput. A page is indexable only when it declares no `noindex` and an anonymous request fetches it with a 2xx (search engines crawl unauthenticated; auth-gated routes 3xx-redirect to login). classifySeoCorpus skips per-page hygiene rules and excludes non-indexable pages from the duplicate-title cross- check. The producer in execute.ts probes anonymous reachability per scraped page before classifying. Validated against run trucp69ve6r6j9vk0i4ttkfc: 34/35 admin seo_h1 FPs suppressed, the public duplicate-title true positive preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

missing_state_change fired on working controls whose click opens a new tab or triggers a download (e.g. "Download label" -> window.open(url, '_blank')) — no in-page change happens, by design. On the spoonworks real-app benchmark these were confirmed false positives by adversarial triage. Add an openedNewWindowOrDownload signal to PostState. The producer instruments window.open before each click and inspects the clicked element's nearest anchor (target="_blank" / download attribute); classifyMissingStateChange returns null when set. Validated live in camofox: window.open -> count 1, _blank/download anchor -> detected, normal click -> neither (still fires). The two remaining FPs in this cluster (POST-with-loading, async stepper) stem from deeper executor signal gaps (networkRequests hardcoded [], domMutationCount ignores attribute/text mutations) and are documented in docs/follow-ups/missing-state-change-executor-gaps.md rather than fixed here, to avoid regressing the detector's true positives. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-03T13:57:58Z

✅ BugHunter Calibration | | 2026-06-03

Overall: tp=0 fp=0 fn=0 precision=1 recall=1 f1=0

BugKind	Precision	Recall	F1	Status

cunninghambe and others added 2 commits June 2, 2026 22:29

cunninghambe merged commit e2c7170 into main Jun 3, 2026
2 checks passed

cunninghambe deleted the fix/real-app-fp-seo-and-state-change branch June 3, 2026 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: kill SEO-on-admin + new-window/download false positives (real-app precision 13.6%→~67%)#271

fix: kill SEO-on-admin + new-window/download false positives (real-app precision 13.6%→~67%)#271
cunninghambe merged 2 commits into
mainfrom
fix/real-app-fp-seo-and-state-change

cunninghambe commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cunninghambe commented Jun 3, 2026

Why

What

Deliberately deferred (documented, not silently skipped)

Impact

Verification

Note on the 5 real vulnerable_dependency_high

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Note on the 5 real `vulnerable_dependency_high`