Skip to content

fix: kill SEO-on-admin + new-window/download false positives (real-app precision 13.6%→~67%)#271

Merged
cunninghambe merged 2 commits into
mainfrom
fix/real-app-fp-seo-and-state-change
Jun 3, 2026
Merged

fix: kill SEO-on-admin + new-window/download false positives (real-app precision 13.6%→~67%)#271
cunninghambe merged 2 commits into
mainfrom
fix/real-app-fp-seo-and-state-change

Conversation

@cunninghambe

Copy link
Copy Markdown
Owner

Why

The README claimed 4/4 = 100% real-app precision on the spoonworks benchmark. An audit showed that number was a read-only artifact: that run was invoked with --read-only, which skipped 2738/2932 tests (every mutating one), leaving only the static dependency-audit detector to fire. The "32 infra failures" attributed to a broken browser surface were all MutatingActionRejectedError: read-only mode — the browser surface was never broken.

The honest full-surface measurement (authed as owner + anon, mutating, mcp-http, 3676 tests, all detectors incl. a11y/SEO) plus per-cluster adversarial triage (skeptic → independent refuter, 0 flips, 0 uncertain) gives:

clusters real precision
Honest full-surface (run trucp69…) 44 6 13.6%

Per-detector: vulnerable_dependency_high 5/5 real, seo_title_duplicate_across_routes 1/1 real, missing_state_change 0/3, seo_h1_missing_or_multiple 0/35. The false positives are concentrated in two patterns this PR fixes.

What

1. fix(seo) — skip SEO hygiene rules on non-indexable routes.
SEO rules (h1/title/meta/canonical + duplicate-title) fired on auth-gated /admin/* pages that search engines never index — 34/35 of all seo_h1 findings. Adds isSeoIndexable() + an indexable field on SeoPageInput. A page is indexable only when it declares no noindex and an anonymous request fetches it with a 2xx (search engines crawl unauthenticated; auth-gated routes 3xx-redirect to login). classifySeoCorpus skips per-page hygiene rules and excludes non-indexable pages from the duplicate-title cross-check; the producer in execute.ts probes anonymous reachability per scraped page.

  • Validated against run trucp69…: 34/35 admin seo_h1 FPs suppressed; the 1 kept (/admin/login) is genuinely anonymously reachable (200), and the public duplicate-title true positive is preserved.

2. fix(state-change) — ignore clicks that open a new window or download.
missing_state_change flagged working controls whose click opens a new tab / triggers a download (e.g. "Download label" → window.open(url, '_blank')) — no in-page change happens, by design. Adds an openedNewWindowOrDownload signal to PostState; the producer instruments window.open before each click and inspects the clicked element's nearest anchor (target="_blank" / download); the classifier returns null when set.

  • Validated live in camofox: window.open → count 1, _blank/download anchor → detected, normal click → neither (still fires).

Deliberately deferred (documented, not silently skipped)

The other two missing_state_change FPs ("Create return label" POST-with-loading, "Increase quantity" async stepper) stem from deeper executor signal gaps: execute.ts hardcodes networkRequests: [] on the UI click path, and domMutationCount counts only childList mutations (not attribute/text). Fixing those touches the detector's true-positive surface, so they're documented in docs/follow-ups/missing-state-change-executor-gaps.md rather than risked here.

Impact

Projected precision on this surface: 13.6% → ~6/9 ≈ 67% (38 FPs → ~3). A full re-run to confirm the aggregate has not been done.

Verification

  • New TDD tests: classify/seo.test.ts (11), classify/state-change.test.ts (+2). Red→green watched.
  • tsc --noEmit clean; eslint clean on changed files; classify suite 316 pass; full suite 2961 pass (the 1 failure, tests/plan-probe.test.ts, fails on clean main too — a stale plan-count assertion, unrelated to this PR).
  • Producer behavior validated end-to-end against the real spoonworks routes and live camofox browser.

Note on the 5 real vulnerable_dependency_high

All 5 are genuine (npm-audit + GHSA confirmed) but severity-inflated — tagged critical while the applicable advisory is high (e.g. the critical Next.js advisories target 15.x, not the installed 16.2.2). A severity-calibration follow-up, not a precision issue.

🤖 Generated with Claude Code

cunninghambe and others added 2 commits June 2, 2026 22:29
SEO detectors (h1/title/meta/canonical + duplicate-title) fired on
auth-gated /admin/* pages that search engines never index. On the
spoonworks real-app benchmark this was 34/35 of all seo_h1 findings —
the dominant false-positive category behind the honest 13.6% precision.

Add isSeoIndexable() + an `indexable` field on SeoPageInput. A page is
indexable only when it declares no `noindex` and an anonymous request
fetches it with a 2xx (search engines crawl unauthenticated; auth-gated
routes 3xx-redirect to login). classifySeoCorpus skips per-page hygiene
rules and excludes non-indexable pages from the duplicate-title cross-
check. The producer in execute.ts probes anonymous reachability per
scraped page before classifying.

Validated against run trucp69ve6r6j9vk0i4ttkfc: 34/35 admin seo_h1 FPs
suppressed, the public duplicate-title true positive preserved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
missing_state_change fired on working controls whose click opens a new
tab or triggers a download (e.g. "Download label" -> window.open(url,
'_blank')) — no in-page change happens, by design. On the spoonworks
real-app benchmark these were confirmed false positives by adversarial
triage.

Add an openedNewWindowOrDownload signal to PostState. The producer
instruments window.open before each click and inspects the clicked
element's nearest anchor (target="_blank" / download attribute);
classifyMissingStateChange returns null when set. Validated live in
camofox: window.open -> count 1, _blank/download anchor -> detected,
normal click -> neither (still fires).

The two remaining FPs in this cluster (POST-with-loading, async stepper)
stem from deeper executor signal gaps (networkRequests hardcoded [],
domMutationCount ignores attribute/text mutations) and are documented in
docs/follow-ups/missing-state-change-executor-gaps.md rather than fixed
here, to avoid regressing the detector's true positives.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown

BugHunter Calibration | | 2026-06-03

Overall: tp=0 fp=0 fn=0 precision=1 recall=1 f1=0

BugKind Precision Recall F1 Status

@cunninghambe cunninghambe merged commit e2c7170 into main Jun 3, 2026
2 checks passed
@cunninghambe cunninghambe deleted the fix/real-app-fp-seo-and-state-change branch June 3, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant