Skip to content

Architectural: SurfaceMCP single-surface limitation blocks combined UI+API smoke coverage #130

@cunninghambe

Description

@cunninghambe

Found in smoke #8

SurfaceMCP only serves `surfaces[0]` from its config. The deliberate-bugs fixture has 6 surfaces (self-spa Vite + self-api OpenAPI + race-bad + idor-bad + pen-bad + 2 static). Smoke #4-#7 had `self-spa` first → got 6 UI kinds, 0 API kinds. PR #119 swapped to `self-api` first to populate `toolCount > 0` → smoke #8 got 5 API kinds, 0 UI kinds.

Net: trade-off, not addition. Combined potential is 6 UI + 5 API = 11 distinct kinds across the same fixture, but no single SurfaceMCP run can serve both.

Three resolution paths

A. Multi-surface SurfaceMCP — extend `/root/SurfaceMCP` to serve all surfaces from one process under different mount paths or via a routing layer. Most invasive (touches a different repo) but cleanest UX.

B. Multi-instance, multi-port — run N SurfaceMCP instances, one per surface, on adjacent ports. Update fixture's `bughunter.config.json` to support an array of `surfaceMcpUrl` values. BugHunter aggregates tool lists from all configured URLs. Bounded inside BugHunter.

C. Per-stack composite extractor in SurfaceMCP — add a synthetic stack type `composite` that itself recursively walks an array of sub-surfaces. Cleanest in the fixture config, requires SurfaceMCP work.

Recommendation: B. The change inside BugHunter is small (`surfaceMcpUrl` becomes `string | string[]`, the adapter loops). It uses what SurfaceMCP already does without extending it. Fixture orchestrator (`bin/up.sh`) already starts one server per fixture port — it just needs to start a SurfaceMCP per surface too.

Why this matters

Without this, the V33 self-test will permanently underreport recall because no run can exercise both UI-layer and API-layer detectors against the full fixture. Calibration will reflect the limitation, not the system.

Priority

High — recall measurement is unreliable until this is resolved. Blocks honest "what fraction of the 105 wired detectors actually fire on the gold fixture" being answered.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions