Found in smoke #17
With PR #177's per-surface budget allocation working correctly (web=900s, api=900s), the API surface still ran 0 tests despite planning 296. Logs show:
```
multi_surface: running pipeline for surface { surface: 'comprehensive-bench-api' }
discovery → static analysis → form-reachability probe (0 probes) → XSS canaries (0) → plan (296 tests)
"Executing 296 planned tests: 296 will run (296 api, 0 ui)"
[execute phase: exited immediately, 0 tests actually run]
header-probe (0 probes) → surface complete in 30s
```
Hypothesis
The execute phase has a precondition guard that's failing for openapi/express stacks specifically. Maybe the bound adapter doesn't expose the API tools to the executor? Or the test-case array is empty by the time execute reads it?
Investigate
- Trace from "Executing 296 planned tests" to execute.ts → why doesn't the loop iterate?
- Check whether `testCases` array is populated when execute starts
- Check whether the bound adapter's tool list is non-empty when execute calls it
Priority
HIGHEST — single biggest cause of recall regression in #17. Without API tests running, all sql/xss/cmd-injection/IDOR/auth-related kinds blocked even with budget allocation working.
Found in smoke #17
With PR #177's per-surface budget allocation working correctly (web=900s, api=900s), the API surface still ran 0 tests despite planning 296. Logs show:
```
multi_surface: running pipeline for surface { surface: 'comprehensive-bench-api' }
discovery → static analysis → form-reachability probe (0 probes) → XSS canaries (0) → plan (296 tests)
"Executing 296 planned tests: 296 will run (296 api, 0 ui)"
[execute phase: exited immediately, 0 tests actually run]
header-probe (0 probes) → surface complete in 30s
```
Hypothesis
The execute phase has a precondition guard that's failing for openapi/express stacks specifically. Maybe the bound adapter doesn't expose the API tools to the executor? Or the test-case array is empty by the time execute reads it?
Investigate
Priority
HIGHEST — single biggest cause of recall regression in #17. Without API tests running, all sql/xss/cmd-injection/IDOR/auth-related kinds blocked even with budget allocation working.