Skip to content

fix(scenarios): holistic-review follow-ups — tighten assertions + docs + helpers#8369

Merged
HydraOps-T-rav merged 18 commits intomainfrom
mockworld-scenarios-review-fixes
Apr 20, 2026
Merged

fix(scenarios): holistic-review follow-ups — tighten assertions + docs + helpers#8369
HydraOps-T-rav merged 18 commits intomainfrom
mockworld-scenarios-review-fixes

Conversation

@HydraOps-T-rav
Copy link
Copy Markdown
Collaborator

Summary

Follow-up PR addressing the deep-review findings across Tier 1/2/3 scenario work (PRs #8356, #8360, #8361). Tightens weak assertions, aligns fakes with real ports, consolidates shared helpers, extends coverage, and documents conventions.

Zero production-code changes. 15 commits total (13 test-side + 2 docs).

Must-fix items (7)

  1. FakeGitHub.add_alerts signature aligned with port — was pr_number: int, now branch: str matching PRPort.fetch_code_scanning_alerts(branch). A19 updated. (commit 384d0619)
  2. init_test_worktree accepts origin=... — A14's 30-line inline workaround collapsed to 3 helper calls. Multi-worktree-under-same-parent now supported. (commit 9167f155)
  3. Removed importorskip from fuzz bead DAG testfake_beads is in main now; skip was masking future deletions. (commit 284f3faa)
  4. A11 asserts CI script queue drained — previously brittle to _pr_counter starting value; would silently pass if wait_for_ci missed the PR. Follow-up also fixed max_ci_fix_attempts config to actually exercise the fix_ci path. (commits 6a6ca5f3, 4b40266b)
  5. A20 asserts observable failure signal — was assert not merged (tautology for empty pipeline); now checks WorkerResult failure or escalation state. (commit e19079cf)
  6. A21 asserts state recovery path — was assert result is not None; now verifies StateTracker loaded and pipeline progressed. (commit ddc93d88)
  7. A22 proves wiki was consulted — pre/post log count delta replaces "file exists" check. Documented scripted-plan caveat. (commit 680e932f)

Should-fix items (2)

  1. Consolidated _seed_ports helper — was duplicated with divergent signatures across test_caretaker_loops.py and test_caretaker_loops_part2.py. Now lives at tests/scenarios/helpers/loop_port_seeding.py with a single kwargs API. (commit 9db403b6)
  2. FakeLLM.script_fix_ci APIfix_ci was hardcoded fixes_made=True. New scripting API lets scenarios exercise the "fix_ci gives up" branch. (commit 694ef7f7)

Nice-fix items (3)

  1. UI-3 asserts non-matching entries hidden — the Memory search filter test now proves filtering actually filters, not just that the matching entry renders. (commit 13843174)
  2. A20b + A20c — disk_full and branch_conflict workspace faults added; previously only permission was scenario-tested. (commit 799a0edb)
  3. vitest.config.mjs comment — explains why e2e specs are excluded. (commit 57b76ee7)

Documentation

  1. docs/scenarios/README.md updated with Tier 1/2/3 conventions: init_test_worktree, seed_ports, use_real_agent_runner, wiki_store/beads_manager, fault modes, Pattern A vs. Pattern B for loops, and extended scenario catalog covering A10–A22, B1, L9–L23. (commit 832bfaaa)

Follow-up issues filed

Follow-up corrections during the batch

  • FakeGitHub.wait_for_ci signature — updated to accept positional *_args matching production's call with ci_check_timeout, ci_poll_interval, stop_event passed positionally.
  • L18 assertionRepoWikiLoop stats now include queue_drained key; test updated to assert required keys rather than full dict equality.
  • Worktree collision test — follow-up tightening on the new init_test_worktree(origin=...) tests.

Test plan

🤖 Generated with Claude Code

T-rav-Hydra-Ops and others added 18 commits April 19, 2026 18:03
…t_for_ci signature; A20/L18 assertion corrections
…iles

Replaces the fabricated A1–A23 and L9–L23 descriptions with entries
derived directly from test names and docstrings in
test_agent_realistic.py, test_bead_workflow.py, test_caretaker_loops.py,
and test_caretaker_loops_part2.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the stale 'FakeGitHub.add_alerts(pr_number=...)' reference with
the actual API: add_alerts(branch=...) keyed by branch string, matching
PRPort.fetch_code_scanning_alerts(branch: str).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The 'or outcome.worker_result is None' branch was always true when
workspace creation raised (worker never ran → None), making the entire
compound assertion vacuous. Production wraps PermissionError in a
WorkerResult(success=False) via run_with_fatal_guard, so the real-signal
arm already fires. Dropping the tautological arm means the test fails
loudly if the signal disappears.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@HydraOps-T-rav HydraOps-T-rav merged commit dda0a69 into main Apr 20, 2026
20 checks passed
@HydraOps-T-rav HydraOps-T-rav deleted the mockworld-scenarios-review-fixes branch April 20, 2026 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant