test: consolidate scripted MockClient doubles into a shared conftest fixture by SuperMarioYL · Pull Request #76 · antoinezambelli/forge

SuperMarioYL · 2026-05-21T19:46:15Z

Summary

The runner, slot-worker, and eval-budget unit tests each ship their own
hand-rolled mock LLMClient. The three started as copies and have since
quietly diverged — they now disagree on observable behavior, so picking
"a" MockClient no longer gives a predictable contract.

Behavior	`test_runner.py`	`test_slot_worker.py`	`test_eval_budget.py`
exhausted `responses`	raises `IndexError`	raises `IndexError`	returns `TextResponse("stuck")`
`send_stream`	`TEXT_DELTA` + `FINAL`	raises `NotImplementedError`	`FINAL` only
call-spy lists	yes	no	no
`api_format` attr	absent	absent	`"ollama"`

The exhaustion split is the riskiest: a test moved between files (or a new test
copying the "wrong" double) silently changes from a hard error to a soft
fallback, which can mask a real defect.

Change

Replaces the three local classes with one configurable MockClient in
tests/conftest.py (previously empty). It is a behavioral superset, with
the variations promoted to explicit constructor knobs:

on_exhausted — "raise" (default) or a fallback response object
stream_mode — "deltas" (default) / "final" / "unsupported"
api_format, context_length — defaulted to the common case
call-spy lists (send_calls / send_stream_calls) are always populated

Each test file imports the shared class and passes only the kwargs needed to
reproduce its original behavior. No test assertion or test body was changed —
only the mock's source moved. The single-use inline NoFinalClient in
test_runner.py is intentionally left in place; it is a one-off for one
negative-path test, not a duplicate.

Why not just keep three mocks

A single shared double is only safe if it does not erase the per-test behavior
the three variants encoded — so the behavior is made configurable, not
collapsed. Each call site keeps its exact prior semantics, and the contract is
now visible at the construction site instead of hidden in three drifting class
bodies.

Test plan

pytest tests/unit/test_runner.py tests/unit/test_slot_worker.py tests/unit/test_eval_budget.py — 105 passed, unchanged from before.
pytest tests/unit/ — full suite green.

…fixture The runner, slot-worker, and eval-budget unit tests each carried their own hand-rolled mock LLM client. The three had quietly diverged: differing exhaustion behavior (IndexError vs. a fallback TextResponse), differing stream semantics (text deltas, FINAL-only, or unsupported), and uneven call-spy and api_format coverage. Replace them with one configurable MockClient in tests/conftest.py that is a behavioral superset, with knobs (on_exhausted, stream_mode, api_format, context_length) defaulting to the most common case. Each test file imports the shared class and passes only the kwargs needed to reproduce its original behavior; no test assertions or bodies changed.

Removing the three local mock classes left their type-only imports (AsyncIterator, LLMResponse, ChunkType, StreamChunk, ToolSpec) unused; drop the ones no longer referenced.

antoinezambelli · 2026-05-23T19:55:47Z

I appreciate this kind of cleanup! Doing some triage with in-flight stuff (including me, I'm literally traveling) but will circle back to this once some of the current work merges in.

antoinezambelli · 2026-05-31T20:57:01Z

Hey @SuperMarioYL — quick status, not a request for changes: I haven't forgotten this one. I took a closer look and the direction is right — consolidating the scripted MockClient doubles into a shared fixture, while keeping the runner / slot-worker / eval-budget behavior differences as explicit knobs rather than flattening them, is exactly how I'd want this done.

The reason I'm holding rather than merging: I've got a round of proxy/backend hardening still landing, and it's going to keep moving the test-suite surface for a bit. I'd rather not have you chase main through that churn and rebase repeatedly. Once the hardening settles, I'll come back to this and we can get it in against a stable base.

Appreciate the patience, and thanks for tackling the test-suite cleanup — it's genuinely useful work.

SuperMarioYL · 2026-05-31T21:07:11Z

Thanks for the detailed status — completely understood, and no rush at all. Consolidating the scripted MockClient doubles into a shared fixture while keeping the runner / slot-worker / eval-budget behavior differences as explicit knobs was exactly the intent, so I'm glad that lines up with how you'd want it. I'll keep an eye on main and am happy to rebase against a stable base whenever the proxy/backend hardening settles — just ping me when you're ready. Appreciate you taking the time to look.

SuperMarioYL added 2 commits May 22, 2026 03:40

test: drop imports orphaned by the MockClient consolidation

11bcf8d

Removing the three local mock classes left their type-only imports (AsyncIterator, LLMResponse, ChunkType, StreamChunk, ToolSpec) unused; drop the ones no longer referenced.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: consolidate scripted MockClient doubles into a shared conftest fixture#76

test: consolidate scripted MockClient doubles into a shared conftest fixture#76
SuperMarioYL wants to merge 2 commits into
antoinezambelli:mainfrom
SuperMarioYL:tests/consolidate-mockclient-fixture

SuperMarioYL commented May 21, 2026

Uh oh!

antoinezambelli commented May 23, 2026

Uh oh!

antoinezambelli commented May 31, 2026

Uh oh!

SuperMarioYL commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SuperMarioYL commented May 21, 2026

Summary

Change

Why not just keep three mocks

Test plan

Uh oh!

antoinezambelli commented May 23, 2026

Uh oh!

antoinezambelli commented May 31, 2026

Uh oh!

SuperMarioYL commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants