Summary
Instead of running one review session per spec per review type, batch multiple specs into a single review session that checks each sequentially. This reduces slot consumption from N sessions to 1.
Problem
In the #339 scenario, 12 specs each needed a reviewer:pre-review session, creating 12 separate review nodes. Even with a concurrency cap, these 12 sessions run sequentially through the single review slot, adding ~50 minutes of wall-clock time (12 x ~4 min each). Each individual session has significant fixed overhead (LLM context setup, spec loading, session init/teardown) that could be amortized across a batch.
Proposed Solution
Introduce a review batching mode where a single review session processes multiple specs in one invocation, producing findings for each spec.
Key Changes
-
Batch review node (injection.py) — Instead of injecting N individual review nodes at a barrier, inject a single batch node that references multiple specs. Node ID convention: batch-review:{review_type}:{batch_id}.
-
Batch-aware session runner — The reviewer archetype session accepts a list of spec directories rather than a single spec. It iterates through each, producing per-spec findings.
-
Result handler (result_handler.py) — Process batch results by splitting findings back to individual specs for blocking decisions.
-
Graph modeling — The batch node depends on all constituent specs' coder groups being complete. Downstream dependencies (if any) are wired from the batch node.
When to Batch
Batching makes sense when multiple specs' review nodes become ready simultaneously (the exact scenario from #339). Individual review nodes are still appropriate when only one spec completes at a time.
Heuristic: if >= 3 review nodes of the same type become ready in the same ready_tasks() call, consolidate them into a batch.
Sketch
# In injection or engine, after barrier:
ready_reviews = [n for n in ready if _is_review_node(n)]
by_type = group_by(ready_reviews, key=lambda n: review_type(n))
for rtype, nodes in by_type.items():
if len(nodes) >= BATCH_THRESHOLD:
batch_node = create_batch_review_node(rtype, nodes)
# Replace individual nodes with batch node in graph
Scope
- High effort: touches injection, session runner, result handler, and graph modeling
- Requires the reviewer archetype prompt to support multi-spec mode
- Best long-term solution for review overhead at scale
Acceptance Criteria
Trade-offs
Related
Summary
Instead of running one review session per spec per review type, batch multiple specs into a single review session that checks each sequentially. This reduces slot consumption from N sessions to 1.
Problem
In the #339 scenario, 12 specs each needed a
reviewer:pre-reviewsession, creating 12 separate review nodes. Even with a concurrency cap, these 12 sessions run sequentially through the single review slot, adding ~50 minutes of wall-clock time (12 x ~4 min each). Each individual session has significant fixed overhead (LLM context setup, spec loading, session init/teardown) that could be amortized across a batch.Proposed Solution
Introduce a review batching mode where a single review session processes multiple specs in one invocation, producing findings for each spec.
Key Changes
Batch review node (
injection.py) — Instead of injecting N individual review nodes at a barrier, inject a single batch node that references multiple specs. Node ID convention:batch-review:{review_type}:{batch_id}.Batch-aware session runner — The reviewer archetype session accepts a list of spec directories rather than a single spec. It iterates through each, producing per-spec findings.
Result handler (
result_handler.py) — Process batch results by splitting findings back to individual specs for blocking decisions.Graph modeling — The batch node depends on all constituent specs' coder groups being complete. Downstream dependencies (if any) are wired from the batch node.
When to Batch
Batching makes sense when multiple specs' review nodes become ready simultaneously (the exact scenario from #339). Individual review nodes are still appropriate when only one spec completes at a time.
Heuristic: if
>= 3review nodes of the same type become ready in the sameready_tasks()call, consolidate them into a batch.Sketch
Scope
Acceptance Criteria
Trade-offs
Related