Skip to content

feat(engine): batch review sessions across multiple specs #492

@mickume

Description

@mickume

Summary

Instead of running one review session per spec per review type, batch multiple specs into a single review session that checks each sequentially. This reduces slot consumption from N sessions to 1.

Problem

In the #339 scenario, 12 specs each needed a reviewer:pre-review session, creating 12 separate review nodes. Even with a concurrency cap, these 12 sessions run sequentially through the single review slot, adding ~50 minutes of wall-clock time (12 x ~4 min each). Each individual session has significant fixed overhead (LLM context setup, spec loading, session init/teardown) that could be amortized across a batch.

Proposed Solution

Introduce a review batching mode where a single review session processes multiple specs in one invocation, producing findings for each spec.

Key Changes

  1. Batch review node (injection.py) — Instead of injecting N individual review nodes at a barrier, inject a single batch node that references multiple specs. Node ID convention: batch-review:{review_type}:{batch_id}.

  2. Batch-aware session runner — The reviewer archetype session accepts a list of spec directories rather than a single spec. It iterates through each, producing per-spec findings.

  3. Result handler (result_handler.py) — Process batch results by splitting findings back to individual specs for blocking decisions.

  4. Graph modeling — The batch node depends on all constituent specs' coder groups being complete. Downstream dependencies (if any) are wired from the batch node.

When to Batch

Batching makes sense when multiple specs' review nodes become ready simultaneously (the exact scenario from #339). Individual review nodes are still appropriate when only one spec completes at a time.

Heuristic: if >= 3 review nodes of the same type become ready in the same ready_tasks() call, consolidate them into a batch.

Sketch

# In injection or engine, after barrier:
ready_reviews = [n for n in ready if _is_review_node(n)]
by_type = group_by(ready_reviews, key=lambda n: review_type(n))
for rtype, nodes in by_type.items():
    if len(nodes) >= BATCH_THRESHOLD:
        batch_node = create_batch_review_node(rtype, nodes)
        # Replace individual nodes with batch node in graph

Scope

  • High effort: touches injection, session runner, result handler, and graph modeling
  • Requires the reviewer archetype prompt to support multi-spec mode
  • Best long-term solution for review overhead at scale

Acceptance Criteria

  • When >= N review nodes of the same type become ready simultaneously, they are consolidated into a single batch session
  • The batch session produces per-spec findings that are individually actionable
  • Blocking decisions are still made per-spec (one spec's critical finding doesn't block unrelated specs)
  • Batch threshold N is configurable (default: 3)
  • Individual review sessions still work for single-spec completions
  • Wall-clock time for reviewing 12 specs is reduced by >= 50% compared to individual sessions

Trade-offs

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions