-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Description
Context
The Outside Voice pattern in /plan-ceo-review, /plan-eng-review, and /design-consultation is one of the most compelling ideas in gstack — the insight that two independent AI systems agreeing is more convincing than one system's exhaustive review is exactly right.
The Gap
In practice, Outside Voice has a structural limitation: the review prompts are static. The same questions get asked every time, regardless of what kinds of blindspots have been found before.
Real-world example from our SFT data generation work:
- Agent self-evaluated: "100% pass rate, quality score 1.000 for all 107 samples"
- Same model (Opus 4.6), fresh context: found 5 critical issues (quality: 5.5/10)
- Agent claimed it fixed the main issue (position bias)
- Fresh context review of the "fix": the old bias disappeared, but a new bias emerged at 67% concentration
The fix created new blindspots. A static reviewer wouldn't know to look for "whack-a-mole" patterns because it doesn't learn from previous findings.
Suggestion: Feedback Loop for Outside Voice
What if Outside Voice could evolve its review strategies based on confirmed findings?
The pattern:
- Canary: a confirmed case where the main agent's self-assessment was wrong
- Critic mutation: review prompts evolve (threshold shifts, focus changes, new examples injected)
- Replay: mutated critics are tested against known canaries
- Keep/discard: only critics that improve detection rate (≥ +5%) AND maintain low FP (≤ 10%) survive
This turns Outside Voice from a static second opinion into a cognitive immune system — review strategies that get smarter over time, calibrated against real blindspots.
We've been building this pattern at tcell, inspired by Karpathy's autoresearch. Our 5 seed critics cold-started from 0% and evolved to functional detection rates with 0% FP.
Key design constraint we found important: noise budget — max 1 alert per 10 tool calls. A reviewer that cries wolf is worse than no reviewer.
Concrete Integration Idea
gstack's Outside Voice could optionally maintain a canaries.jsonl — confirmed cases where the outside review found something the main review missed. When enough canaries accumulate (≥ 20), the Outside Voice prompts could auto-evolve against this dataset.
This would be fully compatible with the current /codex review + /codex challenge flow — just adds a feedback loop that makes each review slightly smarter than the last.
Happy to discuss the architecture further. The core insight is the same as Outside Voice — independent context catches what shared context can't — but with evolution, it compounds.