-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary
Enhance the Plan phase in /implement with multi-perspective planning — two planners with different prompt framings (conservative vs aggressive) whose outputs are debated and synthesized, producing stronger plans that catch blind spots neither perspective would find alone.
Motivation (from Kiln Analysis)
Kiln's Architecture step (Step 4) uses a 7-agent planning pipeline with dual-model debate:
The Agent Roster
| Name | Role | Model |
|---|---|---|
| aristotle | Boss — orchestrates full planning pipeline | Opus |
| architect | Persistent mind — technical authority, live consultant | Opus |
| confucius | Claude-side planner — reads architecture docs + VISION, writes claude_plan.md |
Opus |
| sun-tzu | Codex wrapper — thin delegation to GPT-5.4, writes codex_plan.md |
Sonnet |
| socrates | Debater — reads both plans, identifies disagreements, writes debate_resolution.md |
Opus |
| plato | Synthesizer — delegates master-plan synthesis to GPT-5.4 | Sonnet |
| athena | Validator — validates master-plan on 5 dimensions, binary PASS/FAIL with retry loop | Opus |
The Debate Flow
1. Architect bootstraps → writes architecture.md → marks "complete"
2. Aristotle dispatches planners (gated by architecture.md status)
3. Confucius (Claude Opus) → writes claude_plan.md
- May consult architect directly with technical questions
4. Sun-Tzu → delegates to GPT-5.4 → writes codex_plan.md
5. Socrates → reads BOTH plans → identifies disagreements
- Consults architect if architectural clarification needed
- Writes debate_resolution.md
6. Plato → synthesizes both plans + debate resolution → writes master-plan.md
7. Athena → validates on 5 dimensions → PASS or FAIL
- FAIL: aristotle triggers retry (max 3 attempts)
- PASS: writes architecture-handoff.md
8. Aristotle presents plan to operator for approval/edit/abort
Artifacts Produced
claude_plan.md— Claude's plancodex_plan.md— GPT-5.4's plandebate_resolution.md— Socrates's analysis of disagreementsplan_validation.md— Athena's validation reportmaster-plan.md— final synthesized plan
Why It Works
Different models (and different prompt framings) have different blind spots. Claude excels at reasoning about constraints; GPT excels at concrete implementation. The debate catches assumptions neither would question alone.
Current State in DevFlow
/implementhas a Plan phase that spawns a single Plan agent- Agent Teams infrastructure exists (
/implement-teams.mdvariant) with debate capability - No multi-perspective planning — single planner, single viewpoint
- Teams variant uses debate for review, not planning
Technical Approach
Key Insight: Model Diversity Within Claude's Family
DevFlow doesn't need Codex CLI or GPT. Model diversity is achievable within Claude's model family by varying prompt framing:
1. Dual Planners with Different Framings
Planner-A (Opus, Conservative Framing):
You are a conservative technical planner. Your priorities:
- What could go wrong? Identify risks, edge cases, failure modes
- What are the constraints? Dependencies, backwards compatibility, performance
- What's the minimal safe change? Smallest diff that achieves the goal
- Where are the dragons? Fragile areas, implicit assumptions, hidden coupling
Planner-B (Sonnet, Aggressive Framing):
You are a pragmatic implementation planner. Your priorities:
- What's the fastest path to shipping? Direct route to working code
- What can we simplify? Remove unnecessary abstractions, YAGNI
- What's the 80/20? Which 20% of work delivers 80% of value
- Where can we leverage existing code? Patterns already in the codebase
2. Debate Phase
Synthesizer reads both plans and produces:
- Points of agreement (high confidence — both perspectives converge)
- Points of disagreement (needs resolution — different assumptions)
- Blind spots caught (one planner saw risk the other missed)
- Final synthesized plan incorporating strongest elements of each
3. Integration with Agent Teams
This naturally extends the existing Agent Teams infrastructure:
- Teams variant already supports debate between agents
- Add planning-specific debate protocol to
/implement-teams.md - Reuse existing TeamCreate/TeamDebate/TeamDelete lifecycle
4. Validation Loop (Optional)
After synthesis, a validation step checks the plan against:
- Does it address all requirements from the task?
- Are there untested assumptions?
- Is the scope reasonable for the change?
- Are rollback steps identified?
If validation fails, retry with feedback (max 2 retries).
Acceptance Criteria
- Two planner agents with different prompt framings (conservative + aggressive)
- Both produce independent plan artifacts
- Debate/synthesis phase identifies agreements, disagreements, and blind spots
- Final synthesized plan presented to user for approval
- Integrated with Agent Teams infrastructure (Teams variant)
- Optional validation loop with retry
- Base (non-Teams)
/implementunaffected — dual planning only in Teams variant
References
- Kiln repo: Step 4 agents (aristotle, confucius, sun-tzu, socrates, plato, athena)
- DevFlow Agent Teams:
shared/agents/, Teams command variants - DevFlow
/implement:plugins/devflow-implement/commands/ - Priority: Medium impact, Low effort — "Quick win" (leverages existing Agent Teams)