You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Surfaced during PR #477 (#401 teachback gate, session pact-b791f352, 2026-04-20). A 6+ hour session with 3 remediation cycles + blind round 2 + rebase-then-force-push exposed 3 orchestrator-specialist coordination failure modes that hadn't been salient in shorter sessions. All 3 are fixable (platform or protocol level) and the failure modes are observable / repeatable.
Failure Mode 1 — Name-collision on specialist reuse
Pattern: Orchestrator spawns a fresh agent with name=X while X is already alive (idle consultant from earlier phase). Platform silently auto-appends a suffix (X → X-2) and creates a distinct agent rather than failing or aliasing.
Observed instances this session (3):
secretary at harvest dispatch → secretary-2 spawned; original + new both ran harvest
blind-architect-2 at fix dispatch → blind-architect-2-2 spawned; both instances produced teachbacks
blind-backend-coder-2 at fix dispatch → blind-backend-coder-2-2 spawned; both produced nearly-identical commits (one actually landed, other's staging was no-op)
Consequence: SendMessages + task ownership go to one instance; dispatch prompt goes to the other; orchestrator confusion about which instance is "the" fixer; wasted context on duplicate work.
Mitigation (orchestrator): Always use SendMessage for reuse-as-fixer rather than re-spawning with same name. Or spawn with explicitly differentiated names (arch-fixer-cycle3 vs blind-architect-2).
Mitigation (platform/PACT): Make TeamJoin fail or alias (not silently suffix) when name matches an existing teammate. Alternative: add documentation to pact-agent-teams skill explicitly warning against this anti-pattern.
Failure Mode 2 — Async stale-state replies from teammates
Pattern: SendMessage is fire-and-forget async. When orchestrator sends "stand down" or "stop," teammate may compose their next reply based on state from 2-3 turns ago, seeing the stand-down only after they've acted.
Observed instances this session (~6):
Multiple teammates responded with "proceeding" replies that crossed my redirect messages
blind-backend-coder-2 committed despite my stand-down (their reply acknowledged the stand-down AFTER commit landed)
blind-architect-2 composed a fix teachback after being told to stand down; arrived after my redirect but from their POV was before
Mitigation (orchestrator): For genuinely terminal directives (stop acting, shut down), use structured shutdown_request rather than text. Platform treats shutdown_request unambiguously; text replies are agent-interpreted.
Mitigation (PACT): Document "stand down vs shutdown_request" in pact-agent-teams skill. Text "stand down" is advisory; shutdown_request is structural. Teammates should treat text directives as best-effort under async conditions.
Failure Mode 3 — Completion-gate ambiguity (blocked vs done)
Pattern: teammate_completion_gate.py fires on idle + HANDOFF-metadata presence, interpreting "has produced output" as "done." Agents under sustained gate pressure either procedurally mark complete despite partial work, OR push through unauthorized scope to stop the nudges.
Observed instances this session:
backend-coder-1 at 3-commit context wall: gate fired 60+ times while work was PARTIAL (12 of 15 commits remained). Coder held integrity line via SACROSANCT completion rule but eventually procedural-closed to break the idle loop.
review-architect at docs/ gitignored blocker: gate fired repeatedly while waiting for lead decision on option 1/2/3; coder flagged "marked completed procedurally to stop the loop but NO artifact produced."
Consequence: False "completed" status on task list; true state masked by procedural close; orchestrator loses visibility into partial-work situations.
Mitigation (PACT): Extend handoff metadata schema with a status field (partial | blocked | done). The completion gate should distinguish these states: only fire the "mark completed" nudge on done; on partial / blocked, surface the blocker to the orchestrator instead.
Shared root cause
All 3 failure modes are amplified by session complexity: long sessions, multiple remediation cycles, reviewer-as-fixer reuse, and parallel teammate dispatch stack coordination risk. PACT's current skills cover single-cycle orchestration well; the multi-cycle / long-running case has gaps.
Suggested scope for implementation
Rather than 3 separate fixes, I'd propose one coordinated patch:
Documentation: update pact-agent-teams skill with an "orchestrator coordination patterns" section covering reuse-via-SendMessage, shutdown_request usage, and completion-gate semantics.
Memories saved by secretary capturing the 3 failure modes as institutional knowledge (visible in pact-memory under entities shared-worktree-race, force-push-state-sync, procedural-close-pattern)
Medium. Session ran to completion and shipped correct work; no viability threat. But patterns will recur in future multi-cycle sessions and are worth cleaning up structurally rather than ad-hoc per session.
Context
Surfaced during PR #477 (#401 teachback gate, session
pact-b791f352, 2026-04-20). A 6+ hour session with 3 remediation cycles + blind round 2 + rebase-then-force-push exposed 3 orchestrator-specialist coordination failure modes that hadn't been salient in shorter sessions. All 3 are fixable (platform or protocol level) and the failure modes are observable / repeatable.Failure Mode 1 — Name-collision on specialist reuse
Pattern: Orchestrator spawns a fresh agent with
name=XwhileXis already alive (idle consultant from earlier phase). Platform silently auto-appends a suffix (X→X-2) and creates a distinct agent rather than failing or aliasing.Observed instances this session (3):
secretaryat harvest dispatch →secretary-2spawned; original + new both ran harvestblind-architect-2at fix dispatch →blind-architect-2-2spawned; both instances produced teachbacksblind-backend-coder-2at fix dispatch →blind-backend-coder-2-2spawned; both produced nearly-identical commits (one actually landed, other's staging was no-op)Consequence: SendMessages + task ownership go to one instance; dispatch prompt goes to the other; orchestrator confusion about which instance is "the" fixer; wasted context on duplicate work.
Mitigation (orchestrator): Always use SendMessage for reuse-as-fixer rather than re-spawning with same name. Or spawn with explicitly differentiated names (
arch-fixer-cycle3vsblind-architect-2).Mitigation (platform/PACT): Make TeamJoin fail or alias (not silently suffix) when
namematches an existing teammate. Alternative: add documentation topact-agent-teamsskill explicitly warning against this anti-pattern.Failure Mode 2 — Async stale-state replies from teammates
Pattern: SendMessage is fire-and-forget async. When orchestrator sends "stand down" or "stop," teammate may compose their next reply based on state from 2-3 turns ago, seeing the stand-down only after they've acted.
Observed instances this session (~6):
blind-backend-coder-2committed despite my stand-down (their reply acknowledged the stand-down AFTER commit landed)blind-architect-2composed a fix teachback after being told to stand down; arrived after my redirect but from their POV was beforeConsequence: wasted cycles, duplicated work, confused ownership signals, erosion of orchestrator directive authority.
Mitigation (orchestrator): For genuinely terminal directives (stop acting, shut down), use structured
shutdown_requestrather than text. Platform treatsshutdown_requestunambiguously; text replies are agent-interpreted.Mitigation (PACT): Document "stand down vs shutdown_request" in
pact-agent-teamsskill. Text "stand down" is advisory;shutdown_requestis structural. Teammates should treat text directives as best-effort under async conditions.Failure Mode 3 — Completion-gate ambiguity (blocked vs done)
Pattern:
teammate_completion_gate.pyfires on idle + HANDOFF-metadata presence, interpreting "has produced output" as "done." Agents under sustained gate pressure either procedurally mark complete despite partial work, OR push through unauthorized scope to stop the nudges.Observed instances this session:
backend-coder-1at 3-commit context wall: gate fired 60+ times while work was PARTIAL (12 of 15 commits remained). Coder held integrity line via SACROSANCT completion rule but eventually procedural-closed to break the idle loop.review-architectat docs/ gitignored blocker: gate fired repeatedly while waiting for lead decision on option 1/2/3; coder flagged "marked completed procedurally to stop the loop but NO artifact produced."Consequence: False "completed" status on task list; true state masked by procedural close; orchestrator loses visibility into partial-work situations.
Mitigation (PACT): Extend handoff metadata schema with a
statusfield (partial|blocked|done). The completion gate should distinguish these states: only fire the "mark completed" nudge ondone; onpartial/blocked, surface the blocker to the orchestrator instead.Shared root cause
All 3 failure modes are amplified by session complexity: long sessions, multiple remediation cycles, reviewer-as-fixer reuse, and parallel teammate dispatch stack coordination risk. PACT's current skills cover single-cycle orchestration well; the multi-cycle / long-running case has gaps.
Suggested scope for implementation
Rather than 3 separate fixes, I'd propose one coordinated patch:
pact-agent-teamsskill with an "orchestrator coordination patterns" section covering reuse-via-SendMessage, shutdown_request usage, and completion-gate semantics.handoff.status: partial | blocked | done— small schema change, enables gate refinement.teammate_completion_gate.pyreads new status field; only fires completion nudge ondone; surfaces partial/blocked to orchestrator.Evidence artifacts from this session
~/.claude/pact-sessions/PACT-prompt/b791f352-ab87-4e6a-a618-05227e63284c/session-journal.jsonl(remediation cycle events + review_dispatch events)shared-worktree-race,force-push-state-sync,procedural-close-pattern)Priority
Medium. Session ran to completion and shipped correct work; no viability threat. But patterns will recur in future multi-cycle sessions and are worth cleaning up structurally rather than ad-hoc per session.