Skip to content

Redesign teachback protocol: task-graph-native blocking (replace cooperative halt) #489

@michael-wojcik

Description

@michael-wojcik

Context

Surfaced during PR #477 round-8 dogfooding session 2026-04-20. PR #477 shipped cycles 9 + 9b + 10 for teachback gate. Cycle-10 introduced cooperative-halt blocking semantics: teammate sends teachback_submit, then waits idle for lead's teachback_approved before starting work.

Problem with cooperative-halt model (observed empirically)

Cooperative halt is not well-supported by the platform. Multiple teammates got stuck in idle loops during the session:

Teammate Failure mode
cycle9-fixer Idle-pinged "awaiting approval" because TaskGet doesn't surface metadata — couldn't verify the approval was written
round8-architect (orig) 5+ identical "Idle. Awaiting teachback_approved" pings, shut down without producing review
round8b-architect (relaunch) "Ignoring. Waiting." no-op loop consuming tokens per TeammateIdle hook fire
round8-tester (recovered) Brief stall; self-diagnosed by reading raw task JSON file

Root cause: TeammateIdle hook fires periodically (intentional keep-alive), teammate wakes, checks state, sees "still waiting," emits a no-op, re-enters idle. Livelock driven by hook noise × passive-wait protocol × no mechanical wake-signal distinguishing "approval arrived" from "hook false-positive."

Additional gaps:

Proposed redesign: teachback-as-task

Replace cooperative halt with two-task dispatch pattern using PACT's native blockedBy mechanism:

Lead dispatches teammate with TWO tasks:
  Task A: teachback (small, structured)
  Task B: primary work, blockedBy=[A]

Flow:

  1. Teammate claims Task A first, submits teachback content as its HANDOFF/metadata, marks A completed
  2. Lead reviews Task A, writes approval as resolution (approve → A stays completed, B unblocks automatically; corrections → reopen A, teammate re-claims)
  3. blockedBy edge mechanically prevents teammate from starting B until A is resolved — no custom wait-state, no idle-hook loop

Why this is architecturally better

  1. Kills the livelock. Teammate completes A and has no claimable work (B blocked). Standard "no work → idle" path triggers; no custom wait protocol.
  2. Uses platform-native blocking. blockedBy already gates PACT phases. We're reusing a mechanism, not inventing one.
  3. Auditability by construction. Task A's state + resolution IS the approval record. TaskGet/TaskList surface it directly; no reader-path ambiguity.
  4. Corrections are natural. Reopen Task A → teammate re-claims → revises → completes.
  5. Observability is free. Stuck teachback = incomplete task with clear owner + timestamp. No handoff_gate + teammate_completion_gate should learn teachback_pending/awaiting_approval state #486-style hook state-awareness needed.

Open design questions

  1. Handoff shape: does teachback_submit content live in Task A's description (lead writes expected structure), handoff metadata (teammate writes), or both?
  2. Approval shape: where does lead's teachback_approved 5 sub-fields live on Task A — metadata? description update? comment?
  3. Who marks Task A completed? Teammate self-completes after submitting? Or lead completes after approving? Convention change either way.
  4. Parallel dispatch: spawning 3 reviewers — create 6 tasks upfront (3 teachbacks + 3 primaries) or 3 teachbacks first + dispatch primaries after approvals?
  5. Relation to existing hook layer: cycles 1-8 built mechanical enforcement in teachback_check.py + teachback_gate.py + teachback_scan.py. Does that become redundant under task-graph-native model, or stays as belt-and-suspenders?

What survives from PR #477

  • Cycles 9 + 9b (load-bearing): structured content shape (teachback_submit 4 fields, teachback_approved 5 fields), universal-at-every-dispatch imperative, hard-rule MUST tone, Pask's CT framing. These don't depend on HOW the teammate waits — only on WHAT content gets exchanged. Proposal: cherry-pick to a clean branch.
  • Mechanical scanner/classifier (earlier cycles): stays useful as fallback/belt-and-suspenders even under new design.
  • Cycle-10 cooperative-halt mechanism: rewritten or replaced. The protocol-doc reversal (Proceeding-unless-corrected → Halting-until-approved) is partially right semantically but specifies a wait mechanism that doesn't work in practice.

What this redesign replaces

  • cycle-10 cooperative-halt semantics in pact-ct-teachback.md, pact-protocols.md, pact-agent-teams SKILL, pact-teachback SKILL, orchestration SKILL, peer-review.md
  • Queued #92 (cycle-11 immediate-idle) — not needed if there's no passive-wait to minimize
  • Queued #93 (cycle-12 reader-path) — partially obsolete; approval content goes on the task itself
  • #486 (hook state-awareness for awaiting_approval) — not needed; task state IS the state

Approach

Fresh planning session (likely /PACT:plan-mode) covering:

  • Full state-machine specification (task-graph style)
  • Dispatch API changes (what Task() prompt + dispatch hierarchy looks like)
  • Hook changes (remove teachback_gate.py cooperative-halt logic or repurpose)
  • Skill doc rewrites (pact-teachback/SKILL.md, pact-agent-teams/SKILL.md, orchestration/SKILL.md)
  • Migration from advisory-mode current state

Priority

HIGH — blocks Phase-2 activation of #401 (issue #481) since the underlying wait mechanism doesn't work. Should not re-enforce cycle-10's cooperative halt.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions