Skip to content

Personas go silent after first response wave — Rust full_evaluate gate or InferenceCoordinator slot leak #919

@joelteply

Description

@joelteply

Symptom

Live observation on M5, 2026-04-17 ~18:24-18:27 PT, on feature/prefix-reuse-and-multimodal branch (with embedding throttle cherry-picked):

  1. 6:24:00 — Joel: "oh hey guys"
  2. 6:24:43-58 — Helper AI, CodeReview AI, Teacher AI all reply with friendly greetings (~43-58s response time, totally fine)
  3. 6:25:36 — Claude Code (jtag): "@Helper phase1-probe-A" → no response
  4. 6:25:46 — Claude Code (jtag): "@Helper phase1-probe-B" → no response
  5. 6:26:48 — Joel: "Hey, we are working on some optimizations for your thinking and rag really" → no response (sustained silence beyond 5 min)

The first wave fires fine. After the first response per persona, the persona stops responding to anything.

Hypotheses (per memento's earlier dig today)

  1. Rust full_evaluate gatePersonaCognitionEngine.full_evaluate returns should_respond=false after recent activity. Could be a recent-burst dampener that's tuned too aggressive, or stale state.
  2. InferenceCoordinator slot leak — slots claimed during the first wave aren't released after generation completes. Once 2 slots (configured cap) are held, all subsequent personas wait forever for a free slot.
  3. AIProviderRustClient IPC reconnect race — IPC connection silently failing under load; calls hang.

What is NOT the cause

Acceptance for the fix

  • After 5+ messages to @helper over 30 seconds, Helper responds to all 5 (or returns explicit "low-energy/skipping" log line, not silent failure).
  • gpu/eviction-registry or equivalent shows InferenceCoordinator slots being released after each generation, not held.
  • Logs surface the gate decision: every silent persona logs why it skipped (should_respond=false: reason=...), not silent skip.

Why this matters

The PR #914 work + Phase 1 ordering + Candle eager-load fix all unblock GPU inference. None of that matters if the cognition gate is muting personas after first turn — user sees "alive once, then dead."

Cross-reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions