Skip to content

fix: ensure base tmux session exists at startup (v0.2.45)#132

Merged
gbasin merged 10 commits into
masterfrom
fix/ensure-base-tmux-session
May 5, 2026
Merged

fix: ensure base tmux session exists at startup (v0.2.45)#132
gbasin merged 10 commits into
masterfrom
fix/ensure-base-tmux-session

Conversation

@gbasin
Copy link
Copy Markdown
Owner

@gbasin gbasin commented May 4, 2026

Summary

Live wakes were silently killing the freshly-created window. The base `agentboard` session was never created in production code paths — `tmux has-session -t agentboard` returns success on a session group match, so `ensureSession()` was a silent no-op whenever any per-connection `agentboard-ws-*` session existed. Listings (`SessionManager.listWindowsForSession`, `sessionRefreshWorker.listAllWindows`) skip every `-ws-` session as a proxy artifact, leaving an empty windowSet → orphan check kills the just-woken window within ~10ms.

Symptom in the wild:

```
session_wake_success tmuxWindow=agentboard:@1 durationMs=66
session_orphaned currentWindow=agentboard:@1 windowSetSize=0 windowSetSample=[]
window_killed tmuxWindow=agentboard:@1
ERR_INVALID_WINDOW
```

Fix

  • `SessionManager.ensureSession()` now uses the `=` exact-match prefix on `has-session` so a session-group match doesn't satisfy the existence check.
  • Tmux requires every session to have ≥ 1 window. We create the base session with a placeholder window named `agentboard_root` running `tail -f /dev/null`, defined in `tmuxFormat.ts` as `BOOTSTRAP_WINDOW_NAME` / `BOOTSTRAP_WINDOW_COMMAND`.
  • `SessionManager.listWindowsForSession` and `sessionRefreshWorker.listAllWindows` filter out windows whose name matches `BOOTSTRAP_WINDOW_NAME` so the placeholder never appears in the UI/API.
  • `index.ts` calls `sessionManager.ensureSession()` at startup so the base session exists before the first listing/wake.

Bumps version to 0.2.45.

Test plan

  • `bun run lint` (0 warnings, 0 errors)
  • `bun run typecheck` (clean)
  • `bun run test` (full runner: 821 tests, 0 fail)
  • Smoke test: dev server on a fresh tmux state — base session created with placeholder, `/api/sessions` returns 0 entries, then waking a real session (`cool-harp`) produces `session_wake_success` with no follow-up `session_orphaned` / `window_killed`. Window persists; `/api/sessions` shows it as `status=working`.

Why "create the base session" rather than "make the worker tolerate -ws- views"?

Considered both. Creating the base makes the invariant uniform: `agentboard:@N` is always a real addressable target. The worker-fallback alternative would couple multiple code paths to the group-resolution semantic and rewrite `tmuxWindow` strings under the hood — robust now, footgun later. The placeholder cost is one named window plus a 1-line filter.

gbasin added 10 commits May 4, 2026 10:10
The base agentboard session was never created in production code paths.
`tmux has-session -t agentboard` returns success whenever any session
in the `agentboard` group exists (e.g. per-connection `-ws-` sessions),
so `ensureSession()` was a silent no-op. Listings (`SessionManager.listWindows`,
`sessionRefreshWorker.listAllWindows`) skip every `-ws-` session as a
proxy artifact, leaving an empty window set. Wake operations would then
create a window, the orphan check would see an empty set, and the just-
created window would be killed (`session_orphaned` followed by
`window_killed` within ~10ms).

Fix: always ensure the base session exists at startup. Tmux requires
every session to contain at least one window, so we use a known-named
placeholder window (`__agentboard_root__` running `tail -f /dev/null`)
and filter that window out of listings.

`has-session` now uses the `=` exact-match prefix so a session-group
match doesn't satisfy the existence check.

Errors observed in the wild: `tmux list-windows failed: no server
running on /private/tmp/tmux-501/default` followed by `windowSetSize=0`
and immediate kill of every freshly-woken window.
When a session's tmux window vanishes unexpectedly (server restart,
tmux kill-server, refresh-worker detection), the session row now
auto-promotes to Hibernating instead of falling to History. Deliberate
kills are unaffected because handleKill explicitly clears is_pinned
in the same write that nulls current_window — they still land in
History.

Reverses the "never-hibernated active session falls to History"
decision from the cdefa78 hibernation refactor: in practice, losing
recent unstarred work to History was the bigger surprise than picking
up extra Hibernating rows.
Claude Code marks harness-injected entries (system-reminders, recap
notices, compaction summaries) with isMeta:true. The normalizer wasn't
filtering them, so they surfaced as the displayed "last user message"
instead of the user's actual prior turn.
@gbasin gbasin merged commit 7c20865 into master May 5, 2026
5 checks passed
@gbasin gbasin deleted the fix/ensure-base-tmux-session branch May 5, 2026 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant