Zellij to Tmux and some other fixes.#404
Merged
Merged
Conversation
Adds backend/internal/adapters/runtime/tmux implementing ports.Runtime via the tmux CLI. Drop-in replacement for the zellij adapter on Darwin/Linux. Key design points: - Handle is a plain session id string (no pane-id split needed for tmux). - Exact-match session targeting via = prefix for kill-session and has-session. - Keep-alive shell appended to launch command so sessions survive agent exit. - send-keys -l chunked for literal text delivery (no key-name interpretation). - IsAlive distinguishes definitive-dead (missing/no-server output) from probe errors so the reaper never kills a session on a transient tmux failure. - 34 tests pass: 32 unit tests via fakeRunner seam, 2 integration tests on real tmux 3.6b (TestRuntimeIntegration, TestRuntimeIntegrationExactSessionParsing). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Remove em dash from tmux_test.go:462 (project hard rule); replace with semicolon
- Derive integration test session IDs from t.Name() so concurrent runs do not collide on the same tmux session
- Remove dead scaffolding variables (r/fr, r2/fr2) in TestCreateDestroysAndReturnsErrorWhenNotAlive
- Quote \${SHELL:-/bin/sh} in buildLaunchCommand and update all asserting tests
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ij on Windows - New package runtimeselect: Runtime union interface (ports.Runtime + SendMessage/GetOutput/AttachCommand) with compile-time assertions for both adapters. New(log) returns tmux on non-Windows, zellij on Windows (replicating the old daemon socket-dir setup). - daemon.go: replace zellij-specific socket-dir block with runtimeselect.New(log); update comment to be runtime-neutral. - lifecycle_wiring.go: startSession param changed from *zellij.Runtime to runtimeselect.Runtime. - cli/doctor.go: runtime-aware checkTerminalRuntime (tmux on Darwin/Linux, zellij on Windows); added checkTmux. - cli/spawn.go: attach hint prints tmux attach -t <name> on non-Windows, keeps zellij attach hint on Windows. - wiring_test.go: startSession test uses runtimeselect.New(nil); zellij direct tests retained for zellij-specific coverage. - doctor_test.go: replaced three zellij tool tests with tmux equivalents. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
tmux creates sessions detached via new-session -d, so the Start method (carried over from the zellij runner shape, where it backs the Windows fire-and-forget spawn) is never called. Remove it from the interface and its implementations to shrink the seam. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…agnostic) Ports the ConPTY named-pipe binary framing protocol and rolling output buffer from pty-host.ts to Go. Implements EncodeMessage, MessageParser (handles arbitrary chunk boundaries, payload copy guarantee), and Ring (MaxOutputLines=1000, ANSI-safe, concurrent Append+Snapshot). All 15 unit tests pass on Darwin; GOOS=windows build is also clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Strengthen TestParserPayloadIsCopy to catch internal-buffer aliasing: feed frame1, capture its payload, feed frame2 of the same length so the parser's buffer overwrites the frame1 region, then assert frame1's bytes are unchanged. The prior test only mutated the input slice post-Feed and did not exercise the real aliasing risk. Add TestRingConcurrent: 10 writer goroutines (Append) and 10 reader goroutines (Snapshot + Tail) running concurrently with a WaitGroup. The test is meaningful only under the race detector and catches any missing mu coverage on Ring's exported methods. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds package ptyregistry under backend/internal/adapters/runtime/conpty/ptyregistry. Ports windows-pty-registry.ts: defensive read, atomic temp+rename write, delete-on-empty, register-replaces-same-ID, and auto-pruning List. PID liveness isolated behind build tags (syscall.Kill on Unix, OpenProcess on Windows). 10 tests all green on Darwin. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Ports pty-host.ts behavior to Go: ptyConn interface seam, Serve engine with ring replay, fan-out broadcast, MSG_* handlers, PTY-exit keep-alive, and graceful shutdown (ConPTY dispose first, 50ms grace, then clients and listener). Real conptyConn is Windows-only via build tag; non-Windows stub keeps the package importable on Darwin/Linux. Tests use a fake ptyConn with real loopback sockets and the B1 MessageParser, passing with -race. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Review of Task B3 found one Important bug and two minors. Important: in handleConn the ring Snapshot and the client registration ran under two separate h.mu acquisitions. A PTY chunk arriving in that gap was in neither the snapshot nor that client's broadcast, so it was silently dropped (a hole in the client's stream). Now take the snapshot, write it to the conn, and add the conn to the clients set all under a single h.mu hold; broadcast also takes h.mu so it cannot interleave. Added TestScrollbackLiveOrdering_NoDrop, which emits a contiguous numbered stream while a client connects and asserts the client's stream has no internal gap. It reliably fails against the old two-step code and passes under -race -count=20. Minor (faithfulness): conptyConn.Close() now also best-effort Process.Kill() (nil-guarded) so a child that ignores ConPTY EOF still exits and Done() fires, mirroring pty.kill() in pty-host.ts. Minor (simplify): use os.Environ() instead of exec.Command(shellCmd).Environ() for the child env. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n management (B4) Implements the conpty Runtime adapter: injectable spawn seam, loopback TCP client helpers (SendMessage/GetOutput/IsAlive/Kill), and Runtime methods (Create/Destroy/IsAlive/SendMessage/GetOutput). Session resolution uses an in-memory map with B2 registry fallback for daemon-restart recovery. Windows-only detached spawn in spawn_windows.go; stub errors on other OSes. All adapter methods are unit-tested on Darwin against an in-process B3 Serve and fakePTY. 48 tests pass, all three GOOS builds succeed, vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
clientIsAlive collapsed every probe failure (dial timeout, read-deadline expiry, write error, connection-refused) to false, which the reaper turns into ProbeDead and the LCM can promote to a permanent reap. A single transient 2s loopback timeout would spuriously kill a live idle session. Now clientIsAlive returns (alive bool, transientErr error): a refused dial is definitively gone (false, nil); a timeout or any connected-then-failed I/O error is transient (false, err) so the reaper records ProbeFailed and retries. Wire IsAlive to propagate it. Add regression test covering both the refused-is-gone and timeout-is-transient paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Evolve the terminal layer from argv-based attach (PTYSource.AttachCommand + injected spawnFunc) to stream-based attach (Source embedding ports.Attacher). tmux/zellij keep spawning their attach CLI on a local PTY via the new shared ptyexec.Spawn; conpty attaches by dialing its loopback pty-host directly with a loopbackStream over the B1 framing protocol. Reattach/backoff/size/SIGWINCH/detach semantics are unchanged. - ports: add Stream + Attacher. - ptyexec: new shared package holding the creack/pty (unix) and ConPTY (windows) spawn, moved verbatim from terminal with its tests. - terminal: PTYSource -> Source, drop spawnFunc/WithSpawn, run loop calls src.Attach and uses ports.Stream. - tmux/zellij: add Attach (argv via ptyexec.Spawn); conpty: add Attach (loopbackStream); ports.Attacher assertions on all three. - runtimeselect: union embeds ports.Attacher in place of AttachCommand. - tests migrated; new conpty attach_test against in-process Serve+fakePTY. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…, delete zellij
- runtimeselect.New: Windows branch now returns conpty.New(conpty.Options{}) instead
of zellij; compile-time assertion updated to conpty.Runtime.
- cli/ptyhost.go: new hidden "ao pty-host" subcommand (DisableFlagParsing so agent
shell args with leading dashes survive); calls conpty.RunHost and exits with its code.
- cli/root.go: wires newPtyHostCommand alongside newLaunchCommand.
- cli/doctor.go: Windows terminal-runtime check replaced with a static ConPTY
built-in pass; zellij import and checkZellij function removed.
- cli/spawn.go: Windows attach hint updated to dashboard message (ConPTY has no
CLI attach); zellij import removed.
- daemon/lifecycle_wiring.go: stale zellij comment updated to tmux/conpty.
- daemon/wiring_test.go: zellij import and TestDaemonZellijSocketDir test removed;
TestWiring_StartLifecycleThreadsMessengerIntoLCM now uses tmux.New.
- terminal/attachment_integration_test.go: re-pointed at real tmux
(TestAttachmentStreamsRealTmuxPane + TestAttachmentReattachAdoptsNewSize);
sessions cleaned up in t.Cleanup.
- internal/adapters/runtime/zellij: deleted entirely.
All three GOOS builds pass; go test -race ./... 1607 passed; go vet clean;
grep -rn "runtime/zellij" returns nothing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bridge forge.config.ts to accept the local keychain flow (APPLE_SIGNING_IDENTITY identity + AO_NOTARY_PROFILE notarytool profile) in addition to the existing CI secrets path (CSC_LINK + APPLE_ID/app-specific-password). Enables a signed + notarized macOS build from a developer Mac without exporting a .p12 or the Apple ID app-specific password. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A Finder/Dock launch starts the supervisor under launchd with no controlling tty, so TERM is unset. The daemon inherits that, and its tmux attach client (spawned with env=nil, inheriting the daemon env) dies immediately with "open terminal failed: terminal does not support clear" — the orchestrator terminal pane never opens. Seed TERM=xterm-256color (what the renderer's xterm.js emulates) as the base of buildDaemonEnv, the same place PATH is reconstructed for the same class of "Finder launch lacks a terminal's env" bug. A real TERM from the shell/process env still wins. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Captures the intended daemon lifecycle: on shutdown save every running session (worker and orchestrator) plus its gitignore-respecting uncommitted work to refs/ao/preserved/<id>, then force-remove worktrees; on boot recreate worktrees, replay the preserved work, and restore all sessions. Reuses existing SQLite state, session_worktrees.preserved_ref, manager.Restore, and the /shutdown endpoint (no new file, migration, or route). Also gitignore the built daemon binary copied into frontend/daemon/. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Working-tree regeneration of the pnpm lockfile and TanStack Router generated route tree. No hand edits; generated output only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds ForceDestroy(ctx, info) to ports.Workspace and the gitworktree adapter. It runs `git worktree remove --force`, then prune, then os.RemoveAll as a backstop. A new worktreeForceRemoveArgs builder in commands.go emits --force; the existing worktreeRemoveArgs is untouched so Destroy still refuses dirty worktrees via ErrWorkspaceDirty. TDD: test first creates a dirty worktree, confirms Destroy refuses with ErrWorkspaceDirty, then confirms ForceDestroy succeeds and the path is gone and deregistered. All 1609 backend tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…lifecycle Implements the correctness-critical save-on-close / restore-on-open pair in the gitworktree adapter: - StashUncommitted: captures uncommitted work (tracked edits and new non-ignored files) via a temp GIT_INDEX_FILE into a real commit stored at refs/ao/preserved/<session-id>. Never touches the real index or stash stack. Returns empty string for clean worktrees. Logs the count of .gitignore-skipped paths. - ApplyPreserved: replays the preserve commit onto a freshly re-added worktree via "git checkout <SHA> -- .". Deletes the ref on clean success; keeps it and returns ErrPreservedConflict (wrapped) on content conflicts. - Adds both methods to ports.Workspace interface and stubs them in integration and session_manager test doubles. TDD: wrote two failing tests first (RED confirmed via build failure on undefined methods), then implemented to GREEN. All 39 adapter tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
git checkout <sha> -- . is a path-checkout that always exits 0 for content divergence, making ErrPreservedConflict unreachable. Replace with git cherry-pick --no-commit which performs a true three-way merge, leaves textual conflict markers on conflict, and exits non-zero so the sentinel is correctly returned. Conflict detection now uses exit code only (locale-independent). Add TestWorkspaceIntegrationApplyPreservedConflict to assert: error is ErrPreservedConflict, preserve ref is kept, conflict markers appear in the file. All 40 tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…down lifecycle Implements Task 3: capture-then-destroy on shutdown and restore-all on startup. - Adds ErrPreservedConflict to ports as a named sentinel; gitworktree aliases it (following the same pattern as ErrBranchCheckedOutElsewhere). - Extends the Store interface with UpsertSessionWorktree and ListSessionWorktrees so the session manager can write the shutdown-saved marker and read it back. - SaveAndTeardownAll: for every live session with a workspace path, stash uncommitted work, write the session_worktrees row (DB commit before worktree removal, crash-safety invariant), mark terminated, destroy runtime, force-remove the worktree. Best-effort per session; no kind filter. - RestoreAll: for every terminated session that has a session_worktrees row (the marker written by SaveAndTeardownAll), re-create the worktree, apply any preserved ref (conflict logs and continues), then relaunch via the existing single-session Restore. Sessions killed by the user before shutdown (no row) are skipped. Best-effort per session; no kind filter. - TDD: 9 new tests (RED confirmed via build failure, GREEN confirmed 63 pass). Full suite: 1621 tests across 77 packages. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fix 1: daemon.go comment near SaveAndTeardownAll now correctly states that POST /shutdown closes the shutdownRequested channel (not cancel ctx). Also tighten the RestoreAll comment to remove the inaccurate claim. Fix 2: remove "Task 2's" phrasing from ForceDestroy ponytail comment in workspace.go; condition still references StashUncommitted by name. Fix 3: add note in main.ts that the 8s fetch timeout is shorter than the daemon's 30s save bound, so a SIGTERM after fetch abort does not cut the in-flight save short. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e-persistence feat: save-on-close / restore-on-open session lifecycle
fix(terminal): enable tmux mouse scroll and fix link clicking
These SDD workflow artifacts (task briefs, agent reports, progress ledger, review packages) were committed by accident in prior work, against the .superpowers/sdd/.gitignore intent. Remove them from the repo; they remain local-only scratch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fix the opaque 500 when restoring an un-resumable session (typed 409 SESSION_NOT_RESUMABLE), and add a post-failure popup that offers to recreate a fresh orchestrator on the same branch (cleaning the worktree, preserving committed history). Orchestrators only; recreate fires only after a restore attempt confirms the session cannot be resumed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ors clean=true Planning discovery: the recreate capability already ships via POST /orchestrators (clean=true), which kills the dead orchestrator and re-spawns on the canonical branch (addWorktree reattaches an existing branch). So the feature collapses to a typed-error fix plus a frontend popup. Spec updated to match. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-resumable restore Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ot be restored Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-orchestrator fix: graceful restore + recreate-orchestrator popup (no more 500)
…ctor tests
Formatting: ran gofmt and goimports (with local-prefixes) on the 8 listed
files plus ptyexec/spawn_unix.go which the linter also flagged.
Lint (25 issues fixed):
- gosec G115: EncodeMessage now returns ([]byte, error) with an explicit
bounds check before the int->uint32 conversion; all callers updated.
- govet nilness: removed dead `if lastErr == nil` branch in clientIsAlive;
lastErr is provably non-nil at that point (real bug).
- nilerr: extracted runAcceptLoop helper so Accept-error-on-close is not
flagged; listener close is normal shutdown, not a caller error.
- staticcheck SA4010: removed dead `full = append(...)` loop in host_test.
- revive var-declaration: `var prev int = -1` -> `prev := -1`.
- revive redefines-builtin-id: deleted local `min` helper; builtin covers it.
- unparam (2): dropped always-nil env return from attachCommand; dropped
unused shellPath param from buildLaunchCommand; updated callers.
- errcheck (8): deferred Close/Remove calls wrapped in func(){_ = ...}();
type assertion in host_main.go uses ok-form; fmt.Fprintf to stdout uses
_, _ = pattern; workspace.go tmpIdx.Close() uses _ =.
- gocritic nestingReduce: inverted if+continue in runtime.go resolve loop.
Windows E2E: skip TestDoctorChecksTmuxVersion,
TestDoctorChecksTmuxVersionFailsOnError, TestDoctorWarnsWhenTmuxMissing on
windows (ao doctor emits a conpty check there, not tmux).
Verified: gofmt -l . clean, golangci-lint 0 issues, go build ok,
go test -race 1624/1624 pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ttach timeouts The preserve round-trip/conflict tests commit inside a worktree of the cloned repo, which had no git identity; CI runners cannot auto-derive one, failing with "empty ident name". Set user.email/user.name on the clone in setupOriginClone so its worktrees inherit it. The tmux reattach test drives a real shell and parses stty output, which is slow under -race on CI; raise its echo-write and SIZE-output waits. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…swers Bumping timeouts was the wrong fix: a 30s wait still failed, so the probe output deterministically never appeared, not slowness. onOpen signals the stream accepts input, not that the reattached sh -i is at a prompt, so the first echo keystroke can be dropped. Resend the probe each poll until SIZE output lands, and on timeout dump the captured pane buffer so a remaining failure is self-explaining. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Root cause (from the buffer dump the prior commit added): with TERM unset on CI runners, tmux refuses to attach a client and prints "open terminal failed: terminal does not support clear", so the pane never runs the size probe. The daemon defaults TERM in production; the tests bypass it. Set TERM=xterm-256color in both real-tmux tests. Reproduced locally with `env -u TERM` (fails the same way) and verified the fix passes under it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Boot-time reconcile makes live tmux + worktree state match the DB on every daemon start, so a SIGKILL/crash/force-quit that skips SaveAndTeardownAll no longer leaks an orphaned daemon, tmux sessions, or worktrees. Adopt crash-surviving tmux sessions, preserve-and-terminate dead ones, reap in-namespace orphans, and add a frontend kill+replace branch for a wedged orphan daemon. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Every leak in the incident maps to a DB row, so orphan-reap is a per-session IsAlive+Destroy over terminated rows; no runtime enumeration, no ports/conpty/ runtimeselect changes. Reaping a tmux session with no DB row is deferred (YAGNI). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…leaked tmux Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When both inspectExistingDaemon and resolveDaemonFromPort return null but a process still holds the daemon port (a crashed/orphaned daemon), spawning a new Go child would collide on the port and exit 1. Detect this case, SIGTERM the holder (via the run-file PID, falling back to the probe PID), poll until the port is free (up to 8s), clear the stale run-file, then proceed to spawn fresh. The healthy-daemon reuse path is unchanged. Pure helper: src/shared/daemon-takeover.ts (planDaemonTakeover) Unit tests: src/shared/daemon-takeover.test.ts (3 tests, TDD red-green) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace planDaemonTakeover (inverted logic: ran kill block only when probe was null) with shouldReplacePortHolder(probe, holderPidAlive) which returns true when a real holder exists: non-null probe (rejected responder) OR a run-file PID that is still alive (hung holder). Update main.ts call site to compute PID liveness before gating the kill block. Update tests to cover all three distinct outcomes non-vacuously. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, Reconcile doc Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sions The orchestrator was abandoned on every app open: a fresh orchestrator spawned each launch and the prior conversation appeared lost (it was not; the transcript stays in ~/.claude, resumable by the deterministic --session-id AO pins). Two defects combined: 1. Restore's guard rejected any session with no agentSessionId AND no prompt as ErrNotResumable. But Claude resumes via a deterministic session id regardless of those fields, so promptless orchestrators were perfectly resumable yet always rejected. Workers slipped through only because they carry a prompt. Move the resumability decision to the adapter: restoreArgv returns ErrNotResumable only when GetRestoreCommand reports it cannot resume AND there is no prompt to fresh-launch from. 2. reconcileLive marked a crash-orphaned (dead-runtime) session terminated without a restore marker, so RestoreAll skipped it and it stayed dead. It now saves-and-tears-down to the same end state a graceful shutdown produces (capture work, write the session_worktrees marker, terminate, remove the worktree), so RestoreAll relaunches it on the same boot, resuming history. Crash recovery now matches graceful restart. If work capture fails it terminates without a marker rather than risk losing un-preserved work. Tests: promptless orchestrator restores via adapter resume; promptless session with a non-resuming adapter still returns ErrNotResumable; reconcileLive writes the marker + tears down the worktree. Full backend suite green (1632), gofmt/vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
002077a to
0c46172
Compare
This was referenced Jun 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.