Skip to content

Fix tmux shell compatibility and Claude runner completion#2

Open
pbean wants to merge 7 commits intobmad-code-org:mainfrom
pbean:fix/tmux-shell-compatibility-automator
Open

Fix tmux shell compatibility and Claude runner completion#2
pbean wants to merge 7 commits intobmad-code-org:mainfrom
pbean:fix/tmux-shell-compatibility-automator

Conversation

@pbean
Copy link
Copy Markdown

@pbean pbean commented Apr 15, 2026

Summary

  • port the tmux shell-compatibility runtime rework onto the current automator pure-skill codebase
  • move commands/tmux.py and adapters/tmux.py onto the shared core/tmux_runtime.py implementation to remove shell-dependent duplicated lifecycle logic
  • fix Claude runner sessions that return to the interactive prompt after completing a command so monitor-session can advance instead of hanging
  • update the tmux changelog and packaged tmux guidance to match the new runtime behavior

Notes

  • I intentionally did not port the older _bmad / multi-location path-resolution follow-up from bmad-story-automator; automator already uses the newer pure .claude/skills layout and that older compatibility patch does not apply cleanly here.

Verification

  • PYTHONPATH=source/src python3 -m unittest source.tests.test_tmux_runtime
  • python3 -m compileall source/src

Summary by CodeRabbit

  • New Features

    • Runner-based tmux runtime with per-session persisted state, per-session command/runner artifacts, output export, and a runtime selector (legacy|runner|auto); sessions retain dead panes for postmortem inspection.
  • Bug Fixes

    • Fixed runner regression; improved session completion detection for Claude-like sessions and relaxed prompt detection.
  • Documentation

    • Updated tmux runtime and lifecycle guidance; added changelog entries.
  • Tests

    • Added integration and unit tests covering runtime behavior, lifecycle transitions, output/export, and artifact cleanup/TTL.

@pbean pbean requested a review from bma-d as a code owner April 15, 2026 02:33
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

Warning

Rate limit exceeded

@bma-d has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 29 minutes and 44 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 29 minutes and 44 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f3eb17b1-ddde-4925-9dfa-095337c81e07

📥 Commits

Reviewing files that changed from the base of the PR and between 99b2c95 and 948c310.

📒 Files selected for processing (2)
  • source/src/story_automator/core/tmux_runtime.py
  • source/tests/test_tmux_runtime.py

Walkthrough

Consolidates tmux session lifecycle and state into a new shared runtime (core/tmux_runtime.py), introduces a runner-based spawn path (configurable via SA_TMUX_RUNTIME), updates adapters/commands to delegate lifecycle/monitoring to the runtime, and adds tests and docs for the new behavior.

Changes

Cohort / File(s) Summary
Changelog & Documentation
docs/changelog/260401.md, payload/.claude/skills/bmad-story-automator/data/tmux-commands.md
Added changelog and detailed docs describing the new runner-based tmux execution, session state semantics, SA_TMUX_RUNTIME selector, artifact layout/permissions, and updated command/runner guidance.
Core Runtime Module
source/src/story_automator/core/tmux_runtime.py
New module implementing session naming/paths, legacy vs runner spawn flows, atomic JSON session state lifecycle, pane/pid snapshots, heartbeat/session_status/pane_status APIs, output export, cleanup/garbage-collection, and tmux wrappers.
Adapter Refactor
source/src/story_automator/adapters/tmux.py
Removed many inline helpers (naming, state IO, status/heartbeat, prompt/pid helpers); now delegates those responsibilities to core.tmux_runtime while retaining low-level tmux command wrappers.
Commands Refactor
source/src/story_automator/commands/tmux.py
Replaced local spawn/heartbeat/status logic with runtime calls (spawn_session, heartbeat_check, session_status); removed inline tmux spawn/send-key flows and temp-script execution.
Tests
source/tests/test_tmux_runtime.py
New integration + unit tests covering runner spawn flows, persisted JSON schema and permissions, pane lifecycle and prompt/completion heuristics (including Claude), output handling, failure mappings, and artifact cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI / Commands
    participant Runtime as core.tmux_runtime
    participant Tmux as tmux server
    participant FS as filesystem (/tmp state/runner/command/output)

    CLI->>Runtime: spawn_session(session, command, agent, mode)
    Runtime->>Tmux: create detached session + placeholder pane
    Runtime->>FS: write initial session state & runner/command scripts
    Runtime->>Tmux: respawn pane with runner script (remain-on-exit set)
    Tmux->>Runtime: pane pid / exit events (capture-pane used for output)
    Runtime->>FS: atomically update state (launching → running → finished)
    CLI->>Runtime: session_status/heartbeat_check(session)
    Runtime->>FS: load persisted state
    Runtime->>Tmux: capture-pane / check pane/pid liveness
    Runtime-->>CLI: structured status (active/idle/completed/crashed/...)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Poem

🐰 In a burrow of scripts I neatly compile,
Sessions now runner‑born, and state sits in file.
Prompts tip their whiskers, outputs hop to light,
Dead panes kept for peeking, runner scripts hold tight.
I tidy stale crumbs gently through the night.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix tmux shell compatibility and Claude runner completion' directly addresses two key issues resolved in the changeset: shell compatibility rework and Claude runner completion handling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Apr 15, 2026

🤖 Augment PR Summary

Summary: This PR refactors tmux session lifecycle management to be shell-independent and improves completion detection for Claude runner sessions.

Changes:

  • Introduces core/tmux_runtime.py as the shared tmux runtime (spawn, state tracking, status/heartbeat, artifact cleanup).
  • Migrates commands/tmux.py and adapters/tmux.py to use the shared runtime APIs instead of duplicating lifecycle logic.
  • Adds a runner-based spawn path that uses a placeholder pane + respawn-pane to avoid relying on interactive shell behavior.
  • Persists explicit per-session state/runner/command/output artifacts under /tmp with TTL-based cleanup.
  • Updates Claude completion detection to treat an idle prompt + completion marker as “finished” even if the process stays interactive.
  • Relaxes prompt detection to handle prompts that appear above Claude’s status panel.
  • Adds unit/integration tests covering runner lifecycle, prompt detection, and artifact cleanup.
  • Updates tmux documentation/changelog to reflect the new runtime behavior and migration knob (SA_TMUX_RUNTIME).

Technical Notes: Status evaluation now prioritizes the session state file (runner mode) and uses pane capture primarily for exported transcript artifacts.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread source/src/story_automator/core/tmux_runtime.py Outdated
Comment thread source/src/story_automator/core/tmux_runtime.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
source/src/story_automator/adapters/tmux.py (1)

40-41: ⚠️ Potential issue | 🟡 Minor

Remove or replace the unused skill_prefix definition in adapters/tmux.py.

The adapter defines skill_prefix returning "bmad-", but this function is never imported or used. The codebase only uses core.tmux_runtime.skill_prefix (which returns "/bmad-bmm-"), imported in commands/tmux.py. The adapter should either import skill_prefix from core like it does for other functions, or remove the unused definition.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/adapters/tmux.py` around lines 40 - 41, The local
skill_prefix function in adapters/tmux.py is unused and conflicts with the core
implementation; remove this local def skill_prefix(agent: str) or replace it by
importing the canonical skill_prefix from core.tmux_runtime (e.g., from
core.tmux_runtime import skill_prefix) and use that exported symbol instead so
adapters/tmux.py relies on core.tmux_runtime.skill_prefix rather than defining
its own.
🧹 Nitpick comments (2)
source/src/story_automator/core/tmux_runtime.py (2)

232-234: CPU threshold check may misclassify low-activity processes.

Using int(cpu) > 0 truncates floating-point CPU values, so a process using 0.5% CPU would be incorrectly classified as "idle" instead of "alive". Consider using a small threshold instead.

♻️ Proposed fix
     cpu = _process_cpu(child_pid)
     if _pid_alive(child_pid):
-        return (("alive" if int(cpu) > 0 else "idle"), cpu, str(child_pid), prompt)
+        return (("alive" if cpu > 0.1 else "idle"), cpu, str(child_pid), prompt)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 232 - 234, The
CPU check currently casts cpu to int which truncates fractional usage and can
mislabel low-activity processes; in the code that calls _process_cpu and
_pid_alive (the block using variables cpu, child_pid, prompt), parse cpu as a
float and compare against a small threshold (e.g. cpu_float > 0.1) instead of
int(cpu) > 0 to determine "alive" vs "idle"; update the return condition to use
the float threshold so processes with fractional CPU usage are classified
correctly.

1179-1186: Consider distinguishing EPERM from ESRCH in process liveness check.

The current implementation catches all OSError exceptions, but EPERM (permission denied) actually indicates the process exists (owned by another user), while ESRCH indicates it doesn't exist. In practice, this is unlikely to matter since tmux sessions are user-owned, but it's worth noting.

♻️ More precise error handling
 def _pid_alive(pid: int) -> bool:
     if pid <= 0:
         return False
     try:
         os.kill(pid, 0)
-    except OSError:
-        return False
     return True
+    except ProcessLookupError:
+        return False
+    except PermissionError:
+        return True  # Process exists but owned by another user
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 1179 - 1186,
The _pid_alive function currently treats any OSError from os.kill(pid, 0) as
"not alive"; change it to import errno and inspect the caught OSError.errno:
return False for errno.ESRCH (no such process), return True for errno.EPERM
(process exists but permission denied), and either re-raise or return False for
any other errno as appropriate; update the exception handling in _pid_alive to
perform this errno check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 137-143: The update_session_state function currently preserves the
existing updatedAt because it sets state["updatedAt"] =
str(state.get("updatedAt") or iso_now()); change this to always refresh
updatedAt when updating the state by assigning the current timestamp (e.g.,
str(iso_now()) or iso_now() consistent with other code) before calling
save_session_state; update the assignment in update_session_state so each call
to load_session_state/update_session_state reliably updates the updatedAt field.

---

Outside diff comments:
In `@source/src/story_automator/adapters/tmux.py`:
- Around line 40-41: The local skill_prefix function in adapters/tmux.py is
unused and conflicts with the core implementation; remove this local def
skill_prefix(agent: str) or replace it by importing the canonical skill_prefix
from core.tmux_runtime (e.g., from core.tmux_runtime import skill_prefix) and
use that exported symbol instead so adapters/tmux.py relies on
core.tmux_runtime.skill_prefix rather than defining its own.

---

Nitpick comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 232-234: The CPU check currently casts cpu to int which truncates
fractional usage and can mislabel low-activity processes; in the code that calls
_process_cpu and _pid_alive (the block using variables cpu, child_pid, prompt),
parse cpu as a float and compare against a small threshold (e.g. cpu_float >
0.1) instead of int(cpu) > 0 to determine "alive" vs "idle"; update the return
condition to use the float threshold so processes with fractional CPU usage are
classified correctly.
- Around line 1179-1186: The _pid_alive function currently treats any OSError
from os.kill(pid, 0) as "not alive"; change it to import errno and inspect the
caught OSError.errno: return False for errno.ESRCH (no such process), return
True for errno.EPERM (process exists but permission denied), and either re-raise
or return False for any other errno as appropriate; update the exception
handling in _pid_alive to perform this errno check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f1554b7c-6e7c-4b90-9825-e1b60b53f0ba

📥 Commits

Reviewing files that changed from the base of the PR and between 27e6a1c and ffd832e.

📒 Files selected for processing (6)
  • docs/changelog/260401.md
  • payload/.claude/skills/bmad-story-automator/data/tmux-commands.md
  • source/src/story_automator/adapters/tmux.py
  • source/src/story_automator/commands/tmux.py
  • source/src/story_automator/core/tmux_runtime.py
  • source/tests/test_tmux_runtime.py

Comment thread source/src/story_automator/core/tmux_runtime.py
@pbean
Copy link
Copy Markdown
Author

pbean commented Apr 15, 2026

@coderabbitai review
augment review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

@pbean Sure, I'll kick off a review of this PR now!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread source/src/story_automator/core/tmux_runtime.py
Comment thread source/src/story_automator/core/tmux_runtime.py Outdated
Comment thread source/src/story_automator/core/tmux_runtime.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
source/tests/test_tmux_runtime.py (1)

49-56: Polling deadline is tight for integration tests and may cause CI flakes.

A 5-second terminal-state window is brittle on slower runners. Consider increasing it (or making it configurable) to reduce nondeterministic failures.

Suggested adjustment
-        deadline = time.time() + 5
+        deadline = time.time() + 15
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/test_tmux_runtime.py` around lines 49 - 56, The test uses a very
short hardcoded polling deadline (deadline = time.time() + 5) causing CI flakes;
update the polling timeout in test_tmux_runtime.py to a larger value or make it
configurable: replace the fixed 5-second window with a constant or fixture
(e.g., TERMINAL_STATE_TIMEOUT or a pytest parameter/environment variable) and
use that when computing deadline and in the loop that calls
session_status(session, full=False, codex=codex, project_root=self.project_root,
mode="runner") so slow CI runners get more time to reach terminal states.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 878-880: The code currently truncates fractional CPU by using
int(cpu) in the heartbeat check inside tmux_runtime (variable cpu from
_process_cpu) which causes low-CPU but active sessions to be marked "idle";
change the check to use a numeric float comparison (e.g., float(cpu) > 0.0) or
otherwise treat cpu as a float/Decimal before comparing so fractional values
like 0.5 are considered active; ensure any None/invalid returns from
_process_cpu are handled (fall back to 0.0) and update the status assignment
(status = "alive" if float(cpu) > 0.0 else "idle") accordingly.
- Around line 1018-1021: The runner's exit classification treats SIGQUIT (exit
code 131) as a generic failure but _result_from_exit_code expects 131 to be
"interrupted"; update the runner script's exit-code handling (the conditional
that currently checks for 130 and 143) to also include 131 so it sets
result="interrupted" and failure_reason="signal_terminated", or alternatively
adjust _result_from_exit_code to match the runner; reference the runner
conditional that checks "$exit_code" -eq 130 || "$exit_code" -eq 143 and the
Python function _result_from_exit_code to keep both sides consistent.

---

Nitpick comments:
In `@source/tests/test_tmux_runtime.py`:
- Around line 49-56: The test uses a very short hardcoded polling deadline
(deadline = time.time() + 5) causing CI flakes; update the polling timeout in
test_tmux_runtime.py to a larger value or make it configurable: replace the
fixed 5-second window with a constant or fixture (e.g., TERMINAL_STATE_TIMEOUT
or a pytest parameter/environment variable) and use that when computing deadline
and in the loop that calls session_status(session, full=False, codex=codex,
project_root=self.project_root, mode="runner") so slow CI runners get more time
to reach terminal states.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 22641ef7-d969-42dd-a6f5-fc13c675bce7

📥 Commits

Reviewing files that changed from the base of the PR and between ffd832e and 08b2c22.

📒 Files selected for processing (3)
  • source/src/story_automator/adapters/tmux.py
  • source/src/story_automator/core/tmux_runtime.py
  • source/tests/test_tmux_runtime.py

Comment thread source/src/story_automator/core/tmux_runtime.py
Comment thread source/src/story_automator/core/tmux_runtime.py Outdated
@bma-d
Copy link
Copy Markdown
Contributor

bma-d commented Apr 16, 2026

Fixed the shell-compat regression in the tmux runner.

What changed:

  • runner mode still uses a bash wrapper for lifecycle/state management
  • the payload command now runs through the resolved command shell instead of hardcoded bash
  • shell resolution prefers tmux default-shell, then $SHELL, then falls back to bash

Why:

  • the previous runner path changed tmux-wrapper spawn --command ... semantics from "run in the user/tmux shell" to "always run in bash"
  • that broke existing shell-specific commands and setups; on this machine with tmux default-shell=/bin/zsh, a command using print failed with exit 127

Verification:

  • python3 -m compileall source/src
  • PYTHONPATH=source/src python3 -m unittest source.tests.test_tmux_runtime
  • npm run test:smoke
  • direct zsh repro now completes successfully

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
source/src/story_automator/core/tmux_runtime.py (2)

887-889: ⚠️ Potential issue | 🟠 Major

Avoid truncating fractional CPU in legacy heartbeat.

At Line 888, int(cpu) drops fractional usage (e.g., 0.5 -> 0), which can incorrectly report active sessions as idle.

🔧 Suggested fix
-    status = "alive" if int(cpu) > 0 else "idle"
+    status = "alive" if cpu > 0.1 else "idle"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 887 - 889, The
legacy heartbeat currently casts cpu to int with int(cpu) in the status
calculation (involving _process_cpu, cpu, and agent_pid), which truncates
fractional CPU usage and can mark active agents idle; change the check to use a
float comparison (e.g., float(cpu) > 0.0 or cpu > 0.0 if _process_cpu returns a
float) or apply a small threshold (e.g., cpu > 0.01) instead of int(cpu) so
fractional CPU usage correctly yields status "alive".

1033-1035: ⚠️ Potential issue | 🟠 Major

Align runner exit classification with SIGNAL_EXIT_CODES.

Line 1033 handles only 130 and 143 as interrupted, but _result_from_exit_code also treats 131 as interrupted. This mismatch can produce inconsistent terminal states.

🔧 Suggested fix
-elif [[ "$exit_code" -eq 130 || "$exit_code" -eq 143 ]]; then
+elif [[ "$exit_code" -eq 130 || "$exit_code" -eq 131 || "$exit_code" -eq 143 ]]; then
   result="interrupted"
   failure_reason="signal_terminated"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 1033 - 1035,
The interrupt check in tmux runtime is missing exit code 131 and should align
with the SIGNAL_EXIT_CODES set used by _result_from_exit_code; update the
conditional that currently checks for 130 and 143 (in the tmux command exit
handling) to include 131 or, better, replace the hard-coded list with a
membership test against SIGNAL_EXIT_CODES so the runner's classification
(interrupted / signal_terminated) matches _result_from_exit_code.
🧹 Nitpick comments (2)
source/src/story_automator/core/tmux_runtime.py (1)

432-437: Remove redundant updatedAt argument when calling update_session_state.

Line 436 is unnecessary because update_session_state() always refreshes updatedAt internally (Line 149). Keeping both is harmless but noisy.

♻️ Small cleanup
     update_session_state(
         paths.state,
         paneId=pane_id,
         panePid=respawned_pane_pid,
-        updatedAt=iso_now(),
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 432 - 437, The
call to update_session_state(...) is passing updatedAt=iso_now() redundantly;
remove the updatedAt keyword argument from the update_session_state invocation
(leave paneId and panePid as-is) because update_session_state itself refreshes
updatedAt internally; locate the call that currently passes iso_now() and delete
that parameter so only the necessary paneId/panePid/state args are passed.
source/tests/test_tmux_runtime.py (1)

50-58: Make terminal wait timeout less brittle for CI.

Line 51’s fixed 5-second deadline can intermittently fail under slower runners. This helper is used by integration tests, so this is a reliability hotspot.

🔧 Suggested change
     def _wait_for_terminal(self, session: str, *, codex: bool) -> dict[str, str | int]:
-        deadline = time.time() + 5
+        timeout_seconds = float(os.environ.get("TMUX_TEST_TIMEOUT_SECONDS", "15"))
+        deadline = time.time() + timeout_seconds
         last = session_status(session, full=False, codex=codex, project_root=self.project_root, mode="runner")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/test_tmux_runtime.py` around lines 50 - 58, The fixed 5s timeout
in _wait_for_terminal is too short for slow CI; update the helper to use a
configurable, longer timeout (e.g., read TEST_WAIT_TIMEOUT from the environment
with a sensible default like 30 seconds) and slightly increase the poll interval
to reduce flakiness; modify the _wait_for_terminal function to derive deadline
from that env/config value and keep using session_status(...) as before so tests
wait up to the configured timeout before failing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 90-97: The session_paths function currently interpolates the raw
session string into file paths (SessionPaths.state, .command, .runner, .output),
which allows path-traversal; sanitize or validate session before use — e.g., in
session_paths, ensure session contains only a safe whitelist (alphanumeric,
dash, underscore) and/or replace any disallowed characters with an underscore,
enforce a max length, or raise an error for invalid values; then use the
sanitized value when calling project_hash and building the Path objects so
cleanup flows can't escape the intended /tmp namespace.

---

Duplicate comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 887-889: The legacy heartbeat currently casts cpu to int with
int(cpu) in the status calculation (involving _process_cpu, cpu, and agent_pid),
which truncates fractional CPU usage and can mark active agents idle; change the
check to use a float comparison (e.g., float(cpu) > 0.0 or cpu > 0.0 if
_process_cpu returns a float) or apply a small threshold (e.g., cpu > 0.01)
instead of int(cpu) so fractional CPU usage correctly yields status "alive".
- Around line 1033-1035: The interrupt check in tmux runtime is missing exit
code 131 and should align with the SIGNAL_EXIT_CODES set used by
_result_from_exit_code; update the conditional that currently checks for 130 and
143 (in the tmux command exit handling) to include 131 or, better, replace the
hard-coded list with a membership test against SIGNAL_EXIT_CODES so the runner's
classification (interrupted / signal_terminated) matches _result_from_exit_code.

---

Nitpick comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 432-437: The call to update_session_state(...) is passing
updatedAt=iso_now() redundantly; remove the updatedAt keyword argument from the
update_session_state invocation (leave paneId and panePid as-is) because
update_session_state itself refreshes updatedAt internally; locate the call that
currently passes iso_now() and delete that parameter so only the necessary
paneId/panePid/state args are passed.

In `@source/tests/test_tmux_runtime.py`:
- Around line 50-58: The fixed 5s timeout in _wait_for_terminal is too short for
slow CI; update the helper to use a configurable, longer timeout (e.g., read
TEST_WAIT_TIMEOUT from the environment with a sensible default like 30 seconds)
and slightly increase the poll interval to reduce flakiness; modify the
_wait_for_terminal function to derive deadline from that env/config value and
keep using session_status(...) as before so tests wait up to the configured
timeout before failing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 48a5ebea-6f51-49eb-a760-b74967f0b051

📥 Commits

Reviewing files that changed from the base of the PR and between 08b2c22 and 82a83ec.

📒 Files selected for processing (2)
  • source/src/story_automator/core/tmux_runtime.py
  • source/tests/test_tmux_runtime.py

Comment thread source/src/story_automator/core/tmux_runtime.py
bma-d added 2 commits April 16, 2026 03:59
…patibility-automator

# Conflicts:
#	source/src/story_automator/commands/tmux.py
@bma-d
Copy link
Copy Markdown
Contributor

bma-d commented Apr 17, 2026

Addressed the remaining tmux runtime review items in 99b2c95.

What changed:

  • protect active runner sessions from TTL cleanup deleting command/runner/output artifacts
  • validate session names before building /tmp artifact paths
  • align legacy heartbeat CPU handling with fractional CPU usage
  • align runner signal handling for exit code 131
  • clear failureReason on successful interactive Claude completion
  • make the runner terminal wait in tests configurable (TMUX_TEST_TIMEOUT_SECONDS)

Verification:

  • python3 -m compileall source/src
  • PYTHONPATH=source/src python3 -m unittest source.tests.test_tmux_runtime source.tests.test_runtime_policy source.tests.test_success_verifiers source.tests.test_orchestrator_parse source.tests.test_state_policy_metadata
  • npm run test:smoke

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
source/src/story_automator/core/tmux_runtime.py (3)

169-199: Stale-artifact cleanup correctness is fine; minor hardening opportunity.

_session_name_from_state_path uses a hash-agnostic regex (\.sa-[^-]+-session-...) even though the glob is already scoped to root_hash. Low practical risk (glob filters first), but tying the regex to the passed-in hash would make the helper robust to being called from other contexts:

♻️ Optional tightening
-def _session_name_from_state_path(path: Path) -> str:
-    match = re.match(r"\.sa-[^-]+-session-(.+)-state\.json$", path.name)
+def _session_name_from_state_path(path: Path, root_hash: str | None = None) -> str:
+    hash_part = re.escape(root_hash) if root_hash else r"[^-]+"
+    match = re.match(rf"\.sa-{hash_part}-session-(.+)-state\.json$", path.name)

And pass root_hash at the call site (Line 171).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 169 - 199, The
helper _session_name_from_state_path currently uses a hash-agnostic regex which
can mismatch when used outside the current glob; update
_session_name_from_state_path to accept root_hash (or root_hash_str) and
incorporate that hash into its regex (e.g., include the exact root_hash token in
the pattern) so it only matches state files for that root, and update the caller
in tmux_runtime.py (the loop that calls
_session_name_from_state_path(state_path)) to pass the root_hash argument;
ensure any other callers of _session_name_from_state_path are updated similarly
or kept backward-compatible.

449-481: Legacy spawn path still hardcodes bash — intentional, but worth a brief note.

_spawn_legacy uses #!/bin/bash for long-command scripts and bash {paths.command} via send-keys, independent of resolve_command_shell(). Given the PR scope (legacy = pre-fix behavior) this looks deliberate, but a one-line comment here explaining that the shell-resolution fix applies only to runner mode would prevent future "why wasn't legacy also patched?" churn.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 449 - 481, Add
a one-line comment inside _spawn_legacy explaining that long commands are
intentionally executed with bash (#!/bin/bash and bash {paths.command}) to
preserve legacy behavior and that the shell-resolution change using
resolve_command_shell() only applies to the newer runner mode; reference the
_spawn_legacy function and the use of paths.command/send-keys so reviewers
understand this is deliberate and not a bug.

430-446: Fixed time.sleep(0.1) after respawn-pane is racy.

respawn-pane is effectively synchronous for pane creation, but the new pane_pid (the just-forked bash runner) may or may not be the value tmux reports within 100 ms on a loaded system, and you immediately fail hard (tmux_kill_session) if the lookup returns ≤ 0. Consider polling #{pane_pid} for a short bounded window instead of a single fixed sleep, so a slow fork doesn't tear the session down.

♻️ Suggested poll loop
-    time.sleep(0.1)
-    respawned_pane_pid = _safe_int(tmux_display(session, "#{pane_pid}"))
+    respawned_pane_pid = 0
+    deadline = time.time() + 1.0
+    while time.time() < deadline:
+        respawned_pane_pid = _safe_int(tmux_display(session, "#{pane_pid}"))
+        if respawned_pane_pid > 0:
+            break
+        time.sleep(0.05)
     if respawned_pane_pid <= 0:
         tmux_kill_session(session, str(root))
         return ("failed to resolve respawned tmux pane pid\n", 1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/core/tmux_runtime.py` around lines 430 - 446,
Replace the racy fixed sleep after run_cmd("tmux", "respawn-pane", ...) with a
short bounded poll that repeatedly calls tmux_display(session, "#{pane_pid}")
and parses with _safe_int until a positive pid is returned or a timeout elapses
(e.g. total ~1–3s with 50–200ms intervals); use the existing respawned_pane_pid
logic and only call tmux_kill_session(session, str(root)) and return the failure
tuple if the poll expires without a positive pid, otherwise proceed to
update_session_state(paneId=..., panePid=...) and return success.
source/tests/test_tmux_runtime.py (1)

117-123: Consider adding a negative shell-dispatch assertion too.

The test correctly verifies COMMAND_SHELL=/bin/zsh and the payload line, but it doesn't guard against future regressions that re-introduce a hardcoded bash "$COMMAND_FILE". A one-line assertNotIn('bash "$COMMAND_FILE"', content) (and similar for the shebang of COMMAND_FILE) would cheaply pin the shell-compat fix that motivated this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/test_tmux_runtime.py` around lines 117 - 123, Update the
test_runner_file_content_uses_interactive_shell_for_zsh_payloads test to also
assert negative cases so we don't regress to a hardcoded bash dispatch: after
calling _runner_file_content(paths, "/bin/bash", "/bin/zsh", temp_dir) and the
existing positive assertions for COMMAND_SHELL and the payload line, add
assertNotIn('bash "$COMMAND_FILE"', content) and assertNotIn('#!/bin/bash',
content) (or a corresponding shebang string for COMMAND_FILE) to ensure the
runner uses the COMMAND_SHELL variable rather than a hardcoded bash invocation.
source/src/story_automator/commands/tmux.py (1)

50-60: kill / kill-all don't propagate project_root to runtime cleanup.

tmux_kill_session(args[1]) (and the loop on Line 58) rely on tmux_runtime.tmux_kill_session's project_root=None default, which then falls back to get_project_root() via project_hash(None). That works in practice, but every other command path in this file explicitly passes get_project_root(). Passing it here too would make artifact cleanup deterministic if the CLI is ever invoked with a PROJECT_ROOT override that differs from cwd.

♻️ Suggested consistency tweak
     if action == "kill":
         if len(args) < 2:
             return _usage(1)
-        tmux_kill_session(args[1])
+        tmux_kill_session(args[1], get_project_root())
         return 0
     if action == "kill-all":
         sessions, _ = tmux_list_sessions("--project-only" in args[1:])
+        root = get_project_root()
         for session in sessions:
-            tmux_kill_session(session)
+            tmux_kill_session(session, root)
         print(f"Killed {len(sessions)} sessions")
         return 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/src/story_automator/commands/tmux.py` around lines 50 - 60, The kill
and kill-all branches call tmux_kill_session and tmux_list_sessions without
forwarding the explicit project root; update the calls to pass
project_root=get_project_root() so cleanup uses the same PROJECT_ROOT override
as other commands: change tmux_kill_session(args[1]) to
tmux_kill_session(args[1], project_root=get_project_root()), change
tmux_list_sessions("--project-only" in args[1:]) to include
project_root=get_project_root(), and change the loop's
tmux_kill_session(session) to tmux_kill_session(session,
project_root=get_project_root()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 706-714: The regex in _claude_completion_marker_present is too
permissive because the second alternative `\bfor\s+\d+m\s+\d+s\b` can
false-positive on generic output like "for 3m 10s"; narrow the matcher by
removing that generic alternative and only keep the explicit completion phrase
alternative `(Baked|Done|Finished)\s+for\s+\d+m(?:\s+\d+s)?`, or if you must
preserve the generic form, anchor it to a Claude footer or glyph (e.g., require
line-start or a specific prefix like `✻`/`⏺`) so only the actual Claude
completion footer triggers a true result.
- Around line 624-639: The code currently treats pane_snapshot.dead_status ==
None as success by calling _result_from_exit_code(None) and coercing exitCode to
0; change this so when pane_snapshot.dead_status is None you map result to a
non-success value (e.g., "unknown" or "crashed") and persist exitCode as None
(or another sentinel) instead of 0; update the block that calls
_result_from_exit_code (refer to pane_snapshot, _result_from_exit_code,
save_session_state, and dead_status) to branch on dead_status is None and set
{"result": "unknown"/"crashed", "exitCode": None, "failureReason": "<no exit
status>"} before calling save_session_state and returning
load_session_state(paths.state).

---

Nitpick comments:
In `@source/src/story_automator/commands/tmux.py`:
- Around line 50-60: The kill and kill-all branches call tmux_kill_session and
tmux_list_sessions without forwarding the explicit project root; update the
calls to pass project_root=get_project_root() so cleanup uses the same
PROJECT_ROOT override as other commands: change tmux_kill_session(args[1]) to
tmux_kill_session(args[1], project_root=get_project_root()), change
tmux_list_sessions("--project-only" in args[1:]) to include
project_root=get_project_root(), and change the loop's
tmux_kill_session(session) to tmux_kill_session(session,
project_root=get_project_root()).

In `@source/src/story_automator/core/tmux_runtime.py`:
- Around line 169-199: The helper _session_name_from_state_path currently uses a
hash-agnostic regex which can mismatch when used outside the current glob;
update _session_name_from_state_path to accept root_hash (or root_hash_str) and
incorporate that hash into its regex (e.g., include the exact root_hash token in
the pattern) so it only matches state files for that root, and update the caller
in tmux_runtime.py (the loop that calls
_session_name_from_state_path(state_path)) to pass the root_hash argument;
ensure any other callers of _session_name_from_state_path are updated similarly
or kept backward-compatible.
- Around line 449-481: Add a one-line comment inside _spawn_legacy explaining
that long commands are intentionally executed with bash (#!/bin/bash and bash
{paths.command}) to preserve legacy behavior and that the shell-resolution
change using resolve_command_shell() only applies to the newer runner mode;
reference the _spawn_legacy function and the use of paths.command/send-keys so
reviewers understand this is deliberate and not a bug.
- Around line 430-446: Replace the racy fixed sleep after run_cmd("tmux",
"respawn-pane", ...) with a short bounded poll that repeatedly calls
tmux_display(session, "#{pane_pid}") and parses with _safe_int until a positive
pid is returned or a timeout elapses (e.g. total ~1–3s with 50–200ms intervals);
use the existing respawned_pane_pid logic and only call
tmux_kill_session(session, str(root)) and return the failure tuple if the poll
expires without a positive pid, otherwise proceed to
update_session_state(paneId=..., panePid=...) and return success.

In `@source/tests/test_tmux_runtime.py`:
- Around line 117-123: Update the
test_runner_file_content_uses_interactive_shell_for_zsh_payloads test to also
assert negative cases so we don't regress to a hardcoded bash dispatch: after
calling _runner_file_content(paths, "/bin/bash", "/bin/zsh", temp_dir) and the
existing positive assertions for COMMAND_SHELL and the payload line, add
assertNotIn('bash "$COMMAND_FILE"', content) and assertNotIn('#!/bin/bash',
content) (or a corresponding shebang string for COMMAND_FILE) to ensure the
runner uses the COMMAND_SHELL variable rather than a hardcoded bash invocation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 733e9e42-1056-469c-9dac-3c429677e0a8

📥 Commits

Reviewing files that changed from the base of the PR and between 82a83ec and 99b2c95.

📒 Files selected for processing (3)
  • source/src/story_automator/commands/tmux.py
  • source/src/story_automator/core/tmux_runtime.py
  • source/tests/test_tmux_runtime.py

Comment thread source/src/story_automator/core/tmux_runtime.py
Comment thread source/src/story_automator/core/tmux_runtime.py
@bma-d
Copy link
Copy Markdown
Contributor

bma-d commented Apr 17, 2026

Verified the two additional CodeRabbit findings; both were real.

Fixed in 948c310:

  • dead tmux panes with missing pane_dead_status no longer reconcile as a false success; they now persist as result="unknown" with failureReason="pane_dead_unknown_status"
  • Claude completion detection no longer accepts any generic for Xm Ys text from payload output; it now only matches explicit completion phrasing (Baked|Done|Finished for ...)

Verification:

  • python3 -m compileall source/src
  • PYTHONPATH=source/src python3 -m unittest source.tests.test_tmux_runtime source.tests.test_runtime_policy source.tests.test_success_verifiers source.tests.test_orchestrator_parse source.tests.test_state_policy_metadata
  • npm run test:smoke

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants