Skip to content

fix(codex): self-heal missed Codex-format commands in hooks.json#745

Merged
kaitranntt merged 2 commits intodevfrom
kai/fix/codex-selfheal-command-format
Apr 24, 2026
Merged

fix(codex): self-heal missed Codex-format commands in hooks.json#745
kaitranntt merged 2 commits intodevfrom
kai/fix/codex-selfheal-command-format

Conversation

@kaitranntt
Copy link
Copy Markdown
Collaborator

Summary

Follow-up hotfix to #739 caught during E2E validation on Mac. After merging PR #743, I purged `~/.codex/hooks/`, injected a stale `node "/Users/kaitran/.codex/hooks/deadbeef-stale-hook.cjs"` entry into `hooks.json`, and ran `ck migrate --agent codex --global --hooks --yes`. The stale entry survived.

Root cause: Codex rewrites CK hook registrations as `node "/abs/path"`. The earlier extractor (`extractLeadingAbsolutePath`) took the first whitespace-split token — `node` — saw it wasn't absolute, and returned null → the entry was classified as "not a file-path command" and preserved. So the self-heal silently no-op'd for every real Codex entry, defeating the whole fix.

Fix

Rewrote the extractor to scan the full command for every absolute-path reference (`/abs/path`, `"/abs/path"`, `node "/abs/path"`, `/usr/bin/env node /abs/path`). A hook is considered stale when it references at least one CK-managed path AND every such reference points at a missing file. Shell expressions, PATH-resolved binaries, and user-owned absolute-path hooks (outside `/.claude/hooks/`, `/.codex/hooks/`, `~/.gemini/hooks/`) are preserved verbatim.

Test plan

  • New regression test using the exact `node "/abs/path"` shape production Codex writes
  • Existing self-heal tests still pass
  • `bun test` — 4530 pass, 0 fail
  • `bun run typecheck` / `lint:fix` / `build` — clean
  • Post-merge: re-run the E2E reproduction on Mac — inject stale `node "/…/.codex/hooks/deadbeef.cjs"` entry, migrate, verify pruned
  • Mirror on `ssh i9-bootcamp` Windows machine

Related: #739 (original self-heal). No new issue filed — this is a direct regression fix.

Follow-up to #739. During E2E validation on Mac we observed that after
purging ~/.codex/hooks/ and re-migrating, a manually-injected stale
entry was NOT pruned. Root cause: Codex rewrites hook commands as
`node "/abs/path"`, whose first whitespace-split token is "node" rather
than an absolute path, so the earlier single-leading-token extractor
returned null and the entry was preserved as "not a file-path command".

Extractor rewritten to scan the full command for every absolute-path
reference, then filter to CK-managed locations. A hook is stale iff it
references at least one CK-managed path AND every such reference is
missing. Shell expressions, PATH-resolved binaries, and user-owned
absolute-path hooks continue to be preserved verbatim.

Added regression test using the exact `node "/abs/path"` shape that
production Codex writes.
@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Claude finished @kaitranntt's task in 3m 13s —— View job


Code Review: PR #745 — fix(codex): self-heal missed Codex-format commands in hooks.json

  • Gather context and understand the request
  • Read changed files
  • Analyze implementation correctness
  • Check tests
  • Review edge cases and security
  • Post final review

Summary

Solid, targeted fix. The root cause diagnosis is accurate, the replacement logic is correct, and the regression test directly reproduces the failure shape. A few notes below ranging from a minor logic subtlety to a known limitation worth documenting.


Core fix — extractAbsolutePaths (hooks-settings-merger.ts:358)

The old extractLeadingAbsolutePath was genuinely broken for this case — taking the first whitespace-split token (node) and returning null caused every Codex-format entry to silently pass through the stale check. The new scanner approach is the right fix.

Regex analysis: /(?:^|[\s"'(])(\/[^\s"'()]+)/g

All documented shapes work correctly:

Command Extracted paths
node "/path/.codex/hooks/hook.cjs" ["/path/.codex/hooks/hook.cjs"]
/path/.codex/hooks/bare.cjs ["/path/.codex/hooks/bare.cjs"]
/usr/bin/env node /home/user/.codex/hooks/hook.cjs ["/usr/bin/env", "/path/.codex/hooks/hook.cjs"]
npm run lint []
https://example.com/path [] ✓ (preceding chars not in anchor set)

One known limitation to document: Paths with spaces — node "/Users/John Doe/.codex/hooks/hook.cjs" — would be truncated at the first space, extracting only /Users/John, which doesn't match isCkManagedHookPath. The hook would be silently kept even if stale. This is extremely unlikely in practice (home directories rarely contain spaces on macOS/Linux), but worth a comment on extractAbsolutePaths. Fix this →


Stale detection logic (hooks-settings-merger.ts:385-390)

const ckPaths = paths.filter(isCkManagedHookPath);
if (ckPaths.length === 0) return true;           // no CK ref → keep
return ckPaths.some((p) => existsSync(p));        // keep if ANY ck path exists

The "keep if ANY CK-managed path exists" semantic is the right call. The alternative ("prune if ANY is missing") would cause false positives on commands referencing multiple CK paths where one was intentionally deleted. Good choice.

Minor nuance: isCkManagedHookPath checks for /.claude/hooks/ with a trailing slash, so a path exactly equal to /home/user/.claude/hooks (no trailing slash — the directory itself) wouldn't match. In practice hook entries always reference files, so this is theoretical. No action needed.


Test — regression for node "/path" shape (hooks-settings-merger.test.ts:309)

The test is correctly structured:

  • Creates a live file on disk → should survive pruning
  • Leaves the stale file absent → should be pruned
  • Both commands use the node "/abs/path" shape that triggered the bug

One stylistic note: the test mixes Bun.write (for dir creation) and writeFileSync (for file creation), which is consistent with the surrounding tests — not a problem, just inheriting the existing pattern.

Missing coverage: extractAbsolutePaths has no direct unit tests. It's only exercised via pruneStaleFileHooks. Given the regex complexity, a small table-driven test for the extractor itself would make future changes safer. Not a blocker. Add tests →


Cleanup

Correctly removed the unused isAbsolute import from node:path. Clean.


Verdict

Approve. The fix is correct, minimal, and well-tested for the failure scenario. The space-in-path limitation is pre-existing behavior (the old function also couldn't handle it), so it's not a regression. The post-merge E2E on Mac is the right final validation gate.

Test previously referenced $HOME/.claude/hooks/subagent-init.cjs which
only exists on developer machines. The sharpened self-heal extractor
correctly prunes this entry on CI runners where the file is missing,
breaking the unrelated assertion about unsupported-event handling.

Point the command at a real file inside the test's scoped dir so the
test exercises pipeline behavior for unsupported events, not the
self-heal path (which has its own dedicated tests).
@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Claude finished @kaitranntt's task in 1m 39s —— View job


Code Review: PR #745 — fix(codex): self-heal missed Codex-format commands in hooks.json

  • Gather context and understand the request
  • Read changed files
  • Analyze implementation correctness
  • Check tests
  • Review edge cases and security
  • Post final review

Summary

Correct, minimal, well-targeted fix. The root cause diagnosis is accurate, the replacement logic is sound, and the regression test directly reproduces the failure shape. The integration test fixup is also necessary and correct. Ready to merge.


Core fix — extractAbsolutePaths (hooks-settings-merger.ts:358)

The old extractLeadingAbsolutePath was broken for this case. It split on whitespace and checked only the first token. For node "/path/.codex/hooks/hook.cjs", the first token is "node" — not absolute — so it returned null, classifying every Codex-format entry as "not a file-path command" and silently preserving all stale entries.

The new scanner approach is the right fix. Regex: /(?:^|[\s"'(])(\/[^\s"'()]+)/g

Command Extracted paths
node "/path/.codex/hooks/hook.cjs" ["/path/.codex/hooks/hook.cjs"]
/path/.codex/hooks/bare.cjs ["/path/.codex/hooks/bare.cjs"]
/usr/bin/env node /home/user/.codex/hooks/hook.cjs ["/usr/bin/env", "/home/user/.codex/hooks/hook.cjs"]
npm run lint []
https://example.com/path [] ✓ (: chars preceding / not in anchor set)

The anchor (?:^|[\s"'(]) correctly blocks URL matching — //example.com is preceded by : which is not in the anchor character class.

Known limitation (pre-existing, not a regression): Paths containing spaces — e.g. /Users/John Doe/.codex/hooks/hook.cjs — will be truncated at the first space by the [^\s"'()]+ terminator, causing the truncated path to not match isCkManagedHookPath. The stale hook would be silently kept. The old extractLeadingAbsolutePath had the same blind spot. Home directories with spaces are extremely rare on macOS/Linux, and Codex hooks on Windows are disabled entirely. Worth a // Known limitation: comment on extractAbsolutePaths for future maintainers, but not a blocker. Add comment →


Stale detection logic (hooks-settings-merger.ts:385–390)

const ckPaths = paths.filter(isCkManagedHookPath);
if (ckPaths.length === 0) return true;        // no CK ref → keep
return ckPaths.some((p) => existsSync(p));    // keep if ANY ck path exists

Semantics are correct: a hook is only pruned when all its CK-managed path references are missing. "Keep if any exists" prevents false positives on commands referencing multiple CK paths where one was intentionally deleted. The prior review noted this was the right call — confirmed.


Integration test fix (codex-hook-compat-integration.test.ts:305–330)

The upgrade test previously used homedir()/.claude/hooks/subagent-init.cjs as the stale entry's command path — a path that doesn't exist on disk in CI. Before this PR, pruneStaleFileHooks would not have caught it (old extractor saw node as first token → null → keep). After this PR's fix, the new extractor correctly identifies this as a stale CK-managed entry and prunes it, making subagentEntries.length drop to 0 and breaking the assertion expect(subagentEntries.length).toBe(1).

The fix is correct: create a real file at legacySubagentHook and use that path in the command. Self-heal sees the file exists → keeps the entry → assertion holds. The test comment clearly documents the intent (asserting pipeline behavior for unsupported events, not self-heal behavior).


Regression test (hooks-settings-merger.test.ts:309)

Well structured. Creates a live file → should survive. Leaves stale file absent → should be pruned. Both commands use node "/abs/path" shape — exactly the shape that triggered the bug. No issues.

The test uses Bun.write for directory creation and writeFileSync for file creation, which is consistent with the surrounding test suite.


Cleanup

isAbsolute import correctly removed (no longer used after extractLeadingAbsolutePath was deleted). The old function and its JSDoc are fully removed — no orphaned references.


Verdict

Approve. The fix is correct, minimal, and directly tested. No behavioral changes beyond the intended scope. The post-merge E2E on Mac remains the right final gate.

@kaitranntt kaitranntt merged commit 764417a into dev Apr 24, 2026
4 checks passed
@kaitranntt kaitranntt deleted the kai/fix/codex-selfheal-command-format branch April 24, 2026 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant