Skip to content

fix: align auto-skills context passing and MCP handler defaults#10

Open
ellisjr wants to merge 23 commits intojrenaldi79:mainfrom
ellisjr:fix/auto-skills-context-passing
Open

fix: align auto-skills context passing and MCP handler defaults#10
ellisjr wants to merge 23 commits intojrenaldi79:mainfrom
ellisjr:fix/auto-skills-context-passing

Conversation

@ellisjr
Copy link
Contributor

@ellisjr ellisjr commented Mar 13, 2026

Summary

Stacked on PR #9 (feat/activity-monitoring-hooks). Merge that first.

  • Context passing: All four auto-skill SKILL.md files now pass parentSession to sidecar_start for accurate context matching. Hooks (post-tool-use.sh, pre-bash.sh) extract the session ID from hook JSON and include it in additionalContext. includeContext changed from false to true for auto-review (reviewer needs conversation context about WHY changes were made).
  • MCP handler fixes: Removed hardcoded --client cowork from sidecar_start, sidecar_resume, and sidecar_continue — now only sets it when coworkProcess is explicitly provided, letting Claude Code users auto-detect as code-local. Fixed sidecar_continue agent defaulting to match sidecar_start (only override chat→build in headless, respect Plan). Added agent parameter to sidecar_continue tool schema.
  • Guide text: Strengthened session matching section and added auto-skills to "MUST Include Context" red flags.

Changes (on top of PR #9)

File Change
skill/auto-*/SKILL.md (4 files) Add parentSession to sidecar_start calls
hooks/post-tool-use.sh Extract session ID, enrich additionalContext
hooks/pre-bash.sh Extract session ID, enrich additionalContext
src/mcp-server.js Remove --client cowork hardcode, fix agent defaulting
src/mcp-tools.js Add agent to sidecar_continue schema, strengthen guide text
tests/mcp-server.test.js Update client test for conditional behavior

Test plan

  • All 70 test suites pass (1521 tests)
  • Lint clean
  • Updated --client cowork test to verify conditional behavior
  • Manual: Verify MCP-spawned sidecar from Claude Code auto-detects as code-local
  • Manual: Verify sidecar_continue with agent: Plan + noUi: true respects Plan agent

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Auto-Skills system with four contextual triggers and CLI controls; activity-monitoring hooks to collect events and trigger skills; CLI usage/help improvements.
  • Documentation

    • Large Auto-Skills, BMAD workflow, invocation proposal and publishing updates; social card meta updates.
  • Bug Fixes

    • Safer key import/error handling, improved thinking-level fallbacks, more robust hook/install/uninstall flows, safer spawn/debug behavior.
  • Tests

    • Extensive new/updated tests for auto-skills, CLI handlers, postinstall/hooks, model validation, and related helpers.

ellisjr and others added 22 commits March 12, 2026 01:30
Adds a skill that triggers after completing a feature/fix, offering to
spawn a headless sidecar (Plan mode, read-only) to review changes for
bugs, missed edge cases, and quality issues before claiming done.

Key design decisions:
- Dynamic model discovery via sidecar_guide (not hardcoded aliases)
- Timeout omitted — uses sidecar platform default (15 min)
- Polling cadence follows sidecar_status responses
- Explicit terminal states: complete, timeout, crashed, error, aborted
- Always unions staged + unstaged diffs before empty check
- includeContext controls conversation history, not file access
- Proof of failure required for each finding (reduces false positives)
- git diff --stat for large diffs; Plan agent has read_file access
- Briefing includes validation context and hardcoded-value checks
- Dependency impact flagging for manifest/lockfile changes
- Non-blocking: review runs in background while user gets completion msg

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ging

Adds a skill that triggers after 5+ failed distinct approaches, offering
an iterative brainstorming loop with a sidecar. The sidecar suggests
ideas, Claude evaluates them against codebase knowledge, and feeds back
what won't work — prompting deeper suggestions.

Key design decisions:
- Uses repeated sidecar_start (not sidecar_continue) to stay in Plan mode
  (headless sidecar_continue forces Build mode per mcp-server.js:275)
- Dynamic model discovery via sidecar_guide (not hardcoded aliases)
- Timeout omitted — uses sidecar platform default (15 min)
- Polling cadence follows sidecar_status responses
- Explicit terminal states: complete, timeout, crashed, error, aborted
- Multi-model narrowing: all models round 1, best model only from round 2
- User check-in every 5 rounds to prevent theoretical spiraling
- Rubber duck effect: abort sidecar if briefing prep solves the problem
- Sidecar designs diagnostic probes, not just suggestions
- Broadened context requests (dir listings, env vars, dep trees)
- Token management guidance in follow-up briefings
- Per-model task ID tracking for multi-model brainstorms
- Max 20 iterations with clear termination conditions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a skill that triggers when user asks to commit/push/PR, offering
a headless sidecar security audit before changes enter the repository.
Sequential (blocks commit) to catch critical vulnerabilities first.

Key design decisions:
- Dynamic base branch resolution with full fallback chain
  (origin/HEAD -> @{upstream} -> main -> master -> ask user)
- Audits union of staged + unstaged for commits; full branch diff for PRs
- Dynamic model discovery via sidecar_guide (not hardcoded aliases)
- Timeout omitted — uses sidecar platform default (15 min)
- Polling cadence follows sidecar_status responses
- Explicit terminal states: complete, timeout, crashed, error, aborted
- Attack vector required for each finding (reduces false positives)
- CI/CD files (.github/workflows/, etc.) in high-priority list
- Suspicious dependency detection (typosquatting, unmaintained packages)
- Security-relevant file prioritization for large diffs
- Critical/high findings block commit; medium findings are user's choice
- Does not fire on slash commands (/commit) — isolated context

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Updates postinstall.js to also install auto-review, auto-unblock, and
auto-security skills to ~/.claude/skills/sidecar/<name>/SKILL.md
alongside the main SKILL.md. Without this, npm install would not
deploy the auto-skills to the user's skill directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of hardcoding the list of auto-skills, scan the skill/
directory for auto-* subdirectories at install time. This ensures
new auto-skills are picked up without updating the script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…acts

Adds a new auto-skill that offers second-opinion sidecar reviews at natural
BMAD-METHOD workflow checkpoints. Covers 9 artifact types (PRD, architecture,
epics, stories, etc.) with sequential multi-model review where each model
sees improvements from prior rounds. Includes bmad-workflow.md reference doc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add sharded epics handling (concatenate shards with headers in briefing)
- Remove Implementation Readiness from scope (gate, not persisted artifact)
- Add partial pipeline failure handling (preserve changes, continue to next model)
- Add explicit sleep 25 polling rule in Step 4a
- Add prioritization guidance for large input document sets
- Add scope rationale note explaining intentional artifact selectivity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…catch

- Add `text` language tags to bare code fences in bmad-workflow.md
- Wrap AUTO_SKILLS readdirSync in try-catch for resilience

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an "Auto-Skills" section to the main sidecar SKILL.md listing all
four auto-skills with trigger conditions. Since the main skill appears
in Claude's available skills list, this gives Claude passive awareness
of auto-skill triggers without requiring new hook infrastructure.

Also adds docs/auto-skill-invocation-research.md with findings on how
Superpowers achieves reliable skill invocation and recommendations for
improving sidecar auto-skill reliability (SessionStart hook, etc.).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Install auto-skills as top-level skill directories
(~/.claude/skills/sidecar-auto-*/) instead of nested under sidecar/.
This makes them appear in Claude Code's available skills list, enabling
both auto-fire on trigger conditions AND manual invocation via slash
commands (/sidecar-auto-review, /sidecar-auto-security, etc.).

- Rename frontmatter `name` fields to sidecar-auto-* prefix
- Update postinstall.js to install to top-level with cleanup of old path
- Update main SKILL.md reference to new location

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reflects that Options B and D are now implemented. Updates the gap
analysis table, adds verified skills list output, documents the
nesting depth discovery, and refocuses remaining recommendations
on SessionStart hook and config framework.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds Claude Code hooks as a second triggering path for auto-skills,
alongside existing description-matching. Phase 1 delivers shell hooks
for auto-security (PreToolUse gate on git commit/push/PR) and
auto-bmad-check (PostToolUse on _bmad-output/ writes), plus event
collection for Phase 2 stuck-loop and review-completion detection.

Includes:
- autoSkills config namespace with master + per-skill enable/disable
- `sidecar auto-skills` CLI command for managing auto-skill state
- Shell hooks (pre-bash.sh, post-tool-use.sh, post-failure.sh)
- Hook registration in postinstall with absolute path resolution
- Cleanup on npm uninstall (preuninstall.js)
- Step 0 config check in all 4 auto-skill SKILL.md files
- Research doc analyzing 3 approaches (recommends Approach B)
- 23 unit tests for config module

Reviewed by Gemini Pro (9 findings) and ChatGPT (12 findings),
all accepted fixes applied before commit.

Builds on PR jrenaldi79#7 (feat/auto-sidecar-skills) which introduced the
four auto-skills. Addresses jrenaldi79#8 (activity monitoring for auto-skill
triggers).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use jq for safe JSON construction in BMAD trigger (prevent injection
  from filenames with quotes/backslashes)
- Make mktemp failure non-fatal in all hooks (exit 0 instead of
  blocking git operations)
- Fix CLAUDE.md auto-generated description for preuninstall.js
- Update postinstall.js JSDoc to mention hook registration
- Gate first-run notice on hooks actually being registered

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Split registerHooks into mergeHooks + registerHooks (both under 50 lines)
- Deduplicate hooks by basename (not full path) to prevent accumulation on upgrades
- Fix TOCTOU race: use umask for atomic file creation with restrictive permissions
- Extract getUsage() to cli-usage.js (cli.js now 289 lines, under 300 limit)
- Handle --key=value syntax in CLI arg parser
- Sanitize SESSION_ID to prevent path traversal in event file paths
- Fix jq slice syntax ([:500] -> [0:500]) for broader compatibility
- Cap event JSONL files at 5MB to prevent unbounded growth
- Add Phase 2 forward-reference comment for stop-hook.sh in preuninstall

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes 14 issues found during structured code review:

High-confidence fixes (score 100):
- Replace console.error with logger.error in start-helpers.js (lint failure)
- Fix JSON injection in registerHooks via JSON.stringify escaping
- Prevent mergeHooks from unconditionally rewriting settings.json
- Scope basename dedup to claude-sidecar paths to protect user hooks

Sub-threshold fixes (score 50-75):
- Move cli-handlers.js no-console override to .eslintrc.js
- Extract parseAutoSkillsArgs to bring handleAutoSkills under 50 lines
- Decouple monitoring.enabled from auto-skill triggers in hooks
- Wrap saveConfig in try/catch in setAutoSkillsEnabled
- Consolidate double progress.json read into single parse
- Remove phantom stop-hook.sh from preuninstall.js
- Remove broken docs/plans/index.md link from CLAUDE.md
- Add unit tests for start-helpers, cli-handlers, and mergeHooks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes 4 issues found by multi-model sidecar review:

- Fix String.replace() $-interpolation in registerHooks by using
  replacer function (() => safeHooksDir) instead of string argument
- Add typeof guard for h.command in mergeHooks dedup filter to prevent
  TypeError crash on malformed settings.json entries
- Return boolean from setAutoSkillsEnabled so CLI can detect and report
  save failures instead of printing false success (logger.warn -> error)
- Compare full matcher object in mergeHooks alreadyExists check so
  upgraded matchers (e.g. adding MultiEdit) are properly replaced

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Security: make remote debugging opt-in (setup-window), fix TOCTOU race
conditions in hook event files (post-tool-use, post-failure), add request
timeout to API key validation, handle Anthropic 5xx/429 as errors, add
curl timeout to publish workflow, fix pre-push hook to block on test
failure.

Bugs: fix validateOpenRouterKey backwards-compat alias signature, add
await to handleAutoSkills, add try-catch and result checking to IPC
get-api-keys handler, guard against null m.name in model-validator,
default unknown skills to disabled, validate skill names before saving,
use smarter thinking level fallback for high/xhigh, handle first release
in publish workflow.

Tests: fix process.env restoration in config-null-alias, fix nested
model id test case, remove silent-pass risk in setup-ui-model test,
update setup-window and api-key-validation tests for new behavior.

Docs: add lang/title to social card HTML, clarify publishing auth method.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Redact bash commands in telemetry (truncate + strip long base64)
- Fix stat fallback order to GNU first in post-tool-use.sh
- Use POSIX [[:space:]] instead of \s in pre-bash.sh grep patterns
- Fix postinstall mergeHooks to remove only sidecar hooks within
  matchers, not entire matcher objects with mixed ownership
- Use path-based detection in preuninstall to avoid removing user hooks
  with same basename
- Scope auto-security commit scans to staged changes only
- Split getUsage() into sub-50-line helper functions
- Guard model-validator sort against null name
- Guard progress.js against invalid updatedAt dates
- Warn when boolean CLI flags receive inline values
- Add uninstall script as fallback for preuninstall lifecycle

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-skills were not passing context to sidecar the same way the
primary skill does, leading to degraded review quality and unreliable
session matching.

Changes:
- Add parentSession parameter to all four auto-skill sidecar_start calls
- Change auto-review from includeContext:false to true (reviewer needs
  conversation context about WHY changes were made)
- Enrich hook additionalContext with session ID and trigger data so
  Claude can pass parentSession through to sidecar_start
- Strengthen sidecar_guide text to recommend parentSession when
  includeContext is true
- Update parentSession schema description to note hook integration
- Add auto-skills to guide's "must include context" red flags

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The MCP server was hardcoding --client cowork for all spawned sidecars,
which broke client auto-detection for Claude Code users. Now only passes
--client cowork when coworkProcess is explicitly set.

Also fixes sidecar_continue and sidecar_resume to not blindly override
the agent to build in headless mode — uses the same smart defaulting as
sidecar_start (only override chat→build, respect Plan).

Adds agent parameter to sidecar_continue tool schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

Adds auto-skills infrastructure: CLI management, runtime hooks (pre-bash, post-tool-use, post-failure), postinstall/uninstall scripts to register hooks and install auto-skills, config utilities, skill docs, tests, and related CLI/sidecar adjustments.

Changes

Cohort / File(s) Summary
Config & CI
\.eslintrc.js, .github/workflows/publish.yml, .husky/pre-push, package.json
ESLint override extended to src/cli-handlers.js; publish workflow guards empty PREV_TAG and adds curl timeouts; pre-push enforces test failure abort; package includes hooks and preuninstall lifecycle scripts.
CLI & Usage
src/cli.js, src/cli-usage.js, src/cli-handlers.js, bin/sidecar.js
Inline --flag=value parsing; usage text moved to src/cli-usage.js; new handleAutoSkills CLI command and auto-skills routing in bin/sidecar.js; exports updated.
Auto-Skills Config
src/utils/auto-skills-config.js, CLAUDE.md
New module with VALID_SKILL_NAMES, SKILL_LABELS, config load/save, getters (isSkillEnabled, isMonitoringEnabled), setters, status formatting, and name resolution utilities.
Hooks & Scripts
hooks/hooks.json, hooks/pre-bash.sh, hooks/post-tool-use.sh, hooks/post-failure.sh
Adds hook registry JSON and three robust shell hook scripts that consume JSON input, respect per-user config, append session events, optionally emit BMAD triggers, and provide pre-commit security guidance.
Install / Uninstall
scripts/postinstall.js, scripts/preuninstall.js
Postinstall discovers/top-level-installs auto-skills, merges/registers hooks (mergeHooks, registerHooks); preuninstall removes sidecar hook entries (removeHooks). New exported functions added.
Skills & Docs
skill/..., docs/...
Adds Auto-Skills section to main SKILL and four new skill docs (auto-review, auto-security, auto-unblock, auto-bmad-method-check); research and proposal docs (activity-monitoring, invocation proposal, BMAD workflow, publishing updates).
Sidecar Core & Tools
src/mcp-server.js, src/mcp-tools.js, src/sidecar/progress.js, src/sidecar/setup-window.js
Conditional --client cowork and agent handling for start/resume/continue; expanded guide/context text and agent option; unified stall detection (lastActivityMs); remote-debugging opt-in via SIDECAR_DEBUG_PORT.
Validation & Utilities
src/utils/api-key-validation.js, src/utils/model-validator.js, src/utils/start-helpers.js, src/utils/thinking-validators.js, electron/ipc-setup.js, scripts/generate-docs-helpers.js, site/social-card-render.html
API key timeouts and 5xx handling; model filter hardening; logger usage for errors; thinking-level fallback with adjustedLevel; IPC handlers wrapped with try/catch; docs generator formatting tweak; social card lang/title added.
Tests
tests/* (many files)
Adds wide test coverage: auto-skills config, CLI handlers, postinstall merging, start-helpers, and updates tests for conditional cowork flag, remote-debug opt-in, env cleanup, API key timeout handling, and model id normalization.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CLI/bin/sidecar.js
    participant Config as Auto-Skills Config
    participant Postinstall as Postinstall/Register
    participant HookExec as Hook Executor
    participant SessionFS as Session JSONL
    participant Sidecar as Sidecar Core

    User->>CLI: auto-skills --on / --off / list
    CLI->>Config: setAutoSkillsEnabled / getAutoSkillsStatus
    Config-->>CLI: status / success

    User->>Postinstall: install/update package
    Postinstall->>Postinstall: discover auto-skills, read hooks/hooks.json
    Postinstall->>~/.claude/settings.json: merge hooks
    Postinstall-->>User: hooks registered

    Sidecar->>HookExec: Trigger PreToolUse / PostToolUse / PostToolUseFailure
    HookExec->>HookExec: read stdin JSON, check config, build event
    HookExec->>SessionFS: append session-specific JSONL event
    HookExec-->>Sidecar: emit additionalContext / BMAD trigger

    Sidecar->>Sidecar: evaluate hook output -> match auto-skill
    Sidecar-->>User: spawn sidecar for matched auto-skill
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • jrenaldi79

Poem

🐰 I hopped through hooks and docs tonight,
I nudged configs, set defaults right,
I stitched the scripts with careful paws,
Now auto-skills fire without a pause!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: align auto-skills context passing and MCP handler defaults' accurately summarizes the main changes: correcting context passing for auto-skills (via parentSession) and fixing MCP handler defaults (removing hardcoded --client cowork and aligning agent behavior).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tests/api-key-store-validation.test.js (1)

43-232: ⚠️ Potential issue | 🟠 Major

Add tests for the new validator branches (timeout + Anthropic 429/5xx).

The mock updates align with setTimeout, but the new behavior in validation logic is still unverified in this suite. Please add explicit assertions for:

  • timeout path resolving { valid: false, error: 'Request timed out' }
  • Anthropic 429/5xx resolving as server-error invalid responses.

As per coding guidelines: "Write tests first (TDD approach) for all new business logic".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api-key-store-validation.test.js` around lines 43 - 232, Add unit tests
in tests/api-key-store-validation.test.js to cover the new timeout and Anthropic
server-error branches: mock https.get to simulate a request timeout by returning
a req object whose setTimeout triggers the 'timeout' handler and assert
validateApiKey('openrouter' or relevant provider, '<key>') resolves to { valid:
false, error: 'Request timed out' }; add two Anthropic mocks with statusCode 429
and a 5xx (e.g., 500) that call the response 'data'/'end' handlers and assert
validateApiKey('anthropic', '<key>') resolves to { valid: false, error: 'Server
error (429)' } and { valid: false, error: 'Server error (500)' } respectively;
use the existing pattern of setting https.get.mockImplementation and capturing
the callback response to keep tests consistent with other cases.
tests/setup-ui-model.test.js (1)

187-198: ⚠️ Potential issue | 🟠 Major

Move this UI-picker assertion out of unit tests.

This case is still asserting UI-picker DOM markup/state (route-pill active classes). Please move this verification to the manual CDP flow instead of expanding unit coverage here.

As per coding guidelines: "In tests, do not write unit tests for DOM manipulation, UI picker components, or Electron window configuration - these require manual CDP verification".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/setup-ui-model.test.js` around lines 187 - 198, In the test "should
mark first available pill as active when openrouter key missing", remove the UI
DOM state assertions that inspect route-pill active classes (the expects using
/route-pill active...data-provider="google"/ and /route-pill
active...data-provider="openrouter"/) and instead keep only non-UI assertions
(e.g. that buildModelStepHTML(MODEL_CHOICES, undefined, configuredKeys) returns
a non-empty string and that the gemini toggle span exists via geminiToggle not
being null); move the specific active-class verification to the manual CDP flow
as requested by the reviewer.
src/mcp-tools.js (1)

254-261: ⚠️ Potential issue | 🟠 Major

sidecar_guide still can’t answer the enablement check the auto-skills depend on.

The new SKILL.md flows call sidecar_guide to decide whether a skill is disabled, but this tool still has an empty input schema and its guide text never includes the current autoSkills state. As written, Step 0 cannot observe the master/per-skill switches, so disabled auto-skills will still fire.

🛠️ Suggested direction
 const { z } = require('zod');
 const { formatAliasNames } = require('./utils/config');
+const { getAutoSkillsStatus } = require('./utils/auto-skills-config');
@@
 function getGuideText() {
   const { getEffectiveAliases } = require('./utils/config');
+  const autoSkillsStatus = getAutoSkillsStatus();
   const aliases = getEffectiveAliases();
@@
   return `# Sidecar Usage Guide
@@
+## Auto-Skills Status
+${autoSkillsStatus}
+
 ## Session Matching (Important for Context Accuracy)

Also applies to: 324-343

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mcp-tools.js` around lines 254 - 261, The sidecar_guide tool currently
has an empty inputSchema and its description doesn't include the current
autoSkills state, so it cannot perform the enablement check used by SKILL.md;
update the tool (symbol: sidecar_guide) to accept an inputSchema containing the
runtime enablement flags (e.g., autoSkills or a map of master/per-skill
switches) and ensure the guide text/response generation explicitly reads that
input and reports whether each relevant auto-skill is enabled or disabled (so
Step 0 can observe the switches); apply the same change to the other identical
tool definition occurrences referenced in the diff.
src/cli.js (1)

77-85: ⚠️ Potential issue | 🟠 Major

Don't coerce missing-value options to true.

For value-bearing flags, this leaves the parser returning the wrong type. A simple case is --model --prompt test: validateStartArgs() will reach model.split('/') and throw instead of returning a clean validation error. Please record a parse error here or leave the option unset, but don't turn it into true.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli.js` around lines 77 - 85, The parser currently coerces value-bearing
flags to true when a value is missing (in the branch that sets result[key] =
true), causing downstream errors like validateStartArgs() doing
model.split('/'). Instead, when inlineValue is undefined and the next token is
missing or startsWith('--'), do not set result[key] = true; instead mark a parse
error or leave the key unset: e.g., add logic in the parsing block that detects
missing value for the specific option (using the same option metadata used by
parseValue) and either push a descriptive error into a parseErrors array (or
throw a ParseError) referencing the option name (key), or omit setting
result[key] so validation can produce a clean error; update the parsing branch
that currently assigns result[key] = true (the code using inlineValue, next,
parseValue and result[key]) to implement this behavior.
🧹 Nitpick comments (5)
src/utils/api-key-validation.js (1)

35-88: Refactor validateApiKey into smaller helpers.

validateApiKey is now over the 50-line function limit. Please split URL/header construction, response classification, and timeout/error wiring into separate helpers for maintainability.

As per coding guidelines: "Any function must not exceed 50 lines".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/api-key-validation.js` around lines 35 - 88, Split validateApiKey
into three small helpers: (1) create a buildValidationRequest(provider, key)
that returns { url, headers } including Google query param logic and using
VALIDATION_ENDPOINTS and authHeader, (2) implement
classifyValidationResponse(provider, statusCode) that returns the normalized
result objects ({ valid: true } or { valid: false, error: '...' }) encoding
Anthropic-specific rules and the general 200/401/403/other mapping, and (3)
extract the network wiring into makeRequestWithTimeout(url, headers, timeout)
which returns a Promise that resolves with the response statusCode or rejects
with errors/timeouts; then refactor validateApiKey to call
buildValidationRequest, call makeRequestWithTimeout, and pass the returned
statusCode into classifyValidationResponse, preserving current timeout (10000ms)
and error messages.
src/sidecar/progress.js (1)

114-200: Consider extracting progress.json parsing into a helper.

The readProgress function is approximately 85 lines, exceeding the 50-line guideline. While the logic is cohesive, extracting the progress.json parsing block (lines 156-193) into a separate helper like parseProgressFile(progressPath, convStat) would improve readability and align with the function length guideline.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sidecar/progress.js` around lines 114 - 200, The readProgress function is
long; extract the progress.json parsing block into a helper
parseProgressFile(progressPath, convStat) that returns an object with {stage,
latestOverride, messagesOverride, lastActivityOverride, lastActivityMsOverride}
(or similar), then replace the inline parsing (the fs.existsSync(progressPath)
try/catch block) with a call to parseProgressFile and apply its returned
overrides to messages, latest, lastActivity, lastActivityMs and stage; make sure
parseProgressFile uses computeLastActivity and convStat.mtime comparisons
exactly as in the original logic and preserves error swallowing for malformed
files.
tests/mcp-server.test.js (1)

37-67: Add a regression for headless sidecar_continue with agent: "Plan".

This PR changes that branch in src/mcp-server.js, but the suite only expands the cowork flag case. Please assert that noUi: true, agent: 'Plan' keeps --agent Plan instead of silently coercing to build.

🧪 Suggested test
+  test('sidecar_continue preserves Plan in headless mode when explicitly requested', async () => {
+    let capturedArgs;
+    await jest.isolateModulesAsync(async () => {
+      jest.doMock('child_process', () => ({
+        spawn: jest.fn((_cmd, args) => {
+          capturedArgs = args;
+          return { pid: 12345, unref: jest.fn(), stdout: { on: jest.fn() }, stderr: { on: jest.fn() } };
+        }),
+      }));
+      const { handlers: h } = require('../src/mcp-server');
+      await h.sidecar_continue(
+        { taskId: 'old-parent', prompt: 'new task', noUi: true, agent: 'Plan' },
+        '/tmp'
+      );
+      const idx = capturedArgs.indexOf('--agent');
+      expect(idx).toBeGreaterThan(-1);
+      expect(capturedArgs[idx + 1]).toBe('Plan');
+    });
+  });
As per coding guidelines, "tests/**/*.js: Write tests first (TDD approach) for all new business logic".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/mcp-server.test.js` around lines 37 - 67, Add a regression test to
assert headless sidecar_continue preserves agent "Plan": update
tests/mcp-server.test.js to add a case that calls handlers.sidecar_continue with
noUi: true and agent: 'Plan' (similar style to the existing sidecar_start cowork
tests), mock child_process.spawn to capture args, then assert that the captured
args include '--agent' followed by 'Plan' (and not coerced to 'build');
reference the handler function sidecar_continue and the request field agent in
your test to locate the code under test.
src/mcp-server.js (1)

84-85: Extract the headless agent-selection rule into one helper.

sidecar_start and sidecar_continue now carry the same chat -> build when headless policy. That duplication is what let these paths drift before, so the next tweak can split them again unless it is centralized.

♻️ Suggested refactor
+function resolveSidecarAgent(agent, noUi) {
+  return (noUi && (!agent || agent.toLowerCase() === 'chat')) ? 'build' : agent;
+}
+
 const handlers = {
   async sidecar_start(input, project) {
@@
-    const agent = (input.noUi && (!input.agent || input.agent.toLowerCase() === 'chat'))
-      ? 'build' : input.agent;
+    const agent = resolveSidecarAgent(input.agent, input.noUi);
@@
-    const continueAgent = (input.noUi && (!input.agent || input.agent.toLowerCase() === 'chat'))
-      ? 'build' : input.agent;
+    const continueAgent = resolveSidecarAgent(input.agent, input.noUi);
As per coding guidelines, "Extract duplicate logic to shared utilities".

Also applies to: 276-279

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mcp-server.js` around lines 84 - 85, Extract the duplicated "chat ->
build when headless" rule into a shared helper (e.g., selectAgent or
isHeadlessAgent) and replace the inline logic that sets the local variable agent
with a call to that helper; specifically, implement a helper that returns
'build' when input.noUi && (!input.agent || input.agent.toLowerCase() ===
'chat') otherwise returns input.agent, then update the agent assignment in the
sidecar_start and sidecar_continue code paths to use this helper so the
selection logic is centralized and reused.
src/cli-handlers.js (1)

127-127: JSDoc return type is out of sync with actual return object.

Line 127 documents only enabled and skillArgs, but Line 145 also returns on and off.

Doc fix
- * `@returns` {{ enabled: boolean, skillArgs: string[] }}
+ * `@returns` {{ on: boolean, off: boolean, enabled: boolean, skillArgs: string[] }}

Also applies to: 145-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli-handlers.js` at line 127, Update the JSDoc for the CLI handler
function that currently documents the return as "{ enabled: boolean, skillArgs:
string[] }" so it matches the actual returned object which also includes "on"
and "off" properties; modify the `@returns` annotation to list all returned fields
(enabled, skillArgs, on, off) with their types and brief descriptions to keep
docs and implementation in sync (look for the function in src/cli-handlers.js
that returns the object containing enabled, skillArgs, on, and off).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/activity-monitoring-research.md`:
- Line 109: The in-doc anchor [Future Extensions](`#future-extensions`) referenced
in the sentence about hook types ("See [Future Extensions](`#future-extensions`)
in the design spec.") is broken because that heading doesn't exist; fix it by
either adding a matching heading "## Future Extensions" (or equivalent) to
docs/activity-monitoring-research.md or by updating the link text/anchor to
point to an existing section (e.g., change the link target to an existing
heading name); ensure the link text and the actual heading slug match so the
anchor resolves correctly.
- Around line 296-300: The markdown table row containing the regex
`/^\s*(git\s+(commit|push)|gh\s+pr\s+create)/` (under the Detection Method
column) is breaking table parsing due to unescaped pipe characters; fix by
escaping pipe characters inside the inline code spans (e.g., replace `|` with
`\|` or use the HTML entity &#124;) or move the regex into a fenced code block
so the `|` does not split table cells; apply the same treatment for the `git
add` regex in the PostToolUse row so both regex/code spans no longer create
extra table columns.
- Around line 120-125: The fenced code blocks in
docs/activity-monitoring-research.md are missing language tags; update each
opening triple-backtick fence for the blocks that contain the PostToolUse /
PostToolUseFailure / Stop / PreToolUse(Bash) and SessionStart/SessionEnd
summaries (around the blocks containing the lines with "PostToolUse →
post-tool-use.sh …", the alternate block with "fast event append …", and the
SessionStart/PostToolUse/Stop/SessionEnd block) to include a language tag such
as text (i.e., change ``` to ```text) so the markdown linter and renderer treat
them consistently.

In `@docs/publishing.md`:
- Around line 17-20: The Publishing setup mixes two different flows
(OIDC/Trusted Publisher and token-based publish), causing confusion; update the
docs to clearly separate the primary OIDC Trusted Publisher flow (mentioning
publish.yml, id-token: write, and --provenance for sigstore attestation) from a
fallback token-based flow (describe when NPM_TOKEN is required, its scope and
that it is stored as a GitHub secret), and remove any statements that imply both
are simultaneously required (e.g., the line mentioning NPM_TOKEN alongside
Trusted Publisher and the claude-sidecar access URL) so the release steps become
unambiguous about which credentials are used in each path.

In `@hooks/pre-bash.sh`:
- Around line 73-74: The jq invocation is echoing the raw $COMMAND into
additionalContext and may inject sensitive data; instead change the message
built by the jq call (the object with hookSpecificOutput.additionalContext) to
avoid including $COMMAND directly—use a fixed phrase like "a git commit/push/PR
operation was detected" or a short redacted/validated token (e.g.,
"[REDACTED_COMMAND]" or first N chars only) rather than $COMMAND, and ensure you
do not insert an empty session hint: when $SESSION_ID is empty/undefined omit
the parentSession suggestion (or only append "pass parentSession:
'<SESSION_ID>'" when $SESSION_ID is non-empty). Reference symbols to change: the
jq command that constructs additionalContext, the $COMMAND and $SESSION_ID
variables, and the sidecar_start/parentSession fragment.

In `@package.json`:
- Around line 49-50: Remove the redundant "uninstall" lifecycle entry in
package.json so cleanup runs only once across npm versions: keep the
"preuninstall" script (preserving its value "node scripts/preuninstall.js") and
delete the "uninstall" key; ensure only "preuninstall" remains to avoid double
execution on older npm and to maintain behavior on newer npm versions.

In `@scripts/postinstall.js`:
- Around line 218-230: When reading settingsPath in scripts/postinstall.js you
currently swallow all errors and proceed with an empty settings object, which
causes malformed existing files to be overwritten; instead, catch the read/parse
error and only treat a missing file (ENOENT) as "start fresh"—for other
exceptions log/warn about the malformed file (including the error), skip calling
mergeHooks(settings, hooksConfig) and do not write out settingsPath. Update the
try/catch around JSON.parse(fs.readFileSync(settingsPath, 'utf-8')) to inspect
the error, only fallback to {} for ENOENT, and return/skip the
registration/write path (the code that computes registered and writes
settingsPath and settingsDir) when parse fails.

In `@skill/auto-security/SKILL.md`:
- Around line 92-97: Replace the current guidance that allows deriving
parentSession by scanning the newest ~/.claude/projects/*.jsonl with explicit
instructions to only set the parentSession parameter when you can obtain the
exact session UUID from the hook input (e.g., session_id); if session_id is not
present, do not set parentSession at all—do not fall back to guessing the
most-recent session file. Ensure references to parentSession and session_id are
updated in the SKILL.md text to reflect this strict behavior.

In `@skill/auto-unblock/SKILL.md`:
- Around line 169-181: The doc incorrectly claims headless sidecar_continue
forces Build mode; update the paragraph to reflect that sidecar_continue now
supports agent: "Plan" per mcp__sidecar__sidecar_start and sidecar_continue
implementations (see mcp-server/mcp-tools changes), and if you prefer
sidecar_start keep the justification focused on prompt-isolation and explicit
read-only intent rather than old behavior; edit the text to remove the stale
claim about Build mode, mention that both APIs accept agent: "Plan", and state
the real rationale (prompt isolation / explicit session separation) for using
sidecar_start when applicable.

In `@src/cli.js`:
- Around line 61-63: The code currently treats any inline value for boolean
flags as ignored (logger.warn with inlineValue), which causes "--off=security"
to act like a global --off; update the handling in the flag parsing branch that
uses inlineValue and key so that when key === "on" or key === "off" you either
(a) convert the inlineValue into a positional skill name and inject it back into
the parsed args/skills list (e.g., push inlineValue into the same array used for
skill names), or (b) explicitly reject the syntax by throwing/logging an error
and aborting; implement the chosen behavior inside the same block that
references inlineValue and logger.warn so the check for key === "on"/"off" is
applied before warning and the inlineValue is not discarded.

In `@src/sidecar/setup-window.js`:
- Around line 32-41: The code passes process.env.SIDECAR_DEBUG_PORT directly
into Electron args (const debugPort and args) without validation; validate
SIDECAR_DEBUG_PORT is a positive integer in range 1..65535 before using it:
parse it (e.g., Number or parseInt), check isFinite/integer and 1 <= value <=
65535, log a warning via logger if invalid and fall back to not including the
remote-debugging arg, and only construct args with
`--remote-debugging-port=${debugPort}` when the validated numericPort is valid
(keep existing logging behavior but include whether the env was ignored).

In `@src/utils/model-validator.js`:
- Around line 94-100: filterRelevantModels currently can return model objects
that match by name but lack an id, which later breaks normalizeModelId(provider,
m.id); update filterRelevantModels to filter out entries with missing/empty id
before returning candidates (i.e., ensure filtered = filtered.filter(m => m.id)
and when falling back to models use models.filter(m => m.id)); reference the
function filterRelevantModels and the variables filtered and models so the
returned array always contains only models with a truthy id.

In `@tests/auto-skills-config.test.js`:
- Around line 145-155: The test currently simulates a write failure by changing
filesystem permissions (in tests/auto-skills-config.test.js) which is flaky
across environments; instead, mock fs.writeFileSync to deterministically throw
when setAutoSkillsEnabled(true) runs. Replace the chmod/writeFileSync permission
block with a mock (e.g., jest.spyOn(fs, 'writeFileSync').mockImplementation(()
=> { throw new Error('mock write failure') })) before calling
loadModule().setAutoSkillsEnabled(true), assert the function returns false, and
then restore the original implementation (fs.writeFileSync.mockRestore or
equivalent) for cleanup to avoid side effects.

---

Outside diff comments:
In `@src/cli.js`:
- Around line 77-85: The parser currently coerces value-bearing flags to true
when a value is missing (in the branch that sets result[key] = true), causing
downstream errors like validateStartArgs() doing model.split('/'). Instead, when
inlineValue is undefined and the next token is missing or startsWith('--'), do
not set result[key] = true; instead mark a parse error or leave the key unset:
e.g., add logic in the parsing block that detects missing value for the specific
option (using the same option metadata used by parseValue) and either push a
descriptive error into a parseErrors array (or throw a ParseError) referencing
the option name (key), or omit setting result[key] so validation can produce a
clean error; update the parsing branch that currently assigns result[key] = true
(the code using inlineValue, next, parseValue and result[key]) to implement this
behavior.

In `@src/mcp-tools.js`:
- Around line 254-261: The sidecar_guide tool currently has an empty inputSchema
and its description doesn't include the current autoSkills state, so it cannot
perform the enablement check used by SKILL.md; update the tool (symbol:
sidecar_guide) to accept an inputSchema containing the runtime enablement flags
(e.g., autoSkills or a map of master/per-skill switches) and ensure the guide
text/response generation explicitly reads that input and reports whether each
relevant auto-skill is enabled or disabled (so Step 0 can observe the switches);
apply the same change to the other identical tool definition occurrences
referenced in the diff.

In `@tests/api-key-store-validation.test.js`:
- Around line 43-232: Add unit tests in tests/api-key-store-validation.test.js
to cover the new timeout and Anthropic server-error branches: mock https.get to
simulate a request timeout by returning a req object whose setTimeout triggers
the 'timeout' handler and assert validateApiKey('openrouter' or relevant
provider, '<key>') resolves to { valid: false, error: 'Request timed out' }; add
two Anthropic mocks with statusCode 429 and a 5xx (e.g., 500) that call the
response 'data'/'end' handlers and assert validateApiKey('anthropic', '<key>')
resolves to { valid: false, error: 'Server error (429)' } and { valid: false,
error: 'Server error (500)' } respectively; use the existing pattern of setting
https.get.mockImplementation and capturing the callback response to keep tests
consistent with other cases.

In `@tests/setup-ui-model.test.js`:
- Around line 187-198: In the test "should mark first available pill as active
when openrouter key missing", remove the UI DOM state assertions that inspect
route-pill active classes (the expects using /route-pill
active...data-provider="google"/ and /route-pill
active...data-provider="openrouter"/) and instead keep only non-UI assertions
(e.g. that buildModelStepHTML(MODEL_CHOICES, undefined, configuredKeys) returns
a non-empty string and that the gemini toggle span exists via geminiToggle not
being null); move the specific active-class verification to the manual CDP flow
as requested by the reviewer.

---

Nitpick comments:
In `@src/cli-handlers.js`:
- Line 127: Update the JSDoc for the CLI handler function that currently
documents the return as "{ enabled: boolean, skillArgs: string[] }" so it
matches the actual returned object which also includes "on" and "off"
properties; modify the `@returns` annotation to list all returned fields (enabled,
skillArgs, on, off) with their types and brief descriptions to keep docs and
implementation in sync (look for the function in src/cli-handlers.js that
returns the object containing enabled, skillArgs, on, and off).

In `@src/mcp-server.js`:
- Around line 84-85: Extract the duplicated "chat -> build when headless" rule
into a shared helper (e.g., selectAgent or isHeadlessAgent) and replace the
inline logic that sets the local variable agent with a call to that helper;
specifically, implement a helper that returns 'build' when input.noUi &&
(!input.agent || input.agent.toLowerCase() === 'chat') otherwise returns
input.agent, then update the agent assignment in the sidecar_start and
sidecar_continue code paths to use this helper so the selection logic is
centralized and reused.

In `@src/sidecar/progress.js`:
- Around line 114-200: The readProgress function is long; extract the
progress.json parsing block into a helper parseProgressFile(progressPath,
convStat) that returns an object with {stage, latestOverride, messagesOverride,
lastActivityOverride, lastActivityMsOverride} (or similar), then replace the
inline parsing (the fs.existsSync(progressPath) try/catch block) with a call to
parseProgressFile and apply its returned overrides to messages, latest,
lastActivity, lastActivityMs and stage; make sure parseProgressFile uses
computeLastActivity and convStat.mtime comparisons exactly as in the original
logic and preserves error swallowing for malformed files.

In `@src/utils/api-key-validation.js`:
- Around line 35-88: Split validateApiKey into three small helpers: (1) create a
buildValidationRequest(provider, key) that returns { url, headers } including
Google query param logic and using VALIDATION_ENDPOINTS and authHeader, (2)
implement classifyValidationResponse(provider, statusCode) that returns the
normalized result objects ({ valid: true } or { valid: false, error: '...' })
encoding Anthropic-specific rules and the general 200/401/403/other mapping, and
(3) extract the network wiring into makeRequestWithTimeout(url, headers,
timeout) which returns a Promise that resolves with the response statusCode or
rejects with errors/timeouts; then refactor validateApiKey to call
buildValidationRequest, call makeRequestWithTimeout, and pass the returned
statusCode into classifyValidationResponse, preserving current timeout (10000ms)
and error messages.

In `@tests/mcp-server.test.js`:
- Around line 37-67: Add a regression test to assert headless sidecar_continue
preserves agent "Plan": update tests/mcp-server.test.js to add a case that calls
handlers.sidecar_continue with noUi: true and agent: 'Plan' (similar style to
the existing sidecar_start cowork tests), mock child_process.spawn to capture
args, then assert that the captured args include '--agent' followed by 'Plan'
(and not coerced to 'build'); reference the handler function sidecar_continue
and the request field agent in your test to locate the code under test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 514e6f73-1e2a-4d22-8824-f482b6210257

📥 Commits

Reviewing files that changed from the base of the PR and between 251358e and 6b7ccb4.

📒 Files selected for processing (46)
  • .eslintrc.js
  • .github/workflows/publish.yml
  • .husky/pre-push
  • CLAUDE.md
  • bin/sidecar.js
  • docs/activity-monitoring-research.md
  • docs/auto-skill-invocation-proposal.md
  • docs/bmad-workflow.md
  • docs/publishing.md
  • electron/ipc-setup.js
  • hooks/hooks.json
  • hooks/post-failure.sh
  • hooks/post-tool-use.sh
  • hooks/pre-bash.sh
  • package.json
  • scripts/generate-docs-helpers.js
  • scripts/postinstall.js
  • scripts/preuninstall.js
  • site/social-card-render.html
  • skill/SKILL.md
  • skill/auto-bmad-method-check/SKILL.md
  • skill/auto-review/SKILL.md
  • skill/auto-security/SKILL.md
  • skill/auto-unblock/SKILL.md
  • src/cli-handlers.js
  • src/cli-usage.js
  • src/cli.js
  • src/mcp-server.js
  • src/mcp-tools.js
  • src/sidecar/progress.js
  • src/sidecar/setup-window.js
  • src/utils/api-key-validation.js
  • src/utils/auto-skills-config.js
  • src/utils/model-validator.js
  • src/utils/start-helpers.js
  • src/utils/thinking-validators.js
  • tests/api-key-store-validation.test.js
  • tests/auto-skills-config.test.js
  • tests/cli-handlers.test.js
  • tests/config-null-alias.test.js
  • tests/mcp-server.test.js
  • tests/model-validator-normalize.test.js
  • tests/postinstall.test.js
  • tests/setup-ui-model.test.js
  • tests/sidecar/setup-window.test.js
  • tests/start-helpers.test.js

- **`async: true`:** Hooks can run in the background without blocking, but then they cannot return decisions.
- **Environment variables:** `$CLAUDE_PROJECT_DIR` for project root, `$CLAUDE_PLUGIN_ROOT` for plugin root.
- **Transcript access:** All hooks receive `transcript_path` — the full conversation JSONL, which can be parsed for historical analysis.
- **Hook types beyond shell:** Claude Code also supports `type: "prompt"` (single-turn LLM evaluation returning yes/no) and `type: "agent"` (subagent with tool access) hooks. These could theoretically replace custom Node.js analysis — e.g., a Stop prompt hook asking "Is Claude stuck in a loop?" However, they add per-turn LLM API costs and latency, making them better suited as a future refinement than as the primary mechanism. See [Future Extensions](#future-extensions) in the design spec.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Broken in-doc fragment link.

Line 109 links to #future-extensions, but that heading is not present in this document, so the anchor resolves incorrectly.

Possible fix
- - Hook types beyond shell: Claude Code also supports `type: "prompt"` (single-turn LLM evaluation returning yes/no) and `type: "agent"` (subagent with tool access) hooks. These could theoretically replace custom Node.js analysis — e.g., a Stop prompt hook asking "Is Claude stuck in a loop?" However, they add per-turn LLM API costs and latency, making them better suited as a future refinement than as the primary mechanism. See [Future Extensions](`#future-extensions`) in the design spec.
+ - Hook types beyond shell: Claude Code also supports `type: "prompt"` (single-turn LLM evaluation returning yes/no) and `type: "agent"` (subagent with tool access) hooks. These could theoretically replace custom Node.js analysis — e.g., a Stop prompt hook asking "Is Claude stuck in a loop?" However, they add per-turn LLM API costs and latency, making them better suited as a future refinement than as the primary mechanism. See the "Future Extensions" section in the design spec.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Hook types beyond shell:** Claude Code also supports `type: "prompt"` (single-turn LLM evaluation returning yes/no) and `type: "agent"` (subagent with tool access) hooks. These could theoretically replace custom Node.js analysis — e.g., a Stop prompt hook asking "Is Claude stuck in a loop?" However, they add per-turn LLM API costs and latency, making them better suited as a future refinement than as the primary mechanism. See [Future Extensions](#future-extensions) in the design spec.
- **Hook types beyond shell:** Claude Code also supports `type: "prompt"` (single-turn LLM evaluation returning yes/no) and `type: "agent"` (subagent with tool access) hooks. These could theoretically replace custom Node.js analysis — e.g., a Stop prompt hook asking "Is Claude stuck in a loop?" However, they add per-turn LLM API costs and latency, making them better suited as a future refinement than as the primary mechanism. See the "Future Extensions" section in the design spec.
🧰 Tools
🪛 LanguageTool

[style] ~109-~109: Consider using a shorter alternative to avoid wordiness.
Context: ...add per-turn LLM API costs and latency, making them better suited as a future refinement than as t...

(MADE_IT_JJR)

🪛 markdownlint-cli2 (0.21.0)

[warning] 109-109: Link fragments should be valid

(MD051, link-fragments)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/activity-monitoring-research.md` at line 109, The in-doc anchor [Future
Extensions](`#future-extensions`) referenced in the sentence about hook types
("See [Future Extensions](`#future-extensions`) in the design spec.") is broken
because that heading doesn't exist; fix it by either adding a matching heading
"## Future Extensions" (or equivalent) to docs/activity-monitoring-research.md
or by updating the link text/anchor to point to an existing section (e.g.,
change the link target to an existing heading name); ensure the link text and
the actual heading slug match so the anchor resolves correctly.

Comment on lines +120 to +125
```
PostToolUse → post-tool-use.sh → append event to JSONL + inline checks
PostToolUseFailure → post-failure.sh → append event to JSONL + inline checks
Stop → stop-hook.sh → read accumulated JSONL + pattern analysis via jq/awk
PreToolUse(Bash) → pre-bash.sh → check for git commit/push patterns
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Opening fenced blocks are missing language tags.

Add a language (e.g., text) to opening fences so linting and rendering stay consistent.

Suggested fix
-```
+```text
 PostToolUse → post-tool-use.sh → append event to JSONL + inline checks
 PostToolUseFailure → post-failure.sh → append event to JSONL + inline checks
 Stop → stop-hook.sh → read accumulated JSONL + pattern analysis via jq/awk
 PreToolUse(Bash) → pre-bash.sh → check for git commit/push patterns

```diff
-```
+```text
 PostToolUse → post-tool-use.sh → fast event append + quick BMAD check
 PostToolUseFailure → post-failure.sh → fast event append
 Stop → stop-hook.sh → node hooks/analyze-patterns.js → block/allow
 PreToolUse(Bash) → pre-bash.sh → check for git commit/push/pr

```diff
-```
+```text
 SessionStart → start-watcher.sh → spawn background Node.js process
 PostToolUse → post-tool-use.sh → append event (watcher also reads transcript)
 Stop → stop-hook.sh → read watcher's recommendation file
 SessionEnd → cleanup-watcher.sh → kill the daemon
</details>


Also applies to: 175-180, 240-245

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.21.0)</summary>

[warning] 120-120: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/activity-monitoring-research.md around lines 120 - 125, The fenced code
blocks in docs/activity-monitoring-research.md are missing language tags; update
each opening triple-backtick fence for the blocks that contain the PostToolUse /
PostToolUseFailure / Stop / PreToolUse(Bash) and SessionStart/SessionEnd
summaries (around the blocks containing the lines with "PostToolUse →
post-tool-use.sh …", the alternate block with "fast event append …", and the
SessionStart/PostToolUse/Stop/SessionEnd block) to include a language tag such
as text (i.e., change totext) so the markdown linter and renderer treat
them consistently.


</details>

<!-- fingerprinting:phantom:poseidon:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +296 to +300
| Hook | Pattern | Signal | Detection Method |
|------|---------|--------|-----------------|
| PreToolUse | Pre-commit gate | Bash command matches `git commit`, `git push`, `gh pr create` | Regex on `tool_input.command`: `/^\s*(git\s+(commit|push)|gh\s+pr\s+create)/`. Fast shell check. |
| PostToolUse | Staged files | `git add` detected | Regex on Bash `tool_input.command` for `git add`. Set a flag in the event file so the Stop hook knows files were staged. |

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Regex example breaks markdown table parsing.

Line 298 contains unescaped | in a table cell, which creates extra columns. Escape pipes inside the regex/code span.

Suggested fix
-| PreToolUse | Pre-commit gate | Bash command matches `git commit`, `git push`, `gh pr create` | Regex on `tool_input.command`: `/^\s*(git\s+(commit|push)|gh\s+pr\s+create)/`. Fast shell check. |
+| PreToolUse | Pre-commit gate | Bash command matches `git commit`, `git push`, `gh pr create` | Regex on `tool_input.command`: `/^\s*(git\s+(commit\|push)\|gh\s+pr\s+create)/`. Fast shell check. |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Hook | Pattern | Signal | Detection Method |
|------|---------|--------|-----------------|
| PreToolUse | Pre-commit gate | Bash command matches `git commit`, `git push`, `gh pr create` | Regex on `tool_input.command`: `/^\s*(git\s+(commit|push)|gh\s+pr\s+create)/`. Fast shell check. |
| PostToolUse | Staged files | `git add` detected | Regex on Bash `tool_input.command` for `git add`. Set a flag in the event file so the Stop hook knows files were staged. |
| Hook | Pattern | Signal | Detection Method |
|------|---------|--------|-----------------|
| PreToolUse | Pre-commit gate | Bash command matches `git commit`, `git push`, `gh pr create` | Regex on `tool_input.command`: `/^\s*(git\s+(commit\|push)\|gh\s+pr\s+create)/`. Fast shell check. |
| PostToolUse | Staged files | `git add` detected | Regex on Bash `tool_input.command` for `git add`. Set a flag in the event file so the Stop hook knows files were staged. |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 298-298: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/activity-monitoring-research.md` around lines 296 - 300, The markdown
table row containing the regex `/^\s*(git\s+(commit|push)|gh\s+pr\s+create)/`
(under the Detection Method column) is breaking table parsing due to unescaped
pipe characters; fix by escaping pipe characters inside the inline code spans
(e.g., replace `|` with `\|` or use the HTML entity &#124;) or move the regex
into a fenced code block so the `|` does not split table cells; apply the same
treatment for the `git add` regex in the PostToolUse row so both regex/code
spans no longer create extra table columns.

Comment on lines +17 to 20
- **Trusted Publisher**: Configured on npm for `jrenaldi79/sidecar` + `publish.yml` (OIDC-based provenance attestation)
- **NPM_TOKEN**: Granular automation token stored as GitHub secret (scoped to `claude-sidecar`, required for `npm publish`)
- **OIDC provenance**: `--provenance` flag adds Sigstore attestation (requires `id-token: write` permission)
- **Trusted publisher config**: https://www.npmjs.com/package/claude-sidecar/access (Settings tab)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Publishing setup section is internally inconsistent.

This section currently describes Trusted Publishing (OIDC) while also stating NPM_TOKEN is required for publish. Please separate primary vs fallback publish paths (or remove the conflicting requirement) so release steps are unambiguous.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/publishing.md` around lines 17 - 20, The Publishing setup mixes two
different flows (OIDC/Trusted Publisher and token-based publish), causing
confusion; update the docs to clearly separate the primary OIDC Trusted
Publisher flow (mentioning publish.yml, id-token: write, and --provenance for
sigstore attestation) from a fallback token-based flow (describe when NPM_TOKEN
is required, its scope and that it is stored as a GitHub secret), and remove any
statements that imply both are simultaneously required (e.g., the line
mentioning NPM_TOKEN alongside Trusted Publisher and the claude-sidecar access
URL) so the release steps become unambiguous about which credentials are used in
each path.

Comment on lines +61 to +63
if (inlineValue !== undefined) {
logger.warn(`Flag --${key} is boolean; ignoring value '${inlineValue}'`);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

--on= / --off= currently hit the master switch.

The new --flag=value parsing doesn't extend to these flags because the inline value is discarded as if they were plain booleans. sidecar auto-skills --off=security will therefore behave like bare --off and disable everything instead of just the security skill. Either turn the inline value into a positional skill name or reject this syntax explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli.js` around lines 61 - 63, The code currently treats any inline value
for boolean flags as ignored (logger.warn with inlineValue), which causes
"--off=security" to act like a global --off; update the handling in the flag
parsing branch that uses inlineValue and key so that when key === "on" or key
=== "off" you either (a) convert the inlineValue into a positional skill name
and inject it back into the parsed args/skills list (e.g., push inlineValue into
the same array used for skill names), or (b) explicitly reject the syntax by
throwing/logging an error and aborting; implement the chosen behavior inside the
same block that references inlineValue and logger.warn so the check for key ===
"on"/"off" is applied before warning and the inlineValue is not discarded.

Comment on lines +32 to +41
const debugPort = process.env.SIDECAR_DEBUG_PORT;
if (debugPort) {
logger.info('Launching setup window with remote debugging', { debugPort });
} else {
logger.info('Launching setup window');
}

const args = debugPort
? [`--remote-debugging-port=${debugPort}`, mainPath]
: [mainPath];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate SIDECAR_DEBUG_PORT before constructing Electron args.

Right now any env value is passed through. Guard for numeric range (1..65535) to avoid invalid launch args.

Suggested patch
-    const debugPort = process.env.SIDECAR_DEBUG_PORT;
+    const rawDebugPort = process.env.SIDECAR_DEBUG_PORT;
+    const parsedDebugPort = Number(rawDebugPort);
+    const debugPort = Number.isInteger(parsedDebugPort) && parsedDebugPort > 0 && parsedDebugPort <= 65535
+      ? String(parsedDebugPort)
+      : undefined;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sidecar/setup-window.js` around lines 32 - 41, The code passes
process.env.SIDECAR_DEBUG_PORT directly into Electron args (const debugPort and
args) without validation; validate SIDECAR_DEBUG_PORT is a positive integer in
range 1..65535 before using it: parse it (e.g., Number or parseInt), check
isFinite/integer and 1 <= value <= 65535, log a warning via logger if invalid
and fall back to not including the remote-debugging arg, and only construct args
with `--remote-debugging-port=${debugPort}` when the validated numericPort is
valid (keep existing logging behavior but include whether the env was ignored).

Comment on lines +145 to +155
// Make config dir read-only so writeFileSync fails
const configPath = path.join(tempDir, 'config.json');
fs.writeFileSync(configPath, '{}');
fs.chmodSync(configPath, 0o444);
fs.chmodSync(tempDir, 0o555);
const { setAutoSkillsEnabled } = loadModule();
const result = setAutoSkillsEnabled(true);
expect(result).toBe(false);
// Restore permissions for cleanup
fs.chmodSync(tempDir, 0o755);
fs.chmodSync(configPath, 0o644);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Write-failure simulation is environment-dependent and can be flaky.

This permission-based approach may behave differently across platforms/runners. Prefer mocking fs.writeFileSync to deterministically throw.

Deterministic test alternative
-      // Make config dir read-only so writeFileSync fails
-      const configPath = path.join(tempDir, 'config.json');
-      fs.writeFileSync(configPath, '{}');
-      fs.chmodSync(configPath, 0o444);
-      fs.chmodSync(tempDir, 0o555);
+      const writeSpy = jest.spyOn(fs, 'writeFileSync')
+        .mockImplementation(() => { throw new Error('EACCES'); });
       const { setAutoSkillsEnabled } = loadModule();
       const result = setAutoSkillsEnabled(true);
       expect(result).toBe(false);
-      // Restore permissions for cleanup
-      fs.chmodSync(tempDir, 0o755);
-      fs.chmodSync(configPath, 0o644);
+      writeSpy.mockRestore();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/auto-skills-config.test.js` around lines 145 - 155, The test currently
simulates a write failure by changing filesystem permissions (in
tests/auto-skills-config.test.js) which is flaky across environments; instead,
mock fs.writeFileSync to deterministically throw when setAutoSkillsEnabled(true)
runs. Replace the chmod/writeFileSync permission block with a mock (e.g.,
jest.spyOn(fs, 'writeFileSync').mockImplementation(() => { throw new Error('mock
write failure') })) before calling loadModule().setAutoSkillsEnabled(true),
assert the function returns false, and then restore the original implementation
(fs.writeFileSync.mockRestore or equivalent) for cleanup to avoid side effects.

- hooks/pre-bash.sh: Don't echo raw bash commands into additionalContext
  (may contain credentials). Extract git operation type instead. Omit
  parentSession when session_id is unknown rather than passing empty string.
- scripts/postinstall.js: Don't overwrite ~/.claude/settings.json when
  existing file is malformed (non-ENOENT errors skip hook registration).
- skill/auto-*/SKILL.md: Remove fallback heuristic that guessed
  parentSession from newest .jsonl file — omit entirely if unknown.
- skill/auto-unblock/SKILL.md: Update stale sidecar_continue explanation
  (no longer forces Build mode in headless).
- src/utils/model-validator.js: Filter out models with missing/empty id
  before matching to prevent downstream errors in normalizeModelId.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
skill/auto-bmad-method-check/SKILL.md (1)

256-261: Optional wording cleanup for readability.

Line 256 repeats sentence openings with “If…”. This is clear but slightly repetitive; a tiny reword would read smoother.

✏️ Suggested wording tweak
-If you spot issues, fix them and note what you adjusted. If the artifact needs deeper review after heavy edits, offer a sidecar consolidation round:
+If you spot issues, fix them and note what you adjusted. For deeper validation after substantial edits, offer a sidecar consolidation round:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skill/auto-bmad-method-check/SKILL.md` around lines 256 - 261, The two
consecutive sentences beginning with "If you spot issues, fix them and note what
you adjusted." and "If the artifact needs deeper review after heavy edits, offer
a sidecar consolidation round:" are repetitive; rephrase to vary the sentence
openings and improve flow—e.g., keep the first imperative about fixing and
noting adjustments, then change the second to "When heavy edits warrant deeper
review, offer a sidecar consolidation round:" (refer to the paragraph that
starts with "If you spot issues..." and the following sentence introducing the
sidecar consolidation text). Ensure the meaning is unchanged and the suggested
sidecar template block remains intact.
skill/auto-unblock/SKILL.md (1)

44-44: Clarify the sidecar_guide invocation phrasing.

The phrase "with the question" suggests passing a parameter to sidecar_guide, but per the tool schema (Context snippet 2), sidecar_guide has an empty inputSchema and accepts no parameters. The intent is to call sidecar_guide and then interpret its response text to determine if the skill is enabled.

✏️ Suggested rewording
-Call `mcp__sidecar__sidecar_guide` with the question "Is auto-skill 'unblock' enabled?" If the guide response indicates the skill is disabled (either master switch off or per-skill disabled), **silently skip this entire skill** — produce no output, no prompt, no side effects. Just stop here.
+Call `mcp__sidecar__sidecar_guide` and check the response to see if auto-skill 'unblock' is enabled. If the guide response indicates the skill is disabled (either master switch off or per-skill disabled), **silently skip this entire skill** — produce no output, no prompt, no side effects. Just stop here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skill/auto-unblock/SKILL.md` at line 44, The doc incorrectly suggests passing
a parameter to mcp__sidecar__sidecar_guide; update the wording to state that you
should call mcp__sidecar__sidecar_guide (no input parameters, per the tool
schema) and then inspect the returned response text for phrases indicating the
skill is disabled (master switch off or per-skill disabled); if the response
indicates disabled, silently stop the skill (produce no output, prompt, or side
effects). Include the exact function name mcp__sidecar__sidecar_guide and
mention that its empty inputSchema means no arguments are passed and the
decision must be driven solely by its response text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skill/auto-security/SKILL.md`:
- Line 144: Replace the unhyphenated phrase "medium severity" with the
hyphenated compound adjective "medium-severity" in the SKILL.md sentence that
currently reads: **If medium severity issues found:** Surface them as warnings
and let the user decide...; update that exact sentence to use "medium-severity"
so it reads "**If medium-severity issues found:** Surface them as warnings..."
to conform to technical prose styling.

---

Nitpick comments:
In `@skill/auto-bmad-method-check/SKILL.md`:
- Around line 256-261: The two consecutive sentences beginning with "If you spot
issues, fix them and note what you adjusted." and "If the artifact needs deeper
review after heavy edits, offer a sidecar consolidation round:" are repetitive;
rephrase to vary the sentence openings and improve flow—e.g., keep the first
imperative about fixing and noting adjustments, then change the second to "When
heavy edits warrant deeper review, offer a sidecar consolidation round:" (refer
to the paragraph that starts with "If you spot issues..." and the following
sentence introducing the sidecar consolidation text). Ensure the meaning is
unchanged and the suggested sidecar template block remains intact.

In `@skill/auto-unblock/SKILL.md`:
- Line 44: The doc incorrectly suggests passing a parameter to
mcp__sidecar__sidecar_guide; update the wording to state that you should call
mcp__sidecar__sidecar_guide (no input parameters, per the tool schema) and then
inspect the returned response text for phrases indicating the skill is disabled
(master switch off or per-skill disabled); if the response indicates disabled,
silently stop the skill (produce no output, prompt, or side effects). Include
the exact function name mcp__sidecar__sidecar_guide and mention that its empty
inputSchema means no arguments are passed and the decision must be driven solely
by its response text.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19aa9de2-8e32-44c1-b1ae-5ec1b830a68c

📥 Commits

Reviewing files that changed from the base of the PR and between 6b7ccb4 and 75d0632.

📒 Files selected for processing (7)
  • hooks/pre-bash.sh
  • scripts/postinstall.js
  • skill/auto-bmad-method-check/SKILL.md
  • skill/auto-review/SKILL.md
  • skill/auto-security/SKILL.md
  • skill/auto-unblock/SKILL.md
  • src/utils/model-validator.js
🚧 Files skipped from review as they are similar to previous changes (3)
  • skill/auto-review/SKILL.md
  • src/utils/model-validator.js
  • hooks/pre-bash.sh

After spawning, poll each sidecar's status using `mcp__sidecar__sidecar_status` with the saved task ID, following the polling cadence indicated by `sidecar_status` responses. Continue polling while status is `running`; treat `complete`, `timeout`, `crashed`, `error`, and `aborted` as terminal. Only then read the output using `mcp__sidecar__sidecar_read`. If multiple models were used, label each result (e.g., "Gemini audit:", "GPT audit:"). Then:

- **If critical/high severity issues found:** **Do not proceed with the commit.** Surface the findings: "Security scan found issues that should be addressed before committing:" — then list them and offer to fix.
- **If medium severity issues found:** Surface them as warnings and let the user decide: "Security scan flagged some medium-severity items. Want to address these before committing, or proceed?"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use hyphenated compound adjective.

At Line 144, use “medium-severity” for cleaner technical prose.

✏️ Proposed micro-fix
-- **If medium severity issues found:** Surface them as warnings and let the user decide: "Security scan flagged some medium-severity items. Want to address these before committing, or proceed?"
+- **If medium-severity issues found:** Surface them as warnings and let the user decide: "Security scan flagged some medium-severity items. Want to address these before committing, or proceed?"
🧰 Tools
🪛 LanguageTool

[grammar] ~144-~144: Use a hyphen to join words.
Context: ...ist them and offer to fix. - If medium severity issues found: Surface them as...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skill/auto-security/SKILL.md` at line 144, Replace the unhyphenated phrase
"medium severity" with the hyphenated compound adjective "medium-severity" in
the SKILL.md sentence that currently reads: **If medium severity issues found:**
Surface them as warnings and let the user decide...; update that exact sentence
to use "medium-severity" so it reads "**If medium-severity issues found:**
Surface them as warnings..." to conform to technical prose styling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant