Add test-codex-changes Claude Code skill#865
Conversation
Shared skill that tells an agent how to check out a branch in a worktree, build and launch the extension inside Codex.app, test the affected UI surfaces, and post findings to the PR. Instructions cover only the non-discoverable mechanics so the agent drives the UI with fresh eyes. Also carves out .claude/skills/ from the .claude/ gitignore so the skill is shareable, while keeping settings.local.json etc. still ignored. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test run — used the skill to test this PR itselfWhat I did
Findings from the sparkle flow
🐌 No latency surprises. LLM generation took ~8s with a toast spinner — good feedback. Self-critique of the skill (from using it)
Nits
VerdictSkill works as intended for a code-change branch; this docs-only branch is a weak test of it. Worth a human review of the philosophy section to confirm the "no click-by-click walkthrough" stance matches team preference. Update: after this first pass, the skill was trimmed further (commit |
Removed instructions an agent would know anyway (worktree/gh/diff commands, notes-file conventions, cleanup syntax). Kept the parts that aren't discoverable without this skill: build sequence with its expected mocha warning, Codex.app binary path, com.codex MCP catalog quirk, focus-steal, file→UI mapping table, smart-watch webview workflow, and screenshot attachment via gh gist. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uploading images via gh is awkward and the workaround paths are unreliable. Remove the section entirely; if an agent wants to show the user something visually, it can open the file natively without needing instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
.claude/skills/test-codex-changes/SKILL.mdthat teaches an agent how to check out a branch in a worktree, build the extension, launch it inside Codex.app, exercise affected UI surfaces, and post findings to the PR..claude/skills/out of the.claude/gitignore so the skill is shared;settings.local.jsonand other per-user state stay ignored.The skill deliberately omits a click-by-click walkthrough — its whole value is fresh-eyes testing. If it scripted the UI, the agent would replay the script and miss UX regressions. It gives only the non-discoverable mechanics (build commands, Codex binary path,
com.codexaccess quirk, the focus-steal gotcha, how to post PR comments viagh).Test plan
🤖 Generated with Claude Code