Skip to content

feat(platform-integrations): unify plugin code under a single canonical source#235

Merged
visahak merged 35 commits into
mainfrom
feature/unify-plugin-code
May 4, 2026
Merged

feat(platform-integrations): unify plugin code under a single canonical source#235
visahak merged 35 commits into
mainfrom
feature/unify-plugin-code

Conversation

@illeatmyhat
Copy link
Copy Markdown
Collaborator

@illeatmyhat illeatmyhat commented Apr 29, 2026

Unifies the four hand-edited plugin copies under platform-integrations/ (bob, claude, codex, claw-code) behind a single canonical source at plugin-source/, rendered by a Python+Jinja2 build script. platform-integrations/ is treated as generated output, with a render-equality gate enforced by pre-commit and CI.

Implements the design captured in #219.

What changed

  • plugin-source/ is the source of truth. A skill's SKILL.md.j2, its scripts, and its descriptions all live in one place; per-platform output is fanned out by plugin-source/build_plugins.py with a per-platform Jinja context (forked-context flags, skill-dir paths, etc.).
  • Per-host plugin metadata generated from plugin-source/plugin.toml. Each host's plugin.json (or absence thereof) is projected from a single TOML; per-host extras live in [claude] / [claw-code] / [codex] tables.
  • Per-platform routing via _<platform>/ prefix. Anything under plugin-source/_<platform>/... ships only to that platform, with the _<platform>/ prefix stripped from the output. This is how single-platform artifacts live alongside the universal sources without a separate manifest:
    • _bob/custom_modes.yaml — bob's mandatory workflow definition
    • _claude/hooks/hooks.jsonStop / UserPromptSubmit / SessionStart hooks
    • _claw-code/hooks/retrieve_entities.sh — optional PreToolUse hook
    • _<platform>/README.md — each platform's plugin-facing README
  • Bob commands auto-generated 1:1 from skills. No static plugin-source/commands/ directory; _bob_command_targets() walks the skill folders and emits one evolve-lite-<skill>.md per skill. Frontmatter uses only description (pulled from the skill's SKILL.md frontmatter — bob's command schema only honors description / argument-hints); the body references the on-disk folder name (evolve-lite-<skill>, dash form) since that's what bob resolves against. Folders stay colon-free for Windows compatibility.
  • Wipe-before-generate. render_to() wipes each platform's plugin_root before writing, so renamed skills, deleted scripts, or obsolete commands cannot linger as orphans. Together with the routing convention above, this makes platform-integrations/ fully derivable from plugin-source/.
  • Bob skills use the colon form for their identity (name: evolve-lite:<skill> in frontmatter, referenced as evolve-lite:<skill> in prose) while their on-disk folder remains hyphenated (.bob/skills/evolve-lite-<skill>/) for Windows compatibility.
  • Drift enforcement. A plugins-rendered pre-commit hook and CI job run build_plugins.py check, which exits non-zero if committed platform-integrations/ differs from a fresh render. A redesigned 20-test suite at tests/platform_integrations/test_build_pipeline.py pins the headline invariant (render → check is silent), idempotence, orphan wipe, per-platform routing, bob command generation, and drift detection on perturbed/missing files.

How to validate locally

  1. Generate the plugins — write plugin-source/ out to platform-integrations/:

    just compile-plugins
    # or, equivalently:
    uv run python plugin-source/build_plugins.py render

    Expect every per-platform tree under platform-integrations/<host>/ to be regenerated.

  2. Verify no drift — confirm the committed tree matches a fresh render:

    just check-plugins-rendered
    # or:
    uv run python plugin-source/build_plugins.py check

    Silent on success; exits non-zero with a drift: / missing managed file: message otherwise. The same check runs in pre-commit and CI.

  3. Run the unit tests for the build pipeline (fast, hermetic, ~6s):

    uv run pytest tests/platform_integrations/test_build_pipeline.py -v
  4. Run the cross-platform smoke harness. This runs the install flow + learn / recall / publish on real CLIs (claude / codex / bob), so a --no-live flag is supported for offline runs that exercise everything except live model calls:

    # Three-platform smoke, no live API calls (~5 min, no API spend):
    uv run python tests/smoke_skills.py --platform all --no-live --keep
    
    # Single platform with live API (consumes API credits on claude/codex; bob is install-only):
    uv run python tests/smoke_skills.py --platform claude --keep --verbose
    
    # Available platform values: claude, codex, bob, all

    --keep leaves the temp install dir on exit so you can poke at the rendered plugin layout under <tempdir>/<platform>/.

  5. Optional sanity checks on the rendered tree — quickly inspect a few outputs:

    # Bob's slash commands now live as one-file-per-skill, dash-form body, no `name:`:
    cat platform-integrations/bob/evolve-lite/commands/evolve-lite-learn.md
    
    # Claude's hooks flow from _claude/hooks/hooks.json through the renderer unchanged:
    diff plugin-source/_claude/hooks/hooks.json platform-integrations/claude/plugins/evolve-lite/hooks/hooks.json

Notes for reviewers

  • bob's smoke is install-only — bob does not currently expose a non-interactive slash-command primitive, so the harness can verify install presence but cannot exercise the skills end-to-end on bob. This short-circuit is documented inline in tests/smoke_skills.py.
  • The migration is in-place, byte-equivalent at the renderer output, and reviewable per-commit; the older commit-by-commit migration log was elided from this description for readability.

🤖 Generated with Claude Code

Captures the design that came out of the planning session for #219:
treat platform-integrations/ as generated output from a new
plugin-source/ canonical tree, rendered via Jinja2, with a CI gate
enforcing render-equality. Records the alternatives weighed
(symlinks, separate repo, gitignored output, Go/Rust tooling) and
their rejection reasons so the decision isn't relitigated later.

Establishes docs/adr/ as the project's ADR home.

Refs #219
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

Important

Review skipped

Too many files!

This PR contains 174 files, which is 24 over the limit of 150.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5ecbc6de-feee-4b58-9c9d-1431eed48a73

📥 Commits

Reviewing files that changed from the base of the PR and between 3996a2e and 9fe5f71.

📒 Files selected for processing (174)
  • .github/workflows/check-code.yaml
  • .gitignore
  • .pre-commit-config.yaml
  • justfile
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-learn.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-publish.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-recall.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-save-trajectory.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-save.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-subscribe.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-sync.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite-unsubscribe.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:learn.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:publish.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:recall.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:save-trajectory.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:subscribe.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:sync.md
  • platform-integrations/bob/evolve-lite/commands/evolve-lite:unsubscribe.md
  • platform-integrations/bob/evolve-lite/custom_modes.yaml
  • platform-integrations/bob/evolve-lite/lib/__init__.py
  • platform-integrations/bob/evolve-lite/lib/audit.py
  • platform-integrations/bob/evolve-lite/lib/config.py
  • platform-integrations/bob/evolve-lite/lib/entity_io.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-learn/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-learn/scripts/on_stop.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-learn/scripts/on_stop.sh
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-learn/scripts/save_entities.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-publish/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-publish/scripts/publish.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-recall/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-recall/scripts/retrieve_entities.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-save-trajectory/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-save-trajectory/scripts/on_stop.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-save-trajectory/scripts/save_trajectory.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-save/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-subscribe/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-subscribe/scripts/subscribe.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-sync/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-sync/scripts/sync.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-unsubscribe/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite-unsubscribe/scripts/unsubscribe.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite:learn/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite:recall/SKILL.md
  • platform-integrations/bob/evolve-lite/skills/evolve-lite:recall/scripts/retrieve_entities.py
  • platform-integrations/bob/evolve-lite/skills/evolve-lite:save-trajectory/scripts/save_trajectory.py
  • platform-integrations/claude/plugins/evolve-lite/.claude-plugin/plugin.json
  • platform-integrations/claude/plugins/evolve-lite/hooks/hooks.json
  • platform-integrations/claude/plugins/evolve-lite/lib/config.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/learn/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/learn/scripts/on_stop.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/learn/scripts/on_stop.sh
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/learn/scripts/save_entities.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/publish/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/publish/scripts/publish.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/recall/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/recall/scripts/retrieve_entities.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/save-trajectory/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/save-trajectory/scripts/on_stop.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/save-trajectory/scripts/save_trajectory.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/save/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/subscribe/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/subscribe/scripts/subscribe.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/sync/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/sync/scripts/sync.py
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/unsubscribe/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/evolve-lite/unsubscribe/scripts/unsubscribe.py
  • platform-integrations/claude/plugins/evolve-lite/skills/learn/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/publish/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/recall/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/recall/scripts/retrieve_entities.py
  • platform-integrations/claude/plugins/evolve-lite/skills/subscribe/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/sync/SKILL.md
  • platform-integrations/claude/plugins/evolve-lite/skills/unsubscribe/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/.claude-plugin/plugin.json
  • platform-integrations/claw-code/plugins/evolve-lite/hooks/retrieve_entities.sh
  • platform-integrations/claw-code/plugins/evolve-lite/lib/config.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/learn/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/learn/scripts/on_stop.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/learn/scripts/on_stop.sh
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/learn/scripts/save_entities.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/publish/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/publish/scripts/publish.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/recall/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/recall/scripts/retrieve_entities.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/save-trajectory/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/save-trajectory/scripts/on_stop.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/save-trajectory/scripts/save_trajectory.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/save/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/subscribe/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/subscribe/scripts/subscribe.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/sync/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/sync/scripts/sync.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/unsubscribe/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/evolve-lite/unsubscribe/scripts/unsubscribe.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/learn/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/publish/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/recall/SKILL.md
  • platform-integrations/claw-code/plugins/evolve-lite/skills/recall/scripts/retrieve_entities.py
  • platform-integrations/claw-code/plugins/evolve-lite/skills/sync/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/.codex-plugin/plugin.json
  • platform-integrations/codex/plugins/evolve-lite/lib/__init__.py
  • platform-integrations/codex/plugins/evolve-lite/lib/audit.py
  • platform-integrations/codex/plugins/evolve-lite/lib/config.py
  • platform-integrations/codex/plugins/evolve-lite/lib/entity_io.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/learn/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/learn/scripts/on_stop.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/learn/scripts/on_stop.sh
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/learn/scripts/save_entities.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/publish/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/publish/scripts/publish.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/recall/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/recall/scripts/retrieve_entities.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/save-trajectory/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/save-trajectory/scripts/on_stop.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/save-trajectory/scripts/save_trajectory.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/save/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/subscribe/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/subscribe/scripts/subscribe.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/sync/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/sync/scripts/sync.py
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/unsubscribe/SKILL.md
  • platform-integrations/codex/plugins/evolve-lite/skills/evolve-lite/unsubscribe/scripts/unsubscribe.py
  • platform-integrations/install.sh
  • plugin-source/README.md
  • plugin-source/_bob/README.md
  • plugin-source/_bob/custom_modes.yaml
  • plugin-source/_claude/README.md
  • plugin-source/_claude/hooks/hooks.json
  • plugin-source/_claw-code/README.md
  • plugin-source/_claw-code/hooks/retrieve_entities.sh
  • plugin-source/_codex/README.md
  • plugin-source/_macros.j2
  • plugin-source/build_plugins.py
  • plugin-source/lib/__init__.py
  • plugin-source/lib/audit.py
  • plugin-source/lib/config.py
  • plugin-source/lib/entity_io.py
  • plugin-source/plugin.toml
  • plugin-source/skills/evolve-lite/learn/SKILL.md.j2
  • plugin-source/skills/evolve-lite/learn/scripts/on_stop.py
  • plugin-source/skills/evolve-lite/learn/scripts/on_stop.sh
  • plugin-source/skills/evolve-lite/learn/scripts/save_entities.py
  • plugin-source/skills/evolve-lite/publish/SKILL.md.j2
  • plugin-source/skills/evolve-lite/publish/scripts/publish.py
  • plugin-source/skills/evolve-lite/recall/SKILL.md.j2
  • plugin-source/skills/evolve-lite/recall/scripts/retrieve_entities.py
  • plugin-source/skills/evolve-lite/save-trajectory/SKILL.md.j2
  • plugin-source/skills/evolve-lite/save-trajectory/scripts/on_stop.py
  • plugin-source/skills/evolve-lite/save-trajectory/scripts/save_trajectory.py
  • plugin-source/skills/evolve-lite/save/SKILL.md.j2
  • plugin-source/skills/evolve-lite/subscribe/SKILL.md.j2
  • plugin-source/skills/evolve-lite/subscribe/scripts/subscribe.py
  • plugin-source/skills/evolve-lite/sync/SKILL.md.j2
  • plugin-source/skills/evolve-lite/sync/scripts/sync.py
  • plugin-source/skills/evolve-lite/unsubscribe/SKILL.md.j2
  • plugin-source/skills/evolve-lite/unsubscribe/scripts/unsubscribe.py
  • pyproject.toml
  • tests/platform_integrations/conftest.py
  • tests/platform_integrations/test_bob_sharing.py
  • tests/platform_integrations/test_build_pipeline.py
  • tests/platform_integrations/test_codex.py
  • tests/platform_integrations/test_codex_sharing.py
  • tests/platform_integrations/test_config.py
  • tests/platform_integrations/test_idempotency.py
  • tests/platform_integrations/test_plugin_structure.py
  • tests/platform_integrations/test_preservation.py
  • tests/platform_integrations/test_publish.py
  • tests/platform_integrations/test_retrieve.py
  • tests/platform_integrations/test_save_entities.py
  • tests/platform_integrations/test_skill_directory_names.py
  • tests/platform_integrations/test_subscribe.py
  • tests/platform_integrations/test_sync.py
  • tests/smoke_skills.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/unify-plugin-code

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

illeatmyhat and others added 10 commits April 29, 2026 11:25
The PRD on #219 is the canonical record of the design decisions for
this work. A separate ADR file duplicated that content without adding
review value, so it has been removed.

Refs #219
…ipeline

Adds the canonical source tree (plugin-source/) and the build pipeline that
renders it into platform-integrations/. The first managed slice is the four
identical lib/*.py helpers shared by claude and claw-code; the byte-identical
render produces no diff vs the previously committed copies.

What's wired:
- plugin-source/MANIFEST.toml declares platforms (claude, claw-code, codex, bob)
  and the per-file render targets. Verbatim entries only for now; Jinja2 templating
  and per-platform overlays land in subsequent commits.
- scripts/build_plugins.py renders the manifest and detects drift. Stdlib only
  (tomllib, filecmp, shutil); no new project deps.
- justfile gains compile-plugins and check-plugins-rendered recipes.
- Pre-commit gains a plugins-rendered hook scoped to plugin-source/,
  platform-integrations/, and scripts/build_plugins.py.
- CI gains a check-plugins-rendered job.
- tests/platform_integrations/test_build_pipeline.py covers manifest loading,
  full-render output, and drift detection (positive and negative cases).

Codex and bob declare plugin_root entries but no managed files yet — those land
when those platforms' content is migrated in later commits. The existing
install.sh continues to do the runtime lib copy for them in the meantime.

Refs #219
…in-source

Sweeps the six skill scripts that are byte-identical between claude and
claw-code today into plugin-source/skills/<name>/scripts/. The render
remains byte-identical to committed platform-integrations/, so this is
a pure source relocation — no behavior change.

Migrated:
- learn/scripts/save_entities.py
- publish/scripts/publish.py
- subscribe/scripts/subscribe.py
- unsubscribe/scripts/unsubscribe.py
- sync/scripts/sync.py
- save-trajectory/scripts/save_trajectory.py

Not yet migrated (left for the Jinja2 commit):
- recall/scripts/retrieve_entities.py — varies across all four platforms.
- learn/scripts/on_stop.py and on_stop.sh — claude-only hooks.
- save-trajectory/scripts/on_stop.py — claude-only hook.
- All SKILL.md files — diverge across platforms.
- codex and bob copies of these scripts — diverge from claude/claw-code due
  to runtime-environment differences (lib path discovery, hook contracts).

Refs #219
Adds Jinja2 rendering for source files ending in .j2. Each platform's
[platforms.<name>] table in MANIFEST.toml now accepts arbitrary keys
beyond plugin_root; everything else is forwarded to the template as a
context variable, alongside `platform = "<name>"`. Verbatim copy
remains the default for non-.j2 sources.

Demonstrates the mechanism on skills/learn/SKILL.md, the first templated
file. Two real per-platform variations are now expressed in one shared
.j2 template:

- forked_context (bool) — claude wraps learn in a forked execution model
  and needs a "Step 0: Load the Conversation" section that reads the
  stop-hook transcript; claw-code does not. The bool gates a {% if %}
  block plus a small inline phrasing tweak in Step 1.
- save_entities_invocation (str) — claude invokes the save script via
  ${CLAUDE_PLUGIN_ROOT}/...; claw-code does a config-home lookup dance.
  The string is substituted in three places (Method 1/2/3 examples).

Render produces byte-identical output to the previously committed
SKILL.md files for both claude and claw-code; drift gate stays green.

Build-pipeline tests grow a TestJinjaTemplating class that asserts a
shared .j2 source produces platform-specific output; existing tests
updated for the renamed Manifest.platforms attribute (was
platform_roots) and split into "every target rendered" plus "verbatim
files match source byte-for-byte".

This is commit 3a of the migration plan; commit 3b will sweep the
remaining drifted SKILL.md files and the per-platform script variation
(retrieve_entities.py, codex/bob save_entities.py, on_stop.* hooks).

Refs #219
Sweeps the remaining seven SKILL.md files (recall, publish, subscribe,
unsubscribe, sync, save-trajectory, save) into shared .j2 templates that
render byte-identically (modulo one trivial whitespace fix, see below)
to the previously committed claude and claw-code copies.

The dominant per-platform variation across these files is the script
invocation snippet — claude expands ${CLAUDE_PLUGIN_ROOT} via its plugin
runtime; claw-code does a config-home discovery dance wrapped in
sh -lc '...'. Rather than store the long claw-code shell command as a
manifest variable for each skill, this introduces a shared Jinja2
macro (plugin-source/_macros.j2 :: invoke(skill, script, args)) that
emits the platform-appropriate form. `args` accepts None, a string, or
a list — when given a list, claude renders one arg per line with
backslash continuation (matches the existing publish/subscribe
formatting); claw-code stays single-line because the whole command is
inside sh -lc '...'.

The remaining variation is captured in two new per-platform manifest
keys plus an inline conditional block in recall:

- forked_context (bool) — Step 0 of learn loads a forked-context
  transcript on claude; not relevant on claw-code.
- save_example_script_root (str) — placeholder root used in save's
  example invocations (${CLAUDE_PLUGIN_ROOT}/skills vs ~/.claw/skills).
- user_skills_dir (str) — where the save skill writes the new skill
  (~/.claude/skills vs ~/.claw/skills).
- recall's "How It Works" prose differs in step 1-2 wording (claude
  fires on user prompt submit; claw-code fires on PreToolUse) and
  references "Claude" vs "the agent" in two places. Inline {% if %}.

learn/SKILL.md.j2 (introduced in the previous commit) is migrated from
its bespoke `save_entities_invocation` manifest var to the shared
invoke() macro. The save_entities_invocation key is dropped.

One incidental cleanup: save/SKILL.md had four trailing spaces on two
blank lines inside an embedded python code-block example (legacy of an
earlier editor). The .j2 template renders those lines without the
trailing whitespace; the committed claude+claw-code copies are updated
to match. No semantic change.

Codex and bob SKILL.md files are not migrated in this commit — their
prose diverges substantially (different audience LLMs, different
hook contracts) and they need either deeper conditionals or
per-platform overlay files. Those land in commit 3c alongside the
script-synthesis work.

Refs #219
Three files exist only on the claude tree (not on claw-code, codex, or
bob): the forked-context stop hooks for `learn` and `save-trajectory`.
Bringing them under build management uses the per-platform overlay
pattern — manifest entries with `platforms = ["claude"]` and a single
source path under plugin-source/. The renderer emits them only into
the claude tree; the drift gate enforces byte-identity.

Files: learn/scripts/on_stop.py, learn/scripts/on_stop.sh,
save-trajectory/scripts/on_stop.py.

Mypy now also excludes plugin-source/ (it already excluded
platform-integrations/). The two on_stop.py files share a module name,
which the existing exclusion handled in the rendered tree but not in
the source tree.

Notes on what is NOT in this commit:

- save_entities.py for codex is *not* synthesized in this commit.
  Codex's variant ignores incoming owner/visibility values from stdin
  (see test_codex_sharing.py::test_save_ignores_incoming_owner_and_visibility),
  while claude/claw-code preserve them if set. That is a deliberate
  per-platform security stance, not drift, and collapsing it would
  either change codex behavior or introduce a new behavior-flag knob —
  worth its own PR with explicit user buy-in.

- retrieve_entities.py is also not synthesized here. Beyond the
  lib-path discovery prelude (which the shim pattern would cover), the
  bodies legitimately differ across platforms: claude logs env vars
  and argv for debugging while codex doesn't, codex calls
  find_entities_dir while claude calls find_recall_entity_dirs, and
  the output header text varies. Synthesis warrants a focused commit.

- Codex and bob SKILL.md files remain hand-edited in
  platform-integrations/. Their prose is tuned for different audience
  LLMs and would mostly require Pattern B (per-platform overlay
  files) rather than Jinja2 conditionals; deferring until the broader
  migration shape settles.

Refs #219
Bob is the only platform that used colon-prefixed names on disk
(.bob/skills/evolve-lite:<x>/, .bob/commands/evolve-lite:<x>.md).
Windows treats `:` as a drive separator and rejects it in path
components, so the existing layout couldn't be checked out or
installed on Windows. Other platforms (claude, codex, claw-code)
synthesize the colon namespace from a plugin manifest and don't
have the issue.

Renames every colon-prefixed source path to a hyphen-prefixed name
(evolve-lite-<x>) and updates every reference: bob's custom_modes.yaml
prompt, bob's command-file frontmatter, install.sh's BobInstaller
glob patterns and status output, and the affected tests in
tests/platform_integrations/.

User-facing slash-command surface change for Bob users:
/evolve-lite:learn → /evolve-lite-learn (etc). Other platforms are
unchanged because their plugin manifests still synthesize the colon
form for the user-facing namespace.

The sole reference to evolve-lite:recall left intact is in
install.sh's CodexInstaller post-install message — codex's plugin
manifest still produces /evolve-lite:recall as the slash command, so
the hyphenated name there would be wrong.

Pre-existing test failures unrelated to this rename:

- test_bob_sharing.py and test_sync.py and test_codex_sharing.py
  expect "invalid subscription name" in stdout but sync.py logs
  "invalid name" to stderr. This drift exists on main (verified
  before the rename) across all three platforms; same 5 failures
  before and after. Out of scope here, will need its own commit.

The rename FIXES one pre-existing test:
test_skill_directory_names.py::test_bob_lite_skills_follow_naming_convention
(which now matches the new evolve-lite- prefix expectation).

Refs #219
Drops Step 0 of the evolve-lite mode prompt, which used to enumerate
specific .bob/skills/<skill>/SKILL.md paths the agent had to read up
front. The relationship between the mode and the skills it depends on
was largely a coincidence of the prompt — the mode's job is the
workflow contract (recall → work → save-trajectory → learn → complete);
the skill registry is whatever Bob's runtime resolves under
.bob/skills/.

Replaces the path enumeration with a generic instruction to read each
skill's SKILL.md before first invocation. Workflow steps still call
the relevant skills by name (recall, save-trajectory, learn, plus the
optional sharing skills), since the mode's contract is precisely "use
these skills in this order." Names, not paths.

This finishes the migration plan from #219:
1. ✅ Build pipeline + render-equality gate (commit 1)
2. ✅ Migrate identical claude/claw-code skill scripts (commit 2)
3a. ✅ Jinja2 templating + first per-platform .j2 (commit 3a)
3b. ✅ Sweep remaining claude/claw-code SKILL.md prose (commit 3b)
3c. ✅ Claude-only on_stop overlay files (commit 3c)
4. ✅ Bob colon-prefix rename for Windows compat (commit 4)
5. ✅ Decouple custom_modes.yaml from skill paths (this commit)

Followups outside this PR's scope:
- Synthesize codex's save_entities.py and the four-platform
  retrieve_entities.py (real semantic synthesis, deserves focused PR)
- Migrate codex/bob SKILL.md content into plugin-source as Pattern B
  per-platform overlays
- Move claw-code's installed-path convention off colons (separate
  Windows-compat issue, parallel to bob's)
- Resolve the pre-existing "invalid subscription name" stdout/stderr
  drift across claude/codex/bob (5 failing tests on main, untouched
  by this PR)

Refs #219
Resolves four pre-existing test failures across claude, codex, and bob
sync tests that asserted "invalid subscription name" appeared in stdout
when an entry in evolve.config.yaml had an unsafe name (e.g.
'../evil', '.', '..').

Root cause: every platform's sync.py used `normalize_repos(cfg)`,
which routes through `_coerce_repo` in lib/config.py. _coerce_repo
silently filtered invalid entries (after a stray stderr print with a
slightly different phrasing — "ignoring repo entry 'X' — invalid
name") and returned None. The downstream "skipped — invalid
subscription name" branch in each sync.py ran on already-filtered
entries, so it never fired. The user saw "No subscriptions
configured" and a stderr log with a different message; the tests saw
neither in stdout.

Fix:

- lib/config.py: drop the stderr prints inside _coerce_repo. They
  were leaky from a library function (callers, not the lib, should
  decide where to surface a rejection). Add `classify_repo_entry`
  which returns (repo, rejection) for one raw entry — exactly one is
  non-None — so callers can iterate raw `cfg["repos"]` and report
  rejections per their own UX.

- claude/claw-code/codex/bob sync.py: replace `normalize_repos(cfg)`
  with manual iteration over raw entries via classify_repo_entry.
  Rejection reasons are added to the same `summaries` list that
  already collects per-repo sync results, so they appear in the
  user-visible "Synced N repo(s): …" stdout line. Dedup by name is
  preserved inline.

- test_config.py::test_invalid_scope_entries_dropped: replaced its
  capsys assertion (which depended on the now-removed stderr print)
  with a direct call to classify_repo_entry that returns the same
  rejection reason structurally.

Test impact:

- Fixes test_sync.py::test_skips_invalid_subscription_name
- Fixes test_bob_sharing.py::test_skips_invalid_subscription_name
- Fixes test_bob_sharing.py::test_rejects_dot_and_double_dot_names
- Fixes test_codex_sharing.py::test_sync_skips_invalid_subscription_name
- One pre-existing failure remains: test_subscribe_warns_when_audit_write_fails
  in test_codex_sharing.py. That test asserts subscribe.py warns and
  continues when the audit log can't be written; the current
  subscribe.py rolls back and exits 1 (claude and codex both). That's
  a separate design decision (fail-open UX vs fail-closed security)
  that deserves its own focused commit.

Refs #219
Replaces the SKILL.codex.md / SKILL.bob.md per-platform-overlay
approach (the dropped c6c76a0) with a single SKILL.md.j2 per skill
that renders for all four platforms. Codex's prose is the canonical
base — it is the most refined / production-tested variant — and
Jinja2 branches handle the genuinely platform-specific bits.

What this does for each cross-platform skill (learn, publish, recall,
subscribe, sync, unsubscribe):

  - Frontmatter description switches to codex's trigger-oriented
    wording across all platforms (claude/claw-code/bob previously
    carried a more passive "Analyze ..." description).
  - claude keeps `context: fork` in the frontmatter via a Jinja branch.
  - learn keeps Step 0 (forked-context transcript loading) for claude
    only via the existing `forked_context` flag.
  - recall adopts codex's "Required Action / Completion Rule /
    Required Visible Completion Note / Failure Conditions" guards on
    every platform, with a per-platform "How It Works" branch that
    describes claude's UserPromptSubmit hook, claw-code's PreToolUse
    hook, codex's optional codex_hooks integration, and bob's manual
    workflow respectively.
  - sync gains a "Notes" implementation-detail section sourced from
    bob's prose (additive, applies to all platforms).
  - unsubscribe keeps the claude/claw-code-only `--force` addendum
    inside a `{% if platform in ["claude", "claw-code"] %}` branch
    because only those platforms' unsubscribe.py refuses to remove a
    write-scope clone without it.

save-trajectory now also renders for bob (codex has no
save-trajectory skill). The Write+temp-file pattern from claude
applies to bob too — bob's prior heredoc form had the same escaping
fragility claude's note warned against.

The macro layer (_macros.j2):

  - `invoke(skill, script, args)` gains codex and bob branches:
      codex → python3 "$(git rev-parse --show-toplevel ...)/plugins/.../<script>"
      bob   → python3 .bob/skills/evolve-lite-<skill>/scripts/<script>
    Codex paths now standardise on the git-rev-parse form (codex's
    pre-existing prose mixed that with bare relative paths).
  - new `skill_ref(name)` macro expands to the platform-appropriate
    cross-reference syntax: `/evolve-lite:<name>` for claude /
    claw-code, `evolve-lite:<name>` for codex, `evolve-lite-<name>`
    for bob.

MANIFEST.toml:

  - Adds `forked_context = false` to the codex and bob platform
    tables so StrictUndefined doesn't trip on the `{% if
    forked_context %}` branch in learn.
  - For each cross-platform skill, the [[files]] entries collapse
    from "claude/claw-code .j2 + codex overlay + bob overlay" (3
    sources) into a single source with two target rows — one for
    [claude, claw-code, codex] hitting `skills/<skill>/SKILL.md`,
    one for [bob] hitting `skills/evolve-lite-<skill>/SKILL.md`
    (post-rename folder).

The codex/bob/claude/claw-code on-disk SKILL.md outputs are now all
freshly rendered from these unified sources. The drift gate
(`just check-plugins-rendered`) is green; platform_integrations
tests still pass at 307/308 (the same pre-existing
`test_subscribe_warns_when_audit_write_fails` failure tracked
elsewhere).

Refs #219

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@illeatmyhat illeatmyhat force-pushed the feature/unify-plugin-code branch 2 times, most recently from c6c76a0 to 2363dc5 Compare April 29, 2026 20:34
illeatmyhat and others added 15 commits April 29, 2026 14:36
…antics

All four platforms now render save_entities.py from a single source in
plugin-source/. The unified script adopts codex's strict-overwrite
ownership stamping verbatim:

    entity["owner"] = args.user or "unknown"
    entity["visibility"] = "private"

This replaces the older preserve-if-set form claude/claw-code carried:

    if args.user and not entity.get("owner"):
        entity["owner"] = args.user
    if not entity.get("visibility"):
        entity["visibility"] = "private"

Why strict wins, per the timeline: claude's preserve-if-set form
landed 2026-04-21 in #188 (6f79732 "feat(evolve-lite): add entity
sharing skills and CI tests"). Codex's strict form landed 2026-04-22
in #196 (cd4204c "feat(codex): add lite sharing skills and
session-start sync"), whose commit body explicitly lists "fix(codex):
tighten sharing script safeguards" and "fix(codex): harden sharing
scripts and tests". Codex was a deliberate second pass on the same
script after the spoofing risk was identified — untrusted upstream
input (a prompt-injected agent) must not be able to dictate `owner`
or `visibility` on the resulting on-disk entity.

The strict semantics are pinned by
test_codex_sharing.py::test_save_ignores_incoming_owner_and_visibility,
which still passes against the unified source. No test pinned the
preserve-if-set behavior on the claude/claw-code side, so dropping
that branch costs nothing observable and closes the spoofing vector
on those two platforms as well.

Lib-path discovery is also unified: the walk-up loop checks
`<ancestor>/lib` (claude / claw-code / codex installed layout),
`<ancestor>/evolve-lib` (bob's installed layout), and the existing
`<ancestor>/platform-integrations/claude/plugins/evolve-lite/lib`
monorepo-dev fallback that codex's variant carried. One discovery
prelude works for every platform, no Jinja branching needed.

MANIFEST: save_entities.py expands from `["claude", "claw-code"]` to
two entries — one targeting `skills/learn/scripts/save_entities.py`
for [claude, claw-code, codex], one targeting
`skills/evolve-lite-learn/scripts/save_entities.py` for [bob]
(post-rename folder).

Tests: 307/308 platform_integrations pass — same baseline as before
(the one pre-existing failure
test_codex_sharing.py::test_subscribe_warns_when_audit_write_fails
predates this branch).

Refs #219

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All four platforms now render retrieve_entities.py from a single
source in plugin-source/. The unified script adopts codex's prose
and structure verbatim where the variants diverged, with two
deliberate concessions to preserve other-platform behavior.

Synthesis decisions, by divergence point:

  Lib-path discovery — same walk-up loop as save_entities.py:
  `<ancestor>/lib` (claude / claw-code / codex), `<ancestor>/evolve-lib`
  (bob), and the existing `<ancestor>/platform-integrations/claude/...`
  monorepo-dev fallback. One discovery prelude, no Jinja branching.

  find_recall_entity_dirs vs find_entities_dir — codex's
  `find_entities_dir` wins. Both functions resolve to the same
  canonical `<evolve_dir>/entities` path today, so the multi-root
  list form (claude/claw-code) collapses to the single-dir form with
  no observable behavior change.

  Output header — codex's "## Evolve entities for this task / Review
  these stored entities and apply any that are relevant to the user's
  request:" propagates to all four platforms (claude/claw-code/bob
  previously emitted the shorter "## Entities for this task" form).
  Two test header pins updated to match: SCRIPT_VARIANTS in
  test_retrieve.py and the bob-side assertion in test_bob_sharing.py.

  Item formatting — codex's plain `Rationale:` / `When:` lines win
  over claude/claw-code/bob's italicised `_Rationale: ..._` /
  `_When: ..._` form.

  Subscribed-source detection — codex's relative-path approach
  (`md.relative_to(entities_dir).parts`) wins over the
  search-for-"entities"-in-parts logic claude carried.

  Symlink + .git filtering — preserved as additive defensive features
  even though codex didn't have the .git skip. Skipping git
  bookkeeping when a write-scope clone lives under
  entities/subscribed/{name}/.git/ is the right thing to do, and it
  doesn't conflict with codex's behavior on a clean entities tree.

  Stdin handling — codex's strict "json.load + return on
  JSONDecodeError" is preserved (the
  test_handles_invalid_json_stdin_gracefully test pins this on every
  variant). Empty stdin is treated as "no input, continue with entity
  loading" rather than an error, so bob's manual-invocation path
  (which never pipes anything upstream) keeps working without an
  `echo {}` workaround.

  Argv dump — claude carried a "=== Command-Line Arguments ===" log
  block; codex didn't. Dropped, codex wins.

  "# Made with Bob" footer — dropped.

MANIFEST: retrieve_entities.py adds two entries — one targeting
`skills/recall/scripts/retrieve_entities.py` for [claude, claw-code,
codex], one targeting `skills/evolve-lite-recall/scripts/retrieve_entities.py`
for [bob]. Same shape as save_entities.py from the previous commit.

Tests: 307/308 platform_integrations pass — same baseline (the one
pre-existing failure
test_codex_sharing.py::test_subscribe_warns_when_audit_write_fails
predates this branch).

Refs #219

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ajectory

All five remaining sharing/recall scripts now render from a single
source under plugin-source/. The unified scripts adopt codex's
variants verbatim where codex and claude diverged, with two
mechanical changes per file:

  1. The lib-path-discovery prelude is replaced with the same
     walk-up loop introduced in the save_entities.py and
     retrieve_entities.py commits — checks `<ancestor>/lib`
     (claude/claw-code/codex), `<ancestor>/evolve-lib` (bob), then
     the existing `<ancestor>/platform-integrations/claude/.../lib`
     monorepo-dev fallback codex's variants carried.

  2. The "(Codex)" / "(Bob)" docstring annotations and the trailing
     "# Made with Bob" comment are dropped.

Per-script trade-offs codex-wins introduces on claude/claw-code:

  publish.py
    - No behavior delta vs the prior plugin-source version that
      mattered to existing tests; codex and claude were already very
      close here. Soft-warn-on-audit-failure semantics preserved.

  subscribe.py
    - codex's `project_root` derives from `evolve_dir.resolve()`
      (handles a non-".evolve"-named EVOLVE_DIR) instead of always
      using `str(evolve_dir.resolve().parent)`.
    - codex re-raises rather than printing "Error: failed to record
      subscription — clone removed:" when save_config fails. The
      rollback semantics are unchanged (clone is removed, repos
      list popped); only the user-visible error string differs.
      Updated test_rolls_back_clone_if_config_write_fails to drop
      its message-string check; the rollback behavior it actually
      cares about still passes.
    - Argument help text loses claude's longer descriptions; codex's
      terser arg help propagates.

  sync.py
    - Drops claude's `git -c safe.directory={repo_path}` flag from
      the inner `_git` helper. No test pinned this; its only effect
      is whether sync works inside a repo owned by a different uid
      than the running process (matters in shared-filesystem
      installs, doesn't matter in the test sandbox).
    - Drops claude's head_before / head_after short-circuit and
      always counts a delta after a fetch+rebase/reset; the
      subscribed-base path-traversal check codex carried in the
      main loop is added on top of the lib-level rejection list, so
      both layers of name validation now apply.
    - codex's audit_root indirection (handles a non-".evolve"-named
      EVOLVE_DIR for the audit log path) propagates to all
      platforms.

  unsubscribe.py
    - codex's combined `is_valid_repo_name` + path-traversal check
      replaces claude's two separate-step form. Same observable
      validation; the rejection error string is identical.
    - codex's `project_root` derivation matches the subscribe.py
      change above.

  save_trajectory.py
    - codex has no save-trajectory skill, so the canonical here is
      claude's existing plugin-source variant (lazy log creation,
      atomic O_EXCL claim, file-arg-or-stdin input). Bob's prior
      variant was simpler and used naive `open()`; replacing it
      with the claude version is a strict improvement (handles
      same-second collisions, supports the tmp-file-input pattern
      the SKILL.md prose now describes for all platforms).

MANIFEST: each of the five scripts gains a second [[files]] entry
mirroring the save_entities.py / retrieve_entities.py pattern — one
target for [claude, claw-code, codex] under
`skills/<skill>/scripts/<script>.py`, one target for [bob] under
`skills/evolve-lite-<skill>/scripts/<script>.py`.

Tests: 307/308 platform_integrations pass — same baseline (the one
pre-existing failure
test_codex_sharing.py::test_subscribe_warns_when_audit_write_fails
remains; it pins soft-warn audit semantics on subscribe.py while
both claude and codex variants implement hard-fail, which is a
separate fail-open-vs-fail-closed design call).

After this commit every Python script under platform-integrations/<platform>/
is rendered from plugin-source/. The only files still outside build
management are infrastructure that has no unification opportunity:
README.md, .claude-plugin/ / .codex-plugin/ manifests, bob's
commands/ directory, bob's custom_modes.yaml, and the parallel
evolve-full/ plugin tree.

Refs #219

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make `target` optional in MANIFEST.toml [[files]] entries. When omitted
the renderer falls back to source minus a trailing `.j2`. Drops 14 lines
from MANIFEST.toml without changing the rendered output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the learn template to {% if forked_context | default(false) %} so
non-claude platforms no longer need to declare forked_context = false
just to satisfy StrictUndefined. Drops three lines from MANIFEST.toml
and makes the platform definitions only declare what differs from the
default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `[[platforms.<name>.target_rewrites]]` — a list of (regex, replacement)
substitutions the renderer applies to each entry's target path under that
platform. Use it on bob to map `skills/<name>/` to `skills/evolve-lite-<name>/`
so the platform definition (not 14 duplicate manifest entries) carries the
folder-rename rule from commit 07a171c.

Collapses every `[[files]]` pair (one for claude/claw-code/codex, one for
bob's prefixed target) into a single entry that lists every receiving
platform. Drops MANIFEST.toml from 232 lines to 132 with no change to the
rendered output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make `platforms` optional in [[files]] entries. When omitted the renderer
fans the entry out to every platform declared in the manifest. Drops the
`platforms = ["claude", "claw-code", "codex", "bob"]` line from the 12
fully-shared entries — the common case for skill scripts and SKILL.md
templates after the bob duplicates collapsed.

MANIFEST.toml is now 132 lines (from 232 at the start of this batch); no
change to the rendered output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add bob and codex to the lib/ entries in MANIFEST.toml. Each platform's
plugin tree now ships its own copy of lib/__init__.py, lib/audit.py,
lib/config.py, lib/entity_io.py — codex and bob no longer rely on a
walk-up to claude's monorepo lib.

Simplify the script preludes accordingly: drop the
`platform-integrations/claude/plugins/evolve-lite/lib/` fallback from
the walk-up loop; the local lib/ or evolve-lib/ sibling is always
present now.

Update install.sh — bob now sources its lib from its own plugin tree
instead of reaching into claude's; codex's redundant claude-lib copy
goes away (the plugin copytree already includes lib/).

Drop the PYTHONPATH=claude-lib injection in test_bob_sharing.py — bob's
scripts find their own lib via the walk-up. Tests pass without it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atform

Drop platform restrictions from the four entries that previously
covered partial subsets:
- save (was claude+claw-code only) → all four
- save-trajectory script + SKILL.md (was missing codex) → all four
- on_stop.py / on_stop.sh hooks (was claude only) → all four

For platforms where these don't have full runtime support today, the
files ship as inert artifacts. Per-platform behavior tightening (e.g.
making save-trajectory work under codex, plumbing on_stop hook contracts
on non-claude platforms) is tracked as follow-up issues.

Add user_skills_dir / save_example_script_root context vars for codex
and bob so the save SKILL.md template renders. The codex/bob prose is
tilted toward project-local skill paths rather than user home — fix
later. Wrap the `context: fork` frontmatter line in the save template
with a claude-only branch (matching save-trajectory's pattern).

Add commands/evolve-lite-save.md to bob's plugin tree to satisfy the
"every skill has a command file" gate now that bob has evolve-lite-save.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructure plugin-source/ so every skill folder lives under a shared
`evolve-lite/` parent: plugin-source/skills/<name>/ →
plugin-source/skills/evolve-lite/<name>/. Mirror this in the rendered
output for claude/claw-code/codex; bob keeps its flat
skills/evolve-lite-<name>/ layout via the existing target_rewrite
(pattern updated to match the new source path).

Plugin metadata follows:
- claude/codex plugin.json: skills key now points at ./skills/evolve-lite/
- claw-code plugin.json: gains a `skills` key pointing at the same path
- claude hooks/hooks.json + claw-code hooks/retrieve_entities.sh: shell
  paths inserted with the evolve-lite/ segment
- _macros.j2 invoke() macro: claude and codex paths gain the same
  segment (claw-code uses runtime colon notation independent of source
  layout; bob's flat installed path also unchanged)
- install.sh: codex hook commands rewritten to the new path; status
  output reflects the nested layout

Tests updated mechanically — every hardcoded skills/<name>/ reference
in tests/platform_integrations/ now reads skills/evolve-lite/<name>/.
The bob path-rewrite pattern is exercised end-to-end: source skills
flow through the rewrite and end up at skills/evolve-lite-<name>/
under platform-integrations/bob/.

Tests: 307/308 baseline maintained (the pre-existing
test_subscribe_warns_when_audit_write_fails is unchanged).

Validation note: claude / claw-code / codex plugin loaders are assumed
to honor the `"skills": "./skills/evolve-lite/"` key. Bob's runtime is
unaffected — the rewrite produces the same flat .bob/skills/<name>/
layout as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move per-platform configuration into a PLATFORMS dict at the top of
scripts/build_plugins.py. The renderer now walks plugin-source/ and
fans every file out to every platform — no manifest entries, no
explicit `platforms = [...]` lists, no `target = "..."` overrides.
Files at plugin-source/ root that are not shipped (_macros.j2, README.md)
are listed as RESERVED_SOURCES.

The build pipeline keeps the same public surface (`load_manifest()`,
`render_to()`, `check_drift()`, `Manifest`/`PlatformConfig`/`FileEntry`
dataclasses) so tests and external callers stay working. The MANIFEST_PATH
constant is gone; the perturbation drift test no longer needs to patch it.

Bob's path rewrite stays the only structural divergence — encoded inline
in PLATFORMS as `[(pattern, replacement)]`. Adding a new skill now
requires only creating its directory under plugin-source/skills/evolve-lite/;
the build picks it up automatically.

Tests: 307/308 baseline maintained.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The script and the source tree it walks now live together. Add
build_plugins.py to RESERVED_SOURCES so the renderer skips itself,
and exclude any __pycache__/ directory the interpreter creates from
the source walk.

Update consumer paths in justfile, .pre-commit-config.yaml, the
GitHub Actions workflow, and the test harness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-platform plugin.json files become generated artifacts of a single
source-of-truth plugin-source/plugin.toml, rendered through pydantic
input + output models. Drift gate covers them alongside the existing
tree-walk.

[plugin] holds host-agnostic metadata (only name + version required);
[claude] / [claw-code] / [codex] tables hold genuinely host-specific
fields. All models are extra="allow", so undeclared TOML keys flow
through: [plugin] extras fan out to every host's top-level, host-table
extras go to that host only, [codex] extras land in codex's interface
block. Bob has no plugin.json output.

Refs #219.
… text

Apply evolve-lite:<skill> as bob's runtime skill name across SKILL.md
frontmatter, _macros.j2 skill_ref, custom_modes.yaml workflow steps, and
the commands/*.md slash-command definitions, so bob's UX matches claude
and codex (`/evolve-lite:learn`). On-disk folder layout stays
hyphenated (.bob/skills/evolve-lite-<skill>/) so the plugin tree
installs cleanly on Windows, which rejects colons in path components.

Also folds in the in-flight learn/recall polish: recall switches to
verbatim entity quoting in forked-context renders (the parent agent
can't see intermediate Read results) and uses ${EVOLVE_DIR:-.evolve}
consistently; learn's Step 0 finds the most recent trajectory by
scanning ${EVOLVE_DIR}/trajectories/ instead of parsing on_stop's
transcript_path marker, so the skill is robust to any trajectory the
save-trajectory hook (or skill) wrote.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the 8 hand-maintained bob slash-command definitions from
platform-integrations/bob/evolve-lite/commands/ into
plugin-source/commands/ so they're now driven by the same fan-out
build that already covers SKILL.md, scripts, and per-platform metadata.

Adds a target_excludes pattern list to PlatformConfig — claude /
claw-code / codex declare `^commands/` to opt out of the new subtree
since they have their own command surfaces (plugin.json,
$-registry); bob alone keeps it. The renderer skips excluded files
in both render_to and check_drift, so pre-commit drift detection
keeps working without seeing claude/codex as "missing" the bob-only
files.

Output content is byte-identical to the prior hand-maintained
commands directory (this is purely a source-layout move + build-time
filter), verified via `build_plugins.py check`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@illeatmyhat illeatmyhat marked this pull request as ready for review May 1, 2026 17:37
@vinodmut
Copy link
Copy Markdown
Contributor

vinodmut commented May 1, 2026

I tested the Bob and Claude Code integrations end-to-end — both work as expected.

One thing I'd like to understand better: how should we shape edits that are mostly shared but have a narrow platform-specific seam? Example: #239 adds audit-log influence tracking — the audit schema, log_influence.py, the Step 4 assessment logic, and the recall-side write are all platform-agnostic and should land once in plugin-source/. The only Claude-specific parts are (a) the claude-transcript_<id>.jsonl filename format we strip to derive session_id, and (b) reading transcript_path from the Claude Code hook payload — the name/shape of that field likely differs on Bob and Codex.

Under the new model, is the intended pattern a shared source with a small Jinja-conditional (or per-platform macro) for the session_id derivation, rather than a full overlay file? The invoke() macro approach in _macros.j2 seems like the right analogue — a session_id_from_hook_input() macro per platform. Wanted to confirm the expected shape before #239 rebases onto this.

@vinodmut vinodmut requested review from vinodmut and visahak May 1, 2026 19:53
vinodmut
vinodmut previously approved these changes May 1, 2026
Copy link
Copy Markdown
Contributor

@vinodmut vinodmut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice job!

Let’s wait for @visahak to take a look too before merging.

@illeatmyhat
Copy link
Copy Markdown
Collaborator Author

@vinodmut I think in general we should try to ship the same script to all agents even if parts of it are irrelevant. Unlike the web, we aren't driven to minimize file sizes.
The SKILL.md format also doesn't have the concept of dependencies, and even if it did we shouldn't rely on it.
So as far as agents interacting with code, we should rely on an abstract interface as much as possible.
If some templating is required from jinja then so be it, but every time we use jinja it will introduce some fragility.

As it turns out, Claude exposes environment variables to scripts that it runs, apparently like

CLAUDECODE=1 # flag that the script is running under Claude Code
CLAUDE_CODE_ENTRYPOINT=cli # how Claude Code was launched
CLAUDE_CODE_EXECPATH=~/.local/share/claude/versions/2.1.126  # path to the running Claude Code binary

The other CLIs probably do too, and if they don't maybe we should add to the SKILL.md a part that identifies the running platform

@vinodmut
Copy link
Copy Markdown
Contributor

vinodmut commented May 1, 2026

Makes sense — runtime env-var detection beats compile-time Jinja conditionals here, and keeps SKILL.md readable. A single log_influence script that checks CLAUDECODE (plus Codex/Bob/Claw-code equivalents) is cleaner than a per-platform macro.

One note for #239: the seam is really "how do I identify the current session's transcript" — Claude gives us transcript_path in the hook-input stdin and a claude-transcript_<id>.jsonl filename; other platforms may expose different keys or lack the concept entirely (in which case the audit no-ops). Worth documenting that detection contract in SKILL.md.

@illeatmyhat
Copy link
Copy Markdown
Collaborator Author

illeatmyhat commented May 1, 2026

@vinodmut
Unfortunately I checked in with codex and bob and they do not have an equivalent. There's no good way to detect what platform you're running under in general, so we have to compile
We do have to use jinja somewhere, but we have to keep it small to prevent coding agents from getting confused.
So I think the best approach is to add a switch at the top of the script that gets compiled in

{%- if platform == "claude" -%}
{%- set AGENT = "Claude" -%}
{%- elif platform == "claw-code" -%}
{%- set AGENT = "Claw" -%}
{%- elif platform == "codex" -%}
{%- set AGENT = "Codex" -%}
{%- elif platform == "bob" -%}
{%- set AGENT = "Bob" -%}
{%- endif -%}

@illeatmyhat
Copy link
Copy Markdown
Collaborator Author

illeatmyhat commented May 1, 2026

It's true that Bob doesn't even support hooks so we'd have to find a different way to deal with transcripts

@vinodmut
Copy link
Copy Markdown
Contributor

vinodmut commented May 1, 2026

+1 — a tiny compiled-in AGENT switch is the right amount of Jinja; keeps the scripts readable to a coding agent looking at one platform's rendered output. For #239, the Claude branch derives session_id from transcript_path; other agents return early until we plumb an equivalent.

Drops the MainThread group from the live region (it was redrawing the
entire view on every orchestrator log line, which stacked duplicate
`── MainThread ──` headers when long lines wrapped past the cursor-up
wipe). Inlines per-skill `✓/✗ name detail` lines into each platform's
section as steps complete, matching the old summary format — and
removes the post-run summary block since the same info now lives in
the sections themselves.

Bob's install-only message also corrected: --resume works upstream
again; the real reason we skip skill execution is that bob has no way
to run slash commands non-interactively from a one-shot prompt.
Removes the line truncation in LiveGroupedHandler in favor of a
wrap-aware redraw. `_last_lines` is now a physical-row count (each
buffered line contributes ceil(len / term_width) rows), so the
cursor-up wipe (\033[nF) still lands on the start of the live region
when lines wrap onto multiple rows. Terminal width is re-read on
every render so window resizes mid-run don't desync the wipe math.
@visahak
Copy link
Copy Markdown
Collaborator

visahak commented May 4, 2026

Summary

This PR consolidates the Evolve Lite platform integrations under plugin-source/, renders per-host output into platform-integrations/, and adds build/smoke coverage around the generated trees. The risky
surface is the new render/check pipeline plus Bob’s colon-to-dash rename, so I focused on drift detection and upgrade behavior rather than the core Evolve runtime.

Findings

  1. check_drift() does not detect extra generated files, so CI can pass with stale artifacts still committed (confidence: 95/100)
    • Why it matters: renamed or deleted generated skills/commands can linger under platform-integrations/ and still be shipped/reviewed as if the render output were clean.
    • Evidence: plugin-source/build_plugins.py:596-633 only checks expected files for missing/content drift; it never enumerates each plugin root for unexpected extras. I reproduced this by rendering
      into a temp tree, adding platform-integrations/claude/plugins/evolve-lite/orphan.txt, and rerunning check_drift(); it returned 0.
    • Evidence: the new tests cover orphan removal in render_to() (tests/platform_integrations/test_build_pipeline.py:122-136) and clean fresh renders (tests/platform_integrations/
      test_build_pipeline.py:101-106), but there is no test that check_drift() fails on extra files.
  2. Bob’s colon-to-dash rename has no upgrade cleanup path, so existing installs keep duplicate or legacy artifacts (confidence: 100/100)
    • Why it matters: users upgrading from the old Bob layout can end up with both evolve-lite:learn and evolve-lite-learn skills/commands at once, and uninstall will still leave the legacy ones
      behind.
    • Evidence: install copies the new dash-form skills/commands into .bob without removing old names (platform-integrations/install.sh:499-510), while uninstall/status only match evolve-lite-* and
      ignore legacy evolve-lite:* entries (platform-integrations/install.sh:546-573).
    • Evidence: I reproduced this in a temp directory. After install --platform bob --mode lite, both old and new paths existed (.bob/skills/evolve-lite:learn and .bob/skills/evolve-lite-learn, same
      for commands). After uninstall --platform bob, the old colon-form skill and command still existed.

Testing

  • uv run pytest -m e2e -v: started under the repo’s .venv/bin/python, selected 185 tests, but I could not complete it cleanly in this environment. Isolating the blocker with uv run pytest -m e2e tests/
    e2e/test_sandbox_learn_recall.py::test_learn_then_recall_flow -v failed immediately with 401 ... Invalid bearer token from the Claude sandbox, so this looks environment/auth-related rather than PR-
    specific.
  • Additional validation: uv run pytest tests/platform_integrations/test_build_pipeline.py tests/platform_integrations/test_skill_directory_names.py -v passed (25 passed in 7.52s).
  • Manual validation: confirmed the check_drift() orphan-file false negative and the Bob upgrade/uninstall migration regressions with temp-dir repros.

@visahak
Copy link
Copy Markdown
Collaborator

visahak commented May 4, 2026

@illeatmyhat can you address the issue and resolve the conflicts?

…ninstall

Addresses visahak's PR #235 finding: re-running install over the
pre-rename `.bob/skills/evolve-lite:<name>` layout left both the
legacy colon-form and the new dash-form on disk, and uninstall's
`evolve-lite-*` glob ignored anything not in the new namespace.

Adds `BobInstaller._purge_evolve_artifacts(bob_target)` that strips
every `evolve`-prefixed entry from `.bob/skills/`, `.bob/commands/`,
and the `.bob/` root (catches `evolve-lib`, the legacy colon-form
skills/commands, and any future `evolve-*` namespace). Called as
the first step of install (clean upgrade) and from uninstall in
place of the per-glob loops. `status` widens its glob to `evolve*`
so legacy stragglers surface instead of reading ✗ while artifacts
squat on disk. User-owned non-evolve content is untouched.

New regression tests in `TestBobLegacyMigration` mirror visahak's
exact reproduction (legacy skill + command at install time, post-
install accumulation cleared by uninstall, user content preserved
through the purge).
Addresses visahak's PR #235 finding: `check_drift()` only verified that
expected files matched their source — it never enumerated each
plugin_root for unexpected extras, so a stale file (renamed/removed
generated artifact, hand-written drift) would sail through `check`
even though the rendered tree no longer matched plugin-source/.

`check_drift()` now walks each platform's plugin_root and reports any
file outside the expected set as `orphan: <path> (not generated from
plugin-source/)`. The walk is scoped through `git ls-files
--cached --others --exclude-standard` so .gitignore'd build artifacts
(__pycache__/*.pyc, .DS_Store, …) are correctly invisible — the local
dev workflow leaves these under platform-integrations/ during testing
and they shouldn't trip the check. Falls back to `Path.rglob` when
the working tree isn't a git repo so test fixtures (tmp_path) still
detect deliberately seeded orphans.

Two new TestCheckDrift tests cover visahak's exact reproduction
(orphan.txt at plugin_root) and the descend-into-subdir case.
@vinodmut
Copy link
Copy Markdown
Contributor

vinodmut commented May 4, 2026

Quick update: #239 merged as Claude-only on main, so the render-equality gate will fail once this PR rebases. Should we revert #239 and re-add the feature through plugin-source/ with the AGENT-switch pattern after this lands, rather than trying to absorb it inline during rebase?

@visahak
Copy link
Copy Markdown
Collaborator

visahak commented May 4, 2026

Whatever is easier to get this merged.

@illeatmyhat
Copy link
Copy Markdown
Collaborator Author

I guess we should revert because I have no idea how the log influence stuff is going to work on other platforms

@visahak
Copy link
Copy Markdown
Collaborator

visahak commented May 4, 2026

Created a PR #248 that reverts #239. Please review and merge.

@vinodmut
Copy link
Copy Markdown
Contributor

vinodmut commented May 4, 2026

Ha ha. I also created PR #249 to revert. You’re faster, I’ll close mine. :)

Conflict-resolution notes (visahak's "address conflicts" ask on PR #235):

* Main's PR #245 ("fix(e2e): resolve test failures from issue #244")
  modified the rendered platform-integrations/<claude,claw-code>/lib/
  config.py and platform-integrations/codex/.../subscribe.py directly.
  Under this branch's unified layout these are generated artifacts —
  the fixes belong in plugin-source/ so they propagate to every
  platform on render. Two changes re-applied:

    1. plugin-source/lib/config.py — print a stderr warning when
       _coerce_repo() rejects an invalid repo name (was silent).
       Updated message wording to match #245. Re-rendered to all four
       platforms (claude/claw-code/codex/bob); codex+bob were never
       changed by main but inherit consistent behavior under unify.

    2. plugin-source/skills/evolve-lite/subscribe/scripts/subscribe.py
       — drop the rollback-on-audit-failure block; warn instead.
       Audit logging is best-effort, a failed append shouldn't undo a
       successful subscribe. Re-rendered to all four platforms;
       previously only codex had this fix.

* .gitignore — kept both new entries (`example-guidelines` from this
  branch, `.codex` from main).

* altk_evolve/backend/filesystem.py + tests/unit/test_filesystem_backend.py
  — accepted from main as-is (atomic writes + corrupt-JSON recovery
  for the filesystem backend).

* tests/e2e/, tests/platform_integrations/test_{bob,codex}_sharing.py,
  tests/platform_integrations/test_sync.py — auto-merged cleanly,
  inspected, accepted.

Verified: render-then-check is clean; 186 platform_integrations tests
pass; the 4 new filesystem-backend unit tests pass.
@visahak
Copy link
Copy Markdown
Collaborator

visahak commented May 4, 2026

@vinodmut merging the PR.

@visahak
Copy link
Copy Markdown
Collaborator

visahak commented May 4, 2026

Looks like we need to revert this #247

@visahak visahak merged commit 09b173d into main May 4, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants