refactor: split the generator into a package, add a metaplugin/ source folder, close two guard gaps#155
Conversation
scripts/skills.py had grown to ~1750 lines. Move the implementation into a scripts/skillsgen/ package (common, discovery, manifest, plugins, routing, hooks, validators, cli) and keep scripts/skills.py as a thin facade that re-exports the full API and delegates the CLI to skillsgen.cli.main.
Every entry point is preserved unchanged: the scripts/skills.py {sync,generate,validate} CLI, 'import skills' from bump_version.py, and the tests that load it by path via importlib. Generation is byte-identical (generate produces no diff); validate and all 92 tests pass.
Co-authored-by: Isaac
Signed-off-by: simon <simon.faltum@databricks.com>
Close two drift gaps in the generator. (1) The hook-wiring file path was declared twice per target (targets.*.hooks in plugin.meta.json and targets.*.hooks_render.out). Drop the redundant targets.{codex,copilot,cursor}.hooks keys and derive the plugin.json hooks field from hooks_render.out, so the declaration and the generation output can never disagree. Output is byte-identical. (2) Add check_no_orphan_hook_scripts: every hooks/*.py must be wired into plugin.meta.json hooks.entries (or allow-listed); previously only the reverse was checked, so a hook script that shipped unwired went unnoticed. Wired into validate.
Also make bump_version.py regenerate routing + hook-wiring files (not just plugin manifests + manifest.json), so a release that changed those stays consistent.
Adds 4 tests; validate passes; 96 tests pass; generate is byte-identical.
Co-authored-by: Isaac
Signed-off-by: simon <simon.faltum@databricks.com>
…ignage Make the editable source obvious. Move plugin.meta.json into a metaplugin/ folder (metaplugin/plugin.meta.json) and add metaplugin/README.md, which names the one source, lists every generated artifact (4 plugin.json, 3 marketplace.json, the 4 hook-wiring files, hooks/_routing_data.json, rules/databricks-routing.mdc, manifest.json, the per-dir README markers), and calls out what stays hand-edited in place (hook scripts, commands, skills). Update every reference to the source path: META_FILE in skillsgen/common.py and bump_version.py, the validate-manifest.yml trigger glob (-> metaplugin/**), the release.yml commit list, the generated-dir README markers, the hooks/_routing_data.json header, and the docs (README, CONTRIBUTING, CLAUDE.md, hooks/README.md). Generated outputs are unchanged except the marker/header text naming the new path. validate passes; 96 tests pass. Co-authored-by: Isaac Signed-off-by: simon <simon.faltum@databricks.com>
…kage) The split moved the generator into scripts/skillsgen/. The validate workflow listed scripts/skills.py and scripts/bump_version.py by exact path, so a PR touching only a skillsgen module would not run validation. Broaden the trigger to scripts/** so every generator change is checked. Co-authored-by: Isaac Signed-off-by: simon <simon.faltum@databricks.com>
…pointer Capture the generator architecture for future contributors. New AGENTS.md is a thin pointer to CLAUDE.md + CONTRIBUTING.md (so agents that read AGENTS.md, e.g. Codex/Cursor, pick up the repo's contributor rules) plus the source/generator/hooks map. hooks/README.md gains a 'Changing or adding a hook' section: change behavior = edit the shared .py; change wiring = edit hooks_render; add a hook = edit hooks.entries + the dialect builders in skillsgen/hooks.py (per-platform by necessity). CLAUDE.md + CONTRIBUTING.md note the generator now lives in scripts/skillsgen/ with scripts/skills.py as a thin facade. Co-authored-by: Isaac Signed-off-by: simon <simon.faltum@databricks.com>
pkosiec
left a comment
There was a problem hiding this comment.
Reviewed the refactor — splitting scripts/skills.py into the scripts/skillsgen/ package + a thin re-export façade, and moving plugin.meta.json into metaplugin/. The mechanics hold up well:
python3 scripts/skills.py validate→ "Everything is up to date" (no drift).- The façade re-exports every
skills.<name>the test suite uses (checked across all 6 test files), so the path-based test loader andbump_version.py'simport skillskeep working. validate-manifest.ymlruns the unit suite and triggers ontests/**/scripts/**/metaplugin/**, so the new orphan-hook / declared-hook tests execute in CI.
A few suggestions inline — only the first (release staging) is worth acting on before merge; the rest are minor/optional. Nothing blocks the refactor.
| # leave the Codex/Copilot ones behind. | ||
| git add \ | ||
| plugin.meta.json \ | ||
| metaplugin/plugin.meta.json \ |
There was a problem hiding this comment.
bump_version.py now regenerates more than this release step stages.
scripts/bump_version.py also regenerates the routing files (hooks/_routing_data.json, rules/databricks-routing.mdc, rules/README.md) and the four hook-wiring files (hooks/hooks.json, codex-hooks.json, copilot-hooks.json, cursor-hooks.json), but this git add list still stages only the plugin/marketplace JSON + manifest.json.
Benign today (the version doesn't flow into those files, so the regen is a no-op), but it breaks the "never commit a partial set" invariant this comment asserts: if any of those generated files were ever drifted on main, a release would silently fix them on the runner, pass validate, publish the tag, and leave the fix uncommitted — and post-merge CI on main would then fail.
Fix (pick one): (a) add the routing + hook-wiring files to this git add list (keeps "regenerate full set → stage full set" symmetry), or (b) drop the routing/hook regen from bump_version.py since a version bump can't change their content.
There was a problem hiding this comment.
Went with (a) plus a guard. The git add list now also stages the routing files (hooks/_routing_data.json, rules/databricks-routing.mdc, rules/README.md) and the four hook-wiring files (hooks/hooks.json, codex-hooks.json, copilot-hooks.json, cursor-hooks.json).
On top of that, after staging I run git diff --quiet and fail the release if any tracked file is still modified. So if a new generated artifact is ever added without updating this list, the release fails loudly instead of publishing a tag with the regen left uncommitted (which is the failure mode you described). Keeps the regenerate-full-set / stage-full-set symmetry and makes it self-enforcing. (5252dec)
|
|
||
| for name in sorted(disk_skills - set(meta_skills)): | ||
| errors.append( | ||
| f"Stable skill '{name}' has no entry in plugin.meta.json \"skills\". " |
There was a problem hiding this comment.
This error message points contributors at the old path. It (and the sibling messages) hard-code the bare plugin.meta.json, but the file now lives at metaplugin/plugin.meta.json, so someone who hits this on validate is sent to the repo root. cli.py already interpolates the META_FILE constant — these per-check strings should too.
Other occurrences: plugins.py:254,260, routing.py:144, hooks.py:171, validators.py:312,402.
There was a problem hiding this comment.
Fixed in beff450. Interpolated META_FILE in every per-check message: skill-coverage (plugins.py), routing coverage + pattern (routing.py), orphan-hook (hooks.py), and the generated-file drift message in common.py. The two validators.py version-mismatch strings you flagged are gone entirely, since I dropped that cross-check per the comment below (3e3125d).
| errors.append("Missing hooks/codex-hooks.json.") | ||
| else: | ||
| try: | ||
| cfg = json.loads(codex_hooks.read_text()) |
There was a problem hiding this comment.
DRY: this duplicates _check_hook_wiring. _check_hook_wiring (L164) already parses a wiring file and verifies its hooks/*.py refs exist, returning the parsed cfg — but check_codex_plugin here (and check_copilot_plugin, ~L424) re-implement the parse + re.findall(r"hooks/…\.py") + existence loop inline. Both could route through _check_hook_wiring and layer the version / event-name checks on its return value. Pre-existing; carried over by the split — optional.
There was a problem hiding this comment.
Done in 3e3125d. check_codex_plugin and check_copilot_plugin now route through _check_hook_wiring and layer the copilot version / event-name checks on its return value, matching how check_cursor_plugin already worked.
| claude_version = json.loads(claude_path.read_text()).get("version") | ||
| except Exception: | ||
| claude_version = None | ||
| if claude_version and plugin.get("version") != claude_version: |
There was a problem hiding this comment.
Likely-redundant cross-version check (minor). Now that all four plugin.json are generated from the single meta["version"] and check_generated_plugins enforces byte-exact equality, any version mismatch is already caught by the drift check. This Claude-vs-Codex/Copilot comparison is harmless defense-in-depth but could be dropped to shrink the surface.
There was a problem hiding this comment.
Dropped both (codex + copilot) in 3e3125d. check_generated_plugins enforces byte-exact equality against the meta-generated files, and the version flows from meta["version"] into all four targets, so a mismatch is already caught by the drift check, which is also strictly stronger than the old Claude-vs-target comparison.
…mmits bump_version.py regenerates the routing files (hooks/_routing_data.json, rules/databricks-routing.mdc, rules/README.md) and the four hook-wiring dialects in addition to the plugin/marketplace manifests, but the release commit staged only the manifests + manifest.json. If any generated file ever drifted on main, a release would regenerate it on the runner, pass validate, publish the tag, and leave the fix uncommitted, so post-merge CI on main would then fail. Stage the routing + hook-wiring files too, and add a guard: after staging, fail the release if any tracked file is still modified (a new generated artifact added without updating the list) instead of committing a partial set silently. Co-authored-by: Isaac Signed-off-by: simon <simon.faltum@databricks.com>
The source moved to metaplugin/plugin.meta.json, but the per-check error messages still named the bare plugin.meta.json, sending a contributor who hit them on validate to the repo root. Interpolate the META_FILE constant (as cli.py already does) in the skill-coverage, routing-coverage, routing-pattern, orphan-hook, and generated-file drift messages. Co-authored-by: Isaac Signed-off-by: simon <simon.faltum@databricks.com>
…ook_wiring check_codex_plugin and check_copilot_plugin re-implemented the parse + hooks/*.py existence loop that _check_hook_wiring already does (and that check_cursor_plugin already routes through). Route them through the helper too, layering the copilot version / event-name checks on its return value. Also drop the Claude-vs-Codex/Copilot version cross-check: both plugin.json are generated from meta["version"] and check_generated_plugins enforces byte-exact equality against the source, so any version drift is already caught there. Co-authored-by: Isaac Signed-off-by: simon <simon.faltum@databricks.com>
Why
Follow-up to reviewer feedback on the meta-plugin generator (#152/#153/#154) that we had not addressed:
scripts/skills.pyhad grown to ~1750 lines in one flat module.hooks/*.py(present but never wired) was not flagged.Changes
Three reviewable commits:
Split
scripts/skills.pyinto ascripts/skillsgen/package (common, discovery, manifest, plugins, routing, hooks, validators, cli).scripts/skills.pystays as a thin façade that re-exports the full API and delegates the CLI, so every entry point is unchanged: theskills.py {sync,generate,validate}CLI,import skillsfrombump_version.py, and the tests that load it by path. Generation is byte-identical.Close the two guard gaps. Derive each target's
plugin.jsonhooksfield fromhooks_render.outand drop the redundanttargets.{codex,copilot,cursor}.hookskeys, so the declaration and the generation output cannot disagree (one source, not two hand-maintained paths). Addcheck_no_orphan_hook_scriptsso ahooks/*.pythat is not wired into the source fails CI (previously only the reverse was checked). Also makebump_version.pyregenerate routing + hook-wiring files, not just the plugin manifests +manifest.json. Adds 4 tests.Make the source obvious. Move
plugin.meta.jsoninto ametaplugin/folder and addmetaplugin/README.mdthat names the one source, lists every generated artifact, and calls out what stays hand-edited in place (hook scripts, commands, skills). Update theMETA_FILEpath, the CI trigger glob (metaplugin/**), the release commit list, the generated-dir README markers, and the docs.Test plan
python3 scripts/skills.py generateproduces no diff to any generated file (byte-identical at every step).python3 scripts/skills.py validatepasses (now also flags orphan hook scripts).python3 -m unittest discover -s tests -p '*_test.py'— 96 tests pass.python3 scripts/bump_version.py <v>regenerates plugins + routing + hooks + manifest, then reverted.This pull request and its description were written by Isaac.