Skip to content

feat(skills): boot-time install + prompt alignment + per-phase whitelist#8374

Merged
HydraOps-T-rav merged 18 commits intomainfrom
dynamic-skill-loading-spec
Apr 21, 2026
Merged

feat(skills): boot-time install + prompt alignment + per-phase whitelist#8374
HydraOps-T-rav merged 18 commits intomainfrom
dynamic-skill-loading-spec

Conversation

@HydraOps-T-rav
Copy link
Copy Markdown
Collaborator

Summary

Three coordinated extensions to PR #8348's plugin skill registry, landing together because they depend on each other. Full rationale in ADR-0043.

  • Boot-time installpreflight._check_plugins now runs claude plugin install name@marketplace --scope user for missing Tier-1/Tier-2 plugins before failing. Gated by new auto_install_plugins: bool = True config flag. Falls through to a rich FAIL (with make install-plugins + claude login hints) only if install itself fails.
  • Prompt alignmentformat_plugin_skills_for_prompt now carries a condensed superpowers:using-superpowers discipline preamble: "even at 1% confidence you MUST invoke", process-before-implementation priority, explicit rationalization warning. Subagents now apply the same skill-picking rigor the main Claude session has.
  • Per-phase whitelist — new phase_skills: dict[str, list[str]] config field; each of six runners (agent, planner, reviewer, triage, discover, shape) filters discovery through its whitelist via a new skills_for_phase helper. PHASE_NAMES frozenset + pydantic validator reject unknown phase keys.

Also: src/install_plugins_cli.py + make install-plugins target as the deterministic manual fallback. CLI distinguishes Tier-1 (required → exit 1) from Tier-2 (language-conditional → warning only), matching preflight semantics.

Module API changes

Three symbols promoted from module-private to public (cross-module consumers need them):

  • _parse_plugin_specparse_plugin_spec
  • _install_plugininstall_plugin
  • _plugin_existsplugin_exists
  • _DEFAULT_CACHE_ROOTDEFAULT_CACHE_ROOT
  • New exports: DEFAULT_MARKETPLACE, PHASE_NAMES, skills_for_phase

Test plan

  • `make quality` passes: 11,017 tests / 3 skipped / 231 xfailed / 0 failures, exit 0.
  • New test files (all green): `test_plugin_spec_parser.py`, `test_phase_skill_filter.py`, `test_config_phase_skills.py`, `test_preflight_install.py`, `test_install_plugins_cli.py`.
  • Updated: `test_phase_skill_injection.py` (per-runner whitelist + cross-phase exclusion), `test_plugin_skill_registry.py` (new preamble assertions), `test_preflight_plugins.py` (explicit `auto_install_plugins=False` where previously implicit).

Manual verification (for reviewer):

  • Fresh `~/.claude/plugins/cache/claude-plugins-official/` → `python -m server` installs plugins and starts.
  • `auto_install_plugins=False` + missing plugin → FAIL with `make install-plugins` hint (new header: "Required plugins missing" not "Plugin install failed").
  • `claude` binary absent from $PATH → FAIL includes "claude binary not found".
  • Dump a runner's prompt (via `HYDRAFLOW_DEBUG_PROMPTS=1` or equivalent) and confirm the phase-filtered skill list + "1% confidence" preamble.

Known follow-ups (out of scope)

  • Version pinning (`name@marketplace@version`) — deferred per ADR-0043.
  • Auto-marketplace add — deferred per ADR-0043.
  • Scenario-framework coverage proving subagents actually invoke advertised skills — belongs in `docs/scenarios/`.

🤖 Generated with Claude Code

T-rav-Hydra-Ops and others added 18 commits April 20, 2026 13:23
…e, per-phase filter

Three coordinated extensions to the PR #8348 plugin skill registry:

1. Boot-time install — preflight runs `claude plugin install name@marketplace
   --scope user` for missing Tier-1/Tier-2 plugins instead of failing; falls
   through to a rich FAIL only if install itself fails. Gated by a new
   `auto_install_plugins: bool = True` config flag.

2. Prompt alignment — the `## Available Skills` preamble carries a condensed
   form of the `superpowers:using-superpowers` discipline (1% confidence →
   MUST invoke, process skills before implementation skills, explicit
   red-flag reminder) so subagents apply the same skill-picking rigor the
   main Claude session does.

3. Per-phase whitelist — a new `phase_skills` config field filters each
   factory runner's prompt to only the skills relevant to its job.
   Triage/discover get debugging discipline; planner/shape get writing-plans;
   agent gets TDD + verification + simplify; reviewer gets the code-review
   plugin. Dialog-only and human-author skills (brainstorming, receiving-
   code-review, executing-plans) are explicitly excluded.

Implementation plan to follow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-module consumers (preflight, install_plugins_cli — added in later
tasks) need to import this symbol. Python convention reserves
underscore-prefixed names for module-internal use, so drop the underscore
on both parse_plugin_spec and DEFAULT_MARKETPLACE.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iscovered branch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extend `_check_plugins` with a boot-time install branch: when a Tier-1
  plugin is missing and `config.auto_install_plugins` is True, shell out
  to `claude plugin install name@marketplace --scope user` and re-check
  the cache before deciding PASS/FAIL.
- Tier-2 (language-conditional) plugins get a best-effort install, with
  any still-missing ones reported as WARN as before.
- On failure, the FAIL message surfaces the install errors and a manual
  fix block (make install-plugins / claude plugin install / claude login).
- `subprocess` is now imported at module scope so tests can patch
  `preflight.subprocess.run`.
- Add `tests/test_preflight_install.py` covering the install branch,
  subprocess failure, timeout, missing `claude` binary, disabled
  auto-install, and explicit-marketplace specs.
- Update `tests/test_preflight_plugins.py` to set
  `auto_install_plugins=False` on Tier-1 FAIL and Tier-2 WARN tests to
  preserve their original intent and keep them offline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ROOT, reuse conftest helper, fix FAIL header

Four findings from the Task 5 code-quality review:

1. Tier-2 install errors were silently discarded; the WARN message now
   reports each missing plugin's install-error detail when one exists.
2. Promote _DEFAULT_CACHE_ROOT to DEFAULT_CACHE_ROOT (public) for
   consistency with parse_plugin_spec's earlier promotion — cross-module
   symbols belong in the public API.
3. tests/test_preflight_install.py now uses the shared conftest
   write_plugin_skill helper instead of a local _write_fake_skill that
   duplicated layout and could drift.
4. The FAIL header when auto_install_plugins=False no longer says
   "Plugin install failed" — nothing was attempted. It now reads
   "Required plugins missing".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire skills_for_phase() into all six runner injection sites so each
phase only sees its own whitelisted skills. Add six new whitelist-
assertion tests (one per runner) that confirm whitelisted skills appear
and cross-phase skills are absent; update existing tests to use
whitelisted skills now that filtering is active.
…ling

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds `src/install_plugins_cli.py` with a `run()` function that installs
all required and language plugins from HydraFlowConfig, skipping already-
installed ones and returning non-zero on any failure. Adds `make install-
plugins` target that reads the active config and invokes the CLI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 7's initial implementation duplicated the _install_plugin helper
between preflight and install_plugins_cli to sidestep a test-mocking
issue. Promote preflight._install_plugin to public (install_plugin) and
have the CLI import it, matching the parse_plugin_spec public-promotion
precedent from Task 1. Tests now patch preflight.subprocess.run
directly, which is the module that actually owns the subprocess call.

Also drops the unused argv parameter from install_plugins_cli.main().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
….PHONY entry

Three follow-ups from Task 7 code review:

1. Promote preflight._plugin_exists to plugin_exists (public), matching
   the parse_plugin_spec and install_plugin promotions. Cross-module
   symbols belong in the public API.

2. Replace logger.error(f) — which treats the failure string as a
   format template — with logger.error("%s", failure). Latent
   logging-injection bug if a failure detail ever contains %s or {}.

3. Add install-plugins to Makefile .PHONY list so `make install-plugins`
   doesn't silently no-op if a file named install-plugins is ever
   created in the project root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…r-2 (warn)

Mirror preflight's tier-1/tier-2 distinction in the CLI. A missing
required plugin remains a hard failure; a missing language-conditional
plugin logs a warning and leaves the exit code at 0.

This resolves a latent bug where `make install-plugins` reliably
returned non-zero on a fresh checkout because the stock
language_plugins default references `gopls` and `rust-analyzer` which
are not yet published to the claude-plugins-official marketplace. Those
are language-conditional and should not block required-plugin
installation.

Adds three new tests covering the tier-1/tier-2 split.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@HydraOps-T-rav HydraOps-T-rav merged commit 6b45f82 into main Apr 21, 2026
20 checks passed
@HydraOps-T-rav HydraOps-T-rav deleted the dynamic-skill-loading-spec branch April 21, 2026 19:37
HydraOps-T-rav added a commit that referenced this pull request Apr 21, 2026
* docs(agents): 5 new avoided patterns + planner self-audit step

Retrospective of PR #8374/#8375 surfaced five recurring anti-patterns that the
plan review caught repeatedly. Document each in the canonical
`docs/agents/avoided-patterns.md` (the sensor enricher and audit agents already
read this file) and add a Planning-Steps self-audit reference so the planner
phase catches them BEFORE emitting a plan.

New patterns:
- Underscore-prefixed names imported across modules
- Writing a new test helper without checking conftest
- logger.error(value) without a format string
- Hardcoded path lists that duplicate filesystem state
- _name for unused loop variables (prefer bare _)

Catching these at plan-writing time prevents the agent-review-fix loop cost
observed in PR #8374 (8 fix commits beyond the initial 8 task commits).

* test(planner): bump truncation-test threshold for expanded planning-steps block

Adding the new step 8 (self-audit against avoided-patterns.md) grew the
planner prompt baseline by ~650 chars. The truncation test's `< 10_000`
sanity bound was tight; bump to 11_000. Semantic is unchanged —
"…(truncated)" marker assertion still verifies body truncation works
and the prompt remains well under the original 20k body size.

---------

Co-authored-by: T-rav-Hydra-Ops <t@koderex.dev>
HydraOps-T-rav pushed a commit that referenced this pull request Apr 21, 2026
PR #8374 added `phase_skills: dict[str, list[str]]` to HydraFlowConfig and
each factory runner now reads `self._config.phase_skills` during prompt
construction. PR #8376's prompt-audit harness uses a fake config object
that raises AttributeError on unknown attributes; `phase_skills` wasn't
added to it, so every audit_prompts fixture that invokes a runner's
prompt builder failed with `AttributeError: phase_skills` under CI's
full test run (the failures didn't surface in `make quality` because
that target deselects some tests).

Mirror the existing `required_plugins: list[str] = []` convention — both
are allowlist fields that default to empty in the audit harness.
HydraOps-T-rav added a commit that referenced this pull request Apr 21, 2026
…8378)

* feat(audit): propagate 5 new avoided patterns to sensor rules + audit Agent 5

Follow-up to #8377. The 5 new avoided-patterns entries need matching hooks
in the two automated surfaces that consume the doc.

Sensor enricher (src/sensor_rules.py) — 3 new Rule entries:
- private-symbol-cross-module: pyright "_name is not accessed" trigger
- logger-format-typeerror: TypeError on malformed format strings
- dockerfile-python-constant-drift: Dockerfile* file-changed trigger

(Skipped sensor rules for the two patterns that don't map cleanly to
error/file triggers: "test helper duplicating conftest" and "hardcoded
path list mirroring fs state" rely on diff analysis the sensor runtime
doesn't do. Those remain covered by avoided-patterns.md + Agent 5 sweep.)

Audit Agent 5 (.claude/commands/hf.audit-code.md) Phase 2 — 5 new
detection bullets with concrete rg heuristics:
- Underscore-prefixed names imported across modules
- Test helpers duplicating conftest fixtures
- logger.error(value) without format string
- Hardcoded path lists mirroring filesystem state
- _name for unused loop variables

* fix(audit): add phase_skills to audit_prompts fake config

PR #8374 added `phase_skills: dict[str, list[str]]` to HydraFlowConfig and
each factory runner now reads `self._config.phase_skills` during prompt
construction. PR #8376's prompt-audit harness uses a fake config object
that raises AttributeError on unknown attributes; `phase_skills` wasn't
added to it, so every audit_prompts fixture that invokes a runner's
prompt builder failed with `AttributeError: phase_skills` under CI's
full test run (the failures didn't surface in `make quality` because
that target deselects some tests).

Mirror the existing `required_plugins: list[str] = []` convention — both
are allowlist fields that default to empty in the audit harness.

---------

Co-authored-by: T-rav-Hydra-Ops <t@koderex.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant