feat(skills): boot-time install + prompt alignment + per-phase whitelist by HydraOps-T-rav · Pull Request #8374 · T-rav/hydraflow

HydraOps-T-rav · 2026-04-21T04:46:02Z

Summary

Three coordinated extensions to PR #8348's plugin skill registry, landing together because they depend on each other. Full rationale in ADR-0043.

Boot-time install — preflight._check_plugins now runs claude plugin install name@marketplace --scope user for missing Tier-1/Tier-2 plugins before failing. Gated by new auto_install_plugins: bool = True config flag. Falls through to a rich FAIL (with make install-plugins + claude login hints) only if install itself fails.
Prompt alignment — format_plugin_skills_for_prompt now carries a condensed superpowers:using-superpowers discipline preamble: "even at 1% confidence you MUST invoke", process-before-implementation priority, explicit rationalization warning. Subagents now apply the same skill-picking rigor the main Claude session has.
Per-phase whitelist — new phase_skills: dict[str, list[str]] config field; each of six runners (agent, planner, reviewer, triage, discover, shape) filters discovery through its whitelist via a new skills_for_phase helper. PHASE_NAMES frozenset + pydantic validator reject unknown phase keys.

Also: src/install_plugins_cli.py + make install-plugins target as the deterministic manual fallback. CLI distinguishes Tier-1 (required → exit 1) from Tier-2 (language-conditional → warning only), matching preflight semantics.

Module API changes

Three symbols promoted from module-private to public (cross-module consumers need them):

_parse_plugin_spec → parse_plugin_spec
_install_plugin → install_plugin
_plugin_exists → plugin_exists
_DEFAULT_CACHE_ROOT → DEFAULT_CACHE_ROOT
New exports: DEFAULT_MARKETPLACE, PHASE_NAMES, skills_for_phase

Test plan

`make quality` passes: 11,017 tests / 3 skipped / 231 xfailed / 0 failures, exit 0.
New test files (all green): `test_plugin_spec_parser.py`, `test_phase_skill_filter.py`, `test_config_phase_skills.py`, `test_preflight_install.py`, `test_install_plugins_cli.py`.
Updated: `test_phase_skill_injection.py` (per-runner whitelist + cross-phase exclusion), `test_plugin_skill_registry.py` (new preamble assertions), `test_preflight_plugins.py` (explicit `auto_install_plugins=False` where previously implicit).

Manual verification (for reviewer):

Fresh `~/.claude/plugins/cache/claude-plugins-official/` → `python -m server` installs plugins and starts.
`auto_install_plugins=False` + missing plugin → FAIL with `make install-plugins` hint (new header: "Required plugins missing" not "Plugin install failed").
`claude` binary absent from $PATH → FAIL includes "claude binary not found".
Dump a runner's prompt (via `HYDRAFLOW_DEBUG_PROMPTS=1` or equivalent) and confirm the phase-filtered skill list + "1% confidence" preamble.

Known follow-ups (out of scope)

Version pinning (`name@marketplace@version`) — deferred per ADR-0043.
Auto-marketplace add — deferred per ADR-0043.
Scenario-framework coverage proving subagents actually invoke advertised skills — belongs in `docs/scenarios/`.

🤖 Generated with Claude Code

…e, per-phase filter Three coordinated extensions to the PR #8348 plugin skill registry: 1. Boot-time install — preflight runs `claude plugin install name@marketplace --scope user` for missing Tier-1/Tier-2 plugins instead of failing; falls through to a rich FAIL only if install itself fails. Gated by a new `auto_install_plugins: bool = True` config flag. 2. Prompt alignment — the `## Available Skills` preamble carries a condensed form of the `superpowers:using-superpowers` discipline (1% confidence → MUST invoke, process skills before implementation skills, explicit red-flag reminder) so subagents apply the same skill-picking rigor the main Claude session does. 3. Per-phase whitelist — a new `phase_skills` config field filters each factory runner's prompt to only the skills relevant to its job. Triage/discover get debugging discipline; planner/shape get writing-plans; agent gets TDD + verification + simplify; reviewer gets the code-review plugin. Dialog-only and human-author skills (brainstorming, receiving- code-review, executing-plans) are explicitly excluded. Implementation plan to follow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Cross-module consumers (preflight, install_plugins_cli — added in later tasks) need to import this symbol. Python convention reserves underscore-prefixed names for module-internal use, so drop the underscore on both parse_plugin_spec and DEFAULT_MARKETPLACE. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…iscovered branch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Extend `_check_plugins` with a boot-time install branch: when a Tier-1 plugin is missing and `config.auto_install_plugins` is True, shell out to `claude plugin install name@marketplace --scope user` and re-check the cache before deciding PASS/FAIL. - Tier-2 (language-conditional) plugins get a best-effort install, with any still-missing ones reported as WARN as before. - On failure, the FAIL message surfaces the install errors and a manual fix block (make install-plugins / claude plugin install / claude login). - `subprocess` is now imported at module scope so tests can patch `preflight.subprocess.run`. - Add `tests/test_preflight_install.py` covering the install branch, subprocess failure, timeout, missing `claude` binary, disabled auto-install, and explicit-marketplace specs. - Update `tests/test_preflight_plugins.py` to set `auto_install_plugins=False` on Tier-1 FAIL and Tier-2 WARN tests to preserve their original intent and keep them offline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ROOT, reuse conftest helper, fix FAIL header Four findings from the Task 5 code-quality review: 1. Tier-2 install errors were silently discarded; the WARN message now reports each missing plugin's install-error detail when one exists. 2. Promote _DEFAULT_CACHE_ROOT to DEFAULT_CACHE_ROOT (public) for consistency with parse_plugin_spec's earlier promotion — cross-module symbols belong in the public API. 3. tests/test_preflight_install.py now uses the shared conftest write_plugin_skill helper instead of a local _write_fake_skill that duplicated layout and could drift. 4. The FAIL header when auto_install_plugins=False no longer says "Plugin install failed" — nothing was attempted. It now reads "Required plugins missing". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Wire skills_for_phase() into all six runner injection sites so each phase only sees its own whitelisted skills. Add six new whitelist- assertion tests (one per runner) that confirm whitelisted skills appear and cross-phase skills are absent; update existing tests to use whitelisted skills now that filtering is active.

…ling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds `src/install_plugins_cli.py` with a `run()` function that installs all required and language plugins from HydraFlowConfig, skipping already- installed ones and returning non-zero on any failure. Adds `make install- plugins` target that reads the active config and invokes the CLI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Task 7's initial implementation duplicated the _install_plugin helper between preflight and install_plugins_cli to sidestep a test-mocking issue. Promote preflight._install_plugin to public (install_plugin) and have the CLI import it, matching the parse_plugin_spec public-promotion precedent from Task 1. Tests now patch preflight.subprocess.run directly, which is the module that actually owns the subprocess call. Also drops the unused argv parameter from install_plugins_cli.main(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

….PHONY entry Three follow-ups from Task 7 code review: 1. Promote preflight._plugin_exists to plugin_exists (public), matching the parse_plugin_spec and install_plugin promotions. Cross-module symbols belong in the public API. 2. Replace logger.error(f) — which treats the failure string as a format template — with logger.error("%s", failure). Latent logging-injection bug if a failure detail ever contains %s or {}. 3. Add install-plugins to Makefile .PHONY list so `make install-plugins` doesn't silently no-op if a file named install-plugins is ever created in the project root. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…r-2 (warn) Mirror preflight's tier-1/tier-2 distinction in the CLI. A missing required plugin remains a hard failure; a missing language-conditional plugin logs a warning and leaves the exit code at 0. This resolves a latent bug where `make install-plugins` reliably returned non-zero on a fresh checkout because the stock language_plugins default references `gopls` and `rust-analyzer` which are not yet published to the claude-plugins-official marketplace. Those are language-conditional and should not block required-plugin installation. Adds three new tests covering the tier-1/tier-2 split. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(agents): 5 new avoided patterns + planner self-audit step Retrospective of PR #8374/#8375 surfaced five recurring anti-patterns that the plan review caught repeatedly. Document each in the canonical `docs/agents/avoided-patterns.md` (the sensor enricher and audit agents already read this file) and add a Planning-Steps self-audit reference so the planner phase catches them BEFORE emitting a plan. New patterns: - Underscore-prefixed names imported across modules - Writing a new test helper without checking conftest - logger.error(value) without a format string - Hardcoded path lists that duplicate filesystem state - _name for unused loop variables (prefer bare _) Catching these at plan-writing time prevents the agent-review-fix loop cost observed in PR #8374 (8 fix commits beyond the initial 8 task commits). * test(planner): bump truncation-test threshold for expanded planning-steps block Adding the new step 8 (self-audit against avoided-patterns.md) grew the planner prompt baseline by ~650 chars. The truncation test's `< 10_000` sanity bound was tight; bump to 11_000. Semantic is unchanged — "…(truncated)" marker assertion still verifies body truncation works and the prompt remains well under the original 20k body size. --------- Co-authored-by: T-rav-Hydra-Ops <t@koderex.dev>

PR #8374 added `phase_skills: dict[str, list[str]]` to HydraFlowConfig and each factory runner now reads `self._config.phase_skills` during prompt construction. PR #8376's prompt-audit harness uses a fake config object that raises AttributeError on unknown attributes; `phase_skills` wasn't added to it, so every audit_prompts fixture that invokes a runner's prompt builder failed with `AttributeError: phase_skills` under CI's full test run (the failures didn't surface in `make quality` because that target deselects some tests). Mirror the existing `required_plugins: list[str] = []` convention — both are allowlist fields that default to empty in the audit harness.

…8378) * feat(audit): propagate 5 new avoided patterns to sensor rules + audit Agent 5 Follow-up to #8377. The 5 new avoided-patterns entries need matching hooks in the two automated surfaces that consume the doc. Sensor enricher (src/sensor_rules.py) — 3 new Rule entries: - private-symbol-cross-module: pyright "_name is not accessed" trigger - logger-format-typeerror: TypeError on malformed format strings - dockerfile-python-constant-drift: Dockerfile* file-changed trigger (Skipped sensor rules for the two patterns that don't map cleanly to error/file triggers: "test helper duplicating conftest" and "hardcoded path list mirroring fs state" rely on diff analysis the sensor runtime doesn't do. Those remain covered by avoided-patterns.md + Agent 5 sweep.) Audit Agent 5 (.claude/commands/hf.audit-code.md) Phase 2 — 5 new detection bullets with concrete rg heuristics: - Underscore-prefixed names imported across modules - Test helpers duplicating conftest fixtures - logger.error(value) without format string - Hardcoded path lists mirroring filesystem state - _name for unused loop variables * fix(audit): add phase_skills to audit_prompts fake config PR #8374 added `phase_skills: dict[str, list[str]]` to HydraFlowConfig and each factory runner now reads `self._config.phase_skills` during prompt construction. PR #8376's prompt-audit harness uses a fake config object that raises AttributeError on unknown attributes; `phase_skills` wasn't added to it, so every audit_prompts fixture that invokes a runner's prompt builder failed with `AttributeError: phase_skills` under CI's full test run (the failures didn't surface in `make quality` because that target deselects some tests). Mirror the existing `required_plugins: list[str] = []` convention — both are allowlist fields that default to empty in the audit harness. --------- Co-authored-by: T-rav-Hydra-Ops <t@koderex.dev>

T-rav-Hydra-Ops and others added 18 commits April 20, 2026 13:23

feat(skills): add _parse_plugin_spec for name@marketplace syntax

d845e5b

fix(skills): import DEFAULT_MARKETPLACE instead of redeclaring in test

e51aab2

feat(skills): add PHASE_NAMES and skills_for_phase helper

edaf7c8

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(skills): clarify PHASE_NAMES lockdown test intent + cover empty-d…

c6c06cb

…iscovered branch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(skills): add using-superpowers discipline preamble to prompt sec…

d00a6ec

…tion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(skills): add multi-skill bullet + description-passthrough coverage

0d345c8

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(config): add auto_install_plugins and phase_skills fields

3ee3ce3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

style: silence pyright unused-variable hints on loop and kwargs bindings

0fba081

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: link phase_skills default to ADR-0043 and flag test-helper coup…

1160228

…ling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

HydraOps-T-rav merged commit 6b45f82 into main Apr 21, 2026
20 checks passed

HydraOps-T-rav deleted the dynamic-skill-loading-spec branch April 21, 2026 19:37

This was referenced Apr 21, 2026

refactor(agent-cli): scan /opt/plugins dynamically + verify Skill tool chain #8375

Merged

docs(agents): 5 new avoided patterns + planner self-audit step #8377

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): boot-time install + prompt alignment + per-phase whitelist#8374

feat(skills): boot-time install + prompt alignment + per-phase whitelist#8374
HydraOps-T-rav merged 18 commits intomainfrom
dynamic-skill-loading-spec

HydraOps-T-rav commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HydraOps-T-rav commented Apr 21, 2026

Summary

Module API changes

Test plan

Known follow-ups (out of scope)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant