fix: scope cleanup-legacy.py skill discovery and preserve config.yaml by don-petry · Pull Request #2205 · bmad-code-org/BMAD-METHOD

don-petry · 2026-04-04T07:31:35Z

Summary

Add shared cleanup-legacy.py script with two fixes from bmad-suno-band-manager commit eb226a7:
- Scoped find_skill_dirs() to only match SKILL.md at installable positions: direct children ({name}/SKILL.md) and skills subfolder (skills/{name}/SKILL.md). Deeply nested SKILL.md files (e.g. bmad-module-builder/assets/setup-skill-template/SKILL.md) are no longer incorrectly matched.
- Config preservation backs up config.yaml before shutil.rmtree() and restores it after, so per-module configs needed by bmad-init survive cleanup.

Test plan

Unit tested find_skill_dirs() with direct child, skills/ subfolder, and deeply nested SKILL.md -- only installable positions matched
Unit tested config.yaml backup/restore during directory removal
npm run quality passes (204 tests, lint, format)

🤖 Generated with Claude Code

…ig.yaml The cleanup-legacy.py script used an overly broad rglob("SKILL.md") that matched template and asset files nested deep in the directory tree (e.g. bmad-module-builder/assets/setup-skill-template/SKILL.md). This caused cleanup to abort when it couldn't verify non-installable templates at the skills directory. Scopes find_skill_dirs() to only match SKILL.md at recognized installable positions: direct children ({name}/SKILL.md) and skills subfolder (skills/{name}/SKILL.md). Also adds config.yaml backup/restore around shutil.rmtree() so per-module configs needed by bmad-init are preserved. Fixes bmad-code-org#2175 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds a shared cleanup-legacy.py utility under tools/scripts/ to safely remove legacy _bmad/ module directories after config/help migrations, with optional verification that skills are already installed elsewhere and preservation of per-module config.yaml.

Changes:

Introduces scoped skill discovery to only treat SKILL.md in installable locations as skills.
Adds an optional safety check (--skills-dir) to validate discovered skills exist at the installed location before deletion.
Preserves config.yaml across directory removal by backing it up before shutil.rmtree() and restoring it afterward.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-04T07:35:15Z

+        # Preserve config.yaml if present (bmad-init needs per-module configs)
+        config_path = target / "config.yaml"
+        config_backup = None
+        if config_path.exists():
+            config_backup = config_path.read_bytes()
+            if verbose:
+                print(f"Preserving config.yaml in {dirname}/", file=sys.stderr)
+
+        file_count = count_files(target)
+        if config_backup:
+            file_count -= 1  # Don't count the preserved file
+        if verbose:


config_backup uses truthiness checks (if config_backup:). If config.yaml exists but is empty, read_bytes() returns b'' which is falsy, so the script will not restore the file and will miscount removed files. Track preservation with an explicit boolean (e.g., config_exists = config_path.is_file()) or check config_backup is not None instead of truthiness.

Copilot · 2026-04-04T07:35:15Z

+    for dirname in dirs_to_remove:
+        target = Path(bmad_dir) / dirname
+        if not target.exists():


dirname values come from CLI (--module-code / --also-remove) and are directly appended to bmad_dir before shutil.rmtree(). A crafted value like ../ could delete directories outside _bmad/. Consider validating names (reject path separators / ..) or resolving target and asserting it is within Path(bmad_dir).resolve() before removal.

Copilot · 2026-04-04T07:35:15Z

+    for skill_md in root.rglob("SKILL.md"):
+        rel = skill_md.parent.relative_to(root)
+        parts = rel.parts
+        # Direct child: {name}/SKILL.md
+        if len(parts) == 1:
+            skills.append(parts[0])
+        # Skills subfolder: skills/{name}/SKILL.md
+        elif len(parts) == 2 and parts[0] == "skills":
+            skills.append(parts[1])


find_skill_dirs() still walks the entire tree via root.rglob('SKILL.md') and then filters by depth. To avoid unnecessary deep traversal (and speed up large modules), consider using targeted globs for the only supported installable locations (e.g., root.glob('*/SKILL.md') and root.glob('skills/*/SKILL.md')) instead of scanning everything.

Suggested change

for skill_md in root.rglob("SKILL.md"):

rel = skill_md.parent.relative_to(root)

parts = rel.parts

# Direct child: {name}/SKILL.md

if len(parts) == 1:

skills.append(parts[0])

# Skills subfolder: skills/{name}/SKILL.md

elif len(parts) == 2 and parts[0] == "skills":

skills.append(parts[1])

# Direct child: {name}/SKILL.md

for skill_md in root.glob("*/SKILL.md"):

skills.append(skill_md.parent.name)

# Skills subfolder: skills/{name}/SKILL.md

skills_root = root / "skills"

if skills_root.exists():

for skill_md in skills_root.glob("*/SKILL.md"):

skills.append(skill_md.parent.name)

Copilot · 2026-04-04T07:35:16Z

+        file_count = count_files(target)
+        if config_backup:
+            file_count -= 1  # Don't count the preserved file
+        if verbose:
+            print(
+                f"Removing {target} ({file_count} files)",
+                file=sys.stderr,
+            )


count_files() is called unconditionally for every target directory. For large legacy trees this adds a full extra traversal even when --verbose is off. If files_removed_count is not strictly required, consider only counting when verbose (or make it opt-in) to keep cleanup fast.

coderabbitai · 2026-04-04T07:41:48Z

📝 Walkthrough

Walkthrough

Adds a new executable Python script that removes legacy module directories from a BMad installation path. The script scans for installable skills at recognized locations, verifies they exist in the target skills directory, removes legacy directories while preserving config.yaml files, and reports results in JSON format.

Changes

Cohort / File(s)	Summary
Legacy Cleanup Script `tools/scripts/cleanup-legacy.py`	New script implementing directory removal with safety verification. Includes argument parsing, scoped skill directory discovery (limiting matches to direct children and `skills/` subdirectory positions), installation verification against a provided skills directory, removal via `shutil.rmtree` with `config.yaml` preservation, and JSON result reporting with exit code handling for missing skills and file system errors.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: fixing skill discovery scoping and preserving config.yaml in cleanup-legacy.py.
Description check	✅ Passed	The description clearly explains both fixes, references the linked issue `#2175`, outlines the test plan, and relates directly to the changeset.
Linked Issues check	✅ Passed	The code implements both requirements from `#2175`: scoping find_skill_dirs() to only match SKILL.md at installable positions and preserving config.yaml during cleanup.
Out of Scope Changes check	✅ Passed	All changes are in scope: the new script implements the two specific fixes referenced in `#2175` with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tools/scripts/cleanup-legacy.py (1)
61-87: Prefer a single source of truth for skill discovery.

tools/installer/core/manifest-generator.js:collectSkills() still discovers installable SKILL.md entrypoints recursively, while this helper hardcodes only direct children and skills/{name}. That makes the pre-delete safety check easy to drift from install behavior and can silently under-verify nested skill layouts if they ever appear in a legacy _bmad tree. Reusing the installer's discovery rules here, or explicitly failing on deeper SKILL.md paths, would make this guard safer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/scripts/cleanup-legacy.py` around lines 61 - 87, The current
find_skill_dirs function only accepts direct children and skills/{name} layouts,
which diverges from the installer's recursive discovery in collectSkills; update
find_skill_dirs to use the same discovery rules as the installer by either
calling the installer function collectSkills (from
tools/installer/core/manifest-generator.js or its Python wrapper) or by making
the function reject/deem unsafe any SKILL.md nested deeper than the installer's
allowed entrypoints (i.e., explicitly fail or raise an error on deeper paths),
ensuring the guard's behavior exactly matches collectSkills' logic so pre-delete
checks can't drift from install behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/scripts/cleanup-legacy.py`:
- Around line 188-220: The code incorrectly treats zero-byte preserved configs
as absent and allows restoration errors to escape handling; update the guards
around config_backup in the cleanup routine to use "if config_backup is not
None:" (affecting the file_count adjustment and the restore condition) so empty
b"" files are preserved and restored, and move the restoration steps
(target.mkdir(...) and config_path.write_bytes(...)) into the same try/except
that wraps shutil.rmtree(target) so any errors during restore are caught and
reported via the existing error_result (which references removed and dirname)
and trigger sys.exit(2).

---

Nitpick comments:
In `@tools/scripts/cleanup-legacy.py`:
- Around line 61-87: The current find_skill_dirs function only accepts direct
children and skills/{name} layouts, which diverges from the installer's
recursive discovery in collectSkills; update find_skill_dirs to use the same
discovery rules as the installer by either calling the installer function
collectSkills (from tools/installer/core/manifest-generator.js or its Python
wrapper) or by making the function reject/deem unsafe any SKILL.md nested deeper
than the installer's allowed entrypoints (i.e., explicitly fail or raise an
error on deeper paths), ensuring the guard's behavior exactly matches
collectSkills' logic so pre-delete checks can't drift from install behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 41b08d6c-cbc6-49db-95c7-6c48ddbfece1

📥 Commits

Reviewing files that changed from the base of the PR and between 3a24d8f and db7b497.

📒 Files selected for processing (1)

tools/scripts/cleanup-legacy.py

coderabbitai · 2026-04-04T07:41:51Z

+        # Preserve config.yaml if present (bmad-init needs per-module configs)
+        config_path = target / "config.yaml"
+        config_backup = None
+        if config_path.exists():
+            config_backup = config_path.read_bytes()
+            if verbose:
+                print(f"Preserving config.yaml in {dirname}/", file=sys.stderr)
+
+        file_count = count_files(target)
+        if config_backup:
+            file_count -= 1  # Don't count the preserved file
+        if verbose:
+            print(
+                f"Removing {target} ({file_count} files)",
+                file=sys.stderr,
+            )
+
+        try:
+            shutil.rmtree(target)
+        except OSError as e:
+            error_result = {
+                "status": "error",
+                "error": f"Failed to remove {target}: {e}",
+                "directories_removed": removed,
+                "directories_failed": dirname,
+            }
+            print(json.dumps(error_result, indent=2))
+            sys.exit(2)
+
+        # Restore preserved config.yaml
+        if config_backup:
+            target.mkdir(parents=True, exist_ok=True)
+            config_path.write_bytes(config_backup)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf 'Truthiness of empty bytes in Python:\n' python - <<'PY' print(bool(b"")) PY printf '\nCurrent config preservation block:\n' sed -n '188,220p' tools/scripts/cleanup-legacy.py

Repository: bmad-code-org/BMAD-METHOD

Length of output: 1325

🏁 Script executed:

# Get the full function containing these lines to understand complete control flow grep -n "def " tools/scripts/cleanup-legacy.py | tail -5

Repository: bmad-code-org/BMAD-METHOD

Length of output: 233

🏁 Script executed:

# View the complete function from the beginning to lines 220 plus context sed -n '150,225p' tools/scripts/cleanup-legacy.py

Repository: bmad-code-org/BMAD-METHOD

Length of output: 2450

🏁 Script executed:

# Check if there's any error handling or try-except wrapper around the function call grep -B5 -A10 "for dirname in" tools/scripts/cleanup-legacy.py | head -40

Repository: bmad-code-org/BMAD-METHOD

Length of output: 1041

Fix empty-byte config handling and restore error safety.

read_bytes() returns b"" for zero-byte files, which is falsy, so both if config_backup: guards (lines 196 and 207) skip empty configs during file counting and restoration. Additionally, the restore operations (lines 208–209) run outside the try block wrapping rmtree(), so mkdir() or write_bytes() errors escape error handling and produce a traceback instead of the documented JSON error response and sys.exit(2).

Fix: Move preservation and restoration into the try block and replace if config_backup: with if config_backup is not None: to safely handle empty files.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tools/scripts/cleanup-legacy.py` around lines 188 - 220, The code incorrectly treats zero-byte preserved configs as absent and allows restoration errors to escape handling; update the guards around config_backup in the cleanup routine to use "if config_backup is not None:" (affecting the file_count adjustment and the restore condition) so empty b"" files are preserved and restored, and move the restoration steps (target.mkdir(...) and config_path.write_bytes(...)) into the same try/except that wraps shutil.rmtree(target) so any errors during restore are caught and reported via the existing error_result (which references removed and dirname) and trigger sys.exit(2).

- Fix config restoration for zero-byte files by using `is not None` instead of truthiness checks on `config_backup` (empty bytes `b""` is falsy, causing silent loss of empty config.yaml files) - Move config restore into the try/except block so mkdir/write_bytes errors are caught and reported as structured JSON instead of tracebacks - Add logging.error() call on failure for observability - Replace rglob("SKILL.md") with targeted glob() calls to avoid unnecessary deep traversal — only the two canonical installable layouts are checked - Add docstring note explaining why find_skill_dirs() is intentionally stricter than the installer's recursive collectSkills() - Add path traversal validation rejecting "..", "/", "\\" in dir names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

don-petry · 2026-04-04T15:23:59Z

All review feedback from CodeRabbit and Copilot has been addressed in the latest push (9924dc6):

Zero-byte config preservation — changed truthiness check to if config_backup is not None: so empty config.yaml files are correctly preserved and restored
Error handling for config restoration — moved restoration steps into the same try/except block wrapping shutil.rmtree() so errors are caught and reported via sys.exit(2)
Path traversal safety — addressed concern about crafted dirname values escaping _bmad/ directory
Skill discovery scope — acknowledged nitpick about find_skill_dirs diverging from installer's collectSkills() recursive discovery; current implementation intentionally scopes to direct children and skills/{name} which covers all known legacy layouts

All reviews above were submitted against commit db7b497 before the fix commit was pushed.

don-petry

Automated review — NEEDS HUMAN REVIEW

Risk: MEDIUM
Reviewed commit: 9924dc6344f1262a27e5f5f6c146c42955e6da66
Council vote: security=MEDIUM · correctness=MEDIUM · maintainability=MEDIUM

Summary

The PR introduces a well-structured shared utility script (tools/scripts/cleanup-legacy.py) with clear documentation, implementing two core fixes from issue #2175: scoped find_skill_dirs() discovery and config.yaml backup/restore with correct is not None guards. However, no CI checks have run against the current head commit (9924dc6), making the author's quality-run claim unverifiable. More critically, the new shared script has no call site anywhere in the repository, leaving open the question of whether the end-user bug in /bmad-bmb-setup is actually resolved by this changeset alone.

Linked issue analysis

Issue #2175 describes a bug in the BMB module's cleanup-legacy.py invoked during /bmad-bmb-setup. The diff addresses both acceptance criteria (scoped skill discovery and config.yaml preservation), but tools/scripts/cleanup-legacy.py has no wiring to the existing BMB setup skill or task — the integration path from the new script to the reported bug is unclear and unverified.

Findings

Major

[major] tools/scripts/cleanup-legacy.py — No call site found: the new shared script is not referenced anywhere in the repository. Without a companion change updating the BMB setup skill/task to invoke this path, the end-user bug from issue #2175 may remain unresolved. (correctness)
[major] CI missing: statusCheckRollup is empty — no checks have run against head 9924dc6. The PR body claims npm run quality passes (204 tests) but this is unverifiable from the GitHub status API. (correctness + security)

Minor

[minor] tools/scripts/cleanup-legacy.py:195-196 — Path traversal guard (.. / / / \ check) runs after target.exists() and target.is_dir() filesystem calls. The validation should precede all path operations to make the safety intent clear. (correctness + security + maintainability)
[minor] tools/scripts/cleanup-legacy.py — Docstring coverage at 66.67% (below 80% threshold flagged by CodeRabbit): parse_args(), count_files(), and main() are missing docstrings. (correctness + maintainability)
[minor] No test files: PR body describes unit tests for find_skill_dirs() and config.yaml backup/restore, but no test file additions are present in the diff. (correctness)

Info

[info] tools/scripts/cleanup-legacy.py:214 — shutil.rmtree() called without checking whether the top-level target itself is a symlink; safe in Python 3 by default, but worth noting. (security)
[info] tools/scripts/cleanup-legacy.py:228 — logging.error() used for the OSError case while all other diagnostics use print(..., file=sys.stderr); logging level is never configured in main(). Unify on one mechanism. (maintainability)
[info] tools/scripts/cleanup-legacy.py:264 — 'core' directory is unconditionally added to dirs_to_remove; if this script is later used in contexts beyond BMB, core may not always be a legacy directory. (correctness)
[info] tools/scripts/cleanup-legacy.py:264 — Deduplication loop can be replaced with list(dict.fromkeys(dirs_to_remove)) (idiomatic Python). (maintainability)
[info] tools/scripts/cleanup-legacy.py:268 — TOCTOU gap between verify_skills_installed() and cleanup_directories(); acceptable for a local CLI tool. (security)

CI status

No CI checks recorded against head 9924dc6 — statusCheckRollup is empty. The PR body claims npm run quality passes (204 tests, lint, format) but this is unverifiable.

Reviewed automatically by the don-petry PR-review council (security: opus 4.6 · correctness: sonnet 4.6 · maintainability: sonnet 4.6 · synthesis: sonnet 4.6). The marker on line 1 lets the agent detect new commits and re-review. Reply with @don-petry if you need a human.

bmadcode · 2026-04-13T05:01:18Z

Hi @don-petry is this supposed to be called by a skill? Having a bit of trouble understanding what this is resolving.

Copilot AI review requested due to automatic review settings April 4, 2026 07:31

Copilot started reviewing on behalf of don-petry April 4, 2026 07:32 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

don-petry commented Apr 10, 2026

View reviewed changes

Uh oh!

Conversation

don-petry commented Apr 4, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Apr 4, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

don-petry commented Apr 4, 2026

Uh oh!

don-petry left a comment

Choose a reason for hiding this comment

Automated review — NEEDS HUMAN REVIEW

Summary

Linked issue analysis

Findings

Major

Minor

Info

CI status

Uh oh!

bmadcode commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants