Skip to content

feat: add --skip-subtrees-wo-markdown option to prune non-documentation subtrees#144

Open
geopanther wants to merge 4 commits into
iamjackg:masterfrom
geopanther:feature/skip-subtrees-without-markdown
Open

feat: add --skip-subtrees-wo-markdown option to prune non-documentation subtrees#144
geopanther wants to merge 4 commits into
iamjackg:masterfrom
geopanther:feature/skip-subtrees-without-markdown

Conversation

@geopanther

Copy link
Copy Markdown

Summary

Adds a new --skip-subtrees-wo-markdown CLI option that, when uploading a directory, skips any subtree that doesn't contain markdown files. This avoids creating empty Confluence pages for non-documentation directories like images/, data/, or config/.

Motivation

When using md2cf to upload a project folder, directories containing only non-markdown assets (images, configs, data files) are traversed and produce empty folder pages in Confluence. There's currently no way to prune these automatically — --skip-empty and --collapse-empty only handle folders without direct children, not entire subtrees. This option provides a clean way to focus uploads on documentation content only.

Changes

Core logic (md2cf/document.py)

  • New _subtree_has_markdown(path, git_repo) helper that recursively checks whether a directory tree contains any .md files, respecting .gitignore rules
  • New skip_subtrees_wo_markdown parameter on get_pages_from_directory() that prunes directories[:] in-place during os.walk, preventing descent into subtrees without markdown

CLI wiring (md2cf/__main__.py)

  • New --skip-subtrees-wo-markdown flag in the directory arguments group
  • Passed through collect_pages_to_upload()get_pages_from_directory()

Tests (test_package/unit/test_document.py)

Five new test cases using pyfakefs:

Test Scenario
test_..._skip_subtrees_wo_markdown Basic pruning: sibling dirs without .md files are removed
test_..._nested Deeply nested markdown is preserved; sibling non-md subtrees are pruned
test_..._root_has_md Root-level .md files kept; child subtrees without markdown pruned
test_..._all_empty No subtree has markdown → empty result
test_..._disabled Flag off vs. on comparison — non-md subtrees only pruned when enabled

Documentation (README.md)

  • New "Skipping subtrees without markdown" section under "Directory arguments" with usage description and collapsible example

Design decisions

  • Pruning happens at the os.walk level by modifying directories[:] in-place, which prevents os.walk from descending into pruned subtrees at all — no wasted I/O
  • Gitignore is respected in the subtree check, consistent with existing behavior
  • Off by default — no change to existing behavior unless explicitly opted in

…traversal

Add _subtree_has_markdown() helper that walks a directory tree checking
for .md files while respecting gitignore rules. Add skip_subtrees_wo_markdown
parameter to get_pages_from_directory() that prunes directories in-place
during os.walk to skip subtrees containing no markdown files.
Add the --skip-subtrees-wo-markdown flag to the directory arguments group
and wire it through collect_pages_to_upload() to get_pages_from_directory().
Add 5 tests covering: basic subtree pruning, deeply nested markdown,
root-level markdown with pruned subtrees, all-empty directory case,
and flag-disabled behavior comparison.
Add a new section under 'Directory arguments' with a description and
collapsible example showing how subtrees without markdown are pruned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant