fix: root-cause CI failures inherited from sync#39
Merged
Conversation
PR #38 merged with 5 failing tests. Each had a real root cause now fixed, verified locally before push. 1. tree-sitter-language-pack 1.6.3 breaks 3 bash detector tests 1.6.3 (released 2026-04-23) is a deliberate breaking refactor by upstream (kreuzberg-dev): the public module moved from `tree_sitter_language_pack` to `_native`, removing the top-level package that desloppify's parser bootstrap imports. ImportError gets caught by PARSE_INIT_ERRORS in analysis/unused_imports.py and returns empty findings, masking the regression as test assertion failures rather than import failures. Verified: on a clean Python 3.12 venv with 1.6.3, all 3 bash tests fail with `[] == ['helpers']` etc. Pin to 1.6.2 → all 5 bash tests pass. Fix: cap at <1.6.3 in pyproject.toml (treesitter + full extras) and update test_treesitter_language_pack_is_capped_below_incompatible_release to assert the new cap. 2. test_successful_dedicated_install_rovodev: missing _read_local_docs_file mock OP's update_skill/cmd.py prefers local docs/ files over the download path (sensible: local-first when running from a checkout). Test mocked only `_download`, so _read_local_docs_file('SKILL.md') and ('ROVODEV.md') returned real local content (the actual ROVODEV.md shipping with the package), bypassing the mock. The test class already had this mock pattern on test_download_failure and test_bad_content; the three "successful_install" tests were inconsistent. Added @patch on `_read_local_docs_file, return_value=None` to all three to consistently exercise the download path the tests intend to test. Verified all 25 tests in the file pass. 3. test_show_structural_loads_medium_confidence_matches: scope.py regression The sync's per-file restoration of OP versions reverted scope.py's load_matches to a stale variant that routes everything through build_work_queue, which applies standalone_threshold filtering (medium structural < high threshold → filtered). Upstream's evolved load_matches has an explicit "show should surface persisted matching issues even when a detector has a higher standalone confidence threshold" path that calls build_issue_items directly with forced_ids, bypassing the filter. OP's only difference in scope.py was a docstring example mentioning advocacy_language — not load-bearing code. Upstream's load_matches handles advocacy scopes identically. Fix: take upstream's scope.py + revert engine/work_queue.py to upstream (no longer needs build_work_queue_for_visibility export, which only OP's old scope.py used). Verified all 42 tests in test_cmd_show.py pass. Excluded from this PR: ~30 additional latent failures visible in Python 3.12 local runs that don't appear in CI on Python 3.11. Separate triage.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR stabilizes the work-queue facade (removes visibility exports), refactors show command matching to use persisted issue state, tightens tree-sitter-language-pack bounds, and updates tests to mock local docs reads and skip Bash tests when the language pack is missing. ChangesWork-queue API stabilization and persisted-state matching refactor
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
🚥 Pre-merge checks | ✅ 7 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
tests-core runs via `make tests` → `install-ci-tools` which installs the
core package without [full] extras (no tree-sitter-language-pack). The
new test_bash_unused_imports.py module (came in via the sync from upstream)
lacked the skipif guard that every other test_treesitter module uses:
pytestmark = pytest.mark.skipif(
not is_available(),
reason="tree-sitter-language-pack not installed",
)
Without the guard, the parser-init catch in detect_unused_imports returns
empty findings on missing parser, and the test assertions fail with
"assert [] == ['helpers']" — exactly the symptom the previous commit's
1.6.3 pin fixed for tests-full but couldn't fix for tests-core (parser
not installed at all there).
Verified locally: tests still pass when tree-sitter is installed; will
skip cleanly when it isn't.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #38 merged with 5 failing tests. This PR fixes each at the root, verified locally before push.
The 5 failures, root-caused
1. Three
test_bash_unused_importsfailures → tree-sitter-language-pack 1.6.3 is brokentree-sitter-language-pack 1.6.3 (released 2026-04-23 by kreuzberg-dev) is a deliberate breaking refactor: the public module moved from
tree_sitter_language_packto_native, removing the top-level package name thatdesloppify/languages/_framework/treesitter/analysis/extractors.py:102imports. TheImportErrorgets swallowed byPARSE_INIT_ERRORSinanalysis/unused_imports.py:65, so failures surface as silently-empty findings rather thanImportError— that's why CI reported "assert [] == ['helpers']" instead of a parser-init error.Verification: clean Python 3.12 venv on 1.6.3 → 3 bash tests fail; downgrade to 1.6.2 → all bash tests pass.
Fix: pin to
>=0.3,<1.6.3inpyproject.toml(bothtreesitterandfullextras). Updatedtest_treesitter_language_pack_is_capped_below_incompatible_releaseto assert the new cap — the test exists to ensure the cap doesn't accidentally regress, and the cap value was hardcoded as<1.8.Upstream peteromallet/desloppify is also affected — their recent CI runs are failing too. This fix can be sent upstream as a follow-up.
2.
test_successful_dedicated_install_rovodev→ missing_read_local_docs_filemockOP's
update_skill/cmd.pyprefers localdocs/files over remote download (local-first when running from a checkout). The test mocked only_download, so_read_local_docs_file('ROVODEV.md')returned the real shippingdocs/ROVODEV.mdcontent (which doesn't contain the string"rovodev overlay"), bypassing the mock entirely.The test class already had this
_read_local_docs_filemock pattern ontest_download_failureandtest_bad_content(added together with the local-first feature). The threetest_successful_*tests were inconsistent — written or copied before the mock pattern was established.Fix: add
@patch("desloppify.app.commands.update_skill.cmd._read_local_docs_file", return_value=None)totest_successful_dedicated_install,test_successful_dedicated_install_rovodev, andtest_successful_shared_install. Consistently exercises the download path the tests intend to test. All 25 tests in the file now pass.3.
test_show_structural_loads_medium_confidence_matches→ scope.py regressionThe sync's per-file restoration of OP versions reverted
app/commands/show/scope.pyto a stale variant that routes everything throughbuild_work_queue. That path appliesstandalone_thresholdfiltering inranking.py:148-156(medium-confidence structural <highthreshold → filtered out).Upstream's evolved
load_matcheshas an explicit "show should surface persisted matching issues even when a detector has a higher standalone confidence threshold" branch: it callsbuild_issue_itemsdirectly withforced_ids=_matching_issue_ids_for_scope(state, scope), bypassing the confidence-threshold filter. This is the feature the test verifies.OP's only divergence in scope.py was a docstring example mentioning
show advocacy_language— not load-bearing code; upstream'sload_matcheshandles advocacy scopes identically via the samescope_matcheshelper.Fix: take upstream's
scope.pywholesale + revertengine/work_queue.pyto upstream (no longer needsbuild_work_queue_for_visibilityexport, which only OP's old scope.py used). All 42 tests intest_cmd_show.pypass.What's NOT in this PR
On Python 3.12 locally (not CI's 3.11), I see ~30 additional latent test failures across
test_review_commands, detector adapters, and treesitter imports. These don't appear in CI on Python 3.11. Separate triage — out of scope for this PR's "fix the 5 documented findings" mandate.Test plan
Summary by CodeRabbit
Refactor
Chores
Tests