rules-test-coverage-trim-2026-05: trim rill-*.md (1308→943) + wire 7 + add 3 skill tests#33
Open
tarr1124 wants to merge 12 commits into
Open
rules-test-coverage-trim-2026-05: trim rill-*.md (1308→943) + wire 7 + add 3 skill tests#33tarr1124 wants to merge 12 commits into
tarr1124 wants to merge 12 commits into
Conversation
Add test scaffolding so test/run-all.sh exercises 11 OSS skills
(briefing, clip-tweet, close, distill, eval, focus, inspect, newsletter,
page, repair, solve) instead of /distill only. Three new tests are added
(test-eval.sh, test-inspect.sh, test-repair.sh, all smoke tests modelled
on test-distill.sh).
Test-only fixes to the existing eight tests:
- Optional-ize the `cp $REPO_DIR/{taxonomy,CLAUDE}.md` overlays.
OSS rill does not ship taxonomy.md or CLAUDE.md at the repo root, so
the unconditional cp aborted every test on a clean clone. The fixture
copies under test/fixtures/ are already in place, so optional overlays
preserve the vault-overlay intent without breaking OSS runs.
- test-briefing.sh: scalarize SC-02..SC-05 grep counts via
`{ ... ; } | tr -d '[:space:]'` so assert_gt receives a single integer
(the multi-line stdout otherwise triggered an `((...))` syntax error).
- test-solve.sh: raise `--max-turns` from 50 to 200, and accept `done`
in the P4-01 task-status assertion alongside `waiting` and `open`
(the new /solve may legitimately reach `done` within a single run).
Also add test/results/ to .gitignore (timestamped run artifacts).
Compress operational notes and rationale paragraphs while preserving all technical invariants (lane definitions, 5-case branching, two-channel write invariant, Tier 3 denylist, PUBLIC repo guard). No semantic changes.
Compress prose in Substance section and Anti-patterns; shorten Good Example. Removed redundant scaffold paragraphs. No semantic changes.
Compress section descriptions and remove redundant scaffold paragraphs.
Compress notes/entity principles; merge redundant projects/ deprecation paragraphs.
Compress section descriptions while preserving status transition rules and File-First principle.
Compress reports/ and pages/ subsection descriptions while preserving recipe pair convention.
Compress subsection bullets and cross-cutting rules.
Compress structure/.processed/subdirectory bullets.
Compress bullets in Tag Management, Reference Rules, Entity References.
…178→154) Compress Critical Invariants and detailed rules index in rill-core.md; replace rill-tasks.md Good Example block with abridged prose form.
…e counter
[P1] test-eval.sh: replace 1-entry 'queries: [{id, text}]' fixture with 7-entry top-level sequence matching real eval/queries.yaml format (id/query/type/scope), spanning the 4 supported types for Phase 2 stratified sampling.
[P2] run-all.sh: replace '((TOTAL_FAIL++))' with 'TOTAL_FAIL=$((TOTAL_FAIL + 1))' — post-increment when TOTAL_FAIL=0 returns exit 1 which under set -e would abort run-all.sh on the first failing suite.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.claude/rules/rill-*.md10 files 1308→943 lines (28% reduction). No semantic changes — all technical invariants preserved (lane definitions, 5-case branching, two-channel write, Tier 3 denylist, ADR-046 D46-7 modes, PUBLIC repo guards, etc.). File-by-file commits enable bisect.test/run-all.shnow wires 11 skill tests (distill + briefing + clip-tweet + close + focus + newsletter + page + solve + new inspect + repair + eval). Existing test-briefing/test-solve test bugs fixed. New smoke tests for/inspect,/repair,/eval./evalqueries.yaml fixture format + set-e-saferun-all.shfailure counter (((TOTAL_FAIL++))→TOTAL_FAIL=$((TOTAL_FAIL + 1))).Verification
/pagetransient bash quirk), 134/144 assertion PASStasks/fix-eval-test-comprehensiveness-2026-05/_task.md, draft) — issues are about/evalsmoke test depth (stratified-sampling + EV-02 strictness), out of trim scopeTest plan
bash test/run-all.shexit 0 in a clean clone🤖 Generated with Claude Code