From af1673e5df3cb5dde619ec878ce637f8b0046d0a Mon Sep 17 00:00:00 2001
From: Florian DAVID <150798857+fdaviddpt@users.noreply.github.com>
Date: Fri, 26 Jun 2026 10:10:47 +0200
Subject: [PATCH] feat(validators,git-diff): exclude glob + wrong-CWD guards
 (#335, #336)

Two dispatch/op ergonomics fixes:

#335 - per-spec `exclude` glob on validators & formatters. A spec may set
`"exclude": "*tests/*"` (string or list, skip-if-any) to skip files even when
`match` hits. Motivating case: editing a PHPUnit test fired phpmd's
TooManyPublicMethods - noise, since no real gate scans tests. Per-spec on
purpose: a blanket skip would break phpunit, which must run on tests.

#336 - git-diff:PATH no longer reports a missing/untracked path as "No changes."
(a silent false-negative that masks a wrong CWD). Path mode checks `git ls-files`
first: missing -> warn + exit 1, untracked -> warn, tracked-clean -> "No changes."
Every mode stamps a `Repo: <toplevel>` header so a wrong-repo invocation is
visible. grep's path-not-found error now carries the CWD too.

Tests: 12 new (exclude dispatch x6, git-diff guard/header x5, grep cwd x1).
Docs: validators.md + SCHEMA.md (exclude key), presets/git.md (guard + header).
Co-Authored-By: Max <noreply>
---
 CHANGELOG.md             |  3 +++
 docs/presets/git.md      |  2 +-
 docs/validators.md       |  1 +
 presets/git/diff.py      | 14 ++++++++++
 supertool.py             | 20 +++++++++++++-
 tests/test_formatters.py | 18 +++++++++++++
 tests/test_git_diff.py   | 57 ++++++++++++++++++++++++++++++++++++++++
 tests/test_grep.py       |  9 +++++++
 tests/test_validators.py | 25 ++++++++++++++++++
 validators/SCHEMA.md     |  1 +
 10 files changed, 148 insertions(+), 2 deletions(-)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 5ad62e9..52d2dee 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 
+- **Validators/formatters gained a per-spec `exclude` glob** — a validator or formatter spec may now set `"exclude": "*tests/*"` (or a list of globs, skip-if-any-matches) to skip files even when its `match` glob hits. The motivating case: editing a PHPUnit test file fired the `phpmd` validator, which reported `TooManyPublicMethods` (every `testXxx()` is public) — pure noise, since no real gate scans tests (CI mess-detection scans the source dir only; the pre-push git-hook skips any `/tests/` path). Validators were gated by a single positive `match` glob with no exclude, so there was no config-only fix. `exclude` is per-spec on purpose: a blanket "skip tests" would break `phpunit`, which *must* run on test files. Generalizes the prior `PSR_EXCLUDE` env hack into a first-class config key. Closes [#335](https://github.com/Digital-Process-Tools/claude-supertool/issues/335).
 - **`gl-mr` now lists the per-file name-status by default** — the MR dashboard prints a `## Files (N)` block with one `A`/`D`/`R`/`M` line per changed path, sourced from the paginated `merge_requests/:iid/diffs` endpoint (so the change type comes straight from GitLab's `new_file`/`deleted_file`/`renamed_file` flags, no git shellout). "What got removed?" is the high-signal question when reviewing an MR, and previously `gl-mr` gave only a file *count* — forcing a separate `git diff --name-status master...branch` round-trip, exactly the borrowed round-trip the variants exist to kill (the concrete trigger was auditing a deleted-migration concern on a real MR). The list is capped at 50 files with a `… +N more` marker so large MRs don't blow context; `gl-mr:N:full` uncaps it (paginating up to 500 files) alongside the existing comment uncap. Any API/parse failure silently omits the block. Closes [#332](https://github.com/Digital-Process-Tools/claude-supertool/issues/332).
 - **`git-status` gained a `:full` mode** — `git-status:full` (alias `:porcelain`) uncaps every list in the dashboard (staged/unstaged/untracked files, other branches, stashes), printing the complete untruncated set instead of the default `... (N more)` markers. The default view stays capped (20 staged/unstaged, 10 untracked/branches, 5 stashes) — a cheap overview that answers ahead/behind + dirty + MR. The bug wasn't the truncation (correct for the common case) but the lack of an escape hatch: driving precise staging — e.g. excluding a few pre-existing untracked items from a large commit — needs the full machine-readable list, which previously forced a drop back to raw `git status --porcelain`. Closes [#330](https://github.com/Digital-Process-Tools/claude-supertool/issues/330).
 - **`gl-job` / `gh-job` gained a `:fail` suffix** — `gl-job:ID:fail` (alias `:errors`) prints only the matched failure blocks with no tail, the tight triage view. It applies the same per-job pattern table as the default mode (rector → diff lines, phpstan → 🪪/type markers, unit → `FAILURES!`/`Failed asserting`), so a red job shows just its actionable failures instead of the default's blocks-plus-80-line-tail. This names the discoverable front door for a behavior that previously only existed as the undocumented `:errors` mode on `gl-job` (and was entirely absent on `gh-job`); `:errors` stays as a back-compat alias. The default (no suffix), `:grep:PATTERN`, and `:raw` are unchanged. Closes [#326](https://github.com/Digital-Process-Tools/claude-supertool/issues/326).
@@ -16,6 +17,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Fixed
 
+- **`git-diff:PATH` no longer reports a missing/untracked path as "No changes."** — a path absent or untracked in the current repo produced an empty diff that printed `No changes.`, indistinguishable from a clean tracked file. Combined with a stale CWD (e.g. a `cd` into another repo) this is a silent false-negative: the op reads as "nothing changed" when it actually looked in the wrong place. Path mode now checks `git ls-files` first — a missing path warns `not found under <cwd> — wrong CWD?` and exits 1, an untracked on-disk path warns `untracked (not in git)`, and only a genuinely clean tracked file still says `No changes.`. Every mode also stamps a `Repo: <toplevel>` header so a wrong-repo invocation is visible at a glance (the trigger: `git-diff:.supertool.json` silently returned "No changes" while anchored to the wrong clone). Closes [#336](https://github.com/Digital-Process-Tools/claude-supertool/issues/336).
+- **`grep` path-not-found error now includes the CWD** — `ERROR: path not found: <path> (cwd: <cwd>) — wrong CWD?` instead of the bare path, so the recurring "stale cwd makes a relative path silently miss" trap is diagnosable from the message alone. Part of [#336](https://github.com/Digital-Process-Tools/claude-supertool/issues/336).
 - **`git-commit` no longer aborts on an already-staged deletion** — listing a `git rm`'d path (gone from disk) alongside present files made the op's `git add` step fail with `pathspec did not match any files`, killing the whole commit. The op now drops paths that are already staged as deletions from the `git add` step (their deletion is already staged and gets committed), so a mixed changeset of modifications + new files + deletions lands in one commit. Genuinely-unknown paths still error as before. Closes [#324](https://github.com/Digital-Process-Tools/claude-supertool/issues/324).
 
 ## [0.19.0] - 2026-06-24
diff --git a/docs/presets/git.md b/docs/presets/git.md
index 1eba481..0b52b44 100644
--- a/docs/presets/git.md
+++ b/docs/presets/git.md
@@ -16,7 +16,7 @@ Git investigation and workflow ops. Replaces the 4-6 raw `git` calls you'd norma
 | `git-blame` | `git-blame:PATH:LINE[:N]` | Blame for N lines (default 5) around a specific line number |
 | `git-checkout` | `git-checkout:REF` | Switch to branch/tag/SHA — reports tracking state, dirty files, last commits after switch |
 | `git-diverge` | `git-diverge:BRANCH[:BASE]` | Branch vs base: ahead/behind counts, commit list, changed files, +/− line totals |
-| `git-diff` | `git-diff[:staged\|:branch[:BASE]\|:PATH]` | Review-aware diff (working / `staged` / `branch` merge-base / `PATH`): files grouped by kind + shortstat, red-flag scan of **added** lines (debug code, conflict markers; reported `file:line`), forbidden-path guard, missing-test pairing, next-step hints. Generic defaults built in; project policy via config (below) |
+| `git-diff` | `git-diff[:staged\|:branch[:BASE]\|:PATH]` | Review-aware diff (working / `staged` / `branch` merge-base / `PATH`): files grouped by kind + shortstat, red-flag scan of **added** lines (debug code, conflict markers; reported `file:line`), forbidden-path guard, missing-test pairing, next-step hints. Generic defaults built in; project policy via config (below). Every mode stamps a `Repo: <toplevel>` header so a wrong-CWD invocation is obvious. In `:PATH` mode a path that is **missing** under the current repo warns `not found … — wrong CWD?` and exits 1, an **untracked** on-disk path warns `untracked (not in git)` — neither is silently reported as `No changes.` (which now means only a clean *tracked* file) |
 | `git-merge` | `git-merge:REF` | Merge REF — on conflict surfaces the UU file list, conflict markers, and ours/theirs SHAs |
 | `git-conflicts` | `git-conflicts` | List all UU files + every conflict block + abort hint |
 | `git-resolve` | `git-resolve:::SIDE:::PATH[,PATH...][:::BLOCKS]` | Pick `ours`/`theirs`/`both` for one file, a comma-separated list, or `all` — stages and prints the continue command. `both` is a union: it strips the conflict markers and keeps both sides (ours then theirs), like git's `merge=union` driver — use it when both branches added different non-overlapping lines. Optional **`BLOCKS`** selector (e.g. `1,3`) resolves only those 1-indexed conflict blocks of a **single** file, numbered exactly as `git-conflicts` lists them; one side per call (mixed sides → run twice). A **partial** resolve leaves the other blocks' markers in place by design, so the file stays conflicted and **unstaged** — the receipt reads `N of M block(s) resolved, file still conflicted`; only when the selector covers every block does the file go clean and get staged. **Self-verifies before staging:** a leftover `<<<<<<<` / `>>>>>>>` is a hard fail (file left unstaged), and each resolved file's receipt carries a warn-only validator digest (`markers: clean \| validate: ok`) |
diff --git a/docs/validators.md b/docs/validators.md
index 431fe45..a4f890f 100644
--- a/docs/validators.md
+++ b/docs/validators.md
@@ -117,6 +117,7 @@ Full list of `.supertool.json` validator config fields:
 |--------------------|----------------------------------------------------------------------------------------|
 | `cmd`              | Shell command. `{file}` → target path. `{supertool_dir}` → supertool install dir.      |
 | `match`            | Glob filter on the target path (default `*`).                                          |
+| `exclude`          | Glob (or list of globs) to skip even when `match` hits — e.g. `"*tests/*"`. Per-validator on purpose: `phpunit` must still scan tests. |
 | `hooks_into`       | Op names to wrap (subset of `edit`, `replace`, `replace_lines`, `paste`, `vim`).       |
 | `rollback_on_fail` | Restore pre-edit file content if the validator's count went up or ok flipped to false. |
 | `resolve`          | Shell cmd returning an alternate target path (e.g. source-file → test-file).           |
diff --git a/presets/git/diff.py b/presets/git/diff.py
index 0c6a825..53cdd66 100644
--- a/presets/git/diff.py
+++ b/presets/git/diff.py
@@ -275,11 +275,25 @@ def main() -> int:
             return 1
         mode, scope, diff_args = "branch", f"merge-base({base})..HEAD", [f"{base}...HEAD"]
     elif arg1:
+        # Guard: a path absent or untracked in THIS repo produces an empty diff,
+        # which would print "No changes." — indistinguishable from a clean file.
+        # Surface it as an explicit miss so a wrong-CWD invocation is obvious
+        # instead of silently reading as "nothing changed".
+        if not _git(["ls-files", "--", arg1]).stdout.strip():
+            root = _git(["rev-parse", "--show-toplevel"]).stdout.strip()
+            print("# git-diff (path)")
+            print(f"Repo: {root}")
+            if not os.path.exists(arg1):
+                print(f"{_mark('⚠')} {arg1!r} not found under {os.getcwd()} — wrong CWD?")
+                return 1
+            print(f"{_mark('⚠')} {arg1!r} is untracked (not in git).")
+            return 0
         mode, scope, diff_args = "path", f"working vs HEAD — {arg1}", ["HEAD", "--", arg1]
     else:
         mode, scope, diff_args = "working", "working tree vs HEAD", ["HEAD"]
 
     print(f"# git-diff ({mode})")
+    print(f"Repo: {_git(['rev-parse', '--show-toplevel']).stdout.strip()}")
     print(f"Scope: {scope}")
 
     changed = _changed_files(diff_args)
diff --git a/supertool.py b/supertool.py
index 60dc732..6caedd9 100755
--- a/supertool.py
+++ b/supertool.py
@@ -191,6 +191,20 @@ def _match_glob(path: str, pattern: str) -> bool:
     return False
 
 
+def _matches_any_glob(path: str, patterns: Any) -> bool:
+    """True if `path` matches any glob in `patterns`.
+
+    `patterns` may be a single glob string or a list of globs (skip if any
+    matches). Falsy patterns (None, "", []) match nothing. Used by validator
+    and formatter dispatch to honor a per-spec `exclude` glob.
+    """
+    if not patterns:
+        return False
+    if isinstance(patterns, str):
+        patterns = [patterns]
+    return any(_match_glob(path, p) for p in patterns if p)
+
+
 def _expand_braces(pattern: str) -> List[str]:
     """Expand shell-style brace groups `{a,b,c}` into a list of patterns.
 
@@ -1274,7 +1288,7 @@ def op_grep(pattern: str, path: str = ".", limit: int = 0,
         # Could be a glob pattern — check if it expands to anything
         from glob import glob as _glob
         if not _glob(path, recursive=True):
-            return f"ERROR: path not found: {path}\n"
+            return f"ERROR: path not found: {path} (cwd: {os.getcwd()}) — wrong CWD?\n"
 
     excl = _get_exclude_paths("grep", no_exclude)
 
@@ -8525,6 +8539,8 @@ def _applicable_validators(op: str, path: str) -> Dict[str, Dict[str, Any]]:
         glob = spec.get("match", "*")
         if path and glob and not _match_glob(path, glob):
             continue
+        if path and _matches_any_glob(path, spec.get("exclude")):
+            continue
         out[name] = spec
     return out
 
@@ -9119,6 +9135,8 @@ def _applicable_formatters(op: str, path: str) -> Dict[str, Dict[str, Any]]:
         glob = spec.get("match", "*")
         if path and glob and not _match_glob(path, glob):
             continue
+        if path and _matches_any_glob(path, spec.get("exclude")):
+            continue
         out[name] = spec
     return out
 
diff --git a/tests/test_formatters.py b/tests/test_formatters.py
index 1d04c2e..0274610 100644
--- a/tests/test_formatters.py
+++ b/tests/test_formatters.py
@@ -53,6 +53,24 @@ def test_applicable_formatters_ignores_malformed_specs() -> None:
     assert set(supertool._applicable_formatters("edit", "a.json")) == {"good"}
 
 
+def test_applicable_formatters_exclude_mirrors_validators() -> None:
+    _set_formatters({
+        "fmt": {"cmd": "x", "hooks_into": ["edit"], "match": "*.php", "exclude": "*tests/*"},
+    })
+    assert set(supertool._applicable_formatters("edit", "tests/aTest.php")) == set()
+    assert set(supertool._applicable_formatters("edit", "src/Foo.php")) == {"fmt"}
+
+
+def test_applicable_formatters_exclude_list() -> None:
+    _set_formatters({
+        "fmt": {"cmd": "x", "hooks_into": ["edit"], "match": "*.php",
+                "exclude": ["*tests/*", "*/Generated/*"]},
+    })
+    assert set(supertool._applicable_formatters("edit", "tests/aTest.php")) == set()
+    assert set(supertool._applicable_formatters("edit", "src/Generated/Foo.php")) == set()
+    assert set(supertool._applicable_formatters("edit", "src/Foo.php")) == {"fmt"}
+
+
 # ---------------------------------------------------------------------------
 # _formatter_run_one
 # ---------------------------------------------------------------------------
diff --git a/tests/test_git_diff.py b/tests/test_git_diff.py
index b4833d5..e484fdc 100644
--- a/tests/test_git_diff.py
+++ b/tests/test_git_diff.py
@@ -96,6 +96,63 @@ def test_branch_mode_scope(tmp_path: Path) -> None:
     assert "Foo.class.php" in out
 
 
+def _run_raw(repo: Path, *args: str) -> subprocess.CompletedProcess:
+    """Like _run but without the returncode==0 assertion (guard paths return 1)."""
+    return subprocess.run(
+        [sys.executable, str(DIFF), *args],
+        capture_output=True, text=True, encoding="utf-8", cwd=repo, env=dict(os.environ),
+    )
+
+
+def test_path_mode_missing_path_warns_not_silent(tmp_path: Path) -> None:
+    """A path that doesn't exist must NOT read as 'No changes.' — it warns + exits 1.
+
+    Regression for the wrong-CWD trap: git-diff:PATH on an absent file silently
+    printed 'No changes.', indistinguishable from a clean tracked file.
+    """
+    _init_repo(tmp_path)
+    res = _run_raw(tmp_path, "does-not-exist.json")
+    assert res.returncode == 1
+    assert "No changes." not in res.stdout
+    assert "not found" in res.stdout
+    assert "wrong CWD" in res.stdout
+    assert "Repo:" in res.stdout
+
+
+def test_path_mode_untracked_path_warns(tmp_path: Path) -> None:
+    """An on-disk but untracked file warns 'untracked' rather than 'No changes.'"""
+    _init_repo(tmp_path)
+    _write(tmp_path, "scratch.txt", "hi\\n")
+    res = _run_raw(tmp_path, "scratch.txt")
+    assert res.returncode == 0
+    assert "No changes." not in res.stdout
+    assert "untracked" in res.stdout
+
+
+def test_path_mode_tracked_clean_still_says_no_changes(tmp_path: Path) -> None:
+    """Regression guard: a tracked, unmodified file is genuinely clean → 'No changes.'"""
+    _init_repo(tmp_path)
+    res = _run(tmp_path, "seed.txt")
+    assert "No changes." in res
+
+
+def test_path_mode_tracked_modified_still_diffs(tmp_path: Path) -> None:
+    """The guard must NOT swallow a legit diff: a modified tracked file scoped by
+    PATH still renders its diff (proves ls-files lets tracked paths through)."""
+    _init_repo(tmp_path)
+    (tmp_path / "seed.txt").write_text("seed\\nmore\\n")
+    res = _run(tmp_path, "seed.txt")
+    assert "No changes." not in res
+    assert "seed.txt" in res
+
+
+def test_header_shows_repo_root(tmp_path: Path) -> None:
+    """Every mode stamps the resolved repo root so a wrong-CWD run is visible."""
+    _init_repo(tmp_path)
+    out = _run(tmp_path, "staged")
+    assert "Repo:" in out
+
+
 def test_migration_path_not_flagged_but_module_is(tmp_path: Path) -> None:
     """Bug 2 regression: 'migration' substring must not exempt real source classes."""
     _init_repo(tmp_path)
diff --git a/tests/test_grep.py b/tests/test_grep.py
index f5ba652..04f3a39 100644
--- a/tests/test_grep.py
+++ b/tests/test_grep.py
@@ -30,6 +30,15 @@ def test_grep_empty_pattern_errors() -> None:
     assert "ERROR: empty pattern" in out
 
 
+def test_grep_missing_path_error_includes_cwd() -> None:
+    """A non-existent path errors with the CWD, so 'wrong path' vs 'wrong CWD' is
+    distinguishable — the recurring trap where a stale cwd makes a relative path
+    silently miss."""
+    out = supertool.op_grep("anything", "does-not-exist-xyzzy.json")
+    assert "ERROR: path not found" in out
+    assert os.getcwd() in out
+
+
 def test_grep_respects_limit(tmp_path: Path) -> None:
     f = tmp_path / "many.py"
     content = "\n".join(f"match line {i}" for i in range(1, 20)) + "\n"
diff --git a/tests/test_validators.py b/tests/test_validators.py
index 5bd9fb2..b0d3b11 100644
--- a/tests/test_validators.py
+++ b/tests/test_validators.py
@@ -77,6 +77,31 @@ def test_applicable_ignores_malformed_specs() -> None:
     assert set(supertool._applicable_validators("edit", "a.php")) == {"good"}
 
 
+def test_applicable_exclude_skips_matching_path() -> None:
+    _set_validators({
+        "phpmd": {"cmd": "x", "hooks_into": ["edit"], "match": "*.php", "exclude": "*tests/*"},
+    })
+    assert set(supertool._applicable_validators("edit", "tests/aTest.php")) == set()
+    assert set(supertool._applicable_validators("edit", "src/Foo.php")) == {"phpmd"}
+
+
+def test_applicable_no_exclude_unchanged() -> None:
+    _set_validators({
+        "phpmd": {"cmd": "x", "hooks_into": ["edit"], "match": "*.php"},
+    })
+    assert set(supertool._applicable_validators("edit", "tests/aTest.php")) == {"phpmd"}
+
+
+def test_applicable_exclude_list_skips_if_any_matches() -> None:
+    _set_validators({
+        "phpmd": {"cmd": "x", "hooks_into": ["edit"], "match": "*.php",
+                  "exclude": ["*tests/*", "*/Generated/*"]},
+    })
+    assert set(supertool._applicable_validators("edit", "tests/aTest.php")) == set()
+    assert set(supertool._applicable_validators("edit", "src/Generated/Foo.php")) == set()
+    assert set(supertool._applicable_validators("edit", "src/Foo.php")) == {"phpmd"}
+
+
 # ---------------------------------------------------------------------------
 # _validator_resolve
 # ---------------------------------------------------------------------------
diff --git a/validators/SCHEMA.md b/validators/SCHEMA.md
index dc2ae3d..13f4db9 100644
--- a/validators/SCHEMA.md
+++ b/validators/SCHEMA.md
@@ -54,6 +54,7 @@ Universal JSON. Every adapter emits this shape. Validator core never parses tool
 | `cmd`             | string           | yes      | Shell command. `{file}` and `{supertool_dir}` are substituted before execution.           |
 | `hooks_into`      | array of strings | yes      | Ops that trigger this validator automatically (`edit`, `paste`, `vim`, ...).              |
 | `match`           | string           | no       | Glob pattern to filter by filename (e.g. `*.php`). Matches all files when absent.        |
+| `exclude`         | string or list   | no       | Glob (or list of globs) to skip even when `match` matches (e.g. `*tests/*`). Skip if any matches. |
 | `timeout`         | int              | no       | Seconds before the subprocess is killed. Default 60 (validators), 30 (formatters).       |
 | `rollback_on_fail`| bool             | no       | Revert the file if the validator reports a regression. Default false.                     |
 | `opt_in`          | bool             | no       | When true, validator only runs when explicitly requested (not on every hook).             |