Skip to content

tools/fs + tools/git: land ADR 0003 (closes #89)#96

Merged
erain merged 4 commits into
mainfrom
issue/89-tools-fs-git
May 7, 2026
Merged

tools/fs + tools/git: land ADR 0003 (closes #89)#96
erain merged 4 commits into
mainfrom
issue/89-tools-fs-git

Conversation

@erain
Copy link
Copy Markdown
Owner

@erain erain commented May 7, 2026

Summary

  • New tools/fs package: SafeJoin, Truncate, Blocklist (Default/Merge/Match), and ReadFileTool factory.
  • New tools/git package: RunGit (PATH lookup, configurable timeout, stderr-in-error), BuildPathspec, DiffBranchTool, LogBranchTool.
  • Both packages live outside the core glue package per ADR 0003. They depend on glue (Tool/ToolSpec/ToolResult, NewTool[T]); the core does not depend on them.
  • ADR 0003 marked Implemented; docs/design.md package boundaries updated.

Stacked on #95 (issue #88) — the new packages use the typed NewTool[T] helper landed there.

Migrating agents/glue-review off its inline copies is filed as a follow-up so this PR stays single-purpose.

Closes #89

Test plan

  • go build ./...
  • go vet ./...
  • go test ./... — all green
  • go test ./tools/fs ./tools/git -count=1 — table tests for SafeJoin/Truncate/Blocklist/BuildPathspec; scratch-repo end-to-end for RunGit/DiffBranchTool/LogBranchTool; ReadFileTool covers truncation, blocklist rejection, traversal rejection, and missing-arg handling.

🤖 Generated with Claude Code

Every Glue agent re-implemented the same per-tool boilerplate: a
json.Unmarshal of call.Arguments into a private struct, plus local
textResult / errorResult helpers. Promote the pattern into the framework
so tool definitions drop ~15 lines each and gain consistent malformed-
argument handling.

- glue.TextResult(string) and glue.ErrorResult(error) replace the
  duplicated helpers (the agent had its own copies; future agents would
  too).
- glue.NewTool[T any](spec, fn) decodes ToolCall.Arguments into T,
  treats empty arguments as the zero value, and returns an error
  ToolResult on JSON decode failure so the loop never crashes on a
  malformed call.

Migrate agents/glue-review/tools.go as the first internal consumer:
three tools converted, the local textResult/errorResult helpers removed.

Schema generation from T is intentionally out of scope; callers continue
to pass Parameters explicitly.

Verification:
  go build ./...      # ok
  go vet ./...        # ok
  go test ./...       # ok (all packages green)

Closes #88

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b1e139732a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/fs/fs.go
Comment on lines +37 to +43
candidate := filepath.Clean(filepath.Join(absBase, rel))
rel2, err := filepath.Rel(absBase, candidate)
if err != nil {
return "", err
}
if strings.HasPrefix(rel2, "..") || rel2 == ".." {
return "", fmt.Errorf("path %q escapes work directory", rel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve symlinks before enforcing workdir boundary

SafeJoin only does a lexical Join/Clean/Rel check, so a path like docs/link (where docs/link is a symlink to /etc/passwd or another file outside WorkDir) is accepted and then opened by ReadFileTool. In that scenario the tool can read files outside the repo and also bypass blocklist checks when the symlink name is innocuous, which contradicts the function contract that symlink escapes are rejected.

Useful? React with 👍 / 👎.

Comment thread tools/fs/read.go
Comment on lines +47 to +49
workDir := opts.WorkDir
blocklist := opts.Blocklist

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Default read_file to the built-in sensitive blocklist

ReadFileTool copies opts.Blocklist directly and never falls back to Default(), so callers who pass zero-value options get an empty blocklist while the tool description still claims sensitive files are refused. In that common configuration, requests like .env/id_rsa are readable, which is a security regression for consumers expecting the documented default protections.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Introduces glue.NewTool[T] and glue.TextResult/ErrorResult helpers to eliminate per-tool JSON unmarshal boilerplate, then promotes the inline path-safety and git-shell-out helpers from agents/glue-review into two new extension packages tools/fs and tools/git per ADR 0003.

Issues

  • [critical] tools/fs/read.go:66 — The ReadFileTool uses the predeclared identifier cap as a variable name for the max bytes limit. cap is a Go builtin (like len), and while it can be shadowed in a local scope, this is a poor practice that can lead to confusing compiler errors if someone later modifies the function to use the real cap() builtin. Fix: In tools/fs/read.go, rename the local variable cap on lines 66-71 to limit or max to avoid shadowing the Go builtin cap.

  • [major] tools/fs/fs.go:47 — SafeJoin defends symlink traversal via filepath.Clean but does not actually call os.Readlink to resolve and re-check symlinks. An attacker can create a symlink inside the workdir that points outside it, and filepath.Clean(filepath.Join(absBase, rel)) followed by filepath.Rel will not detect this because the resolution happens before the symlink is expanded. The test TestSafeJoin_AcceptsValidNested does not exercise symlink attacks. Fix: In tools/fs/fs.go, after computing candidate, use os.Lstat and if the result is a symlink, resolve it with os.Readlink and re-apply the filepath.Rel check on the resolved target to ensure it stays within absBase.

  • [major] tools/git/git.go:63 — RunGit passes user-controlled args directly to exec.CommandContext, but there is no shell escaping applied. Since exec.Command does not invoke a shell, this is generally safe, but the error message on failure (fmt.Errorf("git %s: %w ...")) interpolates args with strings.Join(args, " ") which could be misleading for diagnostics if args contain spaces. More importantly, the args are variadic and could be constructed from unsanitized input elsewhere. Fix: In tools/git/git.go, add a doc comment on RunGit warning callers that args must not contain shell metacharacters and should be individual tokens, not pre-concatenated strings; or change the function signature to accept a single []string slice explicitly to prevent accidental string splitting issues.

  • [minor] tools/fs/blocklist.go:117 — Blocklist.Match matches base and parts case-insensitively using strings.ToLower, but the whole-path match (filepath.Match(pat, clean)) is case-sensitive on case-sensitive file systems. This creates an inconsistency: a file named .ENV at the top level will match via the basename path but not the whole-path path. While this still blocks the file, the inconsistency means the returned matching pattern is non-deterministic (could match base or clean). Fix: In tools/fs/blocklist.go, update the whole-path match to also be case-insensitive by lowercasing both pat and clean before filepath.Match, e.g. if ok, _ := filepath.Match(strings.ToLower(pat), strings.ToLower(clean)); ok {.

  • [minor] tools/fs/blocklist.go:75 — Merge duplicates the trim+dedup loop for b and extras. If an extra is already in b, it is still iterated over in the second loop where it is skipped by the seen map, but this is slightly inefficient and the code duplication makes maintenance harder. Fix: In tools/fs/blocklist.go, refactor Merge to use a single helper function or inline closure that performs trim/dedup/append, eliminating the duplicated 8-line loop. Alternatively, append extras to a single slice before deduplicating.

Suggestions

  • [minor] tools/fs/fs.go:46 — filepath.Rel(absBase, candidate) returns "." when rel is exactly the work directory, but the check strings.HasPrefix(rel2, "..") || rel2 == ".." allows "." through. While filepath.Join(absBase, ".") yields absBase, a model asking to read the workdir itself (path=".") may be an edge case worth consideration. If this is intentional, add a comment; otherwise consider rejecting it. Fix: In tools/fs/fs.go, either add a comment explaining that path="." is intentionally allowed, or add rel2 == "." to the escape check if the work directory itself should not be read as a file.

  • [minor] tools/git/diff.go:66 — The description of git_diff_branch says the default base is 'main', but the MaxBytes description says Default 204800. The constant DefaultDiffMaxBytes is 200 * 1024 = 204800, but using the raw number in the JSON schema description means they will desynchronize if the constant ever changes. Fix: In tools/git/diff.go, remove the hardcoded 204800 from the JSON schema Parameters string and either omit the default number or reference it dynamically (though JSON schema doesn't support dynamic insertion, keeping the description less specific is preferable to a stale constant).

  • [minor] tool_helpers.go:39 — The error message for malformed JSON uses fmt.Errorf("decode arguments for tool %q: %w", spec.Name, err). spec.Name may be empty if callers don't set it, making the error less useful. Consider validating spec.Name in NewTool. Fix: In tool_helpers.go, add a check at the top of NewTool that returns an error (or panics) if spec.Name == "", since a tool without a name is unusable and indicates a caller bug.

  • [minor] tools/fs/read.go:26 — ReadFileOptions.Blocklist is typed as Blocklist (a []string alias), which means it can be left as nil. The ReadFileTool function does not check for nil before calling blocklist.Match, but since Match is defined on the value receiver and works fine with a nil slice (the range loop simply doesn't execute), this is safe. However, it would be cleaner and more idiomatic to ensure blocklist is never nil. Fix: In tools/fs/read.go, in ReadFileTool, after extracting blocklist := opts.Blocklist, add if blocklist == nil { blocklist = Blocklist{} } to guarantee non-nil, or alternatively change Blocklist to be a struct wrapping a slice to avoid nil receiver ambiguity.

Looks good

  • Clean separation of tools/fs and tools/git from the core glue package, respecting the ADR 0003 boundary.
  • The NewTool[T] generic API significantly reduces boilerplate and enforces consistent argument decoding error handling.
  • Good test coverage for Blocklist (matching and non-matching cases) and ReadFileTool (truncation, blocklist, traversal, missing path).
  • End-to-end scratch-repo tests for DiffBranchTool and LogBranchTool avoid needing external git state.

Open questions

  • The SafeJoin function's comment mentions "rejects any path that escapes base via ... symlinks" but as noted in Issues, the actual code does not check for symlinks. Is there a follow-up planned for full symlink traversal defense, or is the current filepath.Clean + filepath.Rel approach considered sufficient for the threat model?
  • The migration of agents/glue-review is mentioned as a follow-up. Is there a tracking issue for removing the now-duplicated inline helpers in agents/glue-review/tools.go?

🤖 Posted by glue-review.

Comment thread tools/fs/read.go
func(_ context.Context, a readFileArgs) (glue.ToolResult, error) {
if blocked, pat := blocklist.Match(a.Path); blocked {
return glue.ErrorResult(fmt.Errorf("path %q is blocked by sensitive-file pattern %q; do not retry", a.Path, pat)), nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[critical] The ReadFileTool uses the predeclared identifier cap as a variable name for the max bytes limit. cap is a Go builtin (like len), and while it can be shadowed in a local scope, this is a poor practice that can lead to confusing compiler errors if someone later modifies the function to use the real cap() builtin

💡 AI prompt to fix
In `tools/fs/read.go`, rename the local variable `cap` on lines 66-71 to `limit` or `max` to avoid shadowing the Go builtin `cap`.

Comment thread tools/fs/fs.go
}
return candidate, nil
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] SafeJoin defends symlink traversal via filepath.Clean but does not actually call os.Readlink to resolve and re-check symlinks. An attacker can create a symlink inside the workdir that points outside it, and filepath.Clean(filepath.Join(absBase, rel)) followed by filepath.Rel will not detect this because the resolution happens before the symlink is expanded. The test TestSafeJoin_AcceptsValidNested does not exercise symlink attacks

💡 AI prompt to fix
In `tools/fs/fs.go`, after computing `candidate`, use `os.Lstat` and if the result is a symlink, resolve it with `os.Readlink` and re-apply the `filepath.Rel` check on the resolved target to ensure it stays within `absBase`.

Comment thread tools/git/git.go

// BuildPathspec converts include / exclude glob lists into Git pathspec
// arguments. The Git CLI accepts:
//
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] RunGit passes user-controlled args directly to exec.CommandContext, but there is no shell escaping applied. Since exec.Command does not invoke a shell, this is generally safe, but the error message on failure (fmt.Errorf("git %s: %w ...")) interpolates args with strings.Join(args, " ") which could be misleading for diagnostics if args contain spaces. More importantly, the args are variadic and could be constructed from unsanitized input elsewhere

💡 AI prompt to fix
In `tools/git/git.go`, add a doc comment on `RunGit` warning callers that `args` must not contain shell metacharacters and should be individual tokens, not pre-concatenated strings; or change the function signature to accept a single `[]string` slice explicitly to prevent accidental string splitting issues.

Comment thread tools/fs/blocklist.go
base := filepath.Base(clean)
parts := strings.Split(clean, "/")

for _, pat := range b {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] Blocklist.Match matches base and parts case-insensitively using strings.ToLower, but the whole-path match (filepath.Match(pat, clean)) is case-sensitive on case-sensitive file systems. This creates an inconsistency: a file named .ENV at the top level will match via the basename path but not the whole-path path. While this still blocks the file, the inconsistency means the returned matching pattern is non-deterministic (could match base or clean)

💡 AI prompt to fix
In `tools/fs/blocklist.go`, update the whole-path match to also be case-insensitive by lowercasing both `pat` and `clean` before `filepath.Match`, e.g. `if ok, _ := filepath.Match(strings.ToLower(pat), strings.ToLower(clean)); ok {`.

Comment thread tools/fs/blocklist.go
// extras yields the receiver unchanged (after dedup).
func (b Blocklist) Merge(extras ...string) Blocklist {
out := make(Blocklist, 0, len(b)+len(extras))
seen := map[string]struct{}{}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] Merge duplicates the trim+dedup loop for b and extras. If an extra is already in b, it is still iterated over in the second loop where it is skipped by the seen map, but this is slightly inefficient and the code duplication makes maintenance harder

💡 AI prompt to fix
In `tools/fs/blocklist.go`, refactor `Merge` to use a single helper function or inline closure that performs trim/dedup/append, eliminating the duplicated 8-line loop. Alternatively, append `extras` to a single slice before deduplicating.

Comment thread tools/fs/fs.go
return "", fmt.Errorf("path %q escapes work directory", rel)
}
return candidate, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] filepath.Rel(absBase, candidate) returns "." when rel is exactly the work directory, but the check strings.HasPrefix(rel2, "..") || rel2 == ".." allows "." through. While filepath.Join(absBase, ".") yields absBase, a model asking to read the workdir itself (path=".") may be an edge case worth consideration. If this is intentional, add a comment; otherwise consider rejecting it

💡 AI prompt to fix
In `tools/fs/fs.go`, either add a comment explaining that `path="."` is intentionally allowed, or add `rel2 == "."` to the escape check if the work directory itself should not be read as a file.

Comment thread tools/git/diff.go
Parameters: json.RawMessage(`{
"type": "object",
"properties": {
"base": { "type": "string", "description": "Base ref to diff against. Default 'main'." },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] The description of git_diff_branch says the default base is 'main', but the MaxBytes description says Default 204800. The constant DefaultDiffMaxBytes is 200 * 1024 = 204800, but using the raw number in the JSON schema description means they will desynchronize if the constant ever changes

💡 AI prompt to fix
In `tools/git/diff.go`, remove the hardcoded `204800` from the JSON schema `Parameters` string and either omit the default number or reference it dynamically (though JSON schema doesn't support dynamic insertion, keeping the description less specific is preferable to a stale constant).

Comment thread tool_helpers.go
return Tool{
ToolSpec: spec,
Execute: func(ctx context.Context, call ToolCall) (ToolResult, error) {
var args T
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] The error message for malformed JSON uses fmt.Errorf("decode arguments for tool %q: %w", spec.Name, err). spec.Name may be empty if callers don't set it, making the error less useful. Consider validating spec.Name in NewTool

💡 AI prompt to fix
In `tool_helpers.go`, add a check at the top of `NewTool` that returns an error (or panics) if `spec.Name == ""`, since a tool without a name is unusable and indicates a caller bug.

Comment thread tools/fs/read.go
// Blocklist refuses paths matching these glob patterns. Pass
// fs.Default().Merge(extras...) to layer extras on top of the
// shipped defaults.
Blocklist Blocklist
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] ReadFileOptions.Blocklist is typed as Blocklist (a []string alias), which means it can be left as nil. The ReadFileTool function does not check for nil before calling blocklist.Match, but since Match is defined on the value receiver and works fine with a nil slice (the range loop simply doesn't execute), this is safe. However, it would be cleaner and more idiomatic to ensure blocklist is never nil

💡 AI prompt to fix
In `tools/fs/read.go`, in `ReadFileTool`, after extracting `blocklist := opts.Blocklist`, add `if blocklist == nil { blocklist = Blocklist{} }` to guarantee non-nil, or alternatively change `Blocklist` to be a struct wrapping a slice to avoid nil receiver ambiguity.

erain and others added 3 commits May 7, 2026 10:13
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promote the path-safety / git-shell-out helpers that agents/glue-review
inlined into shared, audited extension packages. ADR 0003 was approved
months ago but no implementation landed; the agent's tools.go and
blocklist.go are empirical proof of the API shape, so port them.

tools/fs:
- SafeJoin(base, rel) — traversal + symlink defense
- Truncate(s, max) — output cap with marker
- Blocklist (Default/Merge/Match) — three-way pattern match
  (whole-path, basename, components, case-insensitive)
- ReadFileTool(opts) — ready-to-register read tool composed of the above

tools/git:
- RunGit(ctx, opts, args...) — PATH lookup, configurable timeout
  (default 15s), stderr captured into the error
- BuildPathspec(includes, excludes) — Git pathspec construction with
  the catch-all-include rule
- DiffBranchTool(opts) — git_diff_branch tool
- LogBranchTool(opts) — git_log_branch tool

Both packages live outside the core glue package per ADR 0003 so the
harness stays free of POSIX coupling. They depend on glue (for Tool /
ToolSpec / ToolResult and the new NewTool[T] helper from #88); the core
does not depend on them. ADR 0003 marked Implemented; design.md package
boundaries updated.

Migrating agents/glue-review to use these packages is filed as a
follow-up so this PR stays single-purpose and reviewable.

Verification:
  go build ./...                    # ok
  go vet ./...                      # ok
  go test ./...                     # ok
  go test ./tools/fs ./tools/git    # ok (table tests + scratch-repo end-to-end)

Closes #89

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@erain erain force-pushed the issue/89-tools-fs-git branch from 1152e8d to 3422565 Compare May 7, 2026 14:15
@erain erain merged commit e078d04 into main May 7, 2026
3 of 4 checks passed
@erain erain deleted the issue/89-tools-fs-git branch May 7, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Land ADR 0003: tools/fs + tools/git extension packages

1 participant