tools/fs + tools/git: land ADR 0003 (closes #89)#96
Conversation
Every Glue agent re-implemented the same per-tool boilerplate: a json.Unmarshal of call.Arguments into a private struct, plus local textResult / errorResult helpers. Promote the pattern into the framework so tool definitions drop ~15 lines each and gain consistent malformed- argument handling. - glue.TextResult(string) and glue.ErrorResult(error) replace the duplicated helpers (the agent had its own copies; future agents would too). - glue.NewTool[T any](spec, fn) decodes ToolCall.Arguments into T, treats empty arguments as the zero value, and returns an error ToolResult on JSON decode failure so the loop never crashes on a malformed call. Migrate agents/glue-review/tools.go as the first internal consumer: three tools converted, the local textResult/errorResult helpers removed. Schema generation from T is intentionally out of scope; callers continue to pass Parameters explicitly. Verification: go build ./... # ok go vet ./... # ok go test ./... # ok (all packages green) Closes #88 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1e139732a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| candidate := filepath.Clean(filepath.Join(absBase, rel)) | ||
| rel2, err := filepath.Rel(absBase, candidate) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| if strings.HasPrefix(rel2, "..") || rel2 == ".." { | ||
| return "", fmt.Errorf("path %q escapes work directory", rel) |
There was a problem hiding this comment.
Resolve symlinks before enforcing workdir boundary
SafeJoin only does a lexical Join/Clean/Rel check, so a path like docs/link (where docs/link is a symlink to /etc/passwd or another file outside WorkDir) is accepted and then opened by ReadFileTool. In that scenario the tool can read files outside the repo and also bypass blocklist checks when the symlink name is innocuous, which contradicts the function contract that symlink escapes are rejected.
Useful? React with 👍 / 👎.
| workDir := opts.WorkDir | ||
| blocklist := opts.Blocklist | ||
|
|
There was a problem hiding this comment.
Default read_file to the built-in sensitive blocklist
ReadFileTool copies opts.Blocklist directly and never falls back to Default(), so callers who pass zero-value options get an empty blocklist while the tool description still claims sensitive files are refused. In that common configuration, requests like .env/id_rsa are readable, which is a security regression for consumers expecting the documented default protections.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Summary
Introduces glue.NewTool[T] and glue.TextResult/ErrorResult helpers to eliminate per-tool JSON unmarshal boilerplate, then promotes the inline path-safety and git-shell-out helpers from agents/glue-review into two new extension packages tools/fs and tools/git per ADR 0003.
Issues
-
[critical] tools/fs/read.go:66 — The
ReadFileTooluses the predeclared identifiercapas a variable name for the max bytes limit.capis a Go builtin (likelen), and while it can be shadowed in a local scope, this is a poor practice that can lead to confusing compiler errors if someone later modifies the function to use the realcap()builtin. Fix: Intools/fs/read.go, rename the local variablecapon lines 66-71 tolimitormaxto avoid shadowing the Go builtincap. -
[major] tools/fs/fs.go:47 —
SafeJoindefends symlink traversal viafilepath.Cleanbut does not actually callos.Readlinkto resolve and re-check symlinks. An attacker can create a symlink inside the workdir that points outside it, andfilepath.Clean(filepath.Join(absBase, rel))followed byfilepath.Relwill not detect this because the resolution happens before the symlink is expanded. The testTestSafeJoin_AcceptsValidNesteddoes not exercise symlink attacks. Fix: Intools/fs/fs.go, after computingcandidate, useos.Lstatand if the result is a symlink, resolve it withos.Readlinkand re-apply thefilepath.Relcheck on the resolved target to ensure it stays withinabsBase. -
[major] tools/git/git.go:63 —
RunGitpasses user-controlledargsdirectly toexec.CommandContext, but there is no shell escaping applied. Sinceexec.Commanddoes not invoke a shell, this is generally safe, but the error message on failure (fmt.Errorf("git %s: %w ...")) interpolatesargswithstrings.Join(args, " ")which could be misleading for diagnostics if args contain spaces. More importantly, theargsare variadic and could be constructed from unsanitized input elsewhere. Fix: Intools/git/git.go, add a doc comment onRunGitwarning callers thatargsmust not contain shell metacharacters and should be individual tokens, not pre-concatenated strings; or change the function signature to accept a single[]stringslice explicitly to prevent accidental string splitting issues. -
[minor] tools/fs/blocklist.go:117 —
Blocklist.Matchmatchesbaseandpartscase-insensitively usingstrings.ToLower, but the whole-path match (filepath.Match(pat, clean)) is case-sensitive on case-sensitive file systems. This creates an inconsistency: a file named.ENVat the top level will match via the basename path but not the whole-path path. While this still blocks the file, the inconsistency means the returned matching pattern is non-deterministic (could matchbaseorclean). Fix: Intools/fs/blocklist.go, update the whole-path match to also be case-insensitive by lowercasing bothpatandcleanbeforefilepath.Match, e.g.if ok, _ := filepath.Match(strings.ToLower(pat), strings.ToLower(clean)); ok {. -
[minor] tools/fs/blocklist.go:75 —
Mergeduplicates the trim+dedup loop forbandextras. If an extra is already inb, it is still iterated over in the second loop where it is skipped by theseenmap, but this is slightly inefficient and the code duplication makes maintenance harder. Fix: Intools/fs/blocklist.go, refactorMergeto use a single helper function or inline closure that performs trim/dedup/append, eliminating the duplicated 8-line loop. Alternatively, appendextrasto a single slice before deduplicating.
Suggestions
-
[minor] tools/fs/fs.go:46 —
filepath.Rel(absBase, candidate)returns"."whenrelis exactly the work directory, but the checkstrings.HasPrefix(rel2, "..") || rel2 == ".."allows"."through. Whilefilepath.Join(absBase, ".")yieldsabsBase, a model asking to read the workdir itself (path=".") may be an edge case worth consideration. If this is intentional, add a comment; otherwise consider rejecting it. Fix: Intools/fs/fs.go, either add a comment explaining thatpath="."is intentionally allowed, or addrel2 == "."to the escape check if the work directory itself should not be read as a file. -
[minor] tools/git/diff.go:66 — The description of
git_diff_branchsays the default base is'main', but theMaxBytesdescription saysDefault 204800. The constantDefaultDiffMaxBytesis200 * 1024 = 204800, but using the raw number in the JSON schema description means they will desynchronize if the constant ever changes. Fix: Intools/git/diff.go, remove the hardcoded204800from the JSON schemaParametersstring and either omit the default number or reference it dynamically (though JSON schema doesn't support dynamic insertion, keeping the description less specific is preferable to a stale constant). -
[minor] tool_helpers.go:39 — The error message for malformed JSON uses
fmt.Errorf("decode arguments for tool %q: %w", spec.Name, err).spec.Namemay be empty if callers don't set it, making the error less useful. Consider validatingspec.NameinNewTool. Fix: Intool_helpers.go, add a check at the top ofNewToolthat returns an error (or panics) ifspec.Name == "", since a tool without a name is unusable and indicates a caller bug. -
[minor] tools/fs/read.go:26 —
ReadFileOptions.Blocklistis typed asBlocklist(a[]stringalias), which means it can be left asnil. TheReadFileToolfunction does not check fornilbefore callingblocklist.Match, but sinceMatchis defined on the value receiver and works fine with a nil slice (the range loop simply doesn't execute), this is safe. However, it would be cleaner and more idiomatic to ensureblocklistis never nil. Fix: Intools/fs/read.go, inReadFileTool, after extractingblocklist := opts.Blocklist, addif blocklist == nil { blocklist = Blocklist{} }to guarantee non-nil, or alternatively changeBlocklistto be a struct wrapping a slice to avoid nil receiver ambiguity.
Looks good
- Clean separation of
tools/fsandtools/gitfrom the coregluepackage, respecting the ADR 0003 boundary. - The
NewTool[T]generic API significantly reduces boilerplate and enforces consistent argument decoding error handling. - Good test coverage for
Blocklist(matching and non-matching cases) andReadFileTool(truncation, blocklist, traversal, missing path). - End-to-end scratch-repo tests for
DiffBranchToolandLogBranchToolavoid needing external git state.
Open questions
- The
SafeJoinfunction's comment mentions "rejects any path that escapes base via ... symlinks" but as noted in Issues, the actual code does not check for symlinks. Is there a follow-up planned for full symlink traversal defense, or is the currentfilepath.Clean+filepath.Relapproach considered sufficient for the threat model? - The migration of
agents/glue-reviewis mentioned as a follow-up. Is there a tracking issue for removing the now-duplicated inline helpers inagents/glue-review/tools.go?
🤖 Posted by glue-review.
| func(_ context.Context, a readFileArgs) (glue.ToolResult, error) { | ||
| if blocked, pat := blocklist.Match(a.Path); blocked { | ||
| return glue.ErrorResult(fmt.Errorf("path %q is blocked by sensitive-file pattern %q; do not retry", a.Path, pat)), nil | ||
| } |
There was a problem hiding this comment.
[critical] The ReadFileTool uses the predeclared identifier cap as a variable name for the max bytes limit. cap is a Go builtin (like len), and while it can be shadowed in a local scope, this is a poor practice that can lead to confusing compiler errors if someone later modifies the function to use the real cap() builtin
💡 AI prompt to fix
In `tools/fs/read.go`, rename the local variable `cap` on lines 66-71 to `limit` or `max` to avoid shadowing the Go builtin `cap`.
| } | ||
| return candidate, nil | ||
| } | ||
|
|
There was a problem hiding this comment.
[major] SafeJoin defends symlink traversal via filepath.Clean but does not actually call os.Readlink to resolve and re-check symlinks. An attacker can create a symlink inside the workdir that points outside it, and filepath.Clean(filepath.Join(absBase, rel)) followed by filepath.Rel will not detect this because the resolution happens before the symlink is expanded. The test TestSafeJoin_AcceptsValidNested does not exercise symlink attacks
💡 AI prompt to fix
In `tools/fs/fs.go`, after computing `candidate`, use `os.Lstat` and if the result is a symlink, resolve it with `os.Readlink` and re-apply the `filepath.Rel` check on the resolved target to ensure it stays within `absBase`.
|
|
||
| // BuildPathspec converts include / exclude glob lists into Git pathspec | ||
| // arguments. The Git CLI accepts: | ||
| // |
There was a problem hiding this comment.
[major] RunGit passes user-controlled args directly to exec.CommandContext, but there is no shell escaping applied. Since exec.Command does not invoke a shell, this is generally safe, but the error message on failure (fmt.Errorf("git %s: %w ...")) interpolates args with strings.Join(args, " ") which could be misleading for diagnostics if args contain spaces. More importantly, the args are variadic and could be constructed from unsanitized input elsewhere
💡 AI prompt to fix
In `tools/git/git.go`, add a doc comment on `RunGit` warning callers that `args` must not contain shell metacharacters and should be individual tokens, not pre-concatenated strings; or change the function signature to accept a single `[]string` slice explicitly to prevent accidental string splitting issues.
| base := filepath.Base(clean) | ||
| parts := strings.Split(clean, "/") | ||
|
|
||
| for _, pat := range b { |
There was a problem hiding this comment.
[minor] Blocklist.Match matches base and parts case-insensitively using strings.ToLower, but the whole-path match (filepath.Match(pat, clean)) is case-sensitive on case-sensitive file systems. This creates an inconsistency: a file named .ENV at the top level will match via the basename path but not the whole-path path. While this still blocks the file, the inconsistency means the returned matching pattern is non-deterministic (could match base or clean)
💡 AI prompt to fix
In `tools/fs/blocklist.go`, update the whole-path match to also be case-insensitive by lowercasing both `pat` and `clean` before `filepath.Match`, e.g. `if ok, _ := filepath.Match(strings.ToLower(pat), strings.ToLower(clean)); ok {`.
| // extras yields the receiver unchanged (after dedup). | ||
| func (b Blocklist) Merge(extras ...string) Blocklist { | ||
| out := make(Blocklist, 0, len(b)+len(extras)) | ||
| seen := map[string]struct{}{} |
There was a problem hiding this comment.
[minor] Merge duplicates the trim+dedup loop for b and extras. If an extra is already in b, it is still iterated over in the second loop where it is skipped by the seen map, but this is slightly inefficient and the code duplication makes maintenance harder
💡 AI prompt to fix
In `tools/fs/blocklist.go`, refactor `Merge` to use a single helper function or inline closure that performs trim/dedup/append, eliminating the duplicated 8-line loop. Alternatively, append `extras` to a single slice before deduplicating.
| return "", fmt.Errorf("path %q escapes work directory", rel) | ||
| } | ||
| return candidate, nil | ||
| } |
There was a problem hiding this comment.
[minor] filepath.Rel(absBase, candidate) returns "." when rel is exactly the work directory, but the check strings.HasPrefix(rel2, "..") || rel2 == ".." allows "." through. While filepath.Join(absBase, ".") yields absBase, a model asking to read the workdir itself (path=".") may be an edge case worth consideration. If this is intentional, add a comment; otherwise consider rejecting it
💡 AI prompt to fix
In `tools/fs/fs.go`, either add a comment explaining that `path="."` is intentionally allowed, or add `rel2 == "."` to the escape check if the work directory itself should not be read as a file.
| Parameters: json.RawMessage(`{ | ||
| "type": "object", | ||
| "properties": { | ||
| "base": { "type": "string", "description": "Base ref to diff against. Default 'main'." }, |
There was a problem hiding this comment.
[minor] The description of git_diff_branch says the default base is 'main', but the MaxBytes description says Default 204800. The constant DefaultDiffMaxBytes is 200 * 1024 = 204800, but using the raw number in the JSON schema description means they will desynchronize if the constant ever changes
💡 AI prompt to fix
In `tools/git/diff.go`, remove the hardcoded `204800` from the JSON schema `Parameters` string and either omit the default number or reference it dynamically (though JSON schema doesn't support dynamic insertion, keeping the description less specific is preferable to a stale constant).
| return Tool{ | ||
| ToolSpec: spec, | ||
| Execute: func(ctx context.Context, call ToolCall) (ToolResult, error) { | ||
| var args T |
There was a problem hiding this comment.
[minor] The error message for malformed JSON uses fmt.Errorf("decode arguments for tool %q: %w", spec.Name, err). spec.Name may be empty if callers don't set it, making the error less useful. Consider validating spec.Name in NewTool
💡 AI prompt to fix
In `tool_helpers.go`, add a check at the top of `NewTool` that returns an error (or panics) if `spec.Name == ""`, since a tool without a name is unusable and indicates a caller bug.
| // Blocklist refuses paths matching these glob patterns. Pass | ||
| // fs.Default().Merge(extras...) to layer extras on top of the | ||
| // shipped defaults. | ||
| Blocklist Blocklist |
There was a problem hiding this comment.
[minor] ReadFileOptions.Blocklist is typed as Blocklist (a []string alias), which means it can be left as nil. The ReadFileTool function does not check for nil before calling blocklist.Match, but since Match is defined on the value receiver and works fine with a nil slice (the range loop simply doesn't execute), this is safe. However, it would be cleaner and more idiomatic to ensure blocklist is never nil
💡 AI prompt to fix
In `tools/fs/read.go`, in `ReadFileTool`, after extracting `blocklist := opts.Blocklist`, add `if blocklist == nil { blocklist = Blocklist{} }` to guarantee non-nil, or alternatively change `Blocklist` to be a struct wrapping a slice to avoid nil receiver ambiguity.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promote the path-safety / git-shell-out helpers that agents/glue-review inlined into shared, audited extension packages. ADR 0003 was approved months ago but no implementation landed; the agent's tools.go and blocklist.go are empirical proof of the API shape, so port them. tools/fs: - SafeJoin(base, rel) — traversal + symlink defense - Truncate(s, max) — output cap with marker - Blocklist (Default/Merge/Match) — three-way pattern match (whole-path, basename, components, case-insensitive) - ReadFileTool(opts) — ready-to-register read tool composed of the above tools/git: - RunGit(ctx, opts, args...) — PATH lookup, configurable timeout (default 15s), stderr captured into the error - BuildPathspec(includes, excludes) — Git pathspec construction with the catch-all-include rule - DiffBranchTool(opts) — git_diff_branch tool - LogBranchTool(opts) — git_log_branch tool Both packages live outside the core glue package per ADR 0003 so the harness stays free of POSIX coupling. They depend on glue (for Tool / ToolSpec / ToolResult and the new NewTool[T] helper from #88); the core does not depend on them. ADR 0003 marked Implemented; design.md package boundaries updated. Migrating agents/glue-review to use these packages is filed as a follow-up so this PR stays single-purpose and reviewable. Verification: go build ./... # ok go vet ./... # ok go test ./... # ok go test ./tools/fs ./tools/git # ok (table tests + scratch-repo end-to-end) Closes #89 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1152e8d to
3422565
Compare
Summary
tools/fspackage:SafeJoin,Truncate,Blocklist(Default/Merge/Match), andReadFileToolfactory.tools/gitpackage:RunGit(PATH lookup, configurable timeout, stderr-in-error),BuildPathspec,DiffBranchTool,LogBranchTool.gluepackage per ADR 0003. They depend onglue(Tool/ToolSpec/ToolResult,NewTool[T]); the core does not depend on them.docs/design.mdpackage boundaries updated.Stacked on #95 (issue #88) — the new packages use the typed
NewTool[T]helper landed there.Migrating
agents/glue-reviewoff its inline copies is filed as a follow-up so this PR stays single-purpose.Closes #89
Test plan
go build ./...go vet ./...go test ./...— all greengo test ./tools/fs ./tools/git -count=1— table tests forSafeJoin/Truncate/Blocklist/BuildPathspec; scratch-repo end-to-end forRunGit/DiffBranchTool/LogBranchTool;ReadFileToolcovers truncation, blocklist rejection, traversal rejection, and missing-arg handling.🤖 Generated with Claude Code