Skip to content

fix(cli): render trace details by id with validation + exit-code mapping (issue #17)#20

Merged
mrgoonie merged 1 commit into
devfrom
worktree-fix-trace-details-by-id
May 28, 2026
Merged

fix(cli): render trace details by id with validation + exit-code mapping (issue #17)#20
mrgoonie merged 1 commit into
devfrom
worktree-fix-trace-details-by-id

Conversation

@mrgoonie
Copy link
Copy Markdown
Contributor

Summary

Fix goclaw traces get <id> — restore readable TTY output, surface decode failures, validate ids strictly, and map distinct exit codes per failure class. Closes #17.

What was broken

traces get <id> was unusable as documented:

  1. TTY output dumped raw JSONprinter.Print(map[string]any) falls back to JSON in table mode.
  2. unmarshalMap silently swallowed errors_ = json.Unmarshal(...) produced empty {} on malformed payload.
  3. No id validationargs[0] concatenated raw into URL; path-traversal (.., /, \x00, whitespace) flowed straight to the gateway.
  4. All failures collapsed to exit 1 — not-found / permission / server / malformed all indistinguishable from generic.

Fixes

cmd/traces.go

  • tracesGetCmd.RunE — strict id validation → HTTP → decode-with-error → render dispatch.
  • validateTraceID — regex allowlist ^[A-Za-z0-9._-]+$ + reserved-token block (., .., empty, whitespace). Returns typed *client.APIError{Code: INVALID_REQUEST} → exit 4 via central handler.
  • renderTraceTable + buildSpanTree — header card + span tree + flat events list. Insertion-ordered children, cycle-safe (cycles silently drop, no infinite recursion), orphans attach to virtual root.

internal/client/http.go

Latent retry-body bug. The retry loop closed resp.Body on every iteration including the final attempt, then post-loop io.ReadAll failed with "http: read on closed response body". This silently collapsed typed APIError → opaque wrapped error → exit 1, regardless of HTTP status. Fix: skip the per-iteration Close() on the final attempt; outer defer handles cleanup exactly once.

Exit-code contract for traces get

HTTP / condition Code
Success 0
403 permission denied 2
404 not found 3
Malformed id (pre-HTTP) 4
5xx server failure 5
429 rate-limit 6

Tests

10 new tests in cmd/traces_get_test.go:

  • Wire contract (GET /v1/traces/{id}) and JSON round-trip.
  • Table mode emits header + span tree markers ( / ) + EVENTS section.
  • 404 → ExitNotFound, 403 → ExitAuth, 5xx → ExitServer.
  • Table-driven malformed id (empty, ".", "..", path-traversal, slashes, NUL, whitespace) — asserts exit 4 and zero HTTP calls.
  • Malformed response (200 + non-JSON body) — surfaces wrapped decode error.

Fixture cmd/testdata/trace_detail_get.json is a scrubbed stub with _TODO_refresh marker. Auth-blocked smoke probe documented in plans/260528-1357-fix-trace-details-by-id/reports/repro-260528-issue17.md. Refresh against live gateway before merging this PR to main.

Verification

  • go vet ./... — clean
  • go build ./... — clean
  • go test -count=1 ./... — all packages green
  • code-reviewer subagent: DONE_WITH_CONCERNS, 0 Critical, 0 Important, 4 informational (all addressed inline or scheduled as follow-up). See plans/260528-1357-fix-trace-details-by-id/reports/code-review-260528-issue17.md.

Acceptance criteria mapping

AC (from issue #17) Where verified
1. Read trace details by id renderTraceTable + TestTracesGet_TableMode_HasHeaderAndSpanMarkers
2. JSON + human-readable output TestTracesGet_JSONMode_PreservesStructure + TestTracesGet_TableMode_*
3. Distinct errors (not-found / perm / malformed / server) TestTracesGet_NotFound_ExitCode3 / _PermissionDenied_ExitCode2 / _MalformedID_NoHTTPCall / _ServerError_ExitCode5
4. Link upstream issue if API root cause n/a — classified CLI-side (see Phase 1 repro report)
5. Regression test 10 tests in cmd/traces_get_test.go

Out of scope

  • Hardening tracesExportCmd id validation (pre-existing; same vulnerability) — filed as follow-up.
  • Fixture refresh against live gateway — gated by the Phase 1 reviewer note; auth-blocked at plan time.

Auto-close timing

This PR targets dev. Closes #17 will auto-close the issue only when dev is promoted to main. A comment will be posted on #17 explaining this.

Plan + reports

  • Plan: plans/260528-1357-fix-trace-details-by-id/plan.md
  • Repro + root-cause classification: plans/260528-1357-fix-trace-details-by-id/reports/repro-260528-issue17.md
  • Code review: plans/260528-1357-fix-trace-details-by-id/reports/code-review-260528-issue17.md

Closes #17

…ing (issue #17)

Closes #17.

Bug surface: `goclaw traces get <id>` was unreadable in TTY mode (dumped raw JSON),
silently swallowed unmarshal errors (empty `{}`), accepted any raw id including
path-traversal sequences (`..`, `/`, control chars), and collapsed every server
failure to exit code 1 regardless of HTTP status.

Fixes:
- `cmd/traces.go`
  - `tracesGetCmd.RunE`: validate id allowlist before HTTP, decode payload with
    error surfacing, render header card + span tree + events list for TTY,
    pass-through JSON for `-o json` / piped stdout.
  - `validateTraceID`: regex allowlist `^[A-Za-z0-9._-]+$` + reserved-token reject
    (`.`, `..`, empty/whitespace). Returns typed `*client.APIError{Code: INVALID_REQUEST}`
    so the central error handler maps it to exit code 4.
  - `renderTraceTable` + `buildSpanTree`: insertion-ordered span tree, orphan spans
    attach to virtual root. No relink walk, cycle-safe (cycles silently drop).
- `internal/client/http.go`
  - Retry loop bug: previously closed `resp.Body` on every iteration including the
    final attempt, then `io.ReadAll` after the loop failed with
    "read on closed response body". This collapsed typed `APIError` into an opaque
    wrapped error and lost the exit-code mapping for 5xx/429. Fixed by skipping
    the per-iteration close on the final attempt; outer `defer` handles cleanup.

Exit-code contract for `traces get`:
- 0 success / 2 permission denied / 3 not-found /
  4 malformed id (pre-HTTP) / 5 server failure / 6 rate-limit.

Tests: 10 new tests in `cmd/traces_get_test.go` lock the wire contract,
table+JSON rendering, exit-code mapping per HTTP status, and the
"malformed id makes zero HTTP calls" invariant (parameterized table).
Fixture `cmd/testdata/trace_detail_get.json` is a scrubbed stub with
`_TODO_refresh` marker; refresh against live gateway before merging to default.

Docs: CHANGELOG `[Unreleased]` Fixed entry, README `Reading a Trace by ID`
section with exit-code table, `docs/codebase-summary.md` traces bullet updated.
@mrgoonie
Copy link
Copy Markdown
Contributor Author

Review (post-CI)

Verdict: Approve — safe to merge to dev.

Findings

Severity Count
Critical 0
Important 0
Suggestion 1 (deferred — out of scope, already filed as follow-up)

Checks performed

  • CI status: all 4 green
    • build (macos-latest, 1.25)
    • build (ubuntu-latest, 1.25)
    • build (windows-latest, 1.25)
    • claude-review
  • Correctness: validated locally
    • validateTraceID rejects all 9 malformed-id table cases pre-HTTP (asserted via TestTracesGet_MalformedID_NoHTTPCall: server hit count = 0).
    • buildSpanTree cycle safety: cycles silently drop (never appear in children[""], never recursed). No infinite loop possible.
    • HTTP retry-body fix (internal/client/http.go:161-175): final-attempt Close() skipped so outer defer handles cleanup exactly once. No leak, no double-close.
    • Error mapping: 404→3, 403→2, 5xx→5, 429→6, malformed→4 — all asserted, all pass.
  • Security:
    • grep -i 'eyJ|Bearer|sk-|token=' cmd/testdata/ → 0 hits (fixture is scrubbed).
    • Id allowlist ^[A-Za-z0-9._-]+$ + reserved-token block (., ..) gates path-traversal before any HTTP call.
  • Hygiene:
    • No plan-artifact refs (Phase N, F1, Y1, audit A, §N) in changed source/comments.
    • Conventional commit format used.
    • PR targets dev (not main), per repo policy.

Suggestions (deferred)

  • tracesExportCmd (cmd/traces.go:207) has the same pre-existing un-validated-id pattern that this PR fixes for traces get. Out of scope here; filed as a follow-up task to apply the same validateTraceID + url.PathEscape treatment + regression test.

Notes for merge

  • Fixture cmd/testdata/trace_detail_get.json carries _TODO_refresh marker — refresh against live goclaw.zuey.me before promoting devmain (per Phase 1 reviewer gate).
  • Closes #17 will auto-fire only on devmain promotion. Issue Bug: cannot read trace details by trace id #17 has been commented with this timing.

No actionable findings. Loop terminates (0 fix iterations needed).

@mrgoonie mrgoonie merged commit 7e71c03 into dev May 28, 2026
4 checks passed
mrgoonie added a commit that referenced this pull request May 28, 2026
…#21)

* feat(cli): add P6 backend-unblocked CLI surfaces (issue #16) (#19)

* plan(p6): add domain coverage P6 backend-unblocked CLI plan

Mirrors issue #16 scope (7 command surfaces) as 4 grouped implementation
phases with TDD per the user's request. Backend evidence: digitopvn/goclaw
PR #37 + PR #44. Verified beta tag containing PR #44 is v3.12.0-beta.20.

Phases:
  1. Scope Lock — contract re-verification, drift sweep
  2. Traces Follow + Providers Reconnect (PR #37)
  3. Sessions Branch + Follow (PR #44 chat)
  4. Channels Writers Test (PR #44 channels)
  5. Activity + Logs Runtime Aggregate (PR #44 aggregation)
  6. Tests + Docs sweep with out-of-scope red-team
  7. Ship Readiness — single PR to dev

Out-of-scope list retained verbatim from issue #16: traces replay,
generic logs aggregate, WS/SSE chat history, watch loops.

* feat(cli): add P6 backend-unblocked CLI surfaces (issue #16)

Seven one-shot HTTP commands wired to backend PRs #37 and #44
(gateway tag v3.12.0-beta.20+). TDD: failing tests landed before each
implementation slice; all suites green under -count=1.

PR #37 surfaces:
- traces follow --session-key|--agent [--since --limit --include-spans
  --status --channel]: GET /v1/traces/follow
- providers reconnect <id>: POST /v1/providers/{id}/reconnect

PR #44 surfaces:
- sessions branch <key> --up-to-index N [--new-session-key --label
  --metadata k=v]: POST /v1/chat/sessions/{key}/branch.
  --up-to-index=0 preserved literally on the wire (bypasses buildBody
  int-zero skip).
- sessions follow <key> [--cursor N --limit N]: one-shot cursor-based
  history poll, NOT a watch loop. --cursor=0 preserved in raw query.
- channels writers test <id> --group-id G --user-id U: POST writers/test
  with body containing exactly two keys.
- activity aggregate --group-by {action|actor_type|entity_type|actor_id}
  [--from --to --limit + scope filters]: subcommand of existing activityCmd.
- logs aggregate [--group-by {level|source} --level --source --from]:
  ring-buffer summary, distinct from logs tail.

Shared formatLastSeen helper type-switches RFC3339-string vs epoch-millis
last_seen so neither aggregate command renders scientific notation in
table view.

All path params url.PathEscape'd; all flag validation runs before HTTP
call; central error handler honored (no direct error printing).

Docs: README + docs/codebase-summary + CHANGELOG updated.

Closes #16

* fix(cli): render trace details by id with validation + exit-code mapping (issue #17) (#20)

Closes #17.

Bug surface: `goclaw traces get <id>` was unreadable in TTY mode (dumped raw JSON),
silently swallowed unmarshal errors (empty `{}`), accepted any raw id including
path-traversal sequences (`..`, `/`, control chars), and collapsed every server
failure to exit code 1 regardless of HTTP status.

Fixes:
- `cmd/traces.go`
  - `tracesGetCmd.RunE`: validate id allowlist before HTTP, decode payload with
    error surfacing, render header card + span tree + events list for TTY,
    pass-through JSON for `-o json` / piped stdout.
  - `validateTraceID`: regex allowlist `^[A-Za-z0-9._-]+$` + reserved-token reject
    (`.`, `..`, empty/whitespace). Returns typed `*client.APIError{Code: INVALID_REQUEST}`
    so the central error handler maps it to exit code 4.
  - `renderTraceTable` + `buildSpanTree`: insertion-ordered span tree, orphan spans
    attach to virtual root. No relink walk, cycle-safe (cycles silently drop).
- `internal/client/http.go`
  - Retry loop bug: previously closed `resp.Body` on every iteration including the
    final attempt, then `io.ReadAll` after the loop failed with
    "read on closed response body". This collapsed typed `APIError` into an opaque
    wrapped error and lost the exit-code mapping for 5xx/429. Fixed by skipping
    the per-iteration close on the final attempt; outer `defer` handles cleanup.

Exit-code contract for `traces get`:
- 0 success / 2 permission denied / 3 not-found /
  4 malformed id (pre-HTTP) / 5 server failure / 6 rate-limit.

Tests: 10 new tests in `cmd/traces_get_test.go` lock the wire contract,
table+JSON rendering, exit-code mapping per HTTP status, and the
"malformed id makes zero HTTP calls" invariant (parameterized table).
Fixture `cmd/testdata/trace_detail_get.json` is a scrubbed stub with
`_TODO_refresh` marker; refresh against live gateway before merging to default.

Docs: CHANGELOG `[Unreleased]` Fixed entry, README `Reading a Trace by ID`
section with exit-code table, `docs/codebase-summary.md` traces bullet updated.

* docs: add planning artifacts for P6 backend-unblocked CLI and skill scaffold
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant