Skip to content

feat(ci): add diagnose command#506

Open
121watts wants to merge 3 commits intomainfrom
watts/dep-4264-ci-diagnose-cli
Open

feat(ci): add diagnose command#506
121watts wants to merge 3 commits intomainfrom
watts/dep-4264-ci-diagnose-cli

Conversation

@121watts
Copy link
Copy Markdown
Contributor

@121watts 121watts commented May 8, 2026

Summary

The CLI keeps the human command as depot ci diagnose: pass a failed run, workflow, job, or attempt id and it renders the API's FailureDiagnosis response into a readable triage summary.

What was happening

Users could get logs and summaries, but the CLI did not have a first-class way to ask which failure mattered. Matrix jobs made that worse because several attempts could fail or cancel for the same underlying reason, and repeating every representative drill-down again in a footer made large diagnoses noisy.

What happens now

  • Adds depot ci diagnose <id> with automatic target resolution and --output json.
  • Keeps a hidden --type run|workflow|job|attempt disambiguation flag for rare ID collisions.
  • Calls the API failure diagnosis endpoint and renders grouped failures, representative attempts, relevant log lines, and contextual follow-up depot ci logs / depot ci summary commands.
  • Avoids repeating representative drill-down commands in a giant Next commands footer; text output keeps those commands under the representative attempt where they are useful.
  • Uses explicit representative wording like Showing 3 of 7 similar attempts for this group. so bounded output looks intentional instead of incomplete.

Sample Response

Target: workflow 2d2xsk1rtq (failed)
Failure groups: 2

Group 1: [Errno -2] Name or service not known
  2 failures
  Where: Step 24 (Set up databases (slow path - run migrations)): script exited with code 1

  Diagnosis:
    The Django `setup_test_environment` management command ... could not resolve the PostgreSQL
    database hostname...

  Possible fix:
    Ensure the PostgreSQL service container ... is defined and running...

  Attempts:
    - #1 dq1t0hmhn2  Node.js Tests (2/3) (shard=2) (failed)
      Logs: depot ci logs dq1t0hmhn2 --org cl0wyyk6k39487ebgraxasinja
      View: https://depot.dev/orgs/...
    - #1 cdtlqs7fjp  Node.js Tests (3/3) (shard=3) (failed)
      Logs: depot ci logs cdtlqs7fjp --org cl0wyyk6k39487ebgraxasinja
      View: https://depot.dev/orgs/...

  Evidence:
    - ... django.db.utils.OperationalError: [Errno -2] Name or service not known
    - ... psycopg.OperationalError: [Errno -2] Name or service not known

Validation

  • make generate
  • go test ./pkg/cmd/ci ./pkg/api
  • go test ./pkg/cmd/ci -run Diagnose
  • go test ./pkg/cmd/ci
  • go test ./...
  • git diff --check
  • go build -o bin/depot-dev ./cmd/depot
  • make bin/depot
  • Live smoke via DEPOT_API_URL=http://localhost:18080 ./bin/depot ci diagnose h5753524rz --org cl0wyyk6k39487ebgraxasinja
  • Live smoke via DEPOT_API_URL=http://127.0.0.1:18080 ./bin/depot-dev ci diagnose cvhm9dnf32: next_commands=0, logs=13, summaries=0

Depends on https://github.com/depot/api/pull/3656.
Linear: https://linear.app/depot/issue/DEP-4264/ci-diagnose-failure-triage-for-runs-workflows-jobs-and-attempts


Note

Medium Risk
Adds a new user-facing CLI command and a new CI API wrapper around GetFailureDiagnosis, with substantial output-formatting logic that could affect UX and error handling. Risk is moderate since it doesn’t touch auth flows beyond reusing existing token/org headers, but introduces new RPC usage and rendering paths.

Overview
Adds a new depot ci diagnose <id> subcommand that calls a new CIGetFailureDiagnosis wrapper (Connect RPC) to retrieve bounded failure diagnosis data for runs/workflows/jobs/attempts.

The command renders a human-readable triage summary (grouped failures, representative attempts, evidence lines, and follow-up logs/summary drill-down commands with optional --org) and also supports --output json with CLI-normalized enum strings.

Includes new tests covering the API wrapper, command registration, and key rendering behaviors (empty/over-limit/focused states, omission/truncation messaging, and hiding unavailable summary commands).

Reviewed by Cursor Bugbot for commit e6c1cad. Bugbot is set up for automated code reviews on this repo. Configure here.

@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 8, 2026

DEP-4264

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: TruncatedContextFields serializes as null instead of empty array
    • Changed line 637 to use make([]string, 0, len(...)) to guarantee a non-nil slice that serializes as [] instead of null, matching the nil-context guard and other array fields.

Create PR

Or push these changes by commenting:

@cursor push 36e7c44691
Preview (36e7c44691)
diff --git a/pkg/cmd/ci/diagnose.go b/pkg/cmd/ci/diagnose.go
--- a/pkg/cmd/ci/diagnose.go
+++ b/pkg/cmd/ci/diagnose.go
@@ -613,6 +613,8 @@
 	if context == nil {
 		return diagnoseContextJSON{TruncatedContextFields: []string{}}
 	}
+	truncatedFields := make([]string, 0, len(context.GetTruncatedContextFields()))
+	truncatedFields = append(truncatedFields, context.GetTruncatedContextFields()...)
 	return diagnoseContextJSON{
 		RunID:                  context.GetRunId(),
 		Repo:                   context.GetRepo(),
@@ -634,7 +636,7 @@
 		Attempt:                context.GetAttempt(),
 		AttemptStatus:          context.GetAttemptStatus(),
 		AttemptConclusion:      context.GetAttemptConclusion(),
-		TruncatedContextFields: append([]string(nil), context.GetTruncatedContextFields()...),
+		TruncatedContextFields: truncatedFields,
 	}
 }

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 89f292c. Configure here.

Comment thread pkg/cmd/ci/diagnose.go
Attempt: context.GetAttempt(),
AttemptStatus: context.GetAttemptStatus(),
AttemptConclusion: context.GetAttemptConclusion(),
TruncatedContextFields: append([]string(nil), context.GetTruncatedContextFields()...),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TruncatedContextFields serializes as null instead of empty array

Low Severity

When context is non-nil but has no truncated fields, append([]string(nil), context.GetTruncatedContextFields()...) returns nil, which JSON-encodes as null. This contradicts the nil-context path at line 614, which explicitly returns []string{} (JSON []), and is inconsistent with every other array field in the JSON document (e.g. FailureGroups, RepresentativeAttempts, NextCommands) that uses make([]…, 0, …) to guarantee a non-nil slice. JSON consumers expecting a consistent [] for "no truncated fields" will see null in the common case.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 89f292c. Configure here.

@121watts 121watts force-pushed the watts/dep-4264-ci-diagnose-cli branch from 89f292c to 3551bb9 Compare May 8, 2026 11:24
@121watts 121watts changed the title Add ci diagnose command feat(ci): add diagnose command May 8, 2026
@121watts 121watts force-pushed the watts/dep-4264-ci-diagnose-cli branch from 3551bb9 to 180881c Compare May 8, 2026 11:45
@121watts 121watts changed the title feat(ci): add diagnose command feat(ci): add command May 8, 2026
@121watts 121watts changed the title feat(ci): add command feat(ci): add diagnose command May 8, 2026
@121watts 121watts force-pushed the watts/dep-4264-ci-diagnose-cli branch from b1fd5bb to 8624f9c Compare May 8, 2026 13:55
@121watts 121watts force-pushed the watts/dep-4264-ci-diagnose-cli branch from 8624f9c to 8ebec37 Compare May 8, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant