Speed up guest-agent exec readiness retries#242
Conversation
cc17b29 to
ce11fd6
Compare
ce11fd6 to
9fb95cc
Compare
9fb95cc to
8e1bc54
Compare
2fa30ed to
fa97c17
Compare
fa97c17 to
1d696bc
Compare
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR modifies guest-agent execution logic and tracing in packages/api/lib/guest and packages/api/lib/hypervisor, not the kernel API endpoints or Temporal workflows specified in the filter. To monitor this PR anyway, reply with |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8b5f21d. Configure here.

Summary
Testing
Note
Medium Risk
Touches core guest vsock/gRPC exec and connection pooling on boot/restore hot paths; behavior change is intentional but affects all WaitForAgent callers.
Overview
Guest exec readiness is faster and easier to observe when
WaitForAgentis set (restore/fork/API paths that wait for the agent).Retry backoff replaces a fixed 500ms sleep with 25ms for the first 2s, then 250ms until the existing deadline. On retryable vsock/gRPC-unavailable errors, the pooled connection is removed and closed so the next attempt dials fresh instead of sitting in gRPC backoff.
OpenTelemetry adds a
guest.execspan only forWaitForAgent > 0(single-attempt exec stays untraced to avoid duplicating APIexec.sessionspans). The span records wait time, attempt counts, retry intervals, first/last retryable error types, and command basename only (no argv/env).Tests: new unit tests for retry timing, sanitized command names, fresh-connection retries, and
CloseConnbehavior; integration helpers stop nestingWaitForAgentinsidewaitForExecAgent, and the VZ standby test waits for Running before standby.Reviewed by Cursor Bugbot for commit d53eb85. Bugbot is set up for automated code reviews on this repo. Configure here.