Skip to content

fix(redis): stale sandboxes in killing state#2388

Merged
jakubno merged 4 commits intomainfrom
fix/redis-stale-sandboxes-in-killing-state
Apr 14, 2026
Merged

fix(redis): stale sandboxes in killing state#2388
jakubno merged 4 commits intomainfrom
fix/redis-stale-sandboxes-in-killing-state

Conversation

@jakubno
Copy link
Copy Markdown
Member

@jakubno jakubno commented Apr 14, 2026

If the sandbox enters killing state and the operation is canceled or the API crashes, it can cause the sandbox to remain stuck indefinitely.

This prevents getting stuck due to context cancellation + adds a reconcile mechanism

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 14, 2026

PR Summary

Medium Risk
Changes sandbox removal and Redis expiration pruning logic; mistakes could cause premature deletion or lingering sandboxes, but scope is limited to cleanup paths.

Overview
Prevents sandboxes from getting stuck in terminal transition states by making store cleanup and analytics emission resilient to context cancellation (using context.WithoutCancel) and by adding a time-based reconcile that force-removes entries that have been in a terminal removal state beyond sandbox.StaleCutoff, which is now a shared constant used across the orchestrator and Redis eviction/pruning logic.

Reviewed by Cursor Bugbot for commit 55150ca. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread packages/api/internal/sandbox/storage/redis/operations.go
@jakubno jakubno marked this pull request as ready for review April 14, 2026 11:18
@jakubno jakubno requested review from sitole and removed request for ValentaTomas and dobrac April 14, 2026 11:18
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1032a5ebf0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/api/internal/sandbox/storage/redis/operations.go
Comment thread packages/api/internal/orchestrator/delete_instance.go
Comment on lines +84 to +94
if time.Since(sbx.EndTime) > sandbox.StaleCutoff && opts.Action.Effect == sandbox.TransitionExpires {
o.sandboxStore.Remove(context.WithoutCancel(ctx), teamID, sandboxID)
go o.analyticsRemove(context.WithoutCancel(ctx), sbx, opts.Action)
}

return nil
}

defer func() { go o.analyticsRemove(context.WithoutCancel(ctx), sbx, opts.Action) }()
defer o.sandboxStore.Remove(ctx, teamID, sandboxID)
// Once we start the removal process, we want to make sure it gets removed from the store
defer o.sandboxStore.Remove(context.WithoutCancel(ctx), teamID, sandboxID)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it worth creating an anonymous cleanup function calling sandbox store, remove, and analytics remove without ctx?

@jakubno jakubno merged commit 8eb5549 into main Apr 14, 2026
45 checks passed
@jakubno jakubno deleted the fix/redis-stale-sandboxes-in-killing-state branch April 14, 2026 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants