Skip to content

fix(workspace): batch cloud-workspace reconcile into one mutation#2256

Closed
andrewm4894 wants to merge 0 commit into
mainfrom
fix/batch-cloud-workspace-reconcile
Closed

fix(workspace): batch cloud-workspace reconcile into one mutation#2256
andrewm4894 wants to merge 0 commit into
mainfrom
fix/batch-cloud-workspace-reconcile

Conversation

@andrewm4894
Copy link
Copy Markdown
Member

Summary

Root-cause fix for the cloud-workspace storm part of #2251.

The reconcile effect in MainLayout.tsx fires one workspace.create tRPC mutation per cloud task that has no local workspace row. With ~100 cloud tasks at boot, that's 100 simultaneous IPC roundtrips, each paying for tRPC validation + a sqlite write + a workspace.getAll subscription invalidation. Logs show the rate-limit cap saturating:

[ipc-rate] mutation workspace.create called 101 times in 2000ms
[workspace] Creating workspace for task <id> in  (mode: cloud, useExistingBranch: undefined)
[workspace] Creating workspace for task <id> in  (mode: cloud, useExistingBranch: undefined)
...

This is the lurking systemic bug: every user who has accumulated many cloud tasks hits this on first boot (and again any time the local workspace rows get cleared — fresh install, DB reset, etc.).

Change

Add a new mutation workspace.reconcileCloudWorkspaces({ taskIds }) that:

  • Filters out task IDs that already have a workspace row (idempotent)
  • Inserts the remaining ones in a single batched db.insert().values([...]) call
  • Returns the list of newly-created task IDs

Update MainLayout.tsx to call this once with all missing IDs instead of Promise.allSettled over N parallel workspaceApi.create calls.

One IPC. One sqlite roundtrip. One subscription invalidation.

Diff shape

  • db/repositories/workspace-repository.ts — new createCloudMany(taskIds[])
  • db/repositories/workspace-repository.mock.ts — same on the mock
  • services/workspace/schemas.ts — new input/output schemas + types
  • services/workspace/service.ts — new reconcileCloudWorkspaces(taskIds[]) method
  • trpc/routers/workspace.ts — wire up the mutation
  • renderer/features/workspace/hooks/useWorkspace.ts — add workspaceApi.reconcileCloudWorkspaces
  • renderer/components/MainLayout.tsx — single batched call replaces the N-parallel pattern

Test plan

  • pnpm --filter code typecheck clean
  • pnpm --filter code test — 1348/1348 passing
  • pnpm lint clean
  • Manual: with a dev profile that has many cloud tasks but no local workspace rows, restart and confirm [ipc-rate] mutation workspace.create called 100 times in 2000ms no longer appears in main.log. Workspaces should still be reconciled (visible in the sidebar).
  • Manual: open a few tasks afterwards and confirm normal workspace creation still works via the existing create path.

Why this is separate from #2252

#2252 fixes the getTaskPrStatus IPC storm (read side). This fixes the workspace.create storm (write side, on reconcile). Same systemic shape — N IPC calls per task on boot — different code path. They don't overlap.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99700e59eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

createdAt: timestamp,
updatedAt: timestamp,
}));
this.db.insert(workspaces).values(rows).run();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle unique conflicts in batched cloud workspace insert

workspaces.taskId is unique, but this batched insert has no conflict handling. Because reconcileCloudWorkspaces computes toCreate in a separate read step, a concurrent createWorkspace for any one task can insert first, causing this single values(rows).run() to fail and abort creation for all remaining tasks in the batch. The previous per-task Promise.allSettled path allowed other tasks to succeed even if one collided, so this introduces a regression under normal concurrent usage.

Useful? React with 👍 / 👎.

Comment on lines +108 to 112
if (result.created.length > 0) {
void queryClient.invalidateQueries(
trpcReact.workspace.getAll.pathFilter(),
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Refresh workspace query even when reconcile creates zero rows

This only invalidates workspace.getAll when result.created.length > 0. If the DB already has those rows (for example from a concurrent creator or stale client cache), reconcile returns created: [] and the UI cache is never refreshed, so tasks can remain treated as missing and reconciliation can be retried unnecessarily. The previous flow invalidated after any successful create call, including no-op returns of existing workspaces.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/code/src/main/services/workspace/service.ts:442-458
**No tests for the new reconcile method**

There are no test files for the workspace service (`services/workspace/service.test.ts` does not exist), and no new tests are added here. The deduplication logic — `findAll()` → build `existingTaskIds` set → `uniqueRequested` dedup → filter → batch insert — has several meaningful branches to cover: empty input, all-already-exist (idempotent), partial overlap, and full batch creation. These are exactly the cases that should be verified with parameterised tests, in line with the team's stated convention.

### Issue 2 of 2
apps/code/src/main/db/repositories/workspace-repository.ts:113
Defensive: use `onConflictDoNothing()` on the batch insert. `taskId` carries a `UNIQUE` constraint (schema.ts), so if `createCloudMany` is ever called with an ID that already exists — e.g., from a future caller that skips the service-layer dedup — the entire batch throws a SQLite constraint error and no rows are created. Adding the conflict clause makes the repository method safe to call in isolation and keeps the UNIQUE constraint as the source of truth.

```suggestion
    this.db.insert(workspaces).values(rows).onConflictDoNothing().run();
```

Reviews (1): Last reviewed commit: "fix(workspace): batch cloud-workspace re..." | Re-trigger Greptile

Comment on lines +442 to +458
async reconcileCloudWorkspaces(
taskIds: string[],
): Promise<ReconcileCloudWorkspacesOutput> {
if (taskIds.length === 0) return { created: [] };

const existingTaskIds = new Set(
this.workspaceRepo.findAll().map((w) => w.taskId),
);
const uniqueRequested = Array.from(new Set(taskIds));
const toCreate = uniqueRequested.filter((id) => !existingTaskIds.has(id));
if (toCreate.length === 0) return { created: [] };

log.info(
`Reconciling ${toCreate.length} cloud workspaces (requested ${taskIds.length})`,
);
this.workspaceRepo.createCloudMany(toCreate);
return { created: toCreate };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No tests for the new reconcile method

There are no test files for the workspace service (services/workspace/service.test.ts does not exist), and no new tests are added here. The deduplication logic — findAll() → build existingTaskIds set → uniqueRequested dedup → filter → batch insert — has several meaningful branches to cover: empty input, all-already-exist (idempotent), partial overlap, and full batch creation. These are exactly the cases that should be verified with parameterised tests, in line with the team's stated convention.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/main/services/workspace/service.ts
Line: 442-458

Comment:
**No tests for the new reconcile method**

There are no test files for the workspace service (`services/workspace/service.test.ts` does not exist), and no new tests are added here. The deduplication logic — `findAll()` → build `existingTaskIds` set → `uniqueRequested` dedup → filter → batch insert — has several meaningful branches to cover: empty input, all-already-exist (idempotent), partial overlap, and full batch creation. These are exactly the cases that should be verified with parameterised tests, in line with the team's stated convention.

How can I resolve this? If you propose a fix, please make it concise.

createdAt: timestamp,
updatedAt: timestamp,
}));
this.db.insert(workspaces).values(rows).run();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Defensive: use onConflictDoNothing() on the batch insert. taskId carries a UNIQUE constraint (schema.ts), so if createCloudMany is ever called with an ID that already exists — e.g., from a future caller that skips the service-layer dedup — the entire batch throws a SQLite constraint error and no rows are created. Adding the conflict clause makes the repository method safe to call in isolation and keeps the UNIQUE constraint as the source of truth.

Suggested change
this.db.insert(workspaces).values(rows).run();
this.db.insert(workspaces).values(rows).onConflictDoNothing().run();
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/main/db/repositories/workspace-repository.ts
Line: 113

Comment:
Defensive: use `onConflictDoNothing()` on the batch insert. `taskId` carries a `UNIQUE` constraint (schema.ts), so if `createCloudMany` is ever called with an ID that already exists — e.g., from a future caller that skips the service-layer dedup — the entire batch throws a SQLite constraint error and no rows are created. Adding the conflict clause makes the repository method safe to call in isolation and keeps the UNIQUE constraint as the source of truth.

```suggestion
    this.db.insert(workspaces).values(rows).onConflictDoNothing().run();
```

How can I resolve this? If you propose a fix, please make it concise.

@charlesvien charlesvien force-pushed the fix/batch-cloud-workspace-reconcile branch from 99700e5 to 08d0511 Compare May 20, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants