Skip to content

Harden backend pipeline and verification coverage#3

Open
Erics1337 wants to merge 36 commits into
masterfrom
development
Open

Harden backend pipeline and verification coverage#3
Erics1337 wants to merge 36 commits into
masterfrom
development

Conversation

@Erics1337

@Erics1337 Erics1337 commented May 29, 2026

Copy link
Copy Markdown
Owner

Summary

  • harden snapshot RPC access and durable pipeline cancellation paths
  • expand backend e2e coverage for ingest, cancel, stale-worker cancellation, status, structure, backups, auth mismatches, manual bookmark queueing, duplicate URLs, search validation, and SSRF-sensitive paths
  • add real manual ingest/enrichment e2e coverage proving localhost SSRF targets are not fetched before embedding enqueue
  • harden route error handling for device registration, search, status response schemas, and web manual-add backend proxy timeouts
  • add focused extension architecture tests plus backend smoke command
  • pass backend e2e env through Turbo so full e2e runs can target fresh local ports
  • add queue_job_failures state plus Lambda retry-exhaustion handling for SQS worker failures

Final verification

  • pnpm --filter backend test
  • pnpm --filter extension test
  • pnpm build
  • env PORT=3346 BACKEND_E2E_BASE_URL=http://127.0.0.1:3346 pnpm test:e2e
  • env BACKEND_SMOKE_URL=http://127.0.0.1:3347 BACKEND_SMOKE_USER_ID=status-user BACKEND_SMOKE_ACCESS_TOKEN=status-user BACKEND_SMOKE_RUN_EMPTY_INGEST=true pnpm smoke:backend
  • terraform -chdir=infra/terraform fmt -check
  • git diff --check

Summary by CodeRabbit

  • New Features

    • Cloud Snapshots: create, list, restore, and delete cloud-backed structure snapshots
    • Apply recovery: resume or rollback interrupted bookmark reorganizations
    • Pipeline job-generation tracking and recovery for more reliable background processing
  • Improvements

    • Stronger rate limiting, request timeouts and clearer error responses
    • Better cancellation, retry and queueing behavior for ingest/enrichment workflows
    • More robust duplicate/manual-ingest handling and stability during sync
  • Security

    • Tightened CORS rules (extension-aware), DB row-level security, and per-user device limits

@coderabbitai

coderabbitai Bot commented May 29, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Backend pipeline-run generation and cancellation were persisted and threaded into routes, workers, and queue metadata. Extension clients/hooks were refactored to shared Structure/CloudSnapshot clients and specialized hooks; web APIs gained origin/rate-limit/error guards; cloud snapshots replaced backups in UI; comprehensive tests and DB migrations were added.

Changes

Pipeline generation, API hardening, and UI refactor

Layer / File(s) Summary
Pipeline SQL, RPCs, and tests
supabase/migrations/*, supabase/tests/database/*
Adds pipeline_run tracking, job_generation, pipeline RPCs (begin/claim/record/complete/fail), queue_job_failures table, RLS/policies, and SQL tests validating snapshot and pipeline behaviors.
Backend core: routes, worker, queues, coordinator
apps/backend/src/routes/*, apps/backend/src/lambda/*, apps/backend/src/lib/*, apps/backend/src/index.ts
Routes updated for pipeline-aware ingest/status/backups; worker records failures and honors attempts/backoff; queues include retry metadata and test driver helpers; pipelineCoordinator coordinates RPCs and clustering enqueue logic; rate-limiter and E2E test endpoints added.
Clustering pipeline and helpers
apps/backend/src/queues/clustering/*
New clustering algorithm, cluster naming (OpenAI + heuristic), persistence, vector parsing, recovery routine, and tests for cluster persistence and pipeline recovery.
Extension: clients, hooks, apply/journal, persistence, UI
apps/extension/src/lib/*, apps/extension/src/hooks/*, apps/extension/src/screens/*
Adds StructureClient/CloudSnapshotClient, apply plan/journal, processingSession, preview builder, backup client, many hooks (useWeaveRun, useStructureResults, useBookmarkTools, useBookmarkPersistence, useChromeApply), UI components/screens updated to cloud snapshots, and broad test coverage.
Web API security, dashboard, and legal pages
apps/web/utils/api/*, apps/web/app/api/*, apps/web/app/dashboard/*
Introduces origin/enforceSameOrigin, rateLimit, sanitizeApiError, requireApiUser; updates API routes (bookmarks/add, checkout, waitlist, webhooks) and refactors dashboard/legal pages to shared UI components.
E2E, smoke, scripts, tooling, infra
apps/backend-e2e/*, apps/backend/src/scripts/smoke.ts, turbo.json, infra/terraform/*, .gitignore, package.json
Rewrote backend e2e harness to seed and run backend with test endpoints; added smoke script; updated turbo globalEnv, Terraform CORS/rate-limit variables, Supabase config/migrations, and gitignore.

Sequence Diagram(s)

sequenceDiagram
  participant IngestRoute
  participant PipelineCoordinator
  participant Supabase
  participant QueueWorker
  IngestRoute->>PipelineCoordinator: beginUserPipelineRun(userId)
  PipelineCoordinator->>Supabase: begin_user_pipeline_run RPC
  Supabase-->>PipelineCoordinator: generation + run id
  IngestRoute->>PipelineCoordinator: recordPipelineRunStarted(...)
  IngestRoute->>QueueWorker: enqueue ingest / clustering jobs with jobGeneration
  QueueWorker->>PipelineCoordinator: recordPipelineIngestCompleted / recordPipelineClusteringCompleted
  PipelineCoordinator->>Supabase: pipeline run RPC updates
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

"I hopped through code and burrowed deep,
Seeds of runs and queues I keep.
Clouds now snap before I apply,
Workers hum and tests comply.
A rabbit cheers — the pipeline's neat! 🐇"

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch development

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/backend-e2e/tests/backend.test.ts (1)

40-42: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Always boot a controlled E2E backend here.

This early return lets the suite attach to any process already answering /health on the target port. That bypasses the intended BACKEND_E2E_FAKE_SUPABASE=true and QUEUE_DRIVER=test setup, so these queue/backup/cancel tests can become flaky or mutate non-test state.

Suggested hardening
 before(async () => {
   const existingHealth = await readHealth();
-  if (existingHealth?.status === 'ok') return;
+  if (existingHealth?.status === 'ok') {
+    const e2eProbe = await request('/__e2e/queues').catch(() => null);
+    if (e2eProbe?.ok) return;
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend-e2e/tests/backend.test.ts` around lines 40 - 42, The test
setup's before hook returns early when readHealth() reports 'ok', which lets the
suite attach to arbitrary existing services and bypass the intended controlled
E2E boot (so environment flags like BACKEND_E2E_FAKE_SUPABASE and
QUEUE_DRIVER=test are not applied); remove that early return and always start
the controlled test backend here: in the before(async () => { ... }) block
replace the conditional return with logic that always invokes the test bootstrap
(e.g., call your startup helper such as startTestBackend/startE2EBackend or the
existing boot function used elsewhere) after readHealth(), ensuring the
controlled environment variables (BACKEND_E2E_FAKE_SUPABASE, QUEUE_DRIVER) are
applied and the suite waits for a successful health check before continuing.
apps/backend/src/routes/ingest.ts (1)

77-101: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make the destructive reset + requeue flow atomic.

If bookmark deletion succeeds and cluster deletion or queues.ingest.add(...) fails, this route returns an error after the user's existing import has already been torn down. That is a real data-loss window; this needs a transaction/RPC/outbox-style flow rather than independent mutations on the request path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/routes/ingest.ts` around lines 77 - 101, The current flow in
ingest.ts performs independent destructive deletes
(supabase.from('bookmarks').delete and supabase.from('clusters').delete) and
then calls queues.ingest.add, which can leave the system in a torn-down state if
enqueueing fails; change this to an atomic pattern: move the deletes and a
queueing intent (an outbox row) into a single database transaction/RPC (e.g.,
create a Postgres function or use supabase.rpc) that deletes bookmarks and
clusters for the given user_id and inserts a single outbox/job record (with
userId, bookmarks payload, clusteringSettings, jobGeneration and desired jobId)
in one transaction; then have the worker read the outbox and call
queues.ingest.add (or have a separate process that drains outbox -> Bull queue)
so the destructive reset and the enqueue intent are committed together and no
data-loss window exists.
🧹 Nitpick comments (5)
apps/backend/src/lib/cancellation.ts (1)

70-87: Read amplification after removing the in-memory cache.

isUserCancelled now issues a Supabase read on every invocation; the previous in-memory set is gone. If workers call this frequently (per job / per batch iteration), this adds a DB round-trip to each check and increases load on user_pipeline_controls. Consider a short-lived per-invocation memoization or a single check per job rather than per item if any call site loops.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lib/cancellation.ts` around lines 70 - 87, isUserCancelled
currently does a Supabase read via readPipelineControl on every call, causing
read amplification now that the in-memory set was removed; change it to use a
short-lived memoization or allow a per-invocation cache so repeated checks for
the same user in one job don't repeat DB reads. Modify isUserCancelled (and
callers) to accept an optional cache Map<string, boolean | PipelineControl> or a
job-scoped context object, check that cache first, populate it with the
readPipelineControl result on a miss, and return the cached value on subsequent
calls; alternatively, add a helper getOrLoadPipelineControl(userId, cache) that
wraps readPipelineControl to centralize this behavior. Ensure the unique symbols
referenced are isUserCancelled and readPipelineControl so callers can pass the
cache and avoid per-item DB round-trips.
supabase/migrations/20260529015619_add_pipeline_job_generation.sql (1)

4-5: Confirm the composite index earns its keep.

Reads in cancellation.ts filter only by user_id (.eq('user_id', ...)), and the upsert(..., { onConflict: 'user_id' }) requires a pre-existing unique index/constraint on user_id. This composite index (user_id, is_cancelled, job_generation) is therefore mostly redundant for the access pattern (at best a covering index on a table that holds one row per user). Unless a query actually filters on is_cancelled, consider dropping it to avoid write overhead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/migrations/20260529015619_add_pipeline_job_generation.sql` around
lines 4 - 5, The composite index idx_user_pipeline_controls_cancelled on
(user_id, is_cancelled, job_generation) appears unnecessary given
cancellation.ts only filters by user_id and upsert(..., { onConflict: 'user_id'
}) requires a unique/index on user_id; verify whether queries actually filter by
is_cancelled and if not, replace the composite index with a single-column unique
index/constraint or plain index on user_id (or drop the composite and ensure a
unique constraint exists for user_id used by upsert), removing is_cancelled and
job_generation from the index to reduce write overhead.
apps/backend/src/lib/__tests__/manualBookmark.test.ts (1)

31-45: ⚡ Quick win

Assert rejected URLs never start a new pipeline generation.

This case only checks that no ingest job was queued. Adding expect(beginUserPipelineRun).not.toHaveBeenCalled() would catch a regression where invalid input still invalidates the user's active generation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lib/__tests__/manualBookmark.test.ts` around lines 31 - 45,
The test should also assert that pipeline generation is never started for
invalid URLs: in the 'rejects unsupported URL protocols before queueing work'
spec (which calls queueManualBookmark), add an expectation that
beginUserPipelineRun was not invoked (e.g.,
expect(beginUserPipelineRun).not.toHaveBeenCalled()) alongside the existing
expect for queues.ingest.add to ensure invalid input doesn't trigger a pipeline
run.
apps/backend/src/queues/__tests__/embedding.test.ts (1)

91-121: ⚡ Quick win

Assert both cancellation checkpoints explicitly.

toHaveBeenCalledWith(...) still passes if only one of the two generation-aware guards survives. Add toHaveBeenCalledTimes(2) and nth-call assertions so the pre-start and pre-status-update checks are both covered.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/queues/__tests__/embedding.test.ts` around lines 91 - 121,
Add explicit assertions that isUserCancelled was called twice and verify each
call: after the embeddingProcessor(job) invocation add
expect(isUserCancelled).toHaveBeenCalledTimes(2) and then assert the individual
calls with expect(isUserCancelled).toHaveBeenNthCalledWith(1, 'user-2', 7) and
expect(isUserCancelled).toHaveBeenNthCalledWith(2, 'user-2', 7) so both the
pre-start and pre-status-update cancellation checkpoints in the test are
covered.
apps/backend/src/queues/__tests__/enrichment.test.ts (1)

91-117: ⚡ Quick win

Cover all three generation-aware cancellation checks.

These assertions only prove that one matching call happened. The worker now checks cancellation before start, after fetch, and before enqueue, so the test should pin the expected call count/order as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/queues/__tests__/enrichment.test.ts` around lines 91 - 117,
The test only asserts a single cancellation check but enrichmentProcessor calls
isUserCancelled three times (before start, after fetch, before enqueue); update
the test to assert isUserCancelled was called exactly three times and verify the
exact sequence/arguments using toHaveBeenNthCalledWith for each call (e.g.,
isUserCancelled called 1st with 'user-2', 5; 2nd with 'user-2', 5; 3rd with
'user-2', 5) and also assert queues.embedding.add was called once with the
expected payload and jobId to ensure enqueue still happened.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/backend/src/db/fakeSupabase.ts`:
- Around line 52-108: The module-level mutable state object (state) in
fakeSupabase is shared across tests and needs a reset helper and stable snapshot
IDs: add and export a reset function (e.g., resetFakeSupabase or
clearFakeSupabaseState) that reinitializes state.users, state.bookmarks,
state.clusters, state.assignments, state.snapshots and state.controls to their
original fixtures (clearing Maps/arrays in-place or replacing with fresh
instances) and also reset any new monotonic counter; replace the current
create_structure_snapshot ID generation (which uses
snapshot-${state.snapshots.length + 1}) with a monotonic counter or UUID (e.g.,
snapshotCounter++ or a UUID generator) stored at module scope so IDs never
collide after deletions, and ensure resetFakeSupabase resets that counter too so
tests get a clean slate.

In `@apps/backend/src/db/index.ts`:
- Around line 11-24: The code currently enables the fakeSupabase adapter
whenever BACKEND_E2E_FAKE_SUPABASE === 'true', which can silently swap out the
real DB in production; update the guard around useFakeSupabase to only allow the
fake adapter in non-production test contexts (e.g. require process.env.NODE_ENV
=== 'test' or !== 'production') and fail fast otherwise: change the logic that
sets useFakeSupabase to combine BACKEND_E2E_FAKE_SUPABASE with an explicit
environment check, and if BACKEND_E2E_FAKE_SUPABASE is set in production, log an
error and exit (or throw) instead of creating supabase from fakeSupabase; adjust
the initialization/console messages around supabase, fakeSupabase, createClient
and supabaseServiceRoleKey accordingly so production cannot accidentally use the
in-memory adapter.

In `@apps/backend/src/index.ts`:
- Around line 70-75: The __e2e routes are currently enabled solely by
BACKEND_E2E_FAKE_SUPABASE and are unauthenticated; change the guard to require a
test-only environment and an optional shared secret: only register the routes
when process.env.BACKEND_E2E_FAKE_SUPABASE === 'true' AND process.env.NODE_ENV
=== 'test', and inside the handlers validate a secret header (e.g. check
request.headers['x-e2e-secret'] against process.env.E2E_SECRET, returning 401 if
missing/invalid) before calling getTestQueuedJobs or clearTestQueuedJobs; this
ensures the '/__e2e/queues' GET and DELETE handlers are only exposed in test
runs and require the secret when present.

In `@apps/backend/src/lib/cancellation.ts`:
- Around line 54-59: beginUserPipelineRun currently does a read-modify-write via
readPipelineControl → persistPipelineControl which allows races; replace that
with a single DB-side atomic increment (e.g., create an RPC like
begin_user_pipeline_run that INSERT ... ON CONFLICT ... SET job_generation =
job_generation + 1, is_cancelled = false, updated_at = now() RETURNING
job_generation) and change beginUserPipelineRun to call that RPC instead of
readPipelineControl/persistPipelineControl; likewise change markUserCancelled to
stop re-writing job_generation (use an UPDATE that only sets is_cancelled = true
and updated_at = now() for the given user or expose an RPC like
mark_user_cancelled) so isUserCancelled semantics remain correct and no older
generation can overwrite a newer one (references: beginUserPipelineRun,
readPipelineControl, persistPipelineControl, markUserCancelled,
isUserCancelled).

In `@apps/backend/src/routes/ingest.ts`:
- Line 51: Call to beginUserPipelineRun currently occurs before the handler's
early-return checks (free-tier rejection and cleanup query paths) so a 402/500
can cancel an in-flight pipeline; move the await beginUserPipelineRun(userId)
invocation so it runs after all early-return validations and after running the
cleanup/query logic (i.e., only after free-tier eligibility is confirmed and any
cleanup queries have completed), ensuring the pipeline is only advanced when the
request will proceed and no replacement job is needed.

In `@apps/backend/src/scripts/smoke.ts`:
- Around line 12-20: The requestJson function lacks a timeout so fetch can hang;
add a timeout using an AbortController: create a controller (const controller =
new AbortController()), if init.signal exists forward aborts by adding
init.signal.addEventListener('abort', () => controller.abort()), start a timer
(setTimeout) that calls controller.abort() after a default timeout (e.g. 10_000
ms), pass controller.signal into fetch instead of init.signal, clear the timer
after fetch completes, and when catching an AbortError throw a clear timeout
error (e.g. "request timeout after X ms") so callers of requestJson know the
request timed out; reference symbols: requestJson, fetch, backendUrl,
accessToken, RequestInit.

In `@apps/extension/src/hooks/useBookmarkWeaver.ts`:
- Around line 813-818: restoreBookmarkBackup currently calls fetchResults after
restoring Chrome bookmarks, but fetchResults will reuse originalTreeRef,
bookmarkRootMapRef, and availableRootsRef if they are non-empty, causing the
preview to be rebuilt against stale pre-restore state; before calling
fetchResults in restoreBookmarkBackup, refresh or clear the in-memory bookmark
snapshot (e.g. reset originalTreeRef, bookmarkRootMapRef, availableRootsRef or
call the existing snapshot-refresh helper) so fetchResults reads the updated
Chrome bookmark tree and rebuilds the preview from the restored data.

In `@apps/extension/src/lib/backupClient.ts`:
- Around line 121-126: The returned snapshot currently uses a constant id
'just_created', which causes collisions; update the code in backupClient.ts
where the function returns the BookmarkBackupSnapshot to instead use the real
backend ID when available (e.g., response.id) or generate a unique temporary ID
client-side (e.g., UUID or timestamp + random) and assign that to the id field
instead of 'just_created', keeping the rest of the object (name, createdAt,
summary) the same so optimistic UI operations remain consistent.

In `@apps/extension/src/lib/chromeApplyPlan.ts`:
- Around line 121-127: Wrap the calls to clearRootChildrenExcept inside a
try/catch so cleanup exceptions do not throw and abort the whole apply; on any
caught error set the existing shouldWarnAboutPartialApply flag (or create it if
absent) and record the error (e.g., increment a cleanup-failure counter or
attach a message to result) so the function returns a partial-apply result
instead of rejecting; ensure you still await each clearRootChildrenExcept call
for all rootNodes and continue to the next root after catching an error so
previously applied changes are preserved and the caller can warn about the
partial apply.

In `@apps/extension/src/lib/processingSession.ts`:
- Around line 141-152: savePreOrganizeBackup currently stores and returns the
live tree reference so later mutations change the backup; before building the
backup payload in savePreOrganizeBackup make a deep clone of the incoming tree
(e.g., using structuredClone(tree) or JSON.parse(JSON.stringify(tree)) if
structuredClone isn't available) and use that clonedTree in the backup object
and when calling chrome.storage.local.set({ [PRE_ORGANIZE_BACKUP_KEY]: backup
}); so the returned backup is an immutable snapshot independent of future
in-memory changes.

---

Outside diff comments:
In `@apps/backend-e2e/tests/backend.test.ts`:
- Around line 40-42: The test setup's before hook returns early when
readHealth() reports 'ok', which lets the suite attach to arbitrary existing
services and bypass the intended controlled E2E boot (so environment flags like
BACKEND_E2E_FAKE_SUPABASE and QUEUE_DRIVER=test are not applied); remove that
early return and always start the controlled test backend here: in the
before(async () => { ... }) block replace the conditional return with logic that
always invokes the test bootstrap (e.g., call your startup helper such as
startTestBackend/startE2EBackend or the existing boot function used elsewhere)
after readHealth(), ensuring the controlled environment variables
(BACKEND_E2E_FAKE_SUPABASE, QUEUE_DRIVER) are applied and the suite waits for a
successful health check before continuing.

In `@apps/backend/src/routes/ingest.ts`:
- Around line 77-101: The current flow in ingest.ts performs independent
destructive deletes (supabase.from('bookmarks').delete and
supabase.from('clusters').delete) and then calls queues.ingest.add, which can
leave the system in a torn-down state if enqueueing fails; change this to an
atomic pattern: move the deletes and a queueing intent (an outbox row) into a
single database transaction/RPC (e.g., create a Postgres function or use
supabase.rpc) that deletes bookmarks and clusters for the given user_id and
inserts a single outbox/job record (with userId, bookmarks payload,
clusteringSettings, jobGeneration and desired jobId) in one transaction; then
have the worker read the outbox and call queues.ingest.add (or have a separate
process that drains outbox -> Bull queue) so the destructive reset and the
enqueue intent are committed together and no data-loss window exists.

---

Nitpick comments:
In `@apps/backend/src/lib/__tests__/manualBookmark.test.ts`:
- Around line 31-45: The test should also assert that pipeline generation is
never started for invalid URLs: in the 'rejects unsupported URL protocols before
queueing work' spec (which calls queueManualBookmark), add an expectation that
beginUserPipelineRun was not invoked (e.g.,
expect(beginUserPipelineRun).not.toHaveBeenCalled()) alongside the existing
expect for queues.ingest.add to ensure invalid input doesn't trigger a pipeline
run.

In `@apps/backend/src/lib/cancellation.ts`:
- Around line 70-87: isUserCancelled currently does a Supabase read via
readPipelineControl on every call, causing read amplification now that the
in-memory set was removed; change it to use a short-lived memoization or allow a
per-invocation cache so repeated checks for the same user in one job don't
repeat DB reads. Modify isUserCancelled (and callers) to accept an optional
cache Map<string, boolean | PipelineControl> or a job-scoped context object,
check that cache first, populate it with the readPipelineControl result on a
miss, and return the cached value on subsequent calls; alternatively, add a
helper getOrLoadPipelineControl(userId, cache) that wraps readPipelineControl to
centralize this behavior. Ensure the unique symbols referenced are
isUserCancelled and readPipelineControl so callers can pass the cache and avoid
per-item DB round-trips.

In `@apps/backend/src/queues/__tests__/embedding.test.ts`:
- Around line 91-121: Add explicit assertions that isUserCancelled was called
twice and verify each call: after the embeddingProcessor(job) invocation add
expect(isUserCancelled).toHaveBeenCalledTimes(2) and then assert the individual
calls with expect(isUserCancelled).toHaveBeenNthCalledWith(1, 'user-2', 7) and
expect(isUserCancelled).toHaveBeenNthCalledWith(2, 'user-2', 7) so both the
pre-start and pre-status-update cancellation checkpoints in the test are
covered.

In `@apps/backend/src/queues/__tests__/enrichment.test.ts`:
- Around line 91-117: The test only asserts a single cancellation check but
enrichmentProcessor calls isUserCancelled three times (before start, after
fetch, before enqueue); update the test to assert isUserCancelled was called
exactly three times and verify the exact sequence/arguments using
toHaveBeenNthCalledWith for each call (e.g., isUserCancelled called 1st with
'user-2', 5; 2nd with 'user-2', 5; 3rd with 'user-2', 5) and also assert
queues.embedding.add was called once with the expected payload and jobId to
ensure enqueue still happened.

In `@supabase/migrations/20260529015619_add_pipeline_job_generation.sql`:
- Around line 4-5: The composite index idx_user_pipeline_controls_cancelled on
(user_id, is_cancelled, job_generation) appears unnecessary given
cancellation.ts only filters by user_id and upsert(..., { onConflict: 'user_id'
}) requires a unique/index on user_id; verify whether queries actually filter by
is_cancelled and if not, replace the composite index with a single-column unique
index/constraint or plain index on user_id (or drop the composite and ensure a
unique constraint exists for user_id used by upsert), removing is_cancelled and
job_generation from the index to reduce write overhead.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae766943-fac6-40fa-a050-35048b938603

📥 Commits

Reviewing files that changed from the base of the PR and between 2175aff and eb99fec.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (38)
  • apps/backend-e2e/tests/backend.test.ts
  • apps/backend/package.json
  • apps/backend/src/db/fakeSupabase.ts
  • apps/backend/src/db/index.ts
  • apps/backend/src/index.ts
  • apps/backend/src/lib/__tests__/cancellation.test.ts
  • apps/backend/src/lib/__tests__/manualBookmark.test.ts
  • apps/backend/src/lib/__tests__/queue.test.ts
  • apps/backend/src/lib/__tests__/snapshotRpcSql.test.ts
  • apps/backend/src/lib/cancellation.ts
  • apps/backend/src/lib/manualBookmark.ts
  • apps/backend/src/lib/queue.ts
  • apps/backend/src/queues/__tests__/clustering.test.ts
  • apps/backend/src/queues/__tests__/embedding.test.ts
  • apps/backend/src/queues/__tests__/enrichment.test.ts
  • apps/backend/src/queues/__tests__/ingest.test.ts
  • apps/backend/src/queues/clustering.ts
  • apps/backend/src/queues/embedding.ts
  • apps/backend/src/queues/enrichment.ts
  • apps/backend/src/queues/ingest.ts
  • apps/backend/src/routes/backups.ts
  • apps/backend/src/routes/ingest.ts
  • apps/backend/src/scripts/smoke.ts
  • apps/extension/package.json
  • apps/extension/src/hooks/useBookmarkWeaver.ts
  • apps/extension/src/lib/__tests__/chromeApplyPlan.test.ts
  • apps/extension/src/lib/__tests__/processingSession.test.ts
  • apps/extension/src/lib/__tests__/structureClient.test.ts
  • apps/extension/src/lib/__tests__/structurePreviewBuilder.test.ts
  • apps/extension/src/lib/backupClient.ts
  • apps/extension/src/lib/chromeApplyPlan.ts
  • apps/extension/src/lib/processingSession.ts
  • apps/extension/src/lib/structureClient.ts
  • apps/extension/src/lib/structurePreviewBuilder.ts
  • package.json
  • supabase/migrations/20260529015351_harden_snapshot_rpc_search_path.sql
  • supabase/migrations/20260529015619_add_pipeline_job_generation.sql
  • turbo.json

Comment thread apps/backend/src/db/fakeSupabase.ts Outdated
Comment on lines +52 to +108
const state = {
users: new Map<string, { id: string; is_premium: boolean }>([
['status-user', { id: 'status-user', is_premium: false }],
['premium-user', { id: 'premium-user', is_premium: true }],
]),
bookmarks: [
{ id: 'pending-1', user_id: 'status-user', status: 'pending' },
{ id: 'enriched-1', user_id: 'status-user', status: 'enriched' },
{
id: 'embedded-1',
user_id: 'status-user',
status: 'embedded',
title: 'Embedded One',
ai_title: 'AI Embedded One',
description: 'Fixture bookmark',
url: 'https://example.com/embedded-one',
chrome_id: 'chrome-embedded-1',
},
{ id: 'embedded-2', user_id: 'status-user', status: 'embedded' },
{ id: 'error-1', user_id: 'status-user', status: 'error' },
] as BookmarkRow[],
clusters: [
{ id: 'cluster-1', user_id: 'status-user', name: 'Fixture Cluster', parent_id: null },
] as ClusterRow[],
assignments: [
{
cluster_id: 'cluster-1',
bookmark_id: 'embedded-1',
user_id: 'status-user',
clusters: { user_id: 'status-user' },
bookmarks: {
title: 'Embedded One',
ai_title: 'AI Embedded One',
description: 'Fixture bookmark',
url: 'https://example.com/embedded-one',
chrome_id: 'chrome-embedded-1',
},
},
] as AssignmentRow[],
snapshots: [
{
id: 'snapshot-1',
user_id: 'status-user',
name: 'Fixture Snapshot',
created_at: '2026-05-29T00:00:00.000Z',
snapshot_clusters: [
{ id: 'snapshot-cluster-1', snapshot_assignments: [{ count: 1 }] },
],
},
] as SnapshotRow[],
controls: new Map<string, {
user_id: string;
is_cancelled: boolean;
job_generation: number;
updated_at: string;
}>(),
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
fd -i backend.test.ts apps/backend-e2e
rg -nP -C3 'reset|clearTestQueuedJobs|state\b' --type=ts -g '**/backend-e2e/**'

Repository: Erics1337/link-loom

Length of output: 102


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Confirm fakeSupabase exports (look for any reset helper)
rg -n "fakeSupabase|resetFakeSupabaseState|reset" apps/backend/src/db/fakeSupabase.ts || true
sed -n '1,220p' apps/backend/src/db/fakeSupabase.ts | nl -ba | head -n 220

# 2) Find where fakeSupabase is imported/used in the backend-e2e app
fd -a "fakeSupabase.ts" . || true
rg -n "fakeSupabase" apps/backend-e2e -S || true
rg -n "fakeSupabase" apps/backend -S || true

# 3) Inspect e2e test file(s) for beforeEach/afterEach and any state resets
sed -n '1,240p' apps/backend-e2e/tests/backend.test.ts | nl -ba | head -n 240

# 4) Search e2e suite for hooks that might reset modules/state
rg -n "beforeEach|afterEach|beforeAll|afterAll|jest\.resetModules|vi\.resetModules|clearAllMocks|resetAllMocks" apps/backend-e2e -S --type=ts || true

# 5) Search e2e suite for job-queue clearing (from original comment)
rg -n "clearTestQueuedJobs|queuedJobs|TestQueuedJobs" apps/backend-e2e -S --type=ts || true

Repository: Erics1337/link-loom

Length of output: 193


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Re-run with cat/sed (no nl dependency)

echo "== fakeSupabase.ts exports/search =="
rg -n "fakeSupabase|resetFakeSupabaseState|reset|beforeEach|afterEach" apps/backend/src/db/fakeSupabase.ts || true
echo
echo "== state definition + mutation methods (rough ranges) =="
sed -n '1,220p' apps/backend/src/db/fakeSupabase.ts | cat -n | head -n 220

echo
echo "== executeDelete/executeUpdate/upsert/snapshot RPCs (broader ranges) =="
sed -n '220,520p' apps/backend/src/db/fakeSupabase.ts | cat -n | head -n 300

echo
echo "== snapshot id generation (find snapshot-${ pattern) =="
rg -n "snapshot-\\$\\{|snapshot-" apps/backend/src/db/fakeSupabase.ts || true

echo
echo "== where fakeSupabase is imported in e2e =="
rg -n "fakeSupabase" apps/backend-e2e -S --type=ts || true

echo
echo "== e2e test setup hooks =="
fd -a -e "backend.test.ts" apps/backend-e2e/tests || true
sed -n '1,260p' apps/backend-e2e/tests/backend.test.ts | cat -n | head -n 260

echo
echo "== e2e suite for resetModules/clearing module cache =="
rg -n "jest\\.resetModules|vi\\.resetModules|resetModules|clearModule|resetAllMocks|clearAllMocks" apps/backend-e2e -S --type=ts || true

Repository: Erics1337/link-loom

Length of output: 13970


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== backend-e2e test files =="
fd -a --extension ts "backend.test.ts" apps/backend-e2e || true
fd -a --type f --extension ts apps/backend-e2e/tests | head -n 50 || true

echo
echo "== where fakeSupabase is imported/used in backend-e2e =="
rg -n "from .*fakeSupabase|fakeSupabase|buildFakeSupabase" apps/backend-e2e -S --type=ts || true

echo
echo "== e2e hooks that could reset module/state =="
rg -n "beforeEach|afterEach|beforeAll|afterAll|jest\.resetModules|vi\.resetModules|resetModules|jest\.reset|vi\.reset|clearAllMocks|clearAllTimers|setupFilesAfterEnv" apps/backend-e2e -S --type=ts || true

echo
echo "== inspect backend.test.ts (imports + hooks) =="
if [ -f apps/backend-e2e/tests/backend.test.ts ]; then
  sed -n '1,220p' apps/backend-e2e/tests/backend.test.ts | cat -n
fi

echo
echo "== inspect fakeSupabase.ts for any reset helper export =="
rg -n "reset|clear.*state|init.*state|exports\.(reset|clear)" apps/backend/src/db/fakeSupabase.ts || true

echo
echo "== inspect remainder of fakeSupabase.ts around snapshot RPC and update/delete mutations =="
sed -n '180,420p' apps/backend/src/db/fakeSupabase.ts | cat -n

Repository: Erics1337/link-loom

Length of output: 16957


Add a reset hook for fakeSupabase’s module-level state (e2e order dependence)

  • apps/backend/src/db/fakeSupabase.ts has mutable module-level state and mutates it in-place (delete/update and RPC create_structure_snapshot) but exports no reset/clear helper; apps/backend-e2e/tests/backend.test.ts starts the backend once in before() and runs multiple it() blocks against that same long-lived process, so state can accumulate across tests.
  • create_structure_snapshot generates snapshot-${state.snapshots.length + 1}, which can collide after structure_snapshots deletions; switch to a monotonic counter/UUID.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/db/fakeSupabase.ts` around lines 52 - 108, The module-level
mutable state object (state) in fakeSupabase is shared across tests and needs a
reset helper and stable snapshot IDs: add and export a reset function (e.g.,
resetFakeSupabase or clearFakeSupabaseState) that reinitializes state.users,
state.bookmarks, state.clusters, state.assignments, state.snapshots and
state.controls to their original fixtures (clearing Maps/arrays in-place or
replacing with fresh instances) and also reset any new monotonic counter;
replace the current create_structure_snapshot ID generation (which uses
snapshot-${state.snapshots.length + 1}) with a monotonic counter or UUID (e.g.,
snapshotCounter++ or a UUID generator) stored at module scope so IDs never
collide after deletions, and ensure resetFakeSupabase resets that counter too so
tests get a clean slate.

Comment thread apps/backend/src/db/index.ts Outdated
Comment thread apps/backend/src/index.ts Outdated
Comment thread apps/backend/src/lib/cancellation.ts Outdated
Comment thread apps/backend/src/routes/ingest.ts Outdated
console.error('[INGEST] Failed to ensure user exists:', userError);
return reply.code(500).send({ error: 'Failed to initialize user' });
}
const jobGeneration = await beginUserPipelineRun(userId);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Move beginUserPipelineRun below the remaining early-return paths.

Line 51 advances the user's generation before the free-tier rejection and before either cleanup query. A 402 or 500 from this handler now cancels the currently running pipeline even though no replacement job was queued.

Suggested direction
-        const jobGeneration = await beginUserPipelineRun(userId);
-
         const isPremium = await getUserPremiumStatus(userId);
@@
         if (deleteError) {
             console.error(`[INGEST] Failed to clear old clusters for user ${userId}`, deleteError);
             return reply.code(500).send({ error: 'Failed to clear existing clusters' });
         }
 
+        const jobGeneration = await beginUserPipelineRun(userId);
         console.log(`[INGEST] Cleared old clusters for user ${userId}`);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/routes/ingest.ts` at line 51, Call to beginUserPipelineRun
currently occurs before the handler's early-return checks (free-tier rejection
and cleanup query paths) so a 402/500 can cancel an in-flight pipeline; move the
await beginUserPipelineRun(userId) invocation so it runs after all early-return
validations and after running the cleanup/query logic (i.e., only after
free-tier eligibility is confirmed and any cleanup queries have completed),
ensuring the pipeline is only advanced when the request will proceed and no
replacement job is needed.

Comment thread apps/backend/src/scripts/smoke.ts Outdated
Comment on lines +813 to +818
const restoreBookmarkBackup = useCallback(async (backupId: string) => {
if (!accountUserId || !canSaveAccountBackups) throw new Error('Create a free account to restore backups.');

const res = await fetch(`${BACKEND_URL}/backups/${accountUserId}/${backupId}/restore`, {
method: 'POST',
headers: getAuthHeaders()
});

if (!res.ok) throw new Error('Failed to restore structure snapshot');

// Automatically fetch the restored structured results
await fetchResults(accountUserId);
}, [accountUserId, canSaveAccountBackups, fetchResults, getAuthHeaders]);
await backupClient.restoreBookmarkBackup(backupId);
if (accountUserId) {
await fetchResults(accountUserId);
}
}, [accountUserId, backupClient, fetchResults]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Refresh the bookmark snapshot before rebuilding the preview after restore.

After Line 814, fetchResults() still reuses originalTreeRef, bookmarkRootMapRef, and availableRootsRef unless they are empty. Since restoreBookmarkBackup() mutates Chrome bookmarks, the restored preview can be rebuilt against the pre-restore tree and show stale root/overflow placement until the next full weave.

💡 Proposed fix
 const restoreBookmarkBackup = useCallback(async (backupId: string) => {
     await backupClient.restoreBookmarkBackup(backupId);
+    await loadCurrentBookmarkTreeSnapshot();
     if (accountUserId) {
         await fetchResults(accountUserId);
     }
-}, [accountUserId, backupClient, fetchResults]);
+}, [accountUserId, backupClient, fetchResults, loadCurrentBookmarkTreeSnapshot]);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/extension/src/hooks/useBookmarkWeaver.ts` around lines 813 - 818,
restoreBookmarkBackup currently calls fetchResults after restoring Chrome
bookmarks, but fetchResults will reuse originalTreeRef, bookmarkRootMapRef, and
availableRootsRef if they are non-empty, causing the preview to be rebuilt
against stale pre-restore state; before calling fetchResults in
restoreBookmarkBackup, refresh or clear the in-memory bookmark snapshot (e.g.
reset originalTreeRef, bookmarkRootMapRef, availableRootsRef or call the
existing snapshot-refresh helper) so fetchResults reads the updated Chrome
bookmark tree and rebuilds the preview from the restored data.

Comment thread apps/extension/src/lib/backupClient.ts Outdated
Comment thread apps/extension/src/lib/chromeApplyPlan.ts Outdated
Comment thread apps/extension/src/lib/processingSession.ts
Erics1337 and others added 13 commits May 29, 2026 01:09
- Removed unused dependencies from pnpm-lock.yaml and package.json.
- Introduced new utility functions for title casing and domain extraction in textLabels.ts.
- Updated bookmarkRename.ts to utilize the new utility functions and improve code readability.
- Refactored SQL test utilities for better maintainability.
- Enhanced error handling and logging in various routes and processors.
Only increment movedCount after a successful chrome.bookmarks.move so
bookmarks already in the target parent are not counted as moved.

Co-authored-by: Cursor <cursoragent@cursor.com>
Erics1337 and others added 7 commits June 3, 2026 01:42
Bookmark scan nodes without a title should still produce valid entries
instead of leaving title undefined.

Co-authored-by: Cursor <cursoragent@cursor.com>
Use setStructureAssignments updater form so stats stay in sync when
dead-link cleanup runs without a stale assignments closure.

Co-authored-by: Cursor <cursoragent@cursor.com>
When the backend rejects an oversized weave, save the emptied overflow
queue so a retried run does not resurrect stale overflow state.

Co-authored-by: Cursor <cursoragent@cursor.com>
Upstream non-JSON responses now surface a readable error payload
instead of an empty object.

Co-authored-by: Cursor <cursoragent@cursor.com>
Handle invalid and future timestamps gracefully and coalesce a missing
bookmark count to zero in the links panel header.

Co-authored-by: Cursor <cursoragent@cursor.com>
Move the hardcoded effective date into each legal page so future policy
updates can set dates independently.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Added rate limiting functionality to various API endpoints, including checkout and waitlist routes, to prevent abuse.
- Introduced origin enforcement to ensure requests come from expected sources, enhancing security.
- Updated error handling to provide clearer responses for rate limit and origin violations.
- Refactored existing error messages for consistency and clarity across the API.
@gitguardian

gitguardian Bot commented Jun 7, 2026

Copy link
Copy Markdown

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
33778095 Triggered JSON Web Token 7384b7f apps/backend/src/lib/tests/sanitizeQueueError.test.ts View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@socket-security

socket-security Bot commented Jun 7, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpsl@​1.15.010010010080100
Addedcross-env@​7.0.310010010082100

View full report

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/backend/src/lib/queue.ts (1)

105-109: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard against negative queue delays before sending to SQS.

At Line 108, any non-zero options.delay is accepted, so a negative delay becomes a negative DelaySeconds and causes a runtime SQS request failure.

Suggested fix
+        if (options.delay !== undefined && options.delay < 0) {
+            throw new Error('Queue delay must be >= 0');
+        }
+
         if (queueDriver === 'sqs') {
@@
             await sqs.send(new SendMessageCommand({
                 QueueUrl: queueUrl,
                 MessageBody: JSON.stringify(body),
-                DelaySeconds: options.delay ? Math.min(Math.ceil(options.delay / 1000), 900) : undefined,
+                DelaySeconds:
+                    options.delay && options.delay > 0
+                        ? Math.min(Math.ceil(options.delay / 1000), 900)
+                        : undefined,
             }));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lib/queue.ts` around lines 105 - 109, The SendMessageCommand
call can send a negative DelaySeconds when options.delay is negative; update the
calculation for DelaySeconds (used in the SendMessageCommand) to clamp
options.delay to the valid range by first taking Math.max(0, options.delay) and
then Math.min(900, ceil(.../1000)) so DelaySeconds is never negative and still
capped at 900; apply this change where sqs.send(new SendMessageCommand(...)) is
constructed to ensure DelaySeconds is always an integer between 0 and 900.
🧹 Nitpick comments (6)
infra/terraform/variables.tf (1)

109-113: ⚡ Quick win

Add validation for rate_limit_store accepted values.

Constrain this variable to known values (supabase, memory) to prevent silent misconfiguration.

Suggested Terraform validation
 variable "rate_limit_store" {
   description = "Rate limit store. Use supabase for the shared app limiter, or memory only for local fallback."
   type        = string
   default     = "supabase"
+  validation {
+    condition     = contains(["supabase", "memory"], var.rate_limit_store)
+    error_message = "rate_limit_store must be one of: supabase, memory."
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@infra/terraform/variables.tf` around lines 109 - 113, The variable
rate_limit_store currently allows any string which can lead to silent
misconfiguration; add a Terraform validation block to variable
"rate_limit_store" that restricts allowed values to "supabase" and "memory" (use
a condition like contains(["supabase","memory"], var.rate_limit_store)) and
provide a clear error_message explaining the accepted values so invalid inputs
fail fast.
apps/backend/src/routes/backups.ts (1)

35-53: 💤 Low value

Handler uses any types, losing type safety.

snapshotMutationHandler parameters req: any and reply: any bypass Fastify's type system. Consider using FastifyRequest and FastifyReply for better type inference and IDE support.

♻️ Suggested type improvement
+import type { FastifyRequest } from 'fastify';
+
 const snapshotMutationHandler =
     (
         status: string,
         logLabel: string,
         failureMessage: string,
         action: (userId: string, snapshotId: string) => Promise<void>
     ) =>
-    async (req: any, reply: any) => {
+    async (req: FastifyRequest<{ Params: { userId: string; snapshotId: string } }>, reply: FastifyReply) => {
         const userId = await requireRequestUserId(req, reply);
         if (!userId) return reply;
-        const { snapshotId } = req.params as { snapshotId: string };
+        const { snapshotId } = req.params;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/routes/backups.ts` around lines 35 - 53, The handler
currently types req and reply as any which loses Fastify typing; update
snapshotMutationHandler to accept FastifyRequest and FastifyReply types (import
them) and use a request generic for params (e.g., FastifyRequest<{ Params: {
snapshotId: string } }>) so snapshotId is strongly typed, and tighten reply to
FastifyReply; ensure this change compiles with requireRequestUserId(req, reply)
and runSnapshotMutation(reply, ...) by keeping their call signatures unchanged.
apps/backend/src/lib/cancellation.ts (1)

62-73: 💤 Low value

Fallback path returns empty id which may cause silent issues downstream.

When the RPC returns an unexpected shape (not an object with id/generation), the fallback returns { id: '', generation: ... }. Callers using pipelineRun.id in job IDs (e.g., manualBookmark.ts line 93) would produce malformed identifiers. Consider throwing an error or logging a warning when the response shape is unexpected.

🛠️ Suggested defensive improvement
     if (data && typeof data === 'object' && !Array.isArray(data)) {
         const candidate = data as { id?: unknown; generation?: unknown };
         return {
             id: String(candidate.id ?? ''),
             generation: Number(candidate.generation ?? 0),
         };
     }

-    return {
-        id: '',
-        generation: Number(data ?? 0),
-    };
+    console.warn(`[CANCEL] Unexpected RPC response shape for user ${userId}:`, data);
+    return {
+        id: '',
+        generation: Number(data ?? 0),
+    };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lib/cancellation.ts` around lines 62 - 73, The fallback
branch in apps/backend/src/lib/cancellation.ts currently returns an empty id
when data isn't an object (the block that creates candidate and returns { id:
'', generation: ... }), which can cause malformed pipelineRun.id downstream;
change this branch to fail loudly instead: validate the shape (presence of
id/generation), and if invalid, throw a descriptive Error (or at minimum
processLogger.warn with the unexpected data and throw) rather than returning id:
''. Update the fallback that uses candidate/id/generation to either throw an
Error indicating "unexpected cancellation RPC shape" (including the raw data) so
callers like manualBookmark.ts that read pipelineRun.id don't get empty strings,
or return a typed null/undefined and update callers to handle it. Ensure the
thrown message includes the offending data for debugging.
apps/web/app/api/checkout/session/route.ts (1)

20-22: Review: Stripe sessionId regex /^cs_(test|live)_[A-Za-z0-9]+$/

  • The “underscore in the suffix” concern doesn’t match Stripe-generated Checkout session IDs (they’re typically treated as opaque IDs with an alphanumeric suffix after cs_test_ / cs_live_); underscores are mainly relevant to custom IDs.
  • Still, since Stripe doesn’t publish a formal regex for all generated formats, consider loosening the validation to at least require the cs_(test|live)_ prefix (optionally with a length check) to avoid brittleness.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/api/checkout/session/route.ts` around lines 20 - 22, The current
validation in route.ts rejects valid Stripe session IDs by enforcing an overly
strict regex; update the check around sessionId (the conditional using
/^cs_(test|live)_[A-Za-z0-9]+$/ and the subsequent NextResponse.json error) to
instead verify that sessionId is a non-empty string that starts with the prefix
"cs_test_" or "cs_live_" (optionally add a reasonable max/min length check) and
return the same 400 response only when that prefix/length requirement fails.
apps/backend/src/queues/clustering/pipelineRecovery.ts (1)

98-117: ⚡ Quick win

Consider parallel job queuing for better recovery throughput.

Sequential await in for-loops limits recovery speed when many jobs need re-queuing. Using Promise.all (with optional batching) would improve throughput.

♻️ Suggested parallel queuing
-    for (const enrichmentJob of toQueueEnrichment) {
-        await queues.enrichment.add('enrich', {
+    await Promise.all(
+        toQueueEnrichment.map((enrichmentJob) =>
+            queues.enrichment.add('enrich', {
             userId,
             pipelineRunId,
             jobGeneration,
             bookmarkId: enrichmentJob.bookmarkId,
-            url: enrichmentJob.url
-        });
-    }
+            url: enrichmentJob.url,
+            })
+        )
+    );

-    for (const embeddingJob of toQueueEmbedding) {
-        await queues.embedding.add('embed', {
+    await Promise.all(
+        toQueueEmbedding.map((embeddingJob) =>
+            queues.embedding.add('embed', {
             userId,
             pipelineRunId,
             jobGeneration,
             bookmarkId: embeddingJob.bookmarkId,
             url: embeddingJob.url,
-            text: embeddingJob.text
-        });
-    }
+            text: embeddingJob.text,
+            })
+        )
+    );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/queues/clustering/pipelineRecovery.ts` around lines 98 -
117, The current sequential awaits in the two for-loops (iterating over
toQueueEnrichment and toQueueEmbedding and calling queues.enrichment.add /
queues.embedding.add) slow recovery; change to enqueue jobs in parallel by
mapping each array to promises and awaiting Promise.all (or implement batching
for large arrays) so all enrichment jobs and all embedding jobs are dispatched
concurrently, preserving the same payload fields (userId, pipelineRunId,
jobGeneration, bookmarkId, url, text) and error handling around the Promise.all
to surface failures from pipelineRecovery's re-queue logic.
apps/backend/src/queues/clustering.ts (1)

29-35: ⚡ Quick win

Consider async logging to avoid blocking the event loop.

fs.appendFileSync blocks the Node.js event loop on every log call. Under high clustering load, this can degrade throughput. Consider using an async logger, buffered writes, or a dedicated logging library.

♻️ Suggested async alternative
+import { appendFile } from 'fs/promises';
+
 const logFile = path.resolve(process.cwd(), 'clustering-debug.log');
+const logQueue: string[] = [];
+let flushScheduled = false;
+
+async function flushLogs() {
+    if (logQueue.length === 0) return;
+    const batch = logQueue.splice(0, logQueue.length).join('');
+    await appendFile(logFile, batch);
+    flushScheduled = false;
+    if (logQueue.length > 0) scheduleFlush();
+}
+
+function scheduleFlush() {
+    if (!flushScheduled) {
+        flushScheduled = true;
+        setImmediate(flushLogs);
+    }
+}

 function log(msg: string) {
     const timestamp = new Date().toISOString();
-    fs.appendFileSync(logFile, `[${timestamp}] ${msg}\n`);
+    logQueue.push(`[${timestamp}] ${msg}\n`);
+    scheduleFlush();
     console.log(msg);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/queues/clustering.ts` around lines 29 - 35, The current log
function uses fs.appendFileSync which blocks the event loop; change it to an
async, non-blocking implementation by replacing fs.appendFileSync(logFile, ...)
with an async writer (e.g., use fs.promises.appendFile or better, create a
persistent write stream via fs.createWriteStream(logFile, { flags: 'a' }) and
call stream.write with backpressure handling), or swap to a non-blocking logger
library (e.g., pino/winston) that writes to logFile; update the log function
(and any callers if needed) to use the async writer/stream or return a Promise
so logging no longer blocks and handle errors from the async write.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/backend-e2e/tests/backend.test.ts`:
- Around line 120-130: resolveSupabaseEnv currently skips calling
parseSupabaseStatusEnv when SUPABASE_SERVICE_ROLE_KEY and SUPABASE_ANON_KEY
exist, so if SUPABASE_URL is not set it falls back to the hardcoded
"http://127.0.0.1:56421"; change the logic so parseSupabaseStatusEnv() is
consulted for SUPABASE_URL (and API_URL) even when keys are preconfigured: e.g.,
always populate statusEnv from parseSupabaseStatusEnv() unless SUPABASE_URL is
explicitly set, then compute url using process.env.SUPABASE_URL ??
statusEnv.SUPABASE_URL ?? statusEnv.API_URL ?? fallback; update references to
statusEnv, parseSupabaseStatusEnv, SUPABASE_SERVICE_ROLE_KEY, SUPABASE_ANON_KEY,
SUPABASE_URL and API_URL accordingly.

In `@apps/backend/src/index.ts`:
- Around line 74-77: The current origin check in the conditional that uses
allowedOrigins and origin.startsWith("chrome-extension://") lets all Chrome
extensions bypass the CORS allowlist; update the logic in
apps/backend/src/index.ts so that extension origins are validated against the
configured allowedOrigins (i.e., only accept origin if
allowedOrigins.includes(origin) even for chrome-extension:// origins) or, if you
need a wildcard for development, gate the unrestricted
origin.startsWith("chrome-extension://") branch behind a non-production
environment check (use NODE_ENV or the existing env flag). Locate the if that
references allowedOrigins and origin.startsWith("chrome-extension://") and
replace the unconditional startsWith branch with a check that either matches
allowedOrigins or is allowed only in non-production.

In `@apps/backend/src/lambda/worker.ts`:
- Around line 99-101: The awaited call to notifyPipelineBookmarkTerminal(userId,
jobGeneration, pipelineRunId, bookmarkId, clusteringSettings) can throw and
abort processing the whole SQS batch; wrap that call in a local try/catch inside
the function that currently awaits it (the caller in
processEvent/recordQueueJobFailure flow), catch and log the error, and ensure
the function still returns normally so that recordQueueJobFailure can add the
per-record batchItemFailure for this message rather than letting the exception
bubble and cause Lambda to replay the entire batch.

In `@apps/backend/src/lib/pipelineCoordinator.ts`:
- Around line 169-175: The clustering job payload currently sends { userId,
pipelineRunId, clusteringSettings } which omits jobGeneration when pipelineRunId
is undefined; update the queues.clustering.add call so the payload includes
jobGeneration (e.g., send { userId, pipelineRunId, jobGeneration,
clusteringSettings }) so workers always receive a run identifier; modify the
payload in the function that calls queues.clustering.add (the block that builds
jobId and calls queues.clustering.add) to include jobGeneration.

In `@apps/backend/src/lib/rateLimit.ts`:
- Around line 21-25: The parsed environment values for WINDOW_MS and
CLEANUP_INTERVAL_MS (and the other parses around those same constants) may
produce NaN or non-positive numbers; validate and clamp them after
Number.parseInt by checking Number.isFinite and >0, falling back to safe
defaults (e.g., 60000 for WINDOW_MS) and ensuring CLEANUP_INTERVAL_MS is at
least WINDOW_MS or another sane minimum. Update the code that sets WINDOW_MS,
CLEANUP_INTERVAL_MS (and the similar parses at the other occurrence) to perform
this validation/clamping and log or silently default when the env value is
invalid.
- Around line 54-60: The getClientKey implementation currently trusts raw
x-forwarded-for and should instead use Fastify's req.ip (so update getClientKey
to return req.ip || "unknown" and ensure Fastify trustProxy is enabled in server
config rather than parsing headers here); also harden all env parsing for
RATE_LIMIT_WINDOW_MS, RATE_LIMIT_CLEANUP_INTERVAL_MS and the auth/write/default
limit variables by using parseInt(value, 10) (or Number(...)) plus validation
(Number.isFinite and > 0) and fall back to safe default constants when invalid,
to avoid NaN/zero/negative values breaking the rate limiter or cleanup logic
(update the variables where they are declared/initialized in this file).

In `@apps/backend/src/queues/clustering/clusterAlgorithm.ts`:
- Around line 39-51: computeCentroid returns a direct reference to vecs[0] when
there is a single vector, which can lead to accidental mutation of the original
cluster vector (e.g., in rebalanceSmallGroups). Change the single-vector early
return in computeCentroid to return a copy of the array (e.g., clone via spread
or slice) instead of vecs[0], ensuring callers receive an independent centroid
array and cannot mutate the source vector.

In `@apps/backend/src/queues/clustering/clusterPersistence.ts`:
- Around line 107-111: The current batch insert error handling in
clusterPersistence.ts merely logs failures (the log call referencing clusterId
and batchBookmarkIds) and continues, which can leave partial
cluster_assignments; change the handler to fail fast by propagating the error
instead of swallowing it—either throw the received error (or return a rejected
Promise) immediately after logging so the job retries and state remains
consistent; locate the block where log(`Batch insert error for cluster
${clusterId}... ${JSON.stringify(error)}`) is called and replace the
silent-continue behavior with an explicit throw/return of the error.

In `@apps/backend/src/routes/auth.ts`:
- Around line 47-95: The current check-then-insert on user_devices (the
count/select then insert flow using supabase.select(..., {count:'exact',
head:true}), maybeSingle(), update, and insert) is race-prone; replace it with a
single atomic DB operation: implement a Postgres RPC (stored procedure) or
transaction that does SELECT FOR UPDATE on the user's device rows or enforces a
per-user device limit via a CHECK/trigger, and then performs the insert/update
inside that function so the limit is enforced atomically; call that RPC from the
auth route (instead of doing count + insert in JS), return clear error codes
from the RPC for “limit reached” vs other DB errors, and update the route to
handle those returned error states instead of relying on the separate
count/insert logic.

In `@apps/backend/src/routes/ingest.ts`:
- Around line 172-178: The /trigger-clustering/:userId handler currently calls
beginUserPipelineRun without error handling, so RPC failures can throw; wrap the
beginUserPipelineRun call in a try/catch (same pattern as the /ingest handler)
and on error log the exception and return an HTTP 500 response. Ensure you still
call recordPipelineRunStarted only when beginUserPipelineRun succeeds and
reference the existing symbols beginUserPipelineRun and recordPipelineRunStarted
so the catch protects the pipelineRun usage and avoids unhandled exceptions.

In `@apps/backend/src/routes/status.ts`:
- Around line 201-207: The current code logs Supabase read errors (totalError,
pendingRawError, enrichedError, embeddedError, erroredError, clusterError,
assignmentError) but continues to build and return a 200 status payload, which
masks failures; change the logic in the status endpoint so that if any of these
error variables are non-null you short-circuit and return an error response
(e.g., throw or return a 500 with an error message) instead of sending the usual
successful body, ensuring the handler that constructs the status payload (where
these variables are used) surfaces the database failure to clients rather than
returning stale/zeroed counts.

In `@apps/extension/src/hooks/useBookmarkWeaver.ts`:
- Around line 202-221: The restoreStructureVersion path currently restores
clusters, stats and status but leaves destructive-tool source state
(structureAssignments and deadLinkChromeIdsRef) from the previous run, causing
stale Chrome IDs to be used; update the implementation of
restoreStructureVersion so that after loading the saved version it clears/resets
structureAssignments (e.g., replace with an empty map/object) and clears
deadLinkChromeIdsRef (e.g., reset to an empty Set/array), ensuring any refs used
by the destructive-tool are reinitialized when restoring a saved structure
version.

In `@apps/extension/src/hooks/useWeaveRun.ts`:
- Around line 116-123: The ensureProcessingIdentity function currently prefers a
local userId over an authenticated accountUserId, which can cause mis-keyed
polling/cancel; reorder the checks so you first test for accountUserId &&
authAccessToken, call setUserId(accountUserId) and return { userId:
accountUserId, accessToken: authAccessToken } if present, and only then fall
back to the existing userId && authAccessToken branch; reference
ensureProcessingIdentity, userId, accountUserId, authAccessToken, and setUserId
when making the change.
- Around line 223-283: Polling currently uses setInterval which starts a new
getStatus (structureClient.getStatus) every 2s even if the prior call is still
running, causing overlapping requests and out-of-order state updates; fix it by
serializing polls — either replace the setInterval loop with an async recursive
timer (await getStatus then setTimeout to schedule next run) or add an in-flight
guard (e.g., pollingInFlight boolean) checked at the start of the interval
callback to skip if true; ensure the guard/replacement preserves existing
terminalAction handling (getTerminalWeavingAction, clearInterval behavior or
stop recursion), still calls fetchResults, and still triggers clustering
recovery via shouldTriggerClusteringRecovery and
structureClient.triggerClustering without allowing concurrent getStatus calls.

In `@apps/extension/src/screens/LoginScreen.tsx`:
- Around line 86-88: The password input currently always has minLength={8} which
enforces sign-up rules during sign-in; update the input in LoginScreen (the
<input ... value={password} onChange={(event) =>
setPassword(event.target.value)} className="field" ... />) to only apply
minLength={8} when the component is in sign-up mode (e.g., check your isSignup /
mode flag) and remove the minLength prop when in sign-in mode so existing users
with shorter passwords aren’t blocked.

In `@apps/extension/src/screens/screenUtils.ts`:
- Around line 13-33: runWithWorkingId can run concurrently if called twice
quickly; serialize calls by introducing a module-scoped or hook-scoped promise
chain (e.g., pendingPromiseRef = React.useRef(Promise.resolve())) and chain each
invocation onto it. Change runWithWorkingId to take the current
pendingPromiseRef.current, replace it with pendingPromiseRef.current =
pendingPromiseRef.current.then(async () => { setWorkingId(itemId);
setMessage(null); try { await action(); if (messages.success)
setMessage(messages.success); } catch (error) { setMessage(error instanceof
Error ? error.message : messages.failure); } finally { setWorkingId(null); } });
and await/return pendingPromiseRef.current so subsequent calls run sequentially;
reference runWithWorkingId, setWorkingId, setMessage, pendingPromiseRef in your
changes.

In `@apps/web/app/login/page.tsx`:
- Line 234: The unconditional minLength={8} on the password input prevents
legacy users from signing in; change it so the minLength is only applied in
registration/sign-up mode (e.g., only when mode === 'signup' or when
isRegistering is true). Locate the password input in apps/web/app/login/page.tsx
(the element with minLength={8}) and replace the unconditional prop with a
conditional application of minLength (or only include the prop in the sign-up
form branch) so sign-in mode accepts shorter legacy passwords.

In `@infra/terraform/main.tf`:
- Around line 24-25: The API Gateway CORS configuration currently uses a
wildcard which overrides the stricter backend allowlist; update the gateway CORS
to wire its allow_origins to the same Terraform input used for the Lambda env
var (var.cors_allowed_origins) instead of ["*"]. Find the API Gateway CORS block
that sets allow_origins and replace the literal ["*"] with
var.cors_allowed_origins so gateway-level CORS matches the CORS_ALLOWED_ORIGINS
env var injected into the Lambdas.

In `@supabase/migrations/20260529015619_add_pipeline_job_generation.sql`:
- Around line 527-531: The current cluster_count in
public.get_pipeline_run_status_counts is computed as COUNT(*) from
public.clusters filtered only by p_user_id, which yields a user-wide count;
either scope it to the pipeline run by joining public.cluster_assignments ->
public.bookmarks and filtering bookmarks.pipeline_run_id = p_pipeline_run_id (so
cluster_count reflects clusters produced in this pipeline run) or explicitly
rename the column to something like user_cluster_count/total_clusters_for_user
to document it is not run-scoped; update the SELECT alias cluster_count and the
JOIN logic accordingly and ensure references to cluster_count in the function
body are adjusted to the new semantics or name.

---

Outside diff comments:
In `@apps/backend/src/lib/queue.ts`:
- Around line 105-109: The SendMessageCommand call can send a negative
DelaySeconds when options.delay is negative; update the calculation for
DelaySeconds (used in the SendMessageCommand) to clamp options.delay to the
valid range by first taking Math.max(0, options.delay) and then Math.min(900,
ceil(.../1000)) so DelaySeconds is never negative and still capped at 900; apply
this change where sqs.send(new SendMessageCommand(...)) is constructed to ensure
DelaySeconds is always an integer between 0 and 900.

---

Nitpick comments:
In `@apps/backend/src/lib/cancellation.ts`:
- Around line 62-73: The fallback branch in apps/backend/src/lib/cancellation.ts
currently returns an empty id when data isn't an object (the block that creates
candidate and returns { id: '', generation: ... }), which can cause malformed
pipelineRun.id downstream; change this branch to fail loudly instead: validate
the shape (presence of id/generation), and if invalid, throw a descriptive Error
(or at minimum processLogger.warn with the unexpected data and throw) rather
than returning id: ''. Update the fallback that uses candidate/id/generation to
either throw an Error indicating "unexpected cancellation RPC shape" (including
the raw data) so callers like manualBookmark.ts that read pipelineRun.id don't
get empty strings, or return a typed null/undefined and update callers to handle
it. Ensure the thrown message includes the offending data for debugging.

In `@apps/backend/src/queues/clustering.ts`:
- Around line 29-35: The current log function uses fs.appendFileSync which
blocks the event loop; change it to an async, non-blocking implementation by
replacing fs.appendFileSync(logFile, ...) with an async writer (e.g., use
fs.promises.appendFile or better, create a persistent write stream via
fs.createWriteStream(logFile, { flags: 'a' }) and call stream.write with
backpressure handling), or swap to a non-blocking logger library (e.g.,
pino/winston) that writes to logFile; update the log function (and any callers
if needed) to use the async writer/stream or return a Promise so logging no
longer blocks and handle errors from the async write.

In `@apps/backend/src/queues/clustering/pipelineRecovery.ts`:
- Around line 98-117: The current sequential awaits in the two for-loops
(iterating over toQueueEnrichment and toQueueEmbedding and calling
queues.enrichment.add / queues.embedding.add) slow recovery; change to enqueue
jobs in parallel by mapping each array to promises and awaiting Promise.all (or
implement batching for large arrays) so all enrichment jobs and all embedding
jobs are dispatched concurrently, preserving the same payload fields (userId,
pipelineRunId, jobGeneration, bookmarkId, url, text) and error handling around
the Promise.all to surface failures from pipelineRecovery's re-queue logic.

In `@apps/backend/src/routes/backups.ts`:
- Around line 35-53: The handler currently types req and reply as any which
loses Fastify typing; update snapshotMutationHandler to accept FastifyRequest
and FastifyReply types (import them) and use a request generic for params (e.g.,
FastifyRequest<{ Params: { snapshotId: string } }>) so snapshotId is strongly
typed, and tighten reply to FastifyReply; ensure this change compiles with
requireRequestUserId(req, reply) and runSnapshotMutation(reply, ...) by keeping
their call signatures unchanged.

In `@apps/web/app/api/checkout/session/route.ts`:
- Around line 20-22: The current validation in route.ts rejects valid Stripe
session IDs by enforcing an overly strict regex; update the check around
sessionId (the conditional using /^cs_(test|live)_[A-Za-z0-9]+$/ and the
subsequent NextResponse.json error) to instead verify that sessionId is a
non-empty string that starts with the prefix "cs_test_" or "cs_live_"
(optionally add a reasonable max/min length check) and return the same 400
response only when that prefix/length requirement fails.

In `@infra/terraform/variables.tf`:
- Around line 109-113: The variable rate_limit_store currently allows any string
which can lead to silent misconfiguration; add a Terraform validation block to
variable "rate_limit_store" that restricts allowed values to "supabase" and
"memory" (use a condition like contains(["supabase","memory"],
var.rate_limit_store)) and provide a clear error_message explaining the accepted
values so invalid inputs fail fast.
🪄 Autofix (Beta)

✅ Autofix completed


ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7413df5b-aeaa-4408-8b46-1202e0dea8f2

📥 Commits

Reviewing files that changed from the base of the PR and between eb99fec and 3f6adac.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (119)
  • .gitignore
  • apps/backend-e2e/package.json
  • apps/backend-e2e/tests/backend.test.ts
  • apps/backend/ARCHITECTURE.md
  • apps/backend/package.json
  • apps/backend/src/db/index.ts
  • apps/backend/src/index.ts
  • apps/backend/src/lambda/__tests__/worker.test.ts
  • apps/backend/src/lambda/worker.ts
  • apps/backend/src/lib/__tests__/cancellation.test.ts
  • apps/backend/src/lib/__tests__/manualBookmark.test.ts
  • apps/backend/src/lib/__tests__/pipelineCoordinator.test.ts
  • apps/backend/src/lib/__tests__/pipelineRunSql.test.ts
  • apps/backend/src/lib/__tests__/queue.test.ts
  • apps/backend/src/lib/__tests__/sanitizeQueueError.test.ts
  • apps/backend/src/lib/__tests__/snapshotRpcSql.test.ts
  • apps/backend/src/lib/__tests__/sqlTestUtils.ts
  • apps/backend/src/lib/__tests__/textLabels.test.ts
  • apps/backend/src/lib/bookmarkRename.ts
  • apps/backend/src/lib/cancellation.ts
  • apps/backend/src/lib/manualBookmark.ts
  • apps/backend/src/lib/pipelineCoordinator.ts
  • apps/backend/src/lib/queue.ts
  • apps/backend/src/lib/rateLimit.ts
  • apps/backend/src/lib/sanitizeQueueError.ts
  • apps/backend/src/lib/textLabels.ts
  • apps/backend/src/lib/userContext.ts
  • apps/backend/src/queues/__tests__/clusterPersistence.test.ts
  • apps/backend/src/queues/__tests__/clustering.test.ts
  • apps/backend/src/queues/__tests__/embedding.test.ts
  • apps/backend/src/queues/__tests__/enrichment.test.ts
  • apps/backend/src/queues/__tests__/ingest.test.ts
  • apps/backend/src/queues/__tests__/pipelineRecovery.test.ts
  • apps/backend/src/queues/clustering.ts
  • apps/backend/src/queues/clustering/clusterAlgorithm.ts
  • apps/backend/src/queues/clustering/clusterNaming.ts
  • apps/backend/src/queues/clustering/clusterPersistence.ts
  • apps/backend/src/queues/clustering/pipelineRecovery.ts
  • apps/backend/src/queues/clustering/vectorParsing.ts
  • apps/backend/src/queues/embedding.ts
  • apps/backend/src/queues/enrichment.ts
  • apps/backend/src/queues/ingest.ts
  • apps/backend/src/routes/__tests__/ingest.test.ts
  • apps/backend/src/routes/__tests__/status.test.ts
  • apps/backend/src/routes/auth.ts
  • apps/backend/src/routes/backups.ts
  • apps/backend/src/routes/ingest.ts
  • apps/backend/src/routes/schemas.ts
  • apps/backend/src/routes/search.ts
  • apps/backend/src/routes/status.ts
  • apps/backend/src/routes/structure.ts
  • apps/backend/src/scripts/smoke.ts
  • apps/extension/package.json
  • apps/extension/src/App.tsx
  • apps/extension/src/hooks/__tests__/useWeaveRun.test.ts
  • apps/extension/src/hooks/useBookmarkBackups.ts
  • apps/extension/src/hooks/useBookmarkScanSession.ts
  • apps/extension/src/hooks/useBookmarkTools.ts
  • apps/extension/src/hooks/useBookmarkWeaver.ts
  • apps/extension/src/hooks/useBookmarkWeaverTypes.ts
  • apps/extension/src/hooks/useChromeApply.ts
  • apps/extension/src/hooks/useExtensionAuth.ts
  • apps/extension/src/hooks/useStructureResults.ts
  • apps/extension/src/hooks/useWeaveRun.ts
  • apps/extension/src/lib/__tests__/chromeApplyPlan.test.ts
  • apps/extension/src/lib/__tests__/processingSession.test.ts
  • apps/extension/src/lib/__tests__/structureClient.test.ts
  • apps/extension/src/lib/backupClient.ts
  • apps/extension/src/lib/chromeApplyPlan.ts
  • apps/extension/src/lib/processingSession.ts
  • apps/extension/src/lib/structureClient.ts
  • apps/extension/src/screens/ApplyRecoveryCard.tsx
  • apps/extension/src/screens/BackupsScreen.tsx
  • apps/extension/src/screens/ImportStructureScreen.tsx
  • apps/extension/src/screens/LoginScreen.tsx
  • apps/extension/src/screens/ResultsScreen.tsx
  • apps/extension/src/screens/ScreenHeader.tsx
  • apps/extension/src/screens/StartScreen.tsx
  • apps/extension/src/screens/VersionsScreen.tsx
  • apps/extension/src/screens/WeavingScreen.tsx
  • apps/extension/src/screens/screenUtils.ts
  • apps/extension/src/styles/global.css
  • apps/web-e2e/package.json
  • apps/web/app/api/bookmarks/add/route.ts
  • apps/web/app/api/checkout/route.ts
  • apps/web/app/api/checkout/session/route.ts
  • apps/web/app/api/create-checkout-session/route.ts
  • apps/web/app/api/waitlist/route.ts
  • apps/web/app/api/webhooks/stripe/route.ts
  • apps/web/app/auth/signout/route.ts
  • apps/web/app/dashboard/backups/page.tsx
  • apps/web/app/dashboard/dashboard-ui.tsx
  • apps/web/app/dashboard/layout.tsx
  • apps/web/app/dashboard/links/page.tsx
  • apps/web/app/dashboard/page.tsx
  • apps/web/app/legal-page.tsx
  • apps/web/app/login/page.tsx
  • apps/web/app/privacy/page.tsx
  • apps/web/app/refund-policy/page.tsx
  • apps/web/app/terms/page.tsx
  • apps/web/components/BackupActions.tsx
  • apps/web/package.json
  • apps/web/tsconfig.tsbuildinfo
  • apps/web/utils/api/auth.ts
  • apps/web/utils/api/security.ts
  • apps/web/utils/supabase/middleware.ts
  • docs/link-loom-architecture.excalidraw
  • infra/terraform/README.md
  • infra/terraform/main.tf
  • infra/terraform/terraform.tfvars.example
  • infra/terraform/variables.tf
  • package.json
  • supabase/config.toml
  • supabase/migrations/20260529015351_harden_snapshot_rpc_search_path.sql
  • supabase/migrations/20260529015619_add_pipeline_job_generation.sql
  • supabase/migrations/20260529024420_add_queue_job_failures.sql
  • supabase/migrations/20260606231942_security_hardening.sql
  • supabase/tests/database/snapshot_rpc_test.sql
  • turbo.json
💤 Files with no reviewable changes (3)
  • apps/extension/package.json
  • apps/web-e2e/package.json
  • apps/web/package.json
✅ Files skipped from review due to trivial changes (11)
  • .gitignore
  • apps/web/app/dashboard/layout.tsx
  • apps/backend-e2e/package.json
  • infra/terraform/terraform.tfvars.example
  • apps/backend/src/lib/tests/textLabels.test.ts
  • apps/backend/src/lib/userContext.ts
  • apps/extension/src/styles/global.css
  • docs/link-loom-architecture.excalidraw
  • apps/backend/src/lib/tests/queue.test.ts
  • apps/backend/ARCHITECTURE.md
  • apps/backend/src/routes/schemas.ts
🚧 Files skipped from review as they are similar to previous changes (6)
  • apps/extension/src/lib/tests/structureClient.test.ts
  • apps/backend/src/scripts/smoke.ts
  • apps/extension/src/lib/tests/processingSession.test.ts
  • supabase/migrations/20260529015351_harden_snapshot_rpc_search_path.sql
  • apps/extension/src/lib/structureClient.ts
  • apps/backend/src/lib/tests/snapshotRpcSql.test.ts

Comment thread apps/backend-e2e/tests/backend.test.ts
Comment thread apps/backend/src/index.ts Outdated
Comment thread apps/backend/src/lambda/worker.ts
Comment thread apps/backend/src/lib/pipelineCoordinator.ts
Comment thread apps/backend/src/lib/rateLimit.ts Outdated
Comment thread apps/extension/src/screens/LoginScreen.tsx
Comment thread apps/extension/src/screens/screenUtils.ts
Comment thread apps/web/app/login/page.tsx Outdated
Comment thread infra/terraform/main.tf
Comment thread supabase/migrations/20260529015619_add_pipeline_job_generation.sql
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 19 file(s) based on 19 unresolved review comments.

Files modified:

  • apps/backend-e2e/tests/backend.test.ts
  • apps/backend/src/index.ts
  • apps/backend/src/lambda/worker.ts
  • apps/backend/src/lib/pipelineCoordinator.ts
  • apps/backend/src/lib/rateLimit.ts
  • apps/backend/src/queues/clustering/clusterAlgorithm.ts
  • apps/backend/src/queues/clustering/clusterPersistence.ts
  • apps/backend/src/routes/auth.ts
  • apps/backend/src/routes/ingest.ts
  • apps/backend/src/routes/status.ts
  • apps/extension/src/hooks/useBookmarkBackups.ts
  • apps/extension/src/hooks/useBookmarkWeaver.ts
  • apps/extension/src/hooks/useWeaveRun.ts
  • apps/extension/src/screens/LoginScreen.tsx
  • apps/extension/src/screens/screenUtils.ts
  • apps/web/app/login/page.tsx
  • apps/web/utils/stripe/checkout.ts
  • infra/terraform/main.tf
  • supabase/migrations/20260529015619_add_pipeline_job_generation.sql

Commit: bcf28ca0f36cc9e0bc716fe704b97ae106aee995

The changes have been pushed to the development branch.

Time taken: 9m 33s

Fixed 19 file(s) based on 19 unresolved review comments.

Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/backend/src/routes/auth.ts (1)

19-22: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Declare the 400 response this handler already returns.

This route still sends a 400 for a blank deviceId at Line 33, but authenticatedStatusResponseSchema only covers 200/401/403/500. That leaves the Fastify response contract out of sync with the real API surface.

Suggested fix
 import {
     authenticatedStatusResponseSchema,
+    errorResponseSchema,
     looseObjectBodySchema
 } from './schemas';
@@
             schema: {
                 body: looseObjectBodySchema,
-                response: authenticatedStatusResponseSchema
+                response: {
+                    ...authenticatedStatusResponseSchema,
+                    400: errorResponseSchema
+                }
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/routes/auth.ts` around lines 19 - 22, The route schema in
apps/backend/src/routes/auth.ts is missing the 400 response that the handler
returns for a blank deviceId; update the route's schema object (the same place
where body: looseObjectBodySchema and response:
authenticatedStatusResponseSchema are set) to include a 400 response schema
describing the error shape returned by the handler so the Fastify response
contract matches the actual behavior (keep authenticatedStatusResponseSchema for
200/401/403/500 and add a matching 400 entry).
apps/backend/src/lambda/worker.ts (1)

157-163: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Isolate recordQueueJobFailure(...) from the batch failure path.

If recordQueueJobFailure(...) throws here, failures.push(...) never runs and processEvent(...) rejects the whole Promise.all, so Lambda replays the entire batch instead of reporting a per-record failure. Wrap the bookkeeping call in its own try/catch and always append the itemIdentifier.

Suggested fix
         } catch (error) {
             console.error(`[LAMBDA:${queueName}] Failed to process message ${record.messageId}`, error);
             const receiveCount = getReceiveCount(record);
             if (message && receiveCount >= message.attempts) {
-                await recordQueueJobFailure(queueName, message, record.messageId, receiveCount, error);
+                try {
+                    await recordQueueJobFailure(queueName, message, record.messageId, receiveCount, error);
+                } catch (recordFailureError) {
+                    console.error(
+                        `[LAMBDA:${queueName}] Failed to persist exhausted-job failure ${record.messageId}`,
+                        recordFailureError,
+                    );
+                }
             }
             failures.push({ itemIdentifier: record.messageId });
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lambda/worker.ts` around lines 157 - 163, The catch block
currently calls recordQueueJobFailure(...) directly which can throw and prevent
failures.push({ itemIdentifier: record.messageId }) from running, causing
processEvent(...) to reject the whole Promise.all; wrap the call to
recordQueueJobFailure(queueName, message, record.messageId, receiveCount, error)
in its own try/catch so any error from that bookkeeping is logged/handled but
does not block execution, and ensure failures.push({ itemIdentifier:
record.messageId }) always runs after getReceiveCount(record) regardless of
recordQueueJobFailure success; reference the catch around recordQueueJobFailure,
getReceiveCount, failures.push and processEvent to locate where to add the inner
try/catch and guaranteed push.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/web/utils/stripe/checkout.ts`:
- Around line 5-13: The exported stripe singleton currently falls back to an
empty string for the API key which defers failures; change the initialization in
the exported stripe IIFE to validate process.env.STRIPE_SECRET_KEY and throw a
clear Error if it's missing before calling new Stripe(...). Locate the stripe
constant and the new Stripe(...) call and replace the use of
process.env.STRIPE_SECRET_KEY || '' with a requiredApiKey variable that is
checked and thrown if falsy, then pass that variable into new Stripe so
construction never succeeds with an empty key.

---

Outside diff comments:
In `@apps/backend/src/lambda/worker.ts`:
- Around line 157-163: The catch block currently calls
recordQueueJobFailure(...) directly which can throw and prevent failures.push({
itemIdentifier: record.messageId }) from running, causing processEvent(...) to
reject the whole Promise.all; wrap the call to recordQueueJobFailure(queueName,
message, record.messageId, receiveCount, error) in its own try/catch so any
error from that bookkeeping is logged/handled but does not block execution, and
ensure failures.push({ itemIdentifier: record.messageId }) always runs after
getReceiveCount(record) regardless of recordQueueJobFailure success; reference
the catch around recordQueueJobFailure, getReceiveCount, failures.push and
processEvent to locate where to add the inner try/catch and guaranteed push.

In `@apps/backend/src/routes/auth.ts`:
- Around line 19-22: The route schema in apps/backend/src/routes/auth.ts is
missing the 400 response that the handler returns for a blank deviceId; update
the route's schema object (the same place where body: looseObjectBodySchema and
response: authenticatedStatusResponseSchema are set) to include a 400 response
schema describing the error shape returned by the handler so the Fastify
response contract matches the actual behavior (keep
authenticatedStatusResponseSchema for 200/401/403/500 and add a matching 400
entry).
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 75f23e29-b6fc-4db9-8399-bd9c0f1d828f

📥 Commits

Reviewing files that changed from the base of the PR and between 3f6adac and bcf28ca.

📒 Files selected for processing (19)
  • apps/backend-e2e/tests/backend.test.ts
  • apps/backend/src/index.ts
  • apps/backend/src/lambda/worker.ts
  • apps/backend/src/lib/pipelineCoordinator.ts
  • apps/backend/src/lib/rateLimit.ts
  • apps/backend/src/queues/clustering/clusterAlgorithm.ts
  • apps/backend/src/queues/clustering/clusterPersistence.ts
  • apps/backend/src/routes/auth.ts
  • apps/backend/src/routes/ingest.ts
  • apps/backend/src/routes/status.ts
  • apps/extension/src/hooks/useBookmarkBackups.ts
  • apps/extension/src/hooks/useBookmarkWeaver.ts
  • apps/extension/src/hooks/useWeaveRun.ts
  • apps/extension/src/screens/LoginScreen.tsx
  • apps/extension/src/screens/screenUtils.ts
  • apps/web/app/login/page.tsx
  • apps/web/utils/stripe/checkout.ts
  • infra/terraform/main.tf
  • supabase/migrations/20260529015619_add_pipeline_job_generation.sql
🚧 Files skipped from review as they are similar to previous changes (12)
  • apps/extension/src/screens/LoginScreen.tsx
  • apps/extension/src/screens/screenUtils.ts
  • apps/backend/src/queues/clustering/clusterPersistence.ts
  • apps/backend/src/routes/status.ts
  • apps/backend/src/lib/rateLimit.ts
  • apps/backend/src/queues/clustering/clusterAlgorithm.ts
  • apps/extension/src/hooks/useWeaveRun.ts
  • apps/backend/src/index.ts
  • apps/backend/src/lib/pipelineCoordinator.ts
  • apps/backend-e2e/tests/backend.test.ts
  • apps/backend/src/routes/ingest.ts
  • supabase/migrations/20260529015619_add_pipeline_job_generation.sql

Comment thread apps/web/utils/stripe/checkout.ts
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant