fix: recover stale sessions before eviction#387
Conversation
Constraint: preserve recoverable session insights during stale cleanup Confidence: high Scope-risk: narrow Tested: npm test -- test/evict.test.ts --reporter=dot; npx tsdown; git diff --check Tested: npm test -- --reporter=dot (fails on existing Windows path-sensitive compress-file and obsidian-export tests) Signed-off-by: laplace young <yangqk12@whu.edu.cn>
|
Someone is attempting to deploy a commit to the rohitg00's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughThis PR adds a recovery mechanism to the eviction sweep that attempts to preserve graph data from stale sessions. When a stale session without a summary is encountered during eviction, the function checks for compressed observations, triggers ChangesStale Session Recovery Flow
Sequence DiagramsequenceDiagram
participant evict as mem::evict
participant kv as KV
participant recovery as recoverStaleSession
participant session_stopped as event::session::stopped
participant consolidate as mem::consolidate-pipeline
evict->>evict: Find stale session without SessionSummary
evict->>kv: List observations / compressed observations
kv-->>evict: Observations response
evict->>recovery: If compressed observations -> attempt recovery(sessionId)
recovery->>session_stopped: trigger(event::session::stopped, sessionId)
alt Recovery Success
session_stopped-->>recovery: { success: true }
recovery-->>evict: true
evict->>evict: Increment recoveredStaleSessions
evict->>kv: Delete session and write audit reason
else Recovery Failure
session_stopped-->>recovery: { success: false } or error
recovery-->>evict: false
evict->>evict: Skip deletion (log warning)
end
alt After loop and recoveredStaleSessions>0 (non-dry-run)
evict->>consolidate: trigger(mem::consolidate-pipeline, { tier: "all" })
consolidate-->>evict: Complete / warn on failure
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/functions/evict.ts (2)
131-153:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAudit reason is misleading after a successful recovery.
If recovery succeeded,
event::session::stoppedhas already produced a summary/endedAtfor the session, but the subsequent delete is still audited withreason: "stale_session_without_summary". The audit trail will misrepresent the post-recovery deletion. Track whether recovery ran and record a distinct reason (e.g."stale_session_recovered_then_evicted").🪵 Proposed audit-reason fix
- const hasCompressedObservations = observations.some((o) => o.title); - if (hasCompressedObservations) { - const recovered = await recoverStaleSession(sdk, session.id); - if (!recovered) continue; - recoveredStaleSessions++; - } + let recovered = false; + const hasCompressedObservations = observations.some((o) => o.title); + if (hasCompressedObservations) { + recovered = await recoverStaleSession(sdk, session.id); + if (!recovered) continue; + recoveredStaleSessions++; + } try { await kv.delete(KV.sessions, session.id); stats.staleSessions++; } catch (err) { logger.warn("Eviction delete failed", { resource: "session", id: session.id, error: err instanceof Error ? err.message : String(err), }); continue; } await recordAudit(kv, "delete", "mem::evict", [session.id], { resource: "session", - reason: "stale_session_without_summary", + reason: recovered + ? "stale_session_recovered_then_evicted" + : "stale_session_without_summary", dryRun, });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/functions/evict.ts` around lines 131 - 153, The audit reason is incorrect after a successful recovery: when hasCompressedObservations triggers recoverStaleSession(session.id) and it returns true (recovered variable), change the subsequent recordAudit call to use a distinct reason (e.g. "stale_session_recovered_then_evicted") instead of "stale_session_without_summary"; locate the logic around hasCompressedObservations, the call to recoverStaleSession and the recordAudit(...) invocation and branch the reason based on the recovered boolean so recoveredStaleSessions++ and the delete still happen but the audit reflects recovery.
138-160:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCompressed observations under deleted stale sessions become orphaned.
When
kv.delete(KV.sessions, session.id)executes for a stale session (line 140), the observations underKV.observations(session.id)remain in KV and are never deleted in this branch. The downstream calls toevent::session::stopped(which triggersmem::summarize) andmem::consolidate-pipelineextract and consolidate observations but do not clean them up. Observations survive only if they later satisfy the low-importance threshold (line 175) or project cap condition (line 125), meaning observations above the importance threshold or under age limits persist indefinitely with no parent session row, becoming unreachable through any session listing.Check whether
mem::summarizeor a downstream consolidation step is intended to delete these observations; otherwise, this branch needs to explicitly delete all observations under the session before removing the session row—consistent with howremember.tsandexport-import.tshandle this cleanup pattern.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/functions/evict.ts` around lines 138 - 160, The stale-session branch deletes the session row via kv.delete(KV.sessions, session.id) but never removes the child's observations under KV.observations(session.id), leaving orphaned observations; before deleting the session (or as part of the same branch) enumerate and delete all keys under KV.observations(session.id) (or call the same helper used in remember.ts/export-import.ts that clears observations) and only then call kv.delete(KV.sessions, session.id) and recordAudit, or alternatively confirm and call the consolidation/summarize path that explicitly deletes observations (mem::summarize / mem::consolidate-pipeline) — update the code around the kv.delete(KV.sessions, session.id) block to remove observations first (use the project’s observation deletion helper or list-and-delete loop) to prevent orphaned data.
🧹 Nitpick comments (3)
src/functions/evict.ts (1)
39-43: 💤 Low valueRename
triggerRecoveredSessionfor clarity.This predicate validates a trigger result; the verb
triggerreads as if it performs an action. The stack summary even refers to it asisValidRecoveryResult, which is more descriptive of the actual behavior.♻️ Proposed rename
-function triggerRecoveredSession(result: unknown): boolean { +function isValidRecoveryResult(result: unknown): boolean { if (!result || typeof result !== "object") return false; if (!("success" in result)) return true; return (result as { success?: unknown }).success !== false; }And update the call site in
recoverStaleSession:- if (!triggerRecoveredSession(result)) { + if (!isValidRecoveryResult(result)) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/functions/evict.ts` around lines 39 - 43, Rename the predicate function triggerRecoveredSession to a clearer name like isValidRecoveryResult and update all call sites (notably the call inside recoverStaleSession) to use the new name; keep the implementation identical but change the function identifier and any references/imports to avoid mismatches so the predicate semantics remain the same while the name better reflects that it validates a recovery result.test/evict.test.ts (2)
6-79: 💤 Low valueMock SDK via
vi.mock("iii-sdk")to match repo test convention.The repo convention is to mock the
iii-sdkmodule itself; this file uses dependency injection (passing the fakesdkdirectly intoregisterEvictFunction) instead. Functionally equivalent here, but new tests that diverge from the established mock pattern make it harder to find and reuse shared shims, andvi.mock("iii-sdk")is whattest/crystallize.test.tsand friends use.As per coding guidelines: "Mock pattern for tests: use
vi.mock("iii-sdk")with mocksdk.trigger,kv.get/set/list".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/evict.test.ts` around lines 6 - 79, The test should follow the repo mock convention by mocking the iii-sdk module instead of injecting a fake sdk: remove the custom mockSdk injection usage and add vi.mock("iii-sdk") that exports an object with registerFunction and trigger (and a calls array) so tests use the module mock; reuse the existing mock handler logic from mockSdk to implement registerFunction and trigger, and keep mockKV as-is so registerEvictFunction (or the code that calls into iii-sdk.registerFunction) gets its handler from the module-level mock rather than a passed-in sdk.
93-155: ⚡ Quick winAdd coverage for the two other stale-session branches.
The new branches in
evict.tsthat aren't exercised here:
- Stale session whose observations list scan throws — should
continuewithout callingevent::session::stoppedand without deleting the session.- Stale session with no compressed observations (
observations.some(o => o.title) === false) — should skip recovery entirely and still delete the session, leavingstaleSessions === 1and noevent::session::stoppedcall.Adding two short cases mirroring the existing pattern would lock down the recovery gate and the error path so future refactors don't regress them.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/evict.test.ts` around lines 93 - 155, Add two tests in test/evict.test.ts using the same pattern as the existing cases: (1) "skips stale session when observations scan throws" — create a storeForObservedSession session, have mockKV or the observations iterator throw when scanning, registerEvictFunction(sdk, kv), ensure sdk.trigger({function_id: "mem::evict"}) leaves the session in KV, does NOT call "event::session::stopped", and does NOT delete the session; (2) "deletes stale session with no compressed observations and skips recovery" — create a store where observations exist but all have no title (so observations.some(o => o.title) === false), registerEvictFunction(sdk, kv), trigger "mem::evict" and assert staleSessions === 1, the session is deleted from KV, and "event::session::stopped" is NOT called while "mem::consolidate-pipeline" may be called as in other tests; reuse mockSdk, registerFunction, KV, registerEvictFunction, mockKV, and sdk.trigger to mirror existing tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/functions/evict.ts`:
- Around line 129-136: The recovery gate is inverted: currently you call
recoverStaleSession only when observations.some((o) => o.title) is true, but
that detects compressed observations; instead detect sessions missing compressed
observations (i.e., only RawObservation) and call recoverStaleSession for those.
Change the logic around observations and hasCompressedObservations so you
compute whether there are no compressed observations (e.g., !observations.some(o
=> o.title)) and only then invoke recoverStaleSession(sdk, session.id) and
increment recoveredStaleSessions when it returns true.
---
Outside diff comments:
In `@src/functions/evict.ts`:
- Around line 131-153: The audit reason is incorrect after a successful
recovery: when hasCompressedObservations triggers
recoverStaleSession(session.id) and it returns true (recovered variable), change
the subsequent recordAudit call to use a distinct reason (e.g.
"stale_session_recovered_then_evicted") instead of
"stale_session_without_summary"; locate the logic around
hasCompressedObservations, the call to recoverStaleSession and the
recordAudit(...) invocation and branch the reason based on the recovered boolean
so recoveredStaleSessions++ and the delete still happen but the audit reflects
recovery.
- Around line 138-160: The stale-session branch deletes the session row via
kv.delete(KV.sessions, session.id) but never removes the child's observations
under KV.observations(session.id), leaving orphaned observations; before
deleting the session (or as part of the same branch) enumerate and delete all
keys under KV.observations(session.id) (or call the same helper used in
remember.ts/export-import.ts that clears observations) and only then call
kv.delete(KV.sessions, session.id) and recordAudit, or alternatively confirm and
call the consolidation/summarize path that explicitly deletes observations
(mem::summarize / mem::consolidate-pipeline) — update the code around the
kv.delete(KV.sessions, session.id) block to remove observations first (use the
project’s observation deletion helper or list-and-delete loop) to prevent
orphaned data.
---
Nitpick comments:
In `@src/functions/evict.ts`:
- Around line 39-43: Rename the predicate function triggerRecoveredSession to a
clearer name like isValidRecoveryResult and update all call sites (notably the
call inside recoverStaleSession) to use the new name; keep the implementation
identical but change the function identifier and any references/imports to avoid
mismatches so the predicate semantics remain the same while the name better
reflects that it validates a recovery result.
In `@test/evict.test.ts`:
- Around line 6-79: The test should follow the repo mock convention by mocking
the iii-sdk module instead of injecting a fake sdk: remove the custom mockSdk
injection usage and add vi.mock("iii-sdk") that exports an object with
registerFunction and trigger (and a calls array) so tests use the module mock;
reuse the existing mock handler logic from mockSdk to implement registerFunction
and trigger, and keep mockKV as-is so registerEvictFunction (or the code that
calls into iii-sdk.registerFunction) gets its handler from the module-level mock
rather than a passed-in sdk.
- Around line 93-155: Add two tests in test/evict.test.ts using the same pattern
as the existing cases: (1) "skips stale session when observations scan throws" —
create a storeForObservedSession session, have mockKV or the observations
iterator throw when scanning, registerEvictFunction(sdk, kv), ensure
sdk.trigger({function_id: "mem::evict"}) leaves the session in KV, does NOT call
"event::session::stopped", and does NOT delete the session; (2) "deletes stale
session with no compressed observations and skips recovery" — create a store
where observations exist but all have no title (so observations.some(o =>
o.title) === false), registerEvictFunction(sdk, kv), trigger "mem::evict" and
assert staleSessions === 1, the session is deleted from KV, and
"event::session::stopped" is NOT called while "mem::consolidate-pipeline" may be
called as in other tests; reuse mockSdk, registerFunction, KV,
registerEvictFunction, mockKV, and sdk.trigger to mirror existing tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c7c34bea-25aa-4426-8151-38379285d65e
📒 Files selected for processing (2)
src/functions/evict.tstest/evict.test.ts
| if (!observations) continue; | ||
|
|
||
| const hasCompressedObservations = observations.some((o) => o.title); | ||
| if (hasCompressedObservations) { | ||
| const recovered = await recoverStaleSession(sdk, session.id); | ||
| if (!recovered) continue; | ||
| recoveredStaleSessions++; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Inspect CompressedObservation/observation shapes and any place that constructs them without a title.
rg -nP --type=ts -C3 '\btitle\s*[:?]\s*' src/types.ts src/state || true
ast-grep --pattern $'interface CompressedObservation {
$$$
}'
rg -nP --type=ts -C2 'title\s*:' -g 'src/functions/**' -g '!**/*.test.ts'
# Also check how observations are produced before vs after session::stopped.
rg -nP --type=ts -C3 'observations?\s*\(' -g 'src/functions/**'Repository: rohitg00/agentmemory
Length of output: 38755
🏁 Script executed:
#!/bin/bash
# Find RawObservation interface definition
rg -nP --type=ts 'interface RawObservation' src/
# Also check for type RawObservation
rg -nP --type=ts 'type RawObservation' src/
# Look for places where RawObservation is created
rg -nP --type=ts 'RawObservation' src/types.ts -A 15 -B 2Repository: rohitg00/agentmemory
Length of output: 527
Reverse the compressed-observation recovery gate—it deletes exactly the sessions that need recovery most.
The KV store can contain both RawObservation and CompressedObservation objects (confirmed by observe.ts storing both, and replay.ts typing the same location as RawObservation | CompressedObservation). RawObservation has no title field, while CompressedObservation requires it.
The current check observations.some((o) => o.title) therefore:
- Returns
falsefor sessions containing only raw observations (wheresession::stoppednever fired—exactly the crash/disconnect cases that need recovery per issue#308) - Returns
truefor sessions that already have compressed observations (which meansession::stoppedlikely already ran)
This is backwards. Sessions with only raw observations are silently deleted without any recovery attempt. Invert the condition to recover sessions without compressed observations:
const hasOnlyRawObservations = !observations.some((o) => o.title);
if (hasOnlyRawObservations) {
const recovered = await recoverStaleSession(sdk, session.id);
if (!recovered) continue;
recoveredStaleSessions++;
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/functions/evict.ts` around lines 129 - 136, The recovery gate is
inverted: currently you call recoverStaleSession only when observations.some((o)
=> o.title) is true, but that detects compressed observations; instead detect
sessions missing compressed observations (i.e., only RawObservation) and call
recoverStaleSession for those. Change the logic around observations and
hasCompressedObservations so you compute whether there are no compressed
observations (e.g., !observations.some(o => o.title)) and only then invoke
recoverStaleSession(sdk, session.id) and increment recoveredStaleSessions when
it returns true.
Constraint: preserve raw-only stale sessions until a compression recovery path exists Confidence: high Scope-risk: narrow Tested: npm test -- test/evict.test.ts --reporter=dot; npx tsdown; git diff --check Signed-off-by: laplace young <yangqk12@whu.edu.cn>
|
Follow-up pushed in d2ff913. The recovery gate now preserves raw-only stale sessions because the existing stop path summarizes compressed observations. I also added coverage for observation scan failures and raw-only sessions, and changed the audit reason for successful recovery-backed evictions. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/evict.test.ts (2)
152-157: ⚡ Quick winHarden success-path assertions to enforce single-pass and ordering.
Current checks only verify presence. Please assert exactly one
event::session::stopped, exactly onemem::consolidate-pipeline, and that recovery occurs before consolidation.Proposed assertion update
- expect(calls.map((call) => call.function_id)).toContain( - "event::session::stopped", - ); - expect(calls.map((call) => call.function_id)).toContain( - "mem::consolidate-pipeline", - ); + const functionIds = calls.map((call) => call.function_id); + expect( + functionIds.filter((id) => id === "event::session::stopped"), + ).toHaveLength(1); + expect( + functionIds.filter((id) => id === "mem::consolidate-pipeline"), + ).toHaveLength(1); + expect(functionIds.indexOf("event::session::stopped")).toBeLessThan( + functionIds.indexOf("mem::consolidate-pipeline"), + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/evict.test.ts` around lines 152 - 157, Tests currently only check presence of events; update assertions to enforce exactly one occurrence of "event::session::stopped" and exactly one of "mem::consolidate-pipeline" and to assert the stopped event appears before the consolidate event. Locate the test using the calls array and function_id (e.g., references to calls.map(call => call.function_id)), replace the loose toContain checks with exact-count assertions (filter calls by call.function_id === "event::session::stopped" and by "mem::consolidate-pipeline" and assert each filtered.length === 1) and add an ordering check by comparing indices (indexOf the stopped id < indexOf the consolidate id) to ensure recovery occurs before consolidation.
58-97: ⚡ Quick winAlign this suite with the required
iii-sdkmock pattern.This test uses custom
mockSdk/mockKVhelpers instead of the repository-standardvi.mock("iii-sdk")setup, which breaks the shared testing convention.As per coding guidelines,
test/**/*.test.ts: usevi.mock("iii-sdk")with mocksdk.trigger,kv.get/set/list.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/evict.test.ts` around lines 58 - 97, The test defines custom mock helpers (mockKV, mockSdk, handlers, sdk.registerFunction, sdk.trigger, calls) instead of using the repository-standard vi.mock("iii-sdk") pattern; replace these custom helpers by calling vi.mock("iii-sdk") and provide a mock implementation that exposes sdk.trigger and kv.get / kv.set / kv.list (and any call tracking) consistent with other tests, then update the test to import and use the mocked iii-sdk exports rather than mockSdk/mockKV so registration and triggering uses the mocked sdk.registerFunction/sdk.trigger and key-value operations use the mocked kv.get/kv.set/kv.list.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/evict.test.ts`:
- Around line 152-157: Tests currently only check presence of events; update
assertions to enforce exactly one occurrence of "event::session::stopped" and
exactly one of "mem::consolidate-pipeline" and to assert the stopped event
appears before the consolidate event. Locate the test using the calls array and
function_id (e.g., references to calls.map(call => call.function_id)), replace
the loose toContain checks with exact-count assertions (filter calls by
call.function_id === "event::session::stopped" and by
"mem::consolidate-pipeline" and assert each filtered.length === 1) and add an
ordering check by comparing indices (indexOf the stopped id < indexOf the
consolidate id) to ensure recovery occurs before consolidation.
- Around line 58-97: The test defines custom mock helpers (mockKV, mockSdk,
handlers, sdk.registerFunction, sdk.trigger, calls) instead of using the
repository-standard vi.mock("iii-sdk") pattern; replace these custom helpers by
calling vi.mock("iii-sdk") and provide a mock implementation that exposes
sdk.trigger and kv.get / kv.set / kv.list (and any call tracking) consistent
with other tests, then update the test to import and use the mocked iii-sdk
exports rather than mockSdk/mockKV so registration and triggering uses the
mocked sdk.registerFunction/sdk.trigger and key-value operations use the mocked
kv.get/kv.set/kv.list.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1dea0470-ef64-4222-934b-a858a84a53ac
📒 Files selected for processing (2)
src/functions/evict.tstest/evict.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/functions/evict.ts
Summary
Fixes #308.
event::session::stoppedbeforemem::evictdeletes the session rowTests
npm test -- test/evict.test.ts --reporter=dotnpx tsdowngit diff --checknpm test -- --reporter=dot(895 passed; 10 existing Windows path-sensitive failures intest/compress-file.test.tsandtest/obsidian-export.test.ts)Summary by CodeRabbit
New Features
Tests