fix(consolidation): skip eager first tick at startup to avoid FalkorDB load race by flintfromthebasement · Pull Request #165 · verygoodplugins/automem

flintfromthebasement · 2026-05-01T15:37:17Z

Why

When init_consolidation_scheduler() runs a tick immediately after spawning the worker thread, FalkorDB can still be loading its RDB snapshot from disk. Every Redis command during that window returns:

LOADING Redis is loading the dataset in memory

The eager tick catches the error, logs it, and bumps last_run timestamps — silently skipping the day's decay / creative / cluster work until tomorrow. The bigger the corpus, the longer the RDB load, the more reliably this fires. On any restart-on-deploy host (Railway, Docker, systemd) with a few thousand memories, it hits every deploy.

What changes

One line in automem/consolidation/runtime_scheduler.py:100 — drop the eager run_consolidation_tick_fn() call after starting the worker thread, and add a comment explaining why.

     state.consolidation_thread.start()
-    run_consolidation_tick_fn()
+    # Skip eager first tick: FalkorDB may still be loading its RDB snapshot at
+    # startup and the "Redis is loading the dataset in memory" error poisons
+    # the day's decay/creative run. The worker loop will fire its first tick
+    # after consolidation_tick_seconds, which is plenty of warm-up time.
     logger.info("Consolidation scheduler initialized")

Why this is safe

The worker loop still fires within CONSOLIDATION_TICK_SECONDS (default 3600s = 1h). For decay/creative/cluster intervals measured in days, a one-tick startup delay is invisible.
The scheduler is timestamp-driven (last_run per task), not edge-triggered. Missed intervals get picked up by the next loop iteration — nothing is "lost" by deferring.
Failure mode flips from "silent broken run" to "no run yet, will run shortly" — strictly better.

Out of scope

A more involved fix would actively probe FalkorDB readiness with retries before the first tick. That's a bigger change and arguably belongs at the FalkorDB-client layer, not here. This PR is the minimal, low-risk fix.
The discover_creative_associations / clustering improvements live in feat(consolidation): expose cluster threshold and min size as env vars #163 and feat(scripts): safer reclassify_with_llm.py with provider flags + tighter prompt #164.

Test plan

Service starts cleanly with no eager tick log entry
Worker loop fires its first tick after CONSOLIDATION_TICK_SECONDS
Forcing a tick via POST /consolidate still works immediately
On a restart with a large RDB, no LOADING Redis is loading the dataset in memory errors appear in consolidation logs

…B load race When init_consolidation_scheduler() ran a tick immediately after spawning the worker thread, FalkorDB could still be loading its RDB snapshot from disk. Every Redis command in that window returns "LOADING Redis is loading the dataset in memory", so the eager tick fails — but the failure is caught and last_run timestamps get bumped, silently skipping the day's decay / creative / cluster work until tomorrow. The bigger the corpus, the longer the RDB load, the more reliably this fires. On any restart-on-deploy host (Railway, Docker, systemd) with a few thousand memories, it hits every deploy. Removing the eager tick is safe: - The worker loop still fires within CONSOLIDATION_TICK_SECONDS (default 1h). For decay/creative/cluster intervals measured in days, a one-tick delay at startup is invisible. - The scheduler is timestamp-driven (last_run per task), not edge-triggered. Nothing is "lost" by deferring — the next loop iteration picks up any missed intervals. - Failure mode flips from "silent broken run" to "no run yet, will run shortly" — strictly better. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

jack-arturo merged commit 1b812cf into verygoodplugins:main May 1, 2026
7 checks passed

jack-arturo mentioned this pull request Apr 28, 2026

chore(main): release 0.15.3 #154

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(consolidation): skip eager first tick at startup to avoid FalkorDB load race#165

fix(consolidation): skip eager first tick at startup to avoid FalkorDB load race#165
jack-arturo merged 1 commit intoverygoodplugins:mainfrom
flintfromthebasement:fix/skip-eager-consolidation-tick-on-startup

flintfromthebasement commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

flintfromthebasement commented May 1, 2026

Why

What changes

Why this is safe

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants