feat(consolidation): expose cluster threshold and min size as env vars by flintfromthebasement · Pull Request #163 · verygoodplugins/automem

flintfromthebasement · 2026-05-01T15:16:42Z

Summary

Adds configuration surface for cluster consolidation tuning while preserving current runtime defaults.

CONSOLIDATION_CLUSTER_SIMILARITY_THRESHOLD is now configurable; default remains 0.75.
CONSOLIDATION_MIN_CLUSTER_SIZE is now configurable; default remains 3.
CONSOLIDATION_CLUSTER_INTERVAL_SECONDS remains the existing env var with its existing 2592000 / 30-day default.

Scope update

This PR is intentionally scoped down to config exposure only. It no longer changes the default threshold, minimum cluster size, or cluster cadence.

The motivation for exposing these knobs still holds: embedding geometry and corpus shape vary by deployment, so hardcoded clustering parameters are hard to operate. But retuning the defaults should happen after measurement, not as part of this mechanical config-surface change.

Local graph diagnostics on May 1, 2026 suggest that lowering the threshold may be useful for some corpora, and Flint's 50k+ memory growth rate makes the 30-day cadence feel stale operationally. Those are real signals, but they are not yet enough to bless new global defaults because exact cluster_similar_memories() is expensive at graph scale and nearest-neighbor samples need quality review before changing merge behavior for everyone.

What changed

automem/config.py adds the two cluster tuning env vars with behavior-preserving defaults.
automem/consolidation/runtime_helpers.py applies optional overrides to MemoryConsolidator after construction.
automem/consolidation/runtime_bindings.py forwards the optional values through the runtime factory.
app.py imports and passes the new config values.
tests/test_consolidation_engine.py covers both default preservation and explicit override behavior.

Out of scope

Follow-up PRs should handle runtime behavior separately:

measured threshold retuning, if nearest-neighbor and cluster-quality evidence supports it;
cluster cadence changes or a fresher online/nearline enrichment path for high-ingest deployments;
readiness probing before consolidation ticks;
legacy PARALLEL_CONTEXT normalization;
supersession or preference discovery improvements.

Test plan

.venv/bin/python -m pytest tests/test_consolidation_engine.py — 9 passed
.venv/bin/python -m compileall automem tests/test_consolidation_engine.py
Pre-commit hooks on commit: black, isort, flake8, trim trailing whitespace, EOF fixer, merge conflict check, debug statement check, detect secrets, conventional commit

Adds two new env-var-backed knobs and tunes one default: - CONSOLIDATION_CLUSTER_SIMILARITY_THRESHOLD (default 0.65) - CONSOLIDATION_MIN_CLUSTER_SIZE (default 2) - CONSOLIDATION_CLUSTER_INTERVAL_SECONDS default: 2592000 (30d) -> 604800 (7d) Why --- similarity_threshold and min_cluster_size were hardcoded in MemoryConsolidator.__init__ at 0.75 and 3 respectively, with no configuration path. Embedding geometry is deployment-dependent (Voyage-4 1024d, OpenAI text-embedding-3-small Matryoshka-1024d, Ollama nomic 768d, FastEmbed bge 768d) and a single hardcoded threshold cannot suit all of them. The pattern of exposing similarity thresholds via env vars is already established for enrichment (ENRICHMENT_SIMILARITY_THRESHOLD). Default 0.65 (down from 0.75): at 1024d, 0.75 sits in near-duplicate territory; on a 62.7k-memory corpus with the 0.75 threshold the cluster tick produced zero clusters. 0.65 puts the threshold in the "topically related, not literally the same thing" band that clustering is meant to capture. Operators on tighter cosine distributions (e.g. text-embedding-3-large at 3072d) can raise it. Default min_cluster_size 2 (down from 3): with min=3, a strongly- related pair of memories never clusters. Connected-components on sparse semantic graphs typically produces pairs first; min=2 keeps those in scope. Operators on very large corpora can raise it. Default interval 7d (down from 30d): monthly clustering on a live, growing corpus means the graph is always up to a month stale. 7d matches the existing creative-consolidation cadence. The breaking default change is the one behavior shift in this PR; deployments wanting the old behavior set CONSOLIDATION_CLUSTER_INTERVAL_SECONDS=2592000. Wiring ------ - automem/config.py: add the two new env vars; flip the interval default. - automem/consolidation/runtime_helpers.py: build_consolidator_from_config accepts the two new params and sets them as attributes post-construction (the MemoryConsolidator constructor does not accept them; this avoids changing the public class signature). - automem/consolidation/runtime_bindings.py: forward through create_consolidation_runtime. - app.py: import and pass the new config values. All other behavior is purely additive — no env override required to preserve prior behavior except the interval. Tests ----- - tests/test_consolidation_engine.py: 7/7 pass. - Full unit suite: clean except two pre-existing tests/test_content_size.py failures (auth setup, unrelated).

jack-arturo · 2026-05-01T16:13:21Z

Scope update pushed in a551221: this PR now preserves the existing defaults and only exposes the config surface. Defaults remain similarity_threshold=0.75, min_cluster_size=3, and CONSOLIDATION_CLUSTER_INTERVAL_SECONDS=2592000. I updated the PR body with the measurement note: local diagnostics make lower thresholds / faster cadence plausible for some deployments, especially high-ingest graphs, but default retuning should be a separate measured follow-up.

…B load race (#165) ## Why When `init_consolidation_scheduler()` runs a tick **immediately** after spawning the worker thread, FalkorDB can still be loading its RDB snapshot from disk. Every Redis command during that window returns: > `LOADING Redis is loading the dataset in memory` The eager tick catches the error, logs it, and bumps `last_run` timestamps — silently skipping the day's decay / creative / cluster work until tomorrow. The bigger the corpus, the longer the RDB load, the more reliably this fires. On any restart-on-deploy host (Railway, Docker, systemd) with a few thousand memories, it hits every deploy. ## What changes One line in `automem/consolidation/runtime_scheduler.py:100` — drop the eager `run_consolidation_tick_fn()` call after starting the worker thread, and add a comment explaining why. ```diff state.consolidation_thread.start() - run_consolidation_tick_fn() + # Skip eager first tick: FalkorDB may still be loading its RDB snapshot at + # startup and the "Redis is loading the dataset in memory" error poisons + # the day's decay/creative run. The worker loop will fire its first tick + # after consolidation_tick_seconds, which is plenty of warm-up time. logger.info("Consolidation scheduler initialized") ``` ## Why this is safe - The worker loop still fires within `CONSOLIDATION_TICK_SECONDS` (default 3600s = 1h). For decay/creative/cluster intervals measured in days, a one-tick startup delay is invisible. - The scheduler is timestamp-driven (`last_run` per task), not edge-triggered. Missed intervals get picked up by the next loop iteration — nothing is "lost" by deferring. - Failure mode flips from "silent broken run" to "no run yet, will run shortly" — strictly better. ## Out of scope - A more involved fix would actively probe FalkorDB readiness with retries before the first tick. That's a bigger change and arguably belongs at the FalkorDB-client layer, not here. This PR is the minimal, low-risk fix. - The `discover_creative_associations` / clustering improvements live in #163 and #164. ## Test plan - [ ] Service starts cleanly with no eager tick log entry - [ ] Worker loop fires its first tick after `CONSOLIDATION_TICK_SECONDS` - [ ] Forcing a tick via `POST /consolidate` still works immediately - [ ] On a restart with a large RDB, no `LOADING Redis is loading the dataset in memory` errors appear in consolidation logs Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

flintfromthebasement mentioned this pull request May 1, 2026

fix(consolidation): skip eager first tick at startup to avoid FalkorDB load race #165

Merged

4 tasks

fix(consolidation): preserve cluster defaults

a551221

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(consolidation): expose cluster threshold and min size as env vars#163

feat(consolidation): expose cluster threshold and min size as env vars#163
flintfromthebasement wants to merge 2 commits intoverygoodplugins:mainfrom
flintfromthebasement:feat/configurable-cluster-thresholds

flintfromthebasement commented May 1, 2026 •

edited by jack-arturo

Loading

Uh oh!

jack-arturo commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

flintfromthebasement commented May 1, 2026 • edited by jack-arturo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope update

What changed

Out of scope

Test plan

Uh oh!

jack-arturo commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

flintfromthebasement commented May 1, 2026 •

edited by jack-arturo

Loading