Skip to content

feat(consolidation): expose cluster threshold and min size as env vars#163

Open
flintfromthebasement wants to merge 2 commits intoverygoodplugins:mainfrom
flintfromthebasement:feat/configurable-cluster-thresholds
Open

feat(consolidation): expose cluster threshold and min size as env vars#163
flintfromthebasement wants to merge 2 commits intoverygoodplugins:mainfrom
flintfromthebasement:feat/configurable-cluster-thresholds

Conversation

@flintfromthebasement
Copy link
Copy Markdown
Contributor

@flintfromthebasement flintfromthebasement commented May 1, 2026

Summary

Adds configuration surface for cluster consolidation tuning while preserving current runtime defaults.

  • CONSOLIDATION_CLUSTER_SIMILARITY_THRESHOLD is now configurable; default remains 0.75.
  • CONSOLIDATION_MIN_CLUSTER_SIZE is now configurable; default remains 3.
  • CONSOLIDATION_CLUSTER_INTERVAL_SECONDS remains the existing env var with its existing 2592000 / 30-day default.

Scope update

This PR is intentionally scoped down to config exposure only. It no longer changes the default threshold, minimum cluster size, or cluster cadence.

The motivation for exposing these knobs still holds: embedding geometry and corpus shape vary by deployment, so hardcoded clustering parameters are hard to operate. But retuning the defaults should happen after measurement, not as part of this mechanical config-surface change.

Local graph diagnostics on May 1, 2026 suggest that lowering the threshold may be useful for some corpora, and Flint's 50k+ memory growth rate makes the 30-day cadence feel stale operationally. Those are real signals, but they are not yet enough to bless new global defaults because exact cluster_similar_memories() is expensive at graph scale and nearest-neighbor samples need quality review before changing merge behavior for everyone.

What changed

  • automem/config.py adds the two cluster tuning env vars with behavior-preserving defaults.
  • automem/consolidation/runtime_helpers.py applies optional overrides to MemoryConsolidator after construction.
  • automem/consolidation/runtime_bindings.py forwards the optional values through the runtime factory.
  • app.py imports and passes the new config values.
  • tests/test_consolidation_engine.py covers both default preservation and explicit override behavior.

Out of scope

Follow-up PRs should handle runtime behavior separately:

  • measured threshold retuning, if nearest-neighbor and cluster-quality evidence supports it;
  • cluster cadence changes or a fresher online/nearline enrichment path for high-ingest deployments;
  • readiness probing before consolidation ticks;
  • legacy PARALLEL_CONTEXT normalization;
  • supersession or preference discovery improvements.

Test plan

  • .venv/bin/python -m pytest tests/test_consolidation_engine.py — 9 passed
  • .venv/bin/python -m compileall automem tests/test_consolidation_engine.py
  • Pre-commit hooks on commit: black, isort, flake8, trim trailing whitespace, EOF fixer, merge conflict check, debug statement check, detect secrets, conventional commit

Adds two new env-var-backed knobs and tunes one default:

- CONSOLIDATION_CLUSTER_SIMILARITY_THRESHOLD (default 0.65)
- CONSOLIDATION_MIN_CLUSTER_SIZE (default 2)
- CONSOLIDATION_CLUSTER_INTERVAL_SECONDS default: 2592000 (30d) -> 604800 (7d)

Why
---

similarity_threshold and min_cluster_size were hardcoded in
MemoryConsolidator.__init__ at 0.75 and 3 respectively, with no
configuration path. Embedding geometry is deployment-dependent
(Voyage-4 1024d, OpenAI text-embedding-3-small Matryoshka-1024d,
Ollama nomic 768d, FastEmbed bge 768d) and a single hardcoded
threshold cannot suit all of them. The pattern of exposing similarity
thresholds via env vars is already established for enrichment
(ENRICHMENT_SIMILARITY_THRESHOLD).

Default 0.65 (down from 0.75): at 1024d, 0.75 sits in near-duplicate
territory; on a 62.7k-memory corpus with the 0.75 threshold the
cluster tick produced zero clusters. 0.65 puts the threshold in the
"topically related, not literally the same thing" band that
clustering is meant to capture. Operators on tighter cosine
distributions (e.g. text-embedding-3-large at 3072d) can raise it.

Default min_cluster_size 2 (down from 3): with min=3, a strongly-
related pair of memories never clusters. Connected-components on
sparse semantic graphs typically produces pairs first; min=2 keeps
those in scope. Operators on very large corpora can raise it.

Default interval 7d (down from 30d): monthly clustering on a live,
growing corpus means the graph is always up to a month stale. 7d
matches the existing creative-consolidation cadence. The breaking
default change is the one behavior shift in this PR; deployments
wanting the old behavior set
CONSOLIDATION_CLUSTER_INTERVAL_SECONDS=2592000.

Wiring
------

- automem/config.py: add the two new env vars; flip the interval
  default.
- automem/consolidation/runtime_helpers.py:
  build_consolidator_from_config accepts the two new params and
  sets them as attributes post-construction (the
  MemoryConsolidator constructor does not accept them; this avoids
  changing the public class signature).
- automem/consolidation/runtime_bindings.py: forward through
  create_consolidation_runtime.
- app.py: import and pass the new config values.

All other behavior is purely additive — no env override required to
preserve prior behavior except the interval.

Tests
-----

- tests/test_consolidation_engine.py: 7/7 pass.
- Full unit suite: clean except two pre-existing
  tests/test_content_size.py failures (auth setup, unrelated).
@jack-arturo
Copy link
Copy Markdown
Member

Scope update pushed in a551221: this PR now preserves the existing defaults and only exposes the config surface. Defaults remain similarity_threshold=0.75, min_cluster_size=3, and CONSOLIDATION_CLUSTER_INTERVAL_SECONDS=2592000. I updated the PR body with the measurement note: local diagnostics make lower thresholds / faster cadence plausible for some deployments, especially high-ingest graphs, but default retuning should be a separate measured follow-up.

jack-arturo pushed a commit that referenced this pull request May 1, 2026
…B load race (#165)

## Why

When `init_consolidation_scheduler()` runs a tick **immediately** after
spawning the worker thread, FalkorDB can still be loading its RDB
snapshot from disk. Every Redis command during that window returns:

> `LOADING Redis is loading the dataset in memory`

The eager tick catches the error, logs it, and bumps `last_run`
timestamps — silently skipping the day's decay / creative / cluster work
until tomorrow. The bigger the corpus, the longer the RDB load, the more
reliably this fires. On any restart-on-deploy host (Railway, Docker,
systemd) with a few thousand memories, it hits every deploy.

## What changes

One line in `automem/consolidation/runtime_scheduler.py:100` — drop the
eager `run_consolidation_tick_fn()` call after starting the worker
thread, and add a comment explaining why.

```diff
     state.consolidation_thread.start()
-    run_consolidation_tick_fn()
+    # Skip eager first tick: FalkorDB may still be loading its RDB snapshot at
+    # startup and the "Redis is loading the dataset in memory" error poisons
+    # the day's decay/creative run. The worker loop will fire its first tick
+    # after consolidation_tick_seconds, which is plenty of warm-up time.
     logger.info("Consolidation scheduler initialized")
```

## Why this is safe

- The worker loop still fires within `CONSOLIDATION_TICK_SECONDS`
(default 3600s = 1h). For decay/creative/cluster intervals measured in
days, a one-tick startup delay is invisible.
- The scheduler is timestamp-driven (`last_run` per task), not
edge-triggered. Missed intervals get picked up by the next loop
iteration — nothing is "lost" by deferring.
- Failure mode flips from "silent broken run" to "no run yet, will run
shortly" — strictly better.

## Out of scope

- A more involved fix would actively probe FalkorDB readiness with
retries before the first tick. That's a bigger change and arguably
belongs at the FalkorDB-client layer, not here. This PR is the minimal,
low-risk fix.
- The `discover_creative_associations` / clustering improvements live in
#163 and #164.

## Test plan

- [ ] Service starts cleanly with no eager tick log entry
- [ ] Worker loop fires its first tick after
`CONSOLIDATION_TICK_SECONDS`
- [ ] Forcing a tick via `POST /consolidate` still works immediately
- [ ] On a restart with a large RDB, no `LOADING Redis is loading the
dataset in memory` errors appear in consolidation logs

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants