Skip to content

Memory: janitor pass for unretrieved-fact decay #466

@dcellison

Description

@dcellison

Parent: #436 (Sub 4 of the memory-quality epic). Stub; full body to be written at spec time.

Scope

Cron-driven background sweep (weekly initial cadence) that walks the Qdrant index and applies a decay condition to each fact: (age > N months) AND (retrieval_count == 0) triggers either a demotion (provenance class downgraded one notch, confidence multiplied by a decay factor) or a cull (delete from index). Demotion is the lower-risk default for v1; cull lands behind a flag once demotion behavior is observed in production.

Per-provenance decay parameters rather than a single global N: a user_stated fact unretrieved for six months is probably still important context the operator told us once; the same age threshold on assistant_derived is much closer to garbage.

Prerequisite: Sub 4a

Per-fact retrieval-event logging does not exist today. Every retrieval needs to increment a counter on the facts it returned (or at least on the facts that ranked above the prompt-injection cutoff). The counter lives in fact metadata alongside the speaker and confidence fields from #437. This logging change is carved out as Sub 4a, landing before Sub 4 proper so the janitor has data to act on.

Sub 4a risk: the counter increment runs on every retrieval, which is a hot-path mutation. Performance impact must be measured before that sub ships; if non-trivial, batch the increment or move to sampling.

Why

The only mechanism in the epic that uses signal genuinely unavailable at write-time (actual retrieval history). Even a perfect upfront-quality floor cannot predict which facts will turn out useless three months later; the janitor catches that class.

Open design questions for the spec

  • Per-provenance decay parameters (N months per class, decay multiplier per class).
  • Demotion vs cull policy: demotion-only for v1 is the design preference; cull flag is a follow-up.
  • Logging granularity: per-retrieval per-fact vs per-retrieval per-fact-that-reached-the-prompt. The latter is cheaper and aligns with how the eval harness defines fraction_in_prompt.

Risks (preliminary)

  • Hot-path mutation overhead from Sub 4a's counter increment.
  • Decay parameters that are too aggressive on user_stated facts could lose context the operator told the bot once and would want again.

Implementation timing: lowest urgency of the four; ship after Sub 3 and Sub 2. The current index is small enough that decay is not yet load-bearing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions