Skip to content

retriever: hardMinScore is applied before decay, so final returned scores can fall below the configured threshold #699

@Zundar

Description

@Zundar

Summary

hardMinScore is currently applied before lifecycle/time decay in src/retriever.ts, so results can pass the cutoff and then be returned with a final score below the configured threshold.

In practice this makes hardMinScore look ineffective for negative-query suppression: the runtime config can show a stricter threshold, but the returned list still contains low-scoring results after decay.

Affected code

  • third_party/memory-lancedb-pro/src/retriever.ts:436-438
  • third_party/memory-lancedb-pro/src/retriever.ts:503-506

Current flow in both vector and hybrid retrieval paths is effectively:

  1. compute semantic / lexical score
  2. apply length normalization
  3. filter with hardMinScore
  4. apply lifecycle/time decay
  5. return results whose final score can now be < hardMinScore

Why this is a problem

For operator tuning, hardMinScore reads like a post-ranking floor. But today it is only a pre-decay gate. That means:

  • negatives can still return low-score memories even with a strict hardMinScore
  • debugging is confusing because effective runtime config looks correct
  • reported result scores do not reflect the configured floor

Reproduction context

Observed during an OpenClaw memory pipeline benchmark on a fixed corpus / query set:

  • runtime overrides included minScore = 0.6, hardMinScore = 0.69, candidatePoolSize = 40
  • those overrides were confirmed in the effective runtime config
  • negative queries still returned results
  • the final returned scores were below 0.69

Related operator notes / evidence:

  • debug runbook: runbooks/embedder-ab-pipeline-debug-20260423T022226Z.md
  • artifact bundle: logs/bench/embedder-ab-pipeline-debug-20260423T022226Z/

The key observation from that debug run:

hardMinScore is applied before lifecycle/time decay. After that, the result is decayed again and the final returned score can fall below the configured threshold.

Expected behavior

One of these should be made explicit and consistent:

  1. hardMinScore is a final returned-score floor
    • then it should be enforced after decay / final scoring
  2. hardMinScore is only a pre-decay relevance gate
    • then the final returned score should probably expose both pre-decay and post-decay values, and docs should say clearly that post-decay scores may fall below the configured threshold

Suggested direction

At minimum, make the contract explicit in docs / config semantics.

Preferably:

  • keep a base_score / pre_decay_score
  • apply lifecycle/time decay into a separate final_score
  • enforce the operator-facing floor on whichever score is intended by config semantics
  • return both scores for debugging if possible

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions