-
-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem
EdgeLearner._cursors is an in-memory dict[str, str] that tracks the last processed Redis stream entry ID per stream key. On process restart, all cursors are lost and every stream is re-consumed from `0-0`.
With the stream TTL now at 30 days (#2102), a restart means re-processing up to 30 days of feedback events. The `on_retrieval` operations (EMA weight update, access count increment) are not fully idempotent — re-applying EMA decay corrupts edge weights.
Discovered During
Code review of PR #2179 (issue #2102)
Fix
Persist cursors to Redis using a simple hash:
```python
HSET rag:cursors:edge_learner <stream_key> <last_entry_id>
```
Read on startup, write after each successful batch.
Impact
Medium — edge weight corruption after restart. Low frequency (restarts are infrequent) but high impact on RAG quality when it occurs.