fix: re-index relay hops in byNode after Load() picks best observation (#692)#801
fix: re-index relay hops in byNode after Load() picks best observation (#692)#801efiten wants to merge 1 commit intoKpa-clawbot:masterfrom
Conversation
Kpa-clawbot
left a comment
There was a problem hiding this comment.
Review — REVIEW-ONLY (PR is CONFLICTING, do not merge as-is)
Verdict: Bug is real (#692 root cause well-diagnosed), test is excellent — but the fix mechanism is superseded by #806 which already landed on master. Recommend rebasing and keeping only the test, dropping the code changes.
Conflict scope
Rebasing onto origin/master conflicts in:
cmd/server/neighbor_persist.go(backfillResolvedPathsAsync block)cmd/server/store.go(auto-merges, but the change is now dead code — see below)
Conflicts with #806 (9e90548 perf(#800): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode).
Why the fix is now obsolete
After #806, Load() indexes resolved-path pubkeys inline per-observation as rows are scanned (store.go:498–521): addToByNode + addToResolvedPubkeyIndex are called right when resolved_path is read off the row. indexByNode() was deliberately scoped down to only decoded_json fields (pubKey/destPubKey/srcPubKey) — it no longer touches resolved paths at all.
So the PR's post-pickBestObservation re-call of s.indexByNode(tx) is a no-op for relay hops on master. Same story for the neighbor_persist.go hunk — master's backfill path now uses removeFromResolvedPubkeyIndex / extractResolvedPubkeys / addToResolvedPubkeyIndex instead.
What to keep
TestLoadIndexesRelayHopsFromResolvedPath is a valuable regression test and almost certainly passes against master as-is (lock in #806's fix for #692). Please rebase, drop the store.go + neighbor_persist.go hunks, and re-run the test against master.
Correctness note (for the historical fix, dijkstra)
Even on the pre-#806 codebase, the fix only indexes pubkeys from the best (longest) observation's resolved_path, not from all observations. Relay nodes that only appear on shorter paths would still be missed. #806's per-observation indexing during Load() handles this correctly.
Perf (carmack)
N/A — superseded.
Action for author
|
@efiten please clean up for merging or abandon. |
…fter Load() TestLoadIndexesRelayHopsFromResolvedPath inserts a packet whose relay node appears only in resolved_path (not decoded_json), calls Load(), and asserts byNode[relayPubkey] is populated. Locks in the fix from Kpa-clawbot#806 (per-observation inline indexing during Load()). Code changes originally in this PR are obsolete after Kpa-clawbot#806. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
0040e92 to
2643651
Compare
Problem
indexByNode()was called duringLoad()immediately when eachStoreTxwas created — before observations were appended and beforepickBestObservation()settx.ResolvedPath. The resolved_path indexing branch added in #708 was effectively dead code on every server restart.Symptom: After any restart,
byNode[relay_pubkey]was empty for relay-only nodes even whenresolved_pathwas correctly persisted in the DB. Analytics showedtotalPackets = 0for repeater nodes despite active relay traffic.Fix
Call
s.indexByNode(tx)again in the post-load loop afterpickBestObservation(), whereResolvedPathis populated. Same fix applied tobackfillResolvedPathsAsync(), which also calledpickBestObservation()without re-indexing afterward.The dedup in
nodeHashesprevents double-counting: pubkeys already indexed from decoded JSON fields are skipped; only the relay hop pubkeys fromresolved_pathare new additions.Test
TestLoadIndexesRelayHopsFromResolvedPath— inserts a packet withresolved_pathcontaining a relay pubkey that does not appear indecoded_json, callsLoad(), and verifiesbyNode[relay_pubkey]is populated.Related
Closes #692 (together with #707, #708, #711 already merged)
🤖 Generated with Claude Code