Skip to content

fix: re-index relay hops in byNode after Load() picks best observation (#692)#801

Open
efiten wants to merge 1 commit intoKpa-clawbot:masterfrom
efiten:fix/load-relay-index-692
Open

fix: re-index relay hops in byNode after Load() picks best observation (#692)#801
efiten wants to merge 1 commit intoKpa-clawbot:masterfrom
efiten:fix/load-relay-index-692

Conversation

@efiten
Copy link
Copy Markdown
Contributor

@efiten efiten commented Apr 20, 2026

Problem

indexByNode() was called during Load() immediately when each StoreTx was created — before observations were appended and before pickBestObservation() set tx.ResolvedPath. The resolved_path indexing branch added in #708 was effectively dead code on every server restart.

Symptom: After any restart, byNode[relay_pubkey] was empty for relay-only nodes even when resolved_path was correctly persisted in the DB. Analytics showed totalPackets = 0 for repeater nodes despite active relay traffic.

Fix

Call s.indexByNode(tx) again in the post-load loop after pickBestObservation(), where ResolvedPath is populated. Same fix applied to backfillResolvedPathsAsync(), which also called pickBestObservation() without re-indexing afterward.

The dedup in nodeHashes prevents double-counting: pubkeys already indexed from decoded JSON fields are skipped; only the relay hop pubkeys from resolved_path are new additions.

Test

TestLoadIndexesRelayHopsFromResolvedPath — inserts a packet with resolved_path containing a relay pubkey that does not appear in decoded_json, calls Load(), and verifies byNode[relay_pubkey] is populated.

Related

Closes #692 (together with #707, #708, #711 already merged)

🤖 Generated with Claude Code

Copy link
Copy Markdown
Owner

@Kpa-clawbot Kpa-clawbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — REVIEW-ONLY (PR is CONFLICTING, do not merge as-is)

Verdict: Bug is real (#692 root cause well-diagnosed), test is excellent — but the fix mechanism is superseded by #806 which already landed on master. Recommend rebasing and keeping only the test, dropping the code changes.

Conflict scope

Rebasing onto origin/master conflicts in:

  • cmd/server/neighbor_persist.go (backfillResolvedPathsAsync block)
  • cmd/server/store.go (auto-merges, but the change is now dead code — see below)

Conflicts with #806 (9e90548 perf(#800): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode).

Why the fix is now obsolete

After #806, Load() indexes resolved-path pubkeys inline per-observation as rows are scanned (store.go:498–521): addToByNode + addToResolvedPubkeyIndex are called right when resolved_path is read off the row. indexByNode() was deliberately scoped down to only decoded_json fields (pubKey/destPubKey/srcPubKey) — it no longer touches resolved paths at all.

So the PR's post-pickBestObservation re-call of s.indexByNode(tx) is a no-op for relay hops on master. Same story for the neighbor_persist.go hunk — master's backfill path now uses removeFromResolvedPubkeyIndex / extractResolvedPubkeys / addToResolvedPubkeyIndex instead.

What to keep

TestLoadIndexesRelayHopsFromResolvedPath is a valuable regression test and almost certainly passes against master as-is (lock in #806's fix for #692). Please rebase, drop the store.go + neighbor_persist.go hunks, and re-run the test against master.

Correctness note (for the historical fix, dijkstra)

Even on the pre-#806 codebase, the fix only indexes pubkeys from the best (longest) observation's resolved_path, not from all observations. Relay nodes that only appear on shorter paths would still be missed. #806's per-observation indexing during Load() handles this correctly.

Perf (carmack)

N/A — superseded.

Action for author

  1. git fetch origin && git rebase origin/master
  2. Resolve by dropping the store.go + neighbor_persist.go changes.
  3. Verify the test passes on master (it should — #806 fixes #692).
  4. If it does, repurpose this PR as "test: regression for #692 (relay-hop indexing on Load)".

@Kpa-clawbot
Copy link
Copy Markdown
Owner

@efiten please clean up for merging or abandon.

…fter Load()

TestLoadIndexesRelayHopsFromResolvedPath inserts a packet whose
relay node appears only in resolved_path (not decoded_json), calls
Load(), and asserts byNode[relayPubkey] is populated.

Locks in the fix from Kpa-clawbot#806 (per-observation inline indexing during
Load()). Code changes originally in this PR are obsolete after Kpa-clawbot#806.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@efiten efiten force-pushed the fix/load-relay-index-692 branch from 0040e92 to 2643651 Compare May 1, 2026 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Loss of analytics for repeater/no additional analytics being reported on day of new release

2 participants