Skip to content

split blocksync reactor into 2 modes.#3511

Open
pompon0 wants to merge 13 commits into
mainfrom
gprusak-blocksync
Open

split blocksync reactor into 2 modes.#3511
pompon0 wants to merge 13 commits into
mainfrom
gprusak-blocksync

Conversation

@pompon0
Copy link
Copy Markdown
Contributor

@pompon0 pompon0 commented May 27, 2026

Autobahn nodes will need to be able to send pre-giga blocks to pre-giga nodes during transition period (so that pre-giga nodes can blocksync to the upgrade height). However all the remaining parts of the blocksync reactor should be disabled. This pr extracts the tendermint-only blocksync logic to syncController which is optional part of the blocksync reactor. Additionally I have

  • refactored the blocksync reactor to use structured concurrency
  • fixed bug in poolRoutine which was silently terminating blocksync in case of block validation failure (now it will retry fetching a block)
  • fixed busy loop in blocksync pool.run.

@cursor
Copy link
Copy Markdown

cursor Bot commented May 27, 2026

PR Summary

High Risk
Large refactor of blocksync lifecycle, P2P message routing, and block application/handoff to consensus; incorrect gating of sync vs query-only mode could break catch-up or upgrade-era block serving.

Overview
The blocksync Reactor is split into an always-on query responder (owns the single P2P channel; serves BlockRequest and StatusRequest from local store) and an optional syncController wired via utils.Option[SyncerConfig]. Active catch-up—block pool, outbound requests, apply/validate, consensus handoff, lag metrics, and self-remediation—lives only when that option is set; RPC depends on a slim Metricer interface instead of the concrete reactor type.

BlockPool and per-height requesters drop service.BaseService in favor of scope.Run, owned request/error channels (Requests / Errors), and utils.Watch for requester state. poolRoutine no longer exits on validation/commit failure: it evicts bad peers, redoes requests, and continues syncing (new tests cover this).

Node wiring passes sync deps through SyncerConfig; SwitchToBlockSync no longer takes a context. Minor: Option.Or is a value receiver; mempool tests use utils.TestRng for tx generation.

Reviewed by Cursor Bugbot for commit 4c1559e. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 28, 2026, 12:52 PM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 74.41860% with 88 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.22%. Comparing base (3936ac9) to head (4c1559e).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
sei-tendermint/internal/blocksync/reactor.go 64.88% 71 Missing and 8 partials ⚠️
sei-tendermint/internal/blocksync/pool.go 92.52% 6 Missing and 2 partials ⚠️
sei-tendermint/node/node.go 90.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3511      +/-   ##
==========================================
- Coverage   59.04%   58.22%   -0.83%     
==========================================
  Files        2199     2129      -70     
  Lines      182096   173896    -8200     
==========================================
- Hits       107513   101243    -6270     
+ Misses      64933    63668    -1265     
+ Partials     9650     8985     -665     
Flag Coverage Δ
sei-chain-pr 63.02% <74.41%> (?)
sei-db 70.41% <ø> (ø)
sei-db-state-db ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-tendermint/internal/rpc/core/env.go 76.15% <ø> (ø)
sei-tendermint/internal/statesync/reactor.go 70.88% <100.00%> (+0.42%) ⬆️
sei-tendermint/libs/utils/option.go 90.90% <100.00%> (ø)
sei-tendermint/node/node.go 65.37% <90.00%> (+0.17%) ⬆️
sei-tendermint/internal/blocksync/pool.go 88.23% <92.52%> (+3.70%) ⬆️
sei-tendermint/internal/blocksync/reactor.go 65.37% <64.88%> (+1.28%) ⬆️

... and 70 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread sei-tendermint/internal/blocksync/reactor.go
Comment thread sei-tendermint/internal/blocksync/reactor.go Outdated
Comment thread sei-tendermint/internal/blocksync/pool.go Outdated
Comment thread sei-tendermint/internal/blocksync/reactor.go
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 3206044. Configure here.

Comment thread sei-tendermint/internal/blocksync/pool.go
return err
switch update.Status {
case p2p.PeerStatusUp:
s.channel.Send(wrap(&pb.StatusRequest{}), update.NodeID)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peer-up handler sends StatusRequest instead of StatusResponse

Medium Severity

On PeerStatusUp, the old code sent a StatusResponse (advertising local height to the new peer), enabling the remote node to immediately learn this node's height. The new code sends a StatusRequest instead, which asks the peer for their height. While this node still learns about the peer, the peer no longer receives an immediate height advertisement. Peers that rely on receiving a StatusResponse on connection (e.g., pre-giga nodes during transition) won't learn this node's height until the next periodic StatusRequest broadcast (every 10 seconds).

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3206044. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be a minor temporary regression until upgrade completes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant