(Requires Audit) PLEX-2473 LogPoller Fix Reorg Handing on Replay (Part 1)#355
(Requires Audit) PLEX-2473 LogPoller Fix Reorg Handing on Replay (Part 1)#355dhaidashenko wants to merge 3 commits intodevelopfrom
Conversation
|
73f8c25 to
fd687cd
Compare
fd687cd to
abead20
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes reorg (reorganization) handling in the LogPoller when replay operations encounter blocks that have been reorged. The core issue was that when LogPoller processed blocks up to block N, then a reorg replaced block N, and a replay was initiated from a block below N, the old and new versions of block N could coexist in the database, or a false finality violation could occur.
Changes:
- Added
isReplayboolean parameter toPollAndSaveLogsmethod and related internal functions to enable special reorg detection logic during replay operations - Extracted
headerByNumberhelper method to reduce code duplication when fetching block headers from RPC - Enhanced
getCurrentBlockMaybeHandleReorgto detect reorgs during replay by checking if current and latest blocks in DB match their RPC counterparts - Refactored reorg handling logic into a new
handleReorgmethod for better code organization - Added comprehensive test case covering three scenarios: replay immediately after reorg, replay a few blocks after reorg, and replay after reorged block is finalized
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| pkg/logpoller/log_poller.go | Core implementation changes: added isReplay parameter, extracted headerByNumber and handleReorg methods, enhanced reorg detection logic for replay scenarios |
| pkg/logpoller/log_poller_test.go | Updated all test calls to include isReplay=false parameter, added comprehensive TestLogPoller_Reorg_On_Replay test with three test cases |
| pkg/logpoller/log_poller_internal_test.go | Updated all test calls to include isReplay=false parameter |
| pkg/logpoller/helper_test.go | Updated PollAndSaveLogs wrapper to pass isReplay=false |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f60d8b4
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| require.Len(t, logs, 1) | ||
| require.Equal(t, newLogData, big.NewInt(0).SetBytes(logs[0].Data).Int64(), "Log data should match the log from the new block, indicating that the old block's log was properly removed during replay") | ||
| // Ensure reorged block was replaced by a new one | ||
| dbBlock, err := th.ORM.SelectBlockByNumber(testutils.Context(t), reorgedBlock.Number().Int64()) |
There was a problem hiding this comment.
Inconsistent context usage: This line uses testutils.Context(t) while the rest of the test function uses t.Context() (lines 2248, 2251, 2254). For consistency and to ensure the context is properly tied to the test lifecycle, consider using t.Context() here as well.
| dbBlock, err := th.ORM.SelectBlockByNumber(testutils.Context(t), reorgedBlock.Number().Int64()) | |
| dbBlock, err := th.ORM.SelectBlockByNumber(t.Context(), reorgedBlock.Number().Int64()) |
| require.NoError(t, err) | ||
| require.Equal(t, int64(reorgedBlockNumber), reorgedBlock.Number().Int64()) | ||
|
|
||
| // Replace block 11 with a new block and bury it under 1 new block |
There was a problem hiding this comment.
Misleading comment: The comment says "bury it under 1 new block" but the actual number of blocks added after the reorg depends on tc.numberOfBlocksAfterReorg, which varies per test case (0, 1, or 5). Consider updating the comment to accurately reflect this, such as "Replace block 11 with a new block and optionally add additional blocks on top of it".
| // Replace block 11 with a new block and bury it under 1 new block | |
| // Replace the reorged block with a new block and add a variable number of blocks on top |
| if pkgerrors.Is(err, sql.ErrNoRows) { | ||
| lp.lggr.Criticalw("Unexpected state. Expected at least one block to be present in the db when handling reorg, but got no rows", "currentBlockNumber", currentBlock.Number, "err", err) | ||
| } | ||
| return nil, pkgerrors.Wrap(err, "failed to get latest finalized block from db") |
There was a problem hiding this comment.
Misleading error message: The error message says "failed to get latest finalized block from db" but the function actually calls orm.SelectLatestBlock (line 1063), not SelectLatestFinalizedBlock. The error message should be updated to "failed to get latest block from db" for accuracy.
| return nil, pkgerrors.Wrap(err, "failed to get latest finalized block from db") | |
| return nil, pkgerrors.Wrap(err, "failed to get latest block from db") |
|
|
||
| // PollAndSaveLogs On startup/crash current is the first block after the last processed block. | ||
| // currentBlockNumber is the block from where new logs are to be polled & saved. Under normal | ||
| // conditions this would be equal to lastProcessed.BlockNumber + 1. |
There was a problem hiding this comment.
Missing parameter documentation: The function comment should document the new isReplay parameter, explaining that it should be true when called during a replay operation and false during normal polling. This helps future maintainers understand when to use each value.
| // conditions this would be equal to lastProcessed.BlockNumber + 1. | |
| // conditions this would be equal to lastProcessed.BlockNumber + 1. The isReplay flag should be | |
| // true when PollAndSaveLogs is called as part of a replay operation and false during normal polling. |
TestCase that were previously failing:
Expected behaviour:
Supports: #356