Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR focuses on increasing Raft log processing throughput by reducing wake-ups in the Raft run loop and batching inbound message handling, while also applying related optimizations across configuration, RocksDB/Raft storage, journal apply ordering, and logging/assert cleanup.
Changes:
- Reworked Raft node run loop to use
tokio::select!with a tick interval plus batch-drain of queued messages (raft_batch_size). - Optimized Raft/RocksDB storage behaviors (range delete for log compaction paths, write options usage, updated flush behavior).
- Adjusted journal apply bookkeeping ordering and reduced logging / redundant asserts in inode/fs paths.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| curvine-common/src/conf/journal_conf.rs | Removes poll interval config, adds batching config, enables batch_append, adjusts flush batch timing defaults. |
| curvine-common/src/raft/raft_node.rs | Implements select-based ticking + batched recv; reorders on_ready processing; snapshots triggered by FSM op_id. |
| curvine-common/src/raft/storage/rocks_storage_core.rs | Makes compaction bounds safer (error instead of panic) and switches entry deletion to RocksDB range deletes. |
| curvine-common/src/rocksdb/db_engine.rs | Changes flush behavior based on WAL and uses write_opt for batched writes. |
| curvine-common/src/rocksdb/write_batch.rs | Adds delete_range_cf helper used by Raft log storage. |
| curvine-server/src/master/journal/journal_loader.rs | Updates applied op_id/rpc_id earlier per batch iteration. |
| curvine-server/src/master/journal/journal_writer.rs | Removes per-entry debug log and enriches snapshot info log fields. |
| curvine-server/src/master/meta/fs_dir.rs | Removes redundant asserts in inode lookup/handling paths. |
| curvine-server/src/master/meta/inode/inode_path.rs | Removes redundant assert in clone_last_file(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
d1ab83c to
4606c58
Compare
- Raft run loop: switch to tokio::select! (ticker + recv), batch-drain up to raft_batch_size messages per wake-up via try_recv() to improve log throughput - Config: replace raft_poll_interval_ms with raft_batch_size; enable batch_append; reduce writer_flush_batch_ms default 100→10 - on_ready: persist entries then send_messages before apply_committed_entries so followers are notified earlier; snapshot trigger uses op_id instead of applied index - RocksDB: delete_entry uses delete_range_cf; flush uses write_opt; add WriteBatch::delete_range_cf - Journal: apply op_id/rpc_id at start of batch step; reduce log noise (drop send_entry debug, add inode_id to snapshot info); remove redundant asserts in fs_dir and inode_path
szbr9486
approved these changes
Mar 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves Raft log processing throughput and applies related config, storage, and logging changes. The main change is the Raft run loop: use
tokio::select!over a ticker andreceiver.recv(), and when a message arrives, batch-drain up toraft_batch_sizepending messages in one pass. Additional edits touch config, RocksDB/Raft storage, journal apply order, and log/assert cleanup.Files changed:
curvine-common/src/conf/journal_conf.rsraft_poll_interval_ms, addraft_batch_size; enablebatch_append;writer_flush_batch_ms100→10curvine-common/src/raft/raft_node.rscurvine-common/src/raft/storage/rocks_storage_core.rsdelete_entry(range delete)curvine-common/src/rocksdb/db_engine.rsflushorder;write_batchwithwrite_optcurvine-common/src/rocksdb/write_batch.rsdelete_range_cfcurvine-server/src/master/journal/journal_loader.rsop_id/rpc_idat start of batch stepcurvine-server/src/master/journal/journal_writer.rscurvine-server/src/master/meta/fs_dir.rsassert!(!inode.is_file_entry())curvine-server/src/master/meta/inode/inode_path.rsassert!(!v.is_file_entry())inclone_last_file1. Raft run loop and config (
raft_node.rs,journal_conf.rs)Run loop: Replaced the previous pattern (poll_interval + timeout around
recv(), then tick) withtokio::select! { biased; ticker.tick() => raw.tick(); recv() => handle one + batch-drain }. When the channel has backlog, each wake-up can process up toraft_batch_size(default 8) messages viatry_recv(), reducing wake-ups and improving log throughput.Config: Removed
raft_poll_interval_ms; addedraft_batch_size(default 8). Enabledbatch_append: truein Raft config. Reduced defaultwriter_flush_batch_msfrom 100 to 10.Snapshot: Snapshot trigger now uses
op_idfrom FSM state (last_snapshot_op_id) instead of applied index (last_snapshot_applied).2. on_ready order (
raft_node.rs)Order: Persist new log entries and then send
persisted_messagesbefore applying committed entries. So followers can be notified earlier; FSM apply of already-committed entries no longer blocks the persistence pipeline.3. Raft storage compact and delete (
rocks_storage_core.rs,write_batch.rs,db_engine.rs)compact_index > last_index(), return an error instead of panicking; condition changed from> last_index() + 1to> last_index()so we do not compact past the last log entry.delete_range_cf) instead of a per-key delete loop. AddedWriteBatch::delete_range_cfinwrite_batch.rs.flush(): when WAL is enabled callflush_wal(sync), otherwiseflush_mem(sync).write_batch()now useswrite_opt(batch, &self.write_opt).4. Journal apply and logging (
journal_loader.rs,journal_writer.rs)applied.op_idandapplied.rpc_idat the start of each batch loop iteration (before the match on the entry), so applied position is updated per entry.debug!("send_entry ..."). Leader snapshot info log now includesinode_idand the existing cost/entries/dir fields.5. Assert and inode cleanup (
fs_dir.rs,inode_path.rs)assert!(!inode.is_file_entry())(in the status and path resolution paths).assert!(!v.is_file_entry())inclone_last_file().