WIP core perf optimizations#5236
Draft
dmkozh wants to merge 103 commits intostellar:masterfrom
Draft
Conversation
Instead of the main thread waiting idle while worker threads process all clusters, have the main thread process cluster 0 directly. This improves CPU utilization by eliminating idle time on the main thread.
This reverts commit 5f43890.
Track which keys existed in the LedgerTxn before parallel apply via mOriginalLedgerTxnKeys. Use this to call createWithoutLoading() or updateWithoutLoading() instead of expensive load() calls during commit. Also clone snapshots from GlobalParallelApplyLedgerState instead of re-acquiring from the snapshot manager, ensuring consistency. # Conflicts: # src/transactions/ParallelApplyUtils.cpp
Replace xdrSha256(success) with streaming SHA256 calculation to avoid XDR re-serialization of InvokeHostFunctionSuccessPreImage. The return value and events are already available as XDR-encoded bytes, so we can hash them directly without round-trip serialization.
…nConfig Allows callers with a pre-fetched SorobanNetworkConfig to pass it directly, avoiding redundant config lookups during validation. The original overload now delegates to the new one after fetching the config. # Conflicts: # src/transactions/TransactionFrame.cpp
This variable was declared but never used.
# Conflicts: # src/ledger/LedgerManagerImpl.cpp
Adds parallel processing to transaction set handling: 1. Parallel TxFrame creation: Creates TxFrames from XDR envelopes in parallel during transaction set deserialization. Uses work-stealing via std::async with even distribution across available threads. 2. Parallel transaction validation: Validates transactions in parallel in txsAreValid() when there are 2+ transactions. 3. Hash precomputation: Precomputes content and full hashes before parallel operations to avoid race conditions. 4. Test coverage: Adds StreamingShaTest for InvokeHostFunctionSuccessPreImage verification. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> # Conflicts: # src/herder/TxSetFrame.cpp
Add sizeBytes field to ContractDataMapEntryT to cache the XDR serialized size of ledger entries. This avoids repeated xdr_size() calls during state updates, reducing CPU overhead in the hot path. Also adds Tracy zone to updateState() for profiling visibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
During ledger close, three independent operations are now parallelized: - addHotArchiveBatch (modifies mHotArchiveBucketList) - addLiveBatch (modifies mLiveBucketList) - runs on main thread - updateInMemorySorobanState (modifies mInMemorySorobanState) These operations modify completely independent data structures and can safely run concurrently. Added getInMemorySorobanStateForUpdate() to allow direct access to mInMemorySorobanState during COMMITTING phase. This reduces ledger close latency by overlapping CPU-bound operations. # Conflicts: # src/ledger/LedgerManagerImpl.cpp
That's because it doesn't properly commit changes and we can't share a snapshot across threads. There must be a better way around this, though preferably we should just fix the tests to not use in-memory mode at all.
…_map" This reverts commit 97a431a.
…nordered_map"" This reverts commit 225f583.
This reverts commit d08c4a6.
…t rid of virtual dispatch""" This reverts commit 5f9634b.
…unordered_map"" This reverts commit a0cfe2a.
…et with unordered_map""" This reverts commit eb661ec.
…ion + get rid of virtual dispatch"""" This reverts commit 338e585.
…Xs"" This reverts commit 73489b4.
resolveBackgroundEvictionScan previously received an UnorderedSet<LedgerKey> built by getAllKeysWithoutSealing() containing ~128K entries (~20ms to build), but only performed ~10-100 lookups. Added isModifiedKey() to LedgerTxn for direct O(1) lookups in the existing EntryMap, eliminating the set construction. resolveEviction zone: 20ms -> 0.116ms per ledger (99.4% reduction). TPS: 18,944 -> 19,328 avg (+2.0%). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace single global mutex + RandomEvictionCache with 16 sharded caches, each with its own mutex. This eliminates contention when 4 parallel threads verify signatures simultaneously. Also use maybeGet() instead of exists()+get() double-lookup, fix ZoneText string heap allocations, make counters atomic, and remove unused liveSnapshot copy in applySorobanStageClustersInParallel.
Sort lightweight 24-byte EntryRef structs (type tag + pointer) instead of full BucketEntry objects (200-500 bytes) in convertToBucketEntry. Reduces sort swap cost by ~12x and materializes final vector in one cache-friendly sequential pass. Cuts convertToBucketEntry from 31.9ms to 25.4ms per ledger. Benchmark: 13,760 -> 14,144 TPS (+384 TPS, +2.8%)
Skip building LedgerTxnDelta in setEffectsDeltaFromSuccessfulTx when INVARIANT_CHECKS is empty. The delta is consumed exclusively by checkOnOperationApply which iterates an empty list when no invariants are configured. This eliminates ~285ms of shared_ptr allocations and entry copies across 4 worker threads per ledger. Benchmark: 12,736 -> 13,760 TPS (+1,024 TPS, +8.0%)
This reverts commit e3225f4. The budget optimization now seems slightly positive, but that wasn't reproduced on AWS instance; in any case the impact is pretty low.
…ads to subtle bugs and it's not clear how to fix these cleanly. Probably some redesign is necessary.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.