Parallel input fetcher + no seek compaction + Flush and compact after IBD#180
Draft
l0rinc wants to merge 17 commits into
Draft
Parallel input fetcher + no seek compaction + Flush and compact after IBD#180l0rinc wants to merge 17 commits into
l0rinc wants to merge 17 commits into
Conversation
Introduce CoinsViewOverlay::StartFetching, which maps all input prevouts of a block to a new m_inputs vector of InputToFetch elements. Returns a ResetGuard which is lifetime bound to the block, while the InputToFetch elements are lifetime bound to the block as well. Introduce StopFetching to clear the m_inputs vector. CCoinsViewCache::Reset is made virtual and is overridden in CoinsViewOverlay. StopFetching is called on Reset, so the InputToFetch objects will not exceed the lifetime of the block. Introduce ProcessInput to fetch the utxo of an individual input in m_inputs. Each caller fetches the input at m_input_head and increments it, so each call will fetch the next input in the queue. Fetch coins from the m_inputs vector in FetchCoinFromBase by scanning all inputs until we discover the input with the correct outpoint. This is designed deliberately so multiple threads can call ProcessInput independently. Co-authored-by: l0rinc <pap.lorinc@gmail.com> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Inputs spending outputs of an earlier transaction in the same block won't be in the cache or the db. They also won't be requested by FetchCoinFromBase, so we can filter them out to not waste time trying to fetch them. Build an unordered set of seen txids while flattening m_inputs and skip any prevout whose hash is already in the set. Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Provides a worst-case upper bound on the number of inputs that can fit in a block, so callers (e.g. parallel input prefetching) can pre-allocate stable storage and rule out reallocation of per-input state. Cherry-picked from PR bitcoin#9938 (Lock-Free CheckQueue), with MAX_TXINS_PER_BLOCK renamed to MAX_INPUTS_PER_BLOCK to match the call site. Co-authored-by: Jeremy Rubin <jeremy.l.rubin@gmail.com>
Prepares for ProcessInput to be called from multiple threads. This flag acts as a memory fence around InputToFetch::coin. There is no lock guarding reads and writes of the coin field. Instead we use the flag's release/acquire semantics to ensure that when the main thread reads the coin it will have happened after a worker thread has finished writing it. Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Prepares for ProcessInput to be called from multiple threads. ProcessInput reads from base. For ProcessInput to be safe to call in parallel on separate threads, it must not be mutated. Flush, Sync, and SetBackend can modify base, so we override these and StopFetching before calling the base class. Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Add a configuration option for the number of worker threads used for parallel UTXO input fetching during block connection. Default is 4 threads, max is 16, 0 disables parallel fetching.
Prepares for ProcessInput to be called from multiple threads. Introduce a ThreadPool shared pointer to CoinsViewOverlay. A pool managed externally can be passed in the constructor. A global thread pool is used in fuzz harnesses since iterations can happen faster than the OS can create and tear down thread pools. This can cause a memory leak when fuzzing. Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Leverages the thread pool to fetch inputs on multiple threads, while the overlay serves inputs on the main thread. This is a performance improvement over blocking the main thread to fetch inputs. Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: l0rinc <pap.lorinc@gmail.com> Co-authored-by: sedited <seb.kung@gmail.com>
Log the chainstate LevelDB per-level file counts and stats table immediately before closing the database. This leaves a final shutdown snapshot in debug.log for reindex-chainstate benchmark comparisons.
Seek compaction is causing a cascade effect in the chainstate DB, causing large parts of the database to be rewritten every ~hour. Every periodic flush writes around 2 MiB. Since this is roughly the `write_buffer_size`, these writes regularly cause the memtable to rotate into a small L0 file. This file has a small seek budget, and with the random UTXO reads done during validation, it can get scheduled for seek compaction quickly. That seek compaction pushes the small file down to L1. Since most UTXOs are already lower down in L4/L5, many reads that consult this file do not find the key there and continue downward. The bloom filter makes those misses cheap, but LevelDB still decrements the file's seek budget. The file then gets scheduled for another seek compaction, and the same pattern pushes it down through L2 and L3. The expensive part happens around L3/L4. L4 has many ~32 MiB files holding the bulk of the UTXO set. When LevelDB compacts into L3, it may split the output into many smaller L3 files to limit how much L4 "grandparent" data any one output overlaps. Each of these small L3 files then gets its own small seek budget. Because chainstate keys are hash-random, each small L3 file can still have a broad key range, so many random reads consult it and quickly drain its budget. Once seek-compacted into L4, each tiny L3 file can overlap many L4 files, so compacting a few hundred KiB from L3 can require rewriting hundreds of MiB from L4. Repeating that across many small L3 files can rewrite most of the chainstate. This is a poor fit for chainstate because UTXO keys are hash-random, the DB is large enough to have many levels, writes are relatively small and periodic, and reads are frequent. The result is that read misses trigger compactions much earlier than size pressure would, and those compactions have very high write amplification. Disabling seek compaction may leave more files in upper levels for longer, so reads could theoretically consult more files. But Bitcoin Core enables bloom filters for all its LevelDB instances, so these misses are usually cheap in-memory filter checks rather than disk reads. For the other DBs, the risk is much smaller. They also use bloom filters, and most are smaller and less read-heavy. With fewer levels and less random read pressure, disabling seek compaction should have little effect there.
`SaltedOutpointHasher` has been `noexcept` since bitcoin#16957, which makes libstdc++ omit cached hash codes from `std::unordered_map` nodes. That saves one `size_t` per node, but it also makes `CCoinsMap` recompute the outpoint hash in table operations that could otherwise reuse the cached code. Declare the operator `noexcept(false)` to restore libstdc++ hash-code caching for this fast hash functor. The implementation still does not throw; only the exception specification used by the container policy changes. This restores a value that bitcoin#16957 deliberately removed, but the pool-backed `CCoinsMap` allocation budget still accounts for implementations storing hash values. The existing `PoolAllocator` comment says that "in some cases the hash value is stored as well" and that "sizeof(void*)*4" overhead "should thus be sufficient so that all implementations can allocate the nodes from the PoolAllocator." Add a unit test for the exception-specification contract and the libstdc++ fast-hash classification. Revert "make SaltedOutpointHasher noexcept" This reverts commit 67d9990.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.