Detached552#182
Draft
l0rinc wants to merge 24 commits into
Draft
Conversation
Seed commit: 4f15ea1 submitblock: Check transaction count early in submitblock The submitblock RPC deserializes caller-provided block data and, when the previous block is known, calls ChainstateManager::UpdateUncommittedBlockStructures before ProcessNewBlock runs normal block validation. That helper is only meant to fill in the coinbase witness reserved value for otherwise valid block structures, but it accepted any first transaction with a witness commitment output. A malformed block can be serialized with a known prev hash, a first transaction containing a witness commitment output, and no inputs. On a SegWit-active chain, UpdateUncommittedBlockStructures then sees a commitment, observes that the first transaction has no witness, copies it to CMutableTransaction, and writes tx.vin[0] before CheckBlock can reject the block as missing a coinbase. That is C++ UB and can crash from authenticated RPC input. Guard the helper to return unless the block has a real coinbase first transaction. This is minimal for the supported behavior and IsCoinBase guarantees the input that will be updated exists. The added validation_block_tests case constructs the malformed pre-validation block shape and checks that the helper leaves it for CheckBlock to reject with bad-cb-missing. Validation: cmake --build build_audit_wallet --target test_bitcoin -j2; build_audit_wallet/bin/test_bitcoin --run_test=validation_block_tests/update_uncommitted_block_structures_rejects_malformed_coinbase; build_audit_wallet/bin/test_bitcoin --run_test=validation_block_tests; git diff --check. Limitations: no functional submitblock RPC test was added; the unit test covers the exact chainstate helper reached by submitblock after block deserialization and before validation.
ChainstateLoadOptions defaults mempool to nullptr. The bitcoinkernel chainstate manager options keep that default and pass the copied options to node::LoadChainstate(). A caller can also set wipe_chainstate_db through btck_chainstate_manager_options_set_wipe_dbs(). If a datadir already contains chainstate_snapshot/base_blockhash, LoadChainstate() initializes the validated chainstate with options.mempool, loads the snapshot chainstate, and, when wipe_chainstate_db is true, deletes the snapshot chainstate. DeleteChainstate() removed the snapshot and then asserted prev_chainstate->m_mempool->size() == 0. In the no-mempool bitcoinkernel configuration prev_chainstate->m_mempool is nullptr, so this is an assertion crash reachable before the API can return an error. Bitcoin Core builds keep assertions enabled; src/util/check.h rejects NDEBUG. This mirrors the existing null-mempool guard in AddChainstate(), fixed by 2bc3265. Keep DeleteChainstate()'s transfer invariant, but allow a null mempool before checking that any present mempool is empty. Add a regression test that creates a snapshot chainstate directory, initializes the manager with a null mempool, loads the snapshot chainstate, and deletes it after clearing the validated chainstate target as LoadChainstate() does for reindex. Verified: git diff --check -- src/validation.cpp src/test/validation_chainstatemanager_tests.cpp Verified: cmake --build build_audit --target test_bitcoin -j 4 Verified: build_audit/bin/test_bitcoin --run_test=validation_chainstatemanager_tests/delete_chainstate_handles_null_mempool --catch_system_error=no --log_level=test_suite -- -printtoconsole=1
ChainstateManager::ProcessNewBlock rejected blocks while holding cs_main, but the reject path also dispatched the synchronous ValidationSignals::BlockChecked callback. A BlockChecked subscriber can take arbitrary callback-owned locks, and code holding the same callback-owned lock can later enter validation and lock cs_main, creating cs_main -> callback_mutex -> cs_main. Keep the CheckBlock and AcceptBlock calls inside the cs_main scope because CheckBlock mutates CBlock::fChecked and AcceptBlock requires cs_main, but release cs_main before notifying BlockChecked for rejected blocks. Reproducer before the fix, with the focused DEBUG_LOCKORDER test added: build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_reject_blockchecked_does_not_hold_cs_main Running 1 test case... fatal error: potential deadlock detected: ::cs_main -> callback_mutex -> ::cs_main Post-fix verifiers: cmake --build build_lockorder_wallet --target test_bitcoin -j2 [100%] Built target test_bitcoin build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_reject_blockchecked_does_not_hold_cs_main Running 1 test case... *** No errors detected build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_signals_ordering Running 1 test case... *** No errors detected TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_signals_ordering Running 1 test case... *** No errors detected cmake --build build_thread_annotations_wallet_clang --target test_bitcoin -j2 [100%] Built target test_bitcoin Limitations: this only moves the ProcessNewBlock rejection notification. Other synchronous validation callbacks, including the ConnectTip BlockChecked path and NewPoWValidBlock, still need separate proof before changing their lock scope.
Chainstate::ActivateBestChain dispatched the synchronous ValidationSignals::ActiveTipChange callback before releasing cs_main. A subscriber that takes a callback-owned mutex can establish cs_main -> callback_mutex, while code holding that same callback mutex and entering validation can take callback_mutex -> cs_main. Capture the active tip pointer, IBD state, and validation-signals pointer while cs_main protects the chainstate, then send ActiveTipChange immediately after the cs_main scope exits. The queued UpdatedBlockTip notification remains under cs_main because its enqueue ordering is documented as part of the callback contract. Reproducer before the fix, with the focused DEBUG_LOCKORDER test added: build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/active_tip_change_does_not_hold_cs_main Running 1 test case... fatal error: potential deadlock detected: ::cs_main -> callback_mutex -> ::cs_main Post-fix verifiers: cmake --build build_lockorder_wallet --target test_bitcoin -j2 [100%] Built target test_bitcoin build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/active_tip_change_does_not_hold_cs_main Running 1 test case... *** No errors detected build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_reject_blockchecked_does_not_hold_cs_main Running 1 test case... *** No errors detected build_lockorder_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_signals_ordering Running 1 test case... *** No errors detected TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=validation_block_tests/processnewblock_signals_ordering Running 1 test case... *** No errors detected cmake --build build_thread_annotations_wallet_clang --target test_bitcoin -j2 [100%] Built target test_bitcoin Limitations: this commit only changes the synchronous ActiveTipChange notification. UpdatedBlockTip and BlockConnected remain queued under their existing ordering contract; the ConnectTip BlockChecked callback still needs a separate design if it is changed.
Seed commit: 17994-era flush-state behavior, rechecked against current BlockManager dirty-file accounting. Current path: FlushStateToDisk() persists block index metadata after BlockManager::FlushChainstateBlockFile(). That helper flushed only the blockfile cursor for the active chainstate tip. Blocks may be written out of height order, while undo data is written in validation order, so a dirty undo file from an older blockfile can have BLOCK_HAVE_UNDO metadata persisted without the corresponding rev file being flushed. Reproducer: write blocks that rotate to file 1, then write undo data for a block in file 0. Replace rev00000.dat with a directory and force a state flush. Before the fix the flush path did not report the dirty undo-file failure for file 0; metadata could be persisted past unflushed undo data. Fix: after flushing the chainstate tip blockfile, iterate dirty file-info entries and flush every dirty undo file that has undo data, skipping the cursor file already handled by FlushBlockFile(). Invalid/stale dirty indexes are ignored defensively. Tests: blockmanager_flush_chainstate_block_file_flushes_dirty_undo_file covers the exact BlockManager failure path. feature_undo_flush.py covers the node-level state-flush path that turns the flush failure into shutdown before metadata is treated as durable. Validation: cmake --build build_audit --target test_bitcoin -j2; build_audit/bin/test_bitcoin --run_test=validation_block_tests,transaction_tests; build_audit/bin/test_bitcoin --run_test=amount_tests,transaction_tests,txvalidation_tests,validation_block_tests; git diff --check. The functional test was added as a minimal end-to-end reproducer but was not rerun in the final build tree because bitcoind/test config were not available there.
Seed commit: none; found while auditing current orphanage peer-resource accounting. Current path: TxOrphanageImpl::LimitOrphans() computes each peer's latency share as MaxPeerLatencyScore(). When there are more peers with orphans than global latency slots, integer division can make that per-peer share zero even though the global latency limit is exceeded. Reproducer: create an orphanage with two global latency slots and add one orphan each from three peers. Before the fix the per-peer latency divisor could be zero or no peer could be selected as over-limit, so LimitOrphans() could miss the necessary trim despite total latency exceeding the global cap. Fix: floor both per-peer latency and memory shares at 1, and include peers with DoS score exactly 1 because that is the over-limit floor-share case. Test: orphanage_tests/peer_dos_limits adds the three-peer/two-slot reproducer and checks total latency and announcement count are brought back under the cap. Validation: cmake --build build_audit --target test_bitcoin -j2; build_audit/bin/test_bitcoin --run_test=orphanage_tests/peer_dos_limits; git diff --check.
Seed commit: 549b08e net_processing: avoid IBD stall after empty HEADERS. A peer selected for initial headers sync can respond to GETHEADERS with an empty HEADERS message while the node is still far behind. The empty-response path cleared the per-peer low-work headers-sync object, but left CNodeState::fSyncStarted set and nSyncStarted incremented. Since SendMessages() only selects another initial headers-sync peer when nSyncStarted is zero while the best header is older than 24 hours, that stale state can keep a node from immediately syncing from another available peer. The timeout is intentionally long, so a stale or malicious peer can delay IBD progress after the node connects to a useful peer. Reset fSyncStarted, decrement nSyncStarted, and clear the headers timeout when an empty HEADERS response arrives from the current initial-sync peer and the best header is still old. Preserve the last-getheaders timestamp in this reset case so the same peer is not immediately selected again; ordinary empty responses still clear the timestamp as before. The fix is limited to the empty HEADERS path and keeps the existing low-work presync cleanup. It does not change behavior after the node is close to the network tip, where multiple peers may already be queried. Validation: cmake --build build_audit_wallet --target bitcoind -j2; test/functional/p2p_addnode_ibd_stall.py --configfile build_audit_wallet/test/config.ini; test/functional/p2p_initial_headers_sync.py --configfile build_audit_wallet/test/config.ini; git diff --check.
Seed commit: none. This was found in a current-tree scan of the experimental libbitcoinkernel C API. Several exported C entry points asserted on caller-provided scalar arguments: out-of-range transaction, block, and undo indexes; mismatched spent-output counts; invalid script verification flags; invalid logging category or level values; and invalid chain parameter enum values. Bitcoin Core builds remove NDEBUG, so these asserts are active in release-style builds. A kernel API user that forwards untrusted or externally-derived indexes/flags can therefore abort the embedding process instead of getting the documented null/error result. Handle those inputs at the C boundary. Range accessors now return null for out-of-range indexes, precomputed transaction data creation returns null for impossible spent-output arrays, script verification reports invalid flag bits and out-of-range input indexes through btck_ScriptVerifyStatus, invalid logging enums are ignored by the void logging mutators, and invalid chain parameter enum values return null. Pointer arguments annotated non-null remain hard API preconditions and are not changed here. The fix is minimal because it only replaces public-entry assertions with existing C API error channels, plus one new script status for the otherwise ambiguous input-index case. Existing wrapper behavior for valid callers is unchanged. Validation: cmake -S . -B /tmp/bitcoin_kernel_audit_build -G Ninja -DBUILD_KERNEL_LIB=ON -DBUILD_KERNEL_TEST=ON -DBUILD_TESTS=OFF -DBUILD_BENCH=OFF -DBUILD_DAEMON=OFF -DBUILD_CLI=OFF -DBUILD_TX=OFF -DBUILD_UTIL=OFF -DENABLE_WALLET=OFF -DBUILD_GUI=OFF completed; cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 completed; /tmp/bitcoin_kernel_audit_build/bin/test_kernel passed all 18 test cases; git diff --check passed. Limitations: this does not make null handles safe, because the header already marks those pointer arguments non-null. Full test_bitcoin was not rerun for this kernel-only API change.
Seed commit: none. This follow-up came from the same current-tree scan of remaining libbitcoinkernel public-entry assertions. btck_block_tree_entry_get_ancestor() accepted a caller-provided height but asserted that CBlockIndex::GetAncestor() returned a block. Negative heights and heights above the entry are normal scalar inputs at the C boundary, and GetAncestor() reports them with nullptr. Because Bitcoin Core builds keep assert() enabled, an API user forwarding an external height could abort the embedding process. Return null when no ancestor exists and document the out-of-range result. This matches btck_chain_get_by_height(), which already documents null for out-of-bounds heights, and leaves non-null handle arguments as preconditions. A kernel test now checks negative and too-high ancestor heights through the C API, and verifies the C++ wrapper turns the null result into its existing runtime_error behavior. Validation: cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 completed; /tmp/bitcoin_kernel_audit_build/bin/test_kernel passed all 18 test cases; git diff --check passed. Limitations: this is scoped to the ancestor height argument only. Full test_bitcoin was not rerun for this kernel-only API change.
Seed commit: 3cbf7cb Squashed 'src/secp256k1/' changes from b9313c6e1a..d543c0d917 That subtree update switched MuSig secret buffers to explicit clearing, including the secret key extracted inside secp256k1_musig_nonce_gen_counter(). The current function still returned before the explicit clear when secp256k1_musig_nonce_gen_internal() failed. The failure path is reachable after secp256k1_keypair_sec() has copied the 32-byte secret key into the local seckey buffer. For example, callers can provide a malformed key aggregation cache and a non-aborting illegal-argument callback; nonce_gen_internal() returns 0 and the function previously returned immediately, leaving the copied secret key in the stack frame. Other MuSig signing code already routes failures through cleanup helpers before returning. Keep the return value from nonce_gen_internal(), always clear seckey with secp256k1_memclear_explicit(), and then return that status. This preserves API behavior while closing the missing-zeroization path. Validation: cmake --build build_audit_wallet --target src/secp256k1/src/test -j2 (runs 195 secp256k1 CTest cases, all passed); git diff --check. Limitations: no new test was added because stack zeroization is not directly observable without instrumentation; the proof is the single failure edge that previously bypassed the existing explicit clear.
Seed commit: none; found while checking current MuSig2 signing secret lifetimes. Current paths: CKey::CreateMuSig2Nonce() created nonce randomness in a stack uint256, and CKey::CreateMuSig2PartialSig() created a secp256k1_keypair containing private key material on the stack. Proof/reproducer: these helpers are reached by wallet MuSig2 PSBT processing. The stack objects contain secret nonce randomness or a copied signing key before returning to normal wallet code. There is no protocol-output reproducer because the bug is residual secret lifetime after successful or failed signing, not a changed signature result. Fix: allocate the nonce randomness with secure_allocator and the keypair with make_secure_unique so their storage is cleansed on destruction. API behavior and signing results are unchanged. Test/proof coverage: no new unit test was added because clearing stack memory is not directly observable in the existing test framework without instrumentation. Existing wallet_musig.py and secp256k1 tests cover that signing still works; the review proof is the replacement of the two secret-bearing automatic objects with existing secure-clearing storage primitives. Validation: cmake --build build_audit_wallet --target src/secp256k1/src/test -j2; test/functional/wallet_musig.py --configfile=build_audit_wallet/test/config.ini; git diff --check.
btck_precomputed_transaction_data_create() checked that the spent-output array pointer was present and matched the transaction input count, but did not check each array element before converting it through the opaque handle helper. A null element therefore dereferenced address 0 instead of returning nullptr like the surrounding invalid-argument paths.\n\nReject null spent-output entries before copying them into the CTxOut vector. This preserves the existing behavior for missing arrays, count mismatches, and valid output arrays.\n\nReproducer before this change, with only the regression test added:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_precomputed_txdata\n\nfailed with:\n\n fatal error: in "btck_precomputed_txdata": memory access violation at address: 0x0\n test_kernel.cpp(543): last checkpoint\n\nVerifier after this change:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_precomputed_txdata\n\npassed with "*** No errors detected".
btck_chainstate_manager_import_blocks() accepted a nonzero path count with null path entries, and also accepted a null lengths array when every path entry was null. Both cases produced an empty import list and returned success, even though a nonzero path count describes arrays of block-file paths and lengths. A non-null path with a null lengths array would also dereference the missing lengths array.\n\nReject nonzero counts unless both arrays are present, and reject null path entries before constructing fs::path values. While touching the conversion, use PathFromString on the supplied pointer/length pair instead of routing through a temporary C string. Empty imports with count zero are unchanged.\n\nReproducer before this change, with only the regression test added:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_chainman_tests\n\nfailed with:\n\n test_kernel.cpp(842): error: check btck_chainstate_manager_import_blocks(raw_chainman.get(), paths, path_lengths, 1) == -1 has failed [0 != -1]\n test_kernel.cpp(843): error: check btck_chainstate_manager_import_blocks(raw_chainman.get(), paths, nullptr, 1) == -1 has failed [0 != -1]\n\nVerifier after this change:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_chainman_tests\n\npassed with "*** No errors detected".
The kernel C API constructors for transactions, blocks, and block headers decoded from a SpanReader but did not require the reader to be exhausted. A caller could append arbitrary bytes to an otherwise valid serialized object and still receive a btck object. This contradicts the public API boundary, especially btck_block_header_create(), whose length argument is documented as an 80-byte serialized header.\n\nRequire all three constructors to consume the entire supplied buffer before returning an object. Truncated and malformed inputs were already rejected by deserialization, so this only tightens trailing-data acceptance.\n\nReproducer before this change, with only the regression tests added:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_transaction_tests,btck_block_header_tests,btck_block\n\nfailed with:\n\n test_kernel.cpp(422): error: in "btck_transaction_tests": exception std::runtime_error expected but not raised\n test_kernel.cpp(745): error: in "btck_block_header_tests": exception std::runtime_error expected but not raised\n test_kernel.cpp(776): error: in "btck_block": exception std::runtime_error expected but not raised\n\nVerifier after this change:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_transaction_tests,btck_block_header_tests,btck_block\n\npassed with "*** No errors detected".
btck_block_check() exposes a bitmask with only POW and MERKLE optional bits, but accepted any unknown bits by ignoring them. That differs from the script-verification C API, which rejects bits outside its public ALL mask before validating. It also means callers can pass a mistyped or future flag and receive a successful validation result that did not actually honor that flag.\n\nReject bits outside btck_BlockCheckFlags_ALL and leave the supplied validation state in error mode, which the C API reports as INTERNAL_ERROR. Valid BASE, POW, MERKLE, and ALL calls keep the existing behavior.\n\nReproducer before this change, with only the regression test added:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_check_block_context_free\n\nfailed with:\n\n test_kernel.cpp(1007): error: check !block.Check(... btck_BlockCheckFlags_ALL | (1U << 2) ...) has failed\n test_kernel.cpp(1008): error: check state.GetValidationMode() == ValidationMode::INTERNAL_ERROR has failed\n\nVerifier after this change:\n\n cmake --build /tmp/bitcoin_kernel_audit_build --target test_kernel -j2 && /tmp/bitcoin_kernel_audit_build/bin/test_kernel --run_test=btck_check_block_context_free\n\npassed with "*** No errors detected".
PrivateBroadcast::PickTxForSend stores a SendStatus entry for every peer selected for a private broadcast attempt. If the private broadcast peer disconnects before confirming receipt with PONG, PeerManagerImpl::FinalizeNode schedules another private broadcast connection but previously left the unconfirmed SendStatus in the transaction state. That makes failed private broadcast attempts accumulate one retained peer record per attempt for as long as the transaction remains pending. The records are exposed through getprivatebroadcastinfo and are scanned when deriving priority and stale state. A hostile or persistently unresponsive private-broadcast path can therefore grow per-transaction memory and RPC response work without bound. Fix this by pruning the disconnected node from PrivateBroadcast when a private-broadcast connection ends without confirmation. Confirmed records are retained because Remove() reports the number of successful acknowledgements. To preserve retry ordering without retaining per-peer failed-attempt records, keep an aggregate count and last-pick timestamp for disconnected unconfirmed attempts. Reproducer before fix, using a temporary mutation that made RemoveUnconfirmedNode() a no-op: cmake --build build_audit --target test_bitcoin -j2 && build_audit/bin/test_bitcoin --run_test=private_broadcast_tests/disconnected_unconfirmed_nodes_are_pruned Failed with retained peer records growing across retries, including checks such as peers.size() 1 != 0, total peers 2 != 1, and later total peers 6 != 1. Verifier after fix: cmake --build build_audit --target test_bitcoin -j2 && build_audit/bin/test_bitcoin --run_test=private_broadcast_tests Running 4 test cases... *** No errors detected Limitations: this is a unit-level resource proof for the PrivateBroadcast queue state and disconnect cleanup path; it does not add an end-to-end P2P functional test or RSS measurement.
Seed commit: none; found during current MuSig2/PSBT signing audit. Current path: PSBTInput::FillSignatureData copies PSBT_IN_TAP_BIP32_DERIVATION and PSBT_IN_MUSIG2_PARTICIPANT_PUBKEYS into SignatureData. SignPSBTInput then calls ProduceSignature, SignTaproot, and SignMuSig2 on RPC- or wallet-provided PSBTs. When the script key differs from the aggregate pubkey, SignMuSig2 builds a BIP328 synthetic xpub from the aggregate pubkey and previously trusted the PSBT-provided key origin path. Reachability and impact: descriptor parsing rejects hardened MuSig2 aggregate derivation, but PSBT fields are independent serialized input and can contain a hardened path or a non-hardened path that does not derive to the claimed Taproot key. The hardened case reached CPubKey::Derive's non-hardened-child assertion, and the mismatch case reached Assert(XOnlyPubKey(extpub.pubkey) == script_pubkey). A malformed PSBT passed to analysis/signing RPCs or wallet signing could therefore abort instead of being treated as incomplete/invalid signing material. Fix: reject hardened aggregate derivation steps before calling public derivation, and only use derived tweaks if the resulting synthetic xpub pubkey actually matches the script pubkey. This keeps valid non-hardened aggregate derivation unchanged while ignoring malformed PSBT metadata. Tests: added psbt_tests/musig2_invalid_aggregate_derivation covering hardened and mismatched aggregate derivation metadata. Validation: cmake --build build_audit_wallet --target test_bitcoin -j 4; build_audit_wallet/bin/test_bitcoin --run_test=psbt_tests/musig2_invalid_aggregate_derivation --catch_system_error=no --log_level=test_suite; build_audit_wallet/bin/test_bitcoin --run_test=psbt_tests --catch_system_error=no --log_level=test_suite; build_audit_wallet/bin/test_bitcoin --run_test=bip328_tests --catch_system_error=no --log_level=test_suite; test/functional/wallet_musig.py --configfile=build_audit_wallet/test/config.ini; git diff --check -- src/script/sign.cpp src/test/psbt_tests.cpp. Limitations: this does not attempt to make malformed MuSig2 PSBT metadata signable; it only prevents invalid metadata from crossing synthetic-xpub derivation preconditions or assertions.
Seed commit: none; found while checking current wallet amount accumulation paths. Current path: CreateTransaction() validated each recipient amount for negativity but did not bound the sum before coin selection and transaction construction. The recipient list is RPC/GUI caller-controlled, and CAmount is signed int64. Reproducer: call CreateTransaction() with enough recipients set to MAX_MONEY that the aggregate recipient target exceeds int64 range. Before the fix the wallet could carry an overflowing signed total into later fee/selection accounting instead of rejecting the request up front. Fix: accumulate recipient totals with an explicit MAX_MONEY - recipients_sum check while preserving the existing negative-amount error. Test: wallet_rejects_oversized_recipient_total constructs the smallest oversized vector for CAmount/MAX_MONEY and checks CreateTransaction() fails with Transaction amount too large. Validation: cmake --build build_audit --target test_bitcoin -j2; build_audit/bin/test_bitcoin --run_test=wallet_tests/spend_tests/wallet_rejects_oversized_recipient_total; git diff --check.
Seed commit: none; found during current MuSig2 wallet signing audit. Current path: MutableTransactionSignatureCreator::CreateMuSig2Nonce() derives a session id from the script pubkey, participant pubkey, and sighash, then asks CKey to generate a fresh MuSig2 secret nonce and stores it in the SigningProvider. Reproducer: process the same MuSig2 PSBT twice before the counterparty nonce is available. Before the fix, the second pass generated and stored a new secret nonce for the same signing session, while the first public nonce may already have been returned to the caller. That can desynchronize nonce state and risks unusable or unsafe MuSig2 signing flow if external PSBT coordination keeps the old public nonce. Fix: compute the session id before nonce generation and return no new nonce if a secret nonce is already stored for that session. This preserves the one-secret-nonce-per-session invariant. Test: wallet_musig.py test_repeated_nonce_request processes the original PSBT twice, checks only one public nonce is returned, combines with the other participant's nonce, and verifies the stored nonce can still produce a partial signature. Validation: cmake --build build_audit_wallet --target bitcoind -j2; test/functional/wallet_musig.py --configfile=build_audit_wallet/test/config.ini; git diff --check.
Seed commit: multiple historical descriptor-range hardening commits; current audit found remaining int32 boundary assumptions in RPC and descriptor wallet range handling. Current paths: EvalDescriptorStringOrObject() expands RPC scan ranges through int positions; ProcessDescriptorImport() stores inclusive RPC ranges as exclusive wallet range_end; DescriptorScriptPubKeyMan::TopUpWithDB() adds next_index and target keypool size into int32 range_end; UpdateWalletDescriptor() reset m_max_cached_index to -1 even for descriptors whose range_start was already positive. Reproducer: request a descriptor range ending at INT32_MAX through scantxoutset and importdescriptors, and use active ranged descriptors near the int32 boundary. Before the fix, inclusive-to-exclusive conversion and keypool top-up arithmetic could overflow int32 or cause cache bookkeeping to believe indexes below range_start were cached. Fix: iterate RPC scan expansion with int64 while casting only after ParseDescriptorRange bounds, reject import ranges that cannot be represented as exclusive int32 range_end, check keypool top-up addition before narrowing, and initialize/reset m_max_cached_index to range_start - 1 for nonzero starts. Tests: rpc_scantxoutset.py covers the allowed single INT32_MAX scan expansion; wallet_importdescriptors.py checks importdescriptors rejects INT32_MAX as an inclusive range end because wallet storage needs end + 1. Validation: cmake --build build_audit_wallet --target bitcoind -j2; test/functional/rpc_scantxoutset.py --configfile=build_audit_wallet/test/config.ini; test/functional/wallet_importdescriptors.py --configfile=build_audit_wallet/test/config.ini; git diff --check.
The wallet interface could read ScriptPubKeyMan state without holding cs_wallet while descriptor import/top-up updated the same state under cs_wallet. This was visible through WalletImpl::getPubKey(), which calls CWallet::GetSolvingProvider() and reads m_cached_spks concurrently with CWallet::CacheNewScriptPubKeys(). The same interface surface also exposes taprootEnabled(), and CWallet::SignMessage()/IsHDEnabled() read SPK manager maps directly, so those reads now take cs_wallet before touching the protected state.
Reproducer before fix:
TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_interface_spk_managers_threadsafe
ThreadSanitizer: data race
write: CWallet::CacheNewScriptPubKeys wallet.cpp:4533, via AddWalletDescriptor
read: CWallet::GetSolvingProvider wallet.cpp:3509, via WalletImpl::getPubKey interfaces.cpp:162
Verifiers after fix:
TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_interface_spk_managers_threadsafe
Running 1 test case... *** No errors detected
build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_interface_spk_managers_threadsafe
Running 1 test case... *** No errors detected
Limitations: this covers the wallet interface SPK manager read race found in this audit; other plausible audit candidates without hard proof are intentionally not changed.
CWallet::HasEncryptionKeys() read mapMasterKeys.empty() without holding cs_wallet. The wallet interface exposes this through isCrypted(), while normal wallet loading/encryption paths mutate mapMasterKeys under cs_wallet. Keep an atomic encryption-state flag in sync with the master-key map writers so encryption-state queries do not read the map without the wallet lock or add a new cs_wallet edge under descriptor/key-store locks.
Reproducer before fix:
TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_interface_encryption_keys_threadsafe
ThreadSanitizer: data race
write: wallet_tests.cpp:176 mutating mapMasterKeys while holding cs_wallet
read: CWallet::HasEncryptionKeys wallet.cpp:3585 via WalletImpl::isCrypted interfaces.cpp:143
Verifiers after fix:
TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_interface_encryption_keys_threadsafe
Running 1 test case... *** No errors detected
build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_interface_encryption_keys_threadsafe
Running 1 test case... *** No errors detected
Limitations: the test uses a focused state writer to exercise the protected map because the normal EncryptWallet() path only mutates it once; this commit fixes encryption-state queries, not broader public access to mapMasterKeys.
AddWallet protected WalletContext::wallets with wallets_mutex, but it also fired CWallet::NotifyCanGetAddressesChanged while still holding that mutex. A can-get-addresses callback could acquire an upper-layer mutex, and code holding that upper-layer mutex could later query wallet list state through GetWallets, creating wallets_mutex -> callback_mutex -> wallets_mutex. Keep the wallet list duplicate check and insertion under wallets_mutex, but release the mutex before sending the address-availability notification. This keeps the protected state scoped to the list mutation and avoids turning wallet UI/interface callbacks into wallet-list lock-order edges. Reproducer before the fix, with the focused test added: build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/add_wallet_does_not_notify_under_wallets_mutex Running 1 test case... fatal error: potential deadlock detected: context.wallets_mutex -> callback_mutex -> context.wallets_mutex Post-fix verifiers: build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/add_wallet_does_not_notify_under_wallets_mutex Running 1 test case... *** No errors detected TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=wallet_tests/add_wallet_does_not_notify_under_wallets_mutex Running 1 test case... *** No errors detected cmake --build build_thread_annotations_wallet_clang --target test_bitcoin -j2 [100%] Built target test_bitcoin Limitations: the DEBUG_LOCKORDER failure uses a focused callback mutex to model upper-layer callback state. TSan is run as a race-regression check for this path; it is not expected to detect the pre-fix lock-order cycle.
NotifyWalletLoaded held WalletContext::wallets_mutex while invoking load-wallet callbacks registered through HandleLoadWallet. A callback that acquired an upper-layer mutex could therefore establish wallets_mutex -> callback_mutex, while code holding that same upper-layer mutex and registering a load callback acquired callback_mutex -> wallets_mutex. This made wallet load notification callbacks an arbitrary lock-order edge. Use the existing btcsignals dispatch machinery for load-wallet notifications. It caches enabled callbacks under its own signal mutex and invokes them after releasing the mutex, while preserving connection liveness checks through MakeSignalHandler. This removes wallet_load_fns from the wallets_mutex-protected state instead of adding another lock scope around callbacks. Reproducer before the fix, with the focused test added: build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_load_callbacks_do_not_hold_wallets_mutex Running 1 test case... fatal error: potential deadlock detected: context.wallets_mutex -> callback_mutex -> context.wallets_mutex Post-fix verifiers: build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/wallet_load_callbacks_do_not_hold_wallets_mutex Running 1 test case... *** No errors detected build_lockorder_wallet/bin/test_bitcoin --run_test=wallet_tests/CreateWallet Running 1 test case... *** No errors detected TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1" build_tsan_wallet/bin/test_bitcoin --run_test=wallet_tests/CreateWallet Running 1 test case... *** No errors detected cmake --build build_thread_annotations_wallet_clang --target test_bitcoin -j2 [100%] Built target test_bitcoin Limitations: the new regression test is DEBUG_LOCKORDER-only because the bug is a lock-order edge, not a data race; the TSan run covers the existing wallet-load callback path for race regressions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.