Skip to content

fix: align bloom filter size/hash calculation with Dash Core#529

Open
xdustinface wants to merge 1 commit intov0.42-devfrom
fix/bloom-calculation
Open

fix: align bloom filter size/hash calculation with Dash Core#529
xdustinface wants to merge 1 commit intov0.42-devfrom
fix/bloom-calculation

Conversation

@xdustinface
Copy link
Collaborator

@xdustinface xdustinface commented Mar 13, 2026

Fix BloomFilter size calculation to compute filter size in bytes first (integer division), then convert to bits (bytes*8), matching how Dash Core implementation sizes the filter.

The previous code used ceil on bits directly, leading to wrong filter sizes and hash moduli that caused bloom filter mismatches.

Summary by CodeRabbit

  • Bug Fixes

    • Bloom filter sizing and hashing behavior updated to match Dash Core, resolving edge cases (including very small/1-byte filters) so filters interoperate correctly and reliably.
  • Tests

    • Added extensive tests that validate Bloom filter serialization and operations against Dash Core reference vectors, including scenarios with key insertions to ensure compatibility and correctness.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

Rewrote Bloom filter sizing and hashing to align with Dash Core: compute filter_bits from elements and false_positive_rate, clamp to ≥8 bits, derive filter_bytes/aligned_bits, recalculate n_hash_funcs from aligned_bits, update allocation/BitVec init, comments, and tests (including 1‑byte and Dash Core vectors).

Changes

Cohort / File(s) Summary
BloomFilter Core & Tests
dash/src/bloom/filter.rs
Replaced previous filter_size logic with Dash Core–aligned computation: compute filter_bits → clamp to ≥8 → derive filter_bytes and aligned_bits; compute n_hash_funcs from aligned_bits (clamped to [1, MAX_HASH_FUNCS]); allocate using aligned_bits and adjust BitVec init. Added comments referencing Dash Core and expanded tests (serialization against Dash Core vectors, 1‑byte filter edge cases) plus imports for PrivateKey/secp256k1 used in tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I dug through bits and bytes anew,
Four and eight and hashes two,
Aligned my bytes to Dash's song,
Tiny filters hopping along,
Seeds and vectors hum—I've leapt so true. 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main change: aligning bloom filter size/hash calculation logic with Dash Core implementation, which is the primary objective of this pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/bloom-calculation
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@xdustinface xdustinface force-pushed the fix/bloom-calculation branch from b9c5d69 to a07f95c Compare March 13, 2026 01:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dash/src/bloom/filter.rs`:
- Around line 355-363: Replace the hardcoded private key bytes, SecretKey
creation, and the fixed Network::Mainnet use in the test: instead of building a
PrivateKey from privkey_bytes and calling PrivateKey::new_uncompressed(...
Network::Mainnet) (symbols: privkey_bytes, secret_key,
PrivateKey::new_uncompressed, pubkey, secp), use a non-sensitive test
fixture—either construct a PublicKey directly from a constant pubkey hex (e.g.,
PublicKey::from_slice(...)) so no private material is present, or derive a
deterministic SecretKey from a constant seed for test purposes and use a
test/regtest network constant (not Mainnet) pulled from test config; update
places that expect privkey/pubkey to accept the new fixture.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31db95a8-722e-44f8-b6f0-20a2d1a1d441

📥 Commits

Reviewing files that changed from the base of the PR and between 33eb7b8 and b9c5d69.

📒 Files selected for processing (1)
  • dash/src/bloom/filter.rs

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 97.14286% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.90%. Comparing base (6749114) to head (73ea5ed).
⚠️ Report is 1 commits behind head on v0.42-dev.

Files with missing lines Patch % Lines
dash/src/bloom/filter.rs 97.14% 2 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           v0.42-dev     #529      +/-   ##
=============================================
+ Coverage      66.87%   66.90%   +0.03%     
=============================================
  Files            313      313              
  Lines          64753    64816      +63     
=============================================
+ Hits           43305    43368      +63     
  Misses         21448    21448              
Flag Coverage Δ *Carryforward flag
core 75.13% <97.14%> (+0.10%) ⬆️
dash-network 74.99% <ø> (-0.01%) ⬇️ Carriedforward from 6749114
dash-network-ffi 34.76% <ø> (ø) Carriedforward from 6749114
dash-spv 68.26% <ø> (ø) Carriedforward from 6749114
dash-spv-ffi 34.76% <ø> (ø) Carriedforward from 6749114
dashcore 74.99% <ø> (-0.01%) ⬇️ Carriedforward from 6749114
dashcore-private 74.99% <ø> (-0.01%) ⬇️ Carriedforward from 6749114
dashcore-rpc 19.92% <ø> (ø) Carriedforward from 6749114
dashcore-rpc-json 19.92% <ø> (ø) Carriedforward from 6749114
dashcore_hashes 74.99% <ø> (-0.01%) ⬇️ Carriedforward from 6749114
ffi 37.14% <ø> (+0.01%) ⬆️
key-wallet 65.64% <ø> (ø) Carriedforward from 6749114
key-wallet-ffi 34.76% <ø> (ø) Carriedforward from 6749114
key-wallet-manager 65.64% <ø> (ø) Carriedforward from 6749114
rpc 19.92% <ø> (ø)
spv 81.02% <ø> (-0.09%) ⬇️
wallet 65.67% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
dash/src/bloom/filter.rs 89.08% <97.14%> (+4.74%) ⬆️

... and 5 files with indirect coverage changes

@xdustinface xdustinface requested a review from ZocoLini March 13, 2026 02:29
/// https://github.com/dashpay/dash/blob/b66b56c3019fe7ab0c9f35dd9894c0ded4d9d420/src/test/bloom_tests.cpp#L51-L74
#[test]
fn test_bloom_filter_dash_core_compatibility_with_tweak() {
let mut filter = BloomFilter::new(3, 0.01, 2147483649, BloomFlags::All).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see difference beetwen this test and test_bloom_filter_dash_core_compatibility other than the filter constructor, maybe a for loop over a collection containing both bloom filters?? Maybe extract the logic into a function encapsulated in the test function and call it with both filters??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its just mirroring the c++ test cases.

let aligned_bits = filter_bytes * 8;

let n_hash_funcs = (aligned_bits as f64 / elements as f64 * ln2) as u32;
let n_hash_funcs = n_hash_funcs.clamp(1, MAX_HASH_FUNCS);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using clamp is kinda unnecessary, u32 cannot be less than 1 and the C++ code is not checking it either

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u32 cannot be less than 1

What about 0? Apparently unlikely to happen but maybe just an edge case issue in dash core then.

let filter_size = filter_size.clamp(1, MAX_BLOOM_FILTER_SIZE * 8);
let filter_bits =
(-1.0 / ln2_squared * elements as f64 * false_positive_rate.ln()) as usize;
let filter_bits = filter_bits.clamp(8, MAX_BLOOM_FILTER_SIZE * 8);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see the C++ code checking if the value is less than 8, and I dont know enough about BloomFilters implementation right now, so I have to trust you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm right, going to drop it.

/// Expected serialized bytes: "038fc16b080000000000000001"
#[test]
fn test_bloom_create_insert_key() {
let secp = secp256k1::Secp256k1::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unit test is testing the same paths (in BloomFilter) the test_bloom_filter_dash_core_compatibility function is testing, right??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its just mirroring the c++ test cases.

Fix `BloomFilter` size calculation to compute filter size in bytes first (integer division), then convert to bits (bytes*8), matching how Dash Core implementation sizes the filter.

The previous code used `ceil` on bits directly, leading to wrong filter sizes and hash moduli that caused bloom filter mismatches.
@xdustinface xdustinface force-pushed the fix/bloom-calculation branch from a07f95c to 73ea5ed Compare March 15, 2026 10:04
@xdustinface xdustinface requested a review from ZocoLini March 15, 2026 10:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
dash/src/bloom/filter.rs (1)

352-361: ⚠️ Potential issue | 🟠 Major

Replace hardcoded private key material and fixed mainnet parameter in test fixture.

Line 353 and Line 360 still embed explicit key material and a fixed Network::Mainnet. Please switch this test to a non-sensitive public fixture path (e.g., fixed pubkey bytes / hash vector) without private key construction.

As per coding guidelines, **/*.rs: Never hardcode network parameters, addresses, or keys in Rust code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dash/src/bloom/filter.rs` around lines 352 - 361, The test currently embeds
secret material via privkey_bytes/secret_key and forces
PrivateKey::new_uncompressed(... Network::Mainnet) — remove construction of any
private key and the hardcoded Network::Mainnet: replace the block so it uses a
fixed public-key fixture (e.g., a compressed/uncompressed pubkey byte array or
hex string) and create the public key directly (use
secp256k1::PublicKey::from_slice or PublicKey::from_str) instead of deriving it
from a PrivateKey; if any network value is required, use a test-safe constant or
derive it from the fixture rather than Network::Mainnet and remove references to
privkey_bytes, secret_key, privkey and PrivateKey::new_uncompressed to ensure no
secret or hardcoded network parameters remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dash/src/bloom/filter.rs`:
- Around line 58-62: Insert can panic when the computed filter length is zero
because insert performs hash % self.filter.len(); update the BloomFilter::insert
implementation to guard against an empty filter by returning early when
self.filter.is_empty() (or equivalently checking self.filter.len() == 0) before
performing any modulo/indexing, and add a brief comment referencing the
zero-length-filter case; alternatively, ensure creation code (where
filter_bits/filter_bytes are computed) enforces a minimum of 1 byte by clamping
filter_bytes = filter_bytes.max(1) so that insert never sees len == 0—pick one
approach and apply consistently to avoid the modulo-by-zero in insert (refer to
variables filter_bits, filter_bytes and the insert method/self.filter.len()).

---

Duplicate comments:
In `@dash/src/bloom/filter.rs`:
- Around line 352-361: The test currently embeds secret material via
privkey_bytes/secret_key and forces PrivateKey::new_uncompressed(...
Network::Mainnet) — remove construction of any private key and the hardcoded
Network::Mainnet: replace the block so it uses a fixed public-key fixture (e.g.,
a compressed/uncompressed pubkey byte array or hex string) and create the public
key directly (use secp256k1::PublicKey::from_slice or PublicKey::from_str)
instead of deriving it from a PrivateKey; if any network value is required, use
a test-safe constant or derive it from the fixture rather than Network::Mainnet
and remove references to privkey_bytes, secret_key, privkey and
PrivateKey::new_uncompressed to ensure no secret or hardcoded network parameters
remain.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c0db6322-94d9-4fe5-9861-e03bd49956bc

📥 Commits

Reviewing files that changed from the base of the PR and between a07f95c and 73ea5ed.

📒 Files selected for processing (1)
  • dash/src/bloom/filter.rs

Comment on lines +58 to 62
let filter_bits =
(-1.0 / ln2_squared * elements as f64 * false_positive_rate.ln()) as usize;
let filter_bits = filter_bits.min(MAX_BLOOM_FILTER_SIZE * 8);
let filter_bytes = filter_bits / 8;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Zero-byte filters make insert panic via modulo-by-zero.

With the new byte-first sizing, zero-length filters are now valid (also asserted in Line 287). That makes insert hit hash % self.filter.len() with len == 0, causing a panic on first insertion.

🐛 Proposed fix
 pub fn insert(&mut self, data: &[u8]) {
+    if self.filter.is_empty() {
+        return; // Dash Core-compatible no-op for zero-byte filters
+    }
     for i in 0..self.n_hash_funcs {
         let seed = i.wrapping_mul(0xfba4c795).wrapping_add(self.n_tweak);
         let hash = murmur3(data, seed);
         let index = (hash as usize) % self.filter.len();
         self.filter.set(index, true);
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dash/src/bloom/filter.rs` around lines 58 - 62, Insert can panic when the
computed filter length is zero because insert performs hash % self.filter.len();
update the BloomFilter::insert implementation to guard against an empty filter
by returning early when self.filter.is_empty() (or equivalently checking
self.filter.len() == 0) before performing any modulo/indexing, and add a brief
comment referencing the zero-length-filter case; alternatively, ensure creation
code (where filter_bits/filter_bytes are computed) enforces a minimum of 1 byte
by clamping filter_bytes = filter_bytes.max(1) so that insert never sees len ==
0—pick one approach and apply consistently to avoid the modulo-by-zero in insert
(refer to variables filter_bits, filter_bytes and the insert
method/self.filter.len()).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants