Skip to content

refactor: remove hybrid storage and sync.Pool, simplify to direct ato…#19

Merged
shaia merged 10 commits intomainfrom
feature/simplified-atomic-operations
Nov 2, 2025
Merged

refactor: remove hybrid storage and sync.Pool, simplify to direct ato…#19
shaia merged 10 commits intomainfrom
feature/simplified-atomic-operations

Conversation

@shaia
Copy link
Copy Markdown
Owner

@shaia shaia commented Nov 2, 2025

…mic operations

This commit completes the transition to a simplified atomic implementation, removing unnecessary complexity while achieving 15-26x better performance.

Breaking Changes:

  • Removed AddBatch, AddBatchString, AddBatchUint64 methods
  • Removed IsArrayMode() method
  • Removed internal/storage package (hybrid array/map architecture)
  • Removed sync.Pool-based buffer management

Architecture Changes:

  • Direct cache-line array with atomic CAS operations
  • Stack-based hash position buffers (zero allocations for hashCount ≤ 16)
  • Removed 200+ lines of storage mode switching complexity
  • Simplified from ~600 to ~400 lines of code

Performance Improvements (vs thread-safe pool version):

  • Add operation: 26 ns/op (was 400 ns/op) - 15.4x faster
  • Contains operation: 23 ns/op (was 600 ns/op) - 26.1x faster
  • Memory: 0 B/op (was 96-128 B/op) - 99.93% reduction
  • Allocations: 0 allocs/op (was millions) - 100% reduction
  • Throughput: 18.6M inserts/sec, 35.8M lookups/sec

Performance vs willf/bloom:

  • Add: 26 ns/op vs 72 ns/op (2.8x faster)
  • Contains: 23 ns/op vs 84 ns/op (3.7x faster)
  • Memory: 0 B/op vs 97 B/op (zero allocations)

Deleted Files:

  • FOLDER_ORGANIZATION.md (consolidated into README)
  • THREAD_SAFETY_FIXES.md (no longer applicable)
  • internal/storage/storage.go (removed hybrid architecture)
  • internal/storage/storage_test.go
  • tests/benchmark/bloomfilter_storage_mode_benchmark_test.go
  • tests/integration/bloomfilter_storage_mode_test.go
  • tests/integration/bloomfilter_stress_test.go

Updated Files:

  • README.md: Complete rewrite with new architecture and benchmarks
  • CHANGELOG.md: Documented breaking changes and improvements
  • bloomfilter.go: Updated comments to reflect simplified implementation
  • docs/examples/basic/example.go: Removed batch operations, added concurrency example
  • tests/TEST_COVERAGE_SUMMARY.md: Updated for simplified implementation
  • tests/integration/bloomfilter_edge_cases_test.go: Rewritten without storage mode dependencies
  • Makefile: Added test-short, test-race, test-integration, bench-short targets

Test Coverage:

  • All existing tests pass with simplified implementation
  • Race detector: zero race conditions detected
  • Escape analysis: confirmed zero heap allocations on hot path
  • Integration tests: comprehensive edge cases, concurrency, SIMD validation

…mic operations

This commit completes the transition to a simplified atomic implementation,
removing unnecessary complexity while achieving 15-26x better performance.

Breaking Changes:
- Removed AddBatch, AddBatchString, AddBatchUint64 methods
- Removed IsArrayMode() method
- Removed internal/storage package (hybrid array/map architecture)
- Removed sync.Pool-based buffer management

Architecture Changes:
- Direct cache-line array with atomic CAS operations
- Stack-based hash position buffers (zero allocations for hashCount ≤ 16)
- Removed 200+ lines of storage mode switching complexity
- Simplified from ~600 to ~400 lines of code

Performance Improvements (vs thread-safe pool version):
- Add operation: 26 ns/op (was 400 ns/op) - 15.4x faster
- Contains operation: 23 ns/op (was 600 ns/op) - 26.1x faster
- Memory: 0 B/op (was 96-128 B/op) - 99.93% reduction
- Allocations: 0 allocs/op (was millions) - 100% reduction
- Throughput: 18.6M inserts/sec, 35.8M lookups/sec

Performance vs willf/bloom:
- Add: 26 ns/op vs 72 ns/op (2.8x faster)
- Contains: 23 ns/op vs 84 ns/op (3.7x faster)
- Memory: 0 B/op vs 97 B/op (zero allocations)

Deleted Files:
- FOLDER_ORGANIZATION.md (consolidated into README)
- THREAD_SAFETY_FIXES.md (no longer applicable)
- internal/storage/storage.go (removed hybrid architecture)
- internal/storage/storage_test.go
- tests/benchmark/bloomfilter_storage_mode_benchmark_test.go
- tests/integration/bloomfilter_storage_mode_test.go
- tests/integration/bloomfilter_stress_test.go

Updated Files:
- README.md: Complete rewrite with new architecture and benchmarks
- CHANGELOG.md: Documented breaking changes and improvements
- bloomfilter.go: Updated comments to reflect simplified implementation
- docs/examples/basic/example.go: Removed batch operations, added concurrency example
- tests/TEST_COVERAGE_SUMMARY.md: Updated for simplified implementation
- tests/integration/bloomfilter_edge_cases_test.go: Rewritten without storage mode dependencies
- Makefile: Added test-short, test-race, test-integration, bench-short targets

Test Coverage:
- All existing tests pass with simplified implementation
- Race detector: zero race conditions detected
- Escape analysis: confirmed zero heap allocations on hot path
- Integration tests: comprehensive edge cases, concurrency, SIMD validation
@shaia shaia requested a review from Copilot November 2, 2025 09:10
@shaia shaia self-assigned this Nov 2, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 2, 2025

📋 Version Check: This PR contains changes. Consider creating a new version tag after merging.

Semantic Versioning Guide:

  • vX.Y.Z - Patch: Bug fixes, performance improvements
  • vX.Y.0 - Minor: New features, backward compatible
  • vX.0.0 - Major: Breaking changes

To create a release after merging:

git tag v0.1.0
git push origin v0.1.0

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@shaia shaia requested a review from Copilot November 2, 2025 09:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

bloomfilter.go:300

  • The constant ArrayModeThreshold and its documentation describe array/map modes that no longer exist in the simplified implementation. This constant and comment should be removed as they reference obsolete functionality.
	// Threshold for choosing between array and map mode
	// Arrays: ≤10K cache lines = ~5MB bloom filter = efficient for small/medium filters
	// Maps: >10K cache lines = scalable for large filters (up to billions of elements)
	// Memory overhead: Array mode ~240KB fixed, Map mode grows dynamically
	ArrayModeThreshold = 10000

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +72 to +80
for _, url := range urls {
filter2.AddString(url)
}

// Batch add uint64s
// Add multiple uint64s (zero allocations per operation)
userIDs := []uint64{1001, 1002, 1003, 1004, 1005}
filter2.AddBatchUint64(userIDs)
for _, id := range userIDs {
filter2.AddUint64(id)
}
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments claim 'zero allocations per operation', but this is only true when hashCount ≤ 16 (which covers 99% of scenarios). For very low FPR values requiring hashCount > 16, heap allocation will occur. Consider clarifying this in the comment or documentation to set accurate expectations.

Copilot uses AI. Check for mistakes.
shaia added 8 commits November 2, 2025 11:27
Replace relative paths to external bloomfilter-benchmark directory with
inline comparison table. The relative paths don't work in GitHub PR views
or for users viewing the README on GitHub.

Changes:
- Replaced external links with inline comparison table
- Added note about separate benchmarking repository
- Kept all performance numbers and key findings
- Improved accessibility for PR reviewers and GitHub viewers
The previous comments claimed "zero allocations" without mentioning the
hashCount ≤ 16 requirement. This updates documentation to set accurate
expectations about when heap allocation occurs.

Changes:
- Added Memory Allocation Details section in README explaining:
  - Stack allocation (0 B/op): FPR ≥ 0.001 (hashCount ≤ 16) - 99% of use cases
  - Heap allocation: Only for extremely low FPR < 0.001 (e.g., 0.0000001)
  - Common configurations (0.01, 0.001, 0.0001) all use stack buffers

- Updated all "zero allocation" comments to include "for typical FPR" caveat
- Added explicit note in example.go explaining hashCount ≤ 16 condition
- Updated Performance Characteristics with FPR threshold details

This ensures users understand:
1. Typical use cases (FPR 0.01-0.0001) achieve zero allocations
2. Extremely low FPR values (< 0.001) may trigger heap allocation
3. 99% of real-world scenarios use stack buffers
Previously, the API accepted invalid inputs without validation, leading to
unpredictable behavior. Tests accepted either panic or graceful handling,
which is non-deterministic and confusing for users.

Changes:
- Added explicit input validation in NewCacheOptimizedBloomFilter:
  * expectedElements must be > 0
  * falsePositiveRate must be in range (0, 1) exclusive
  * falsePositiveRate cannot be NaN

- All invalid inputs now panic with descriptive error messages:
  * "bloomfilter: expectedElements must be greater than 0"
  * "bloomfilter: falsePositiveRate must be in range (0, 1), got X"
  * "bloomfilter: falsePositiveRate cannot be NaN"

- Updated tests to expect deterministic panic behavior:
  * Verify that panic always occurs for invalid inputs
  * Verify panic messages are informative
  * Added test cases for edge values (0.0, 1.0)

- Added API Behavior section to README documenting:
  * Valid and invalid usage examples
  * Rationale for panic-on-invalid-input design
  * Follows Go stdlib conventions (like make())

Benefits:
1. Fail-fast during development rather than silent corruption
2. Clear error messages help developers fix issues immediately
3. Consistent, predictable API behavior
4. Prevents creating broken filters with nonsensical parameters
Added dedicated test file with extensive coverage of all validation paths
and edge cases. Tests discovered and fixed a critical bug where extremely
high FPR values (e.g., 0.999999) caused zero bitCount calculation, leading
to index out of range panics.

New Test File: bloomfilter_validation_test.go
- TestInputValidation_ZeroExpectedElements
- TestInputValidation_NegativeFPR (3 sub-tests)
- TestInputValidation_ZeroFPR
- TestInputValidation_FPRTooHigh (4 sub-tests including infinity)
- TestInputValidation_FPRTooHighForElements (4 sub-tests for zero-bit edge case)
- TestInputValidation_NaNFPR
- TestInputValidation_ValidInputs (10 sub-tests covering valid range)
- TestInputValidation_BoundaryValues (4 sub-tests)
- TestInputValidation_PanicRecovery (6 sub-tests)

Total: 9 test functions, 32 sub-tests

Bug Fix in bloomfilter.go:
- Added validation for bitCount == 0 after calculation
- Prevents index out of range when FPR is extremely close to 1.0
- Clear panic message: "falsePositiveRate too high (X) for Y elements, results in zero bits"
- Added safety check: ensure cacheLineCount >= 1

Test Coverage:
✅ Zero expected elements
✅ Negative FPR values (including -Inf)
✅ Zero FPR
✅ FPR >= 1.0 (including +Inf)
✅ NaN FPR
✅ Extremely high FPR causing zero bits (NEW - found bug)
✅ Valid inputs across full range (0.000001 to 0.9)
✅ Boundary values (1 element, 4B elements, etc.)
✅ Panic recovery and error messages
✅ Post-panic state recovery

All panic messages are verified for correctness and clarity.
…E.md

Removed tests/README.md as its content was redundant with the comprehensive
TESTING.md guide. Updated TESTING.md to reflect the simplified atomic
implementation structure and remove references to deleted files.

Changes:
- Deleted tests/README.md (178 lines)
- Updated TESTING.md with current test structure
- Removed references to deleted storage mode tests
- Added bloomfilter_validation_test.go to documentation
- Consolidated all testing information into single comprehensive guide

Documentation now organized as:
- TESTING.md: Comprehensive testing guide (how to run tests, best practices)
- tests/TEST_COVERAGE_SUMMARY.md: Detailed coverage metrics and results
…ation

Updated comments in bloomfilter_concurrent_test.go to accurately describe
the current thread-safety mechanism using atomic CAS operations instead of
the old sync.Pool approach.

Also fixed TESTING.md to remove reference to deleted tests/README.md file.

Changes:
- Updated TestConcurrentReads comment
- Updated TestConcurrentWrites comment
- Updated TestMixedConcurrentOperations comment
- Removed tests/README.md from Additional Resources section
- All concurrent tests pass successfully (verified)
…orrectness

CRITICAL FIX: Changed setBitsAtomic from limited retries (maxRetries=100) to
unlimited retries. This ensures we never silently fail to set bits, which would
introduce false negatives and violate Bloom filter correctness guarantees.

Bloom filter mathematical property: Can have false positives, but NEVER false
negatives. If we fail to set a bit during Add(), subsequent Contains() checks
will incorrectly return false, corrupting the data structure.

Why unlimited retries are safe:
1. CAS is lock-free and will eventually succeed
2. Fast path: if bit already set, exits immediately without CAS
3. Each CAS operation is ~1-10ns
4. Natural hash distribution = low contention across large bit array
5. Benchmarks show 22M+ writes/sec with 50 concurrent goroutines

Added comprehensive documentation explaining:
- Correctness guarantee (no false negatives)
- Retry strategy rationale
- Contention analysis (512 bits/line, low collision probability)
- Performance characteristics

This change prioritizes correctness over the theoretical risk of spinning
under extreme contention, which is the right trade-off for a probabilistic
data structure with strict mathematical guarantees.
Added 4 new integration tests to validate that the atomic CAS retry mechanism
works correctly under extreme contention and never silently fails to set bits.

New tests:
1. TestAtomicRetryMechanism - 100 goroutines × 100 inserts into small filter
   Validates no false negatives occur under high contention

2. TestExtremeContentionSameWord - 50 goroutines × 1,000 writes to same key
   Worst-case scenario: all threads write exact same bit positions
   Validates CAS retry succeeds even with maximum contention on same word

3. TestCASRetriesEventualSuccess - Tests across 3 filter sizes
   Small filter (high contention), medium (moderate), large (low contention)
   Validates retry loop succeeds across different contention scenarios

4. TestNoHangUnderContention - Timeout-based validation
   Ensures retry mechanism doesn't hang under concurrent writes

All tests validate the critical correctness property: Bloom filters can have
false positives but NEVER false negatives. The tests confirm that even under
extreme contention, all bits are successfully set.

Test results:
- 10,000 concurrent insertions: 0 false negatives ✓
- 50,000 writes to same bits: Key found ✓
- 110,000 unique keys across scenarios: All found ✓
- No hangs detected ✓

Updated TESTING.md to document the new retry mechanism tests.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 2, 2025

📋 Version Check: This PR contains changes. Consider creating a new version tag after merging.

Semantic Versioning Guide:

  • vX.Y.Z - Patch: Bug fixes, performance improvements
  • vX.Y.0 - Minor: New features, backward compatible
  • vX.0.0 - Major: Breaking changes

To create a release after merging:

git tag v0.1.0
git push origin v0.1.0

@shaia shaia requested a review from Copilot November 2, 2025 09:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

bloomfilter.go:323

  • The constant ArrayModeThreshold is no longer used after removing the hybrid storage architecture. It should be removed to avoid confusion, as the comment references array vs map modes that don't exist in this simplified implementation.
	ArrayModeThreshold = 10000

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The timeout mechanism in TestNoHangUnderContention was completely broken.
The timeout channel was created but never received any values, making the
timeout case unreachable.

Fixed by using time.After() which properly creates a timer that sends on
the channel after the specified duration.

Changes:
- Replaced broken timeout channel logic with time.After(10 * time.Second)
- Added missing "time" import
- Simplified code by removing unnecessary closure and channel creation
- Timeout now actually works: test will fail if retry mechanism hangs

The test now properly validates that the infinite retry loop doesn't hang
under concurrent writes, providing a safety check against potential deadlocks.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 2, 2025

📋 Version Check: This PR contains changes. Consider creating a new version tag after merging.

Semantic Versioning Guide:

  • vX.Y.Z - Patch: Bug fixes, performance improvements
  • vX.Y.0 - Minor: New features, backward compatible
  • vX.0.0 - Major: Breaking changes

To create a release after merging:

git tag v0.1.0
git push origin v0.1.0

@shaia shaia merged commit 01e8adf into main Nov 2, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants