Skip to content

Performance Optimization

shaia edited this page Dec 10, 2025 · 1 revision

BloomFilter Performance & Concurrency

This document details the performance optimizations and concurrency features introduced in the BloomFilter library.

Overview

The BloomFilter implementation has been improved to support:

  1. Concurrent Batch Operations: New methods AddBatch and ContainsBatch automatically utilize multi-core processing for large inputs.
  2. Parallel Bulk Operations: Existing methods Union, Intersection, and PopCount now automatically switch to a parallel execution strategy when the filter size exceeds a threshold (256KB).
  3. SIMD-Optimized Hashing: The hash functions use AVX2-optimized chunking for high throughput.

Implementation Details

Parallel Execution Strategy

The library checks the size of the data or input batch. If it exceeds ParallelThreshold, the work is partitioned into chunks and processed by goroutines.

  • Worker Count: Defaults to runtime.NumCPU().
  • Synchronization: Uses sync.WaitGroup to coordinate completion.
  • Atomic Updates: AddBatch uses atomic.CompareAndSwapUint64 to safely set bits concurrently without locking, ensuring thread-safety with minimal contention.

Performance Benchmarks

Benchmarks were conducted on a 13th Gen Intel Core i9-13980HX.

AddBatch (50,000 items)

Method Latency
Sequential Loop ~253,675 ns/op
AddBatch (Parallel) ~114,362 ns/op
Speedup ~2.2x

PopCount (1M Cache Lines / 64MB)

Method Latency
Sequential Scan ~200,000+ ns/op (estimated)
Parallel PopCount ~32,586 ns/op
Speedup ~6x

Usage Guide

Batch Adding

// Create a filter
bf := bloomfilter.NewCacheOptimizedBloomFilter(1000000, 0.01)

// Prepare a large batch of data
data := make([][]byte, 50000)
// ... fill data ...

// Add all items in parallel
bf.AddBatch(data)

Implicit Parallelism

Standard set operations automatically benefit from parallelism when dealing with large filters:

bf1 := bloomfilter.NewCacheOptimizedBloomFilter(1000000, 0.01)
bf2 := bloomfilter.NewCacheOptimizedBloomFilter(1000000, 0.01)

// ... fill filters ...

// This will automatically run in parallel
bf1.Union(bf2)