Skip to content

rnorlund/adaptive-gravity-sort

Repository files navigation

Adaptive Gravity Sort

A physics-inspired parallel sorting algorithm for 32-bit unsigned integers that adapts its bucket spacing to the input distribution in a single O(n) pass.

Paper: adaptive_gravity_sort_paper.tex (compile with pdflatex)


The Idea

Integer values are treated as particles dropped into a viscous fluid. Under Stokes drag, heavier particles (larger values) settle lower. Bucket boundaries placed at equal time-slices of fall time correspond to a power-law spacing of the value range, parameterised by a single exponent λ estimated from the arithmetic mean of the input:

λ̂ = 2(μ − x_min) / (x_max − x_min)
  • λ̂ ≈ 1 → linear spacing (recovers Flash Sort for uniform data)
  • λ̂ ≈ 0 → logarithmic spacing (optimal for power-law data)
  • λ̂ ≈ 0.5 → quadratic spacing (Stokes drag model, optimal for clustered/bimodal)

No prior knowledge of the distribution is needed — the mean encodes it.


Performance (n = 500,000, Apple M-series, 12 threads)

Distribution std::sort (ms) AGS speedup
Pipe-organ 23.6 9.16×
Bimodal 29.9 9.07×
Clustered (32 groups) 22.9 7.74×
Small range [0, 65535] 28.6 5.69×
Uniform random 29.8 5.05×
Power-law 30.1 4.77×

Geometric mean across all 7 disordered distributions: 6.07× over std::sort.

AGS is slower than std::sort on pre-sorted and reverse-sorted inputs (structural limitation of all distribution sorts — see paper for details).

GPU Results (NVIDIA L40S, CUDA 13.1)

GPU-AGS achieves 75–343× over CPU std::sort at n = 500K–50M. However, CUB DeviceRadixSort is 4–10× faster than GPU-AGS on GPU. The cache-pressure advantage of AGS over radix sort on CPU does not transfer to GPU, where large L2 cache eliminates the bottleneck. See Section 8 of the paper for the full analysis.


Build and Run

CPU benchmark:

g++ -std=c++17 -O3 -march=native -o gravity_bench gravity_sort_benchmark.cpp
./gravity_bench 500000 --reps 5 --warmup 1
./gravity_bench --fuzz    # correctness: 713/713 expected

GPU benchmark (requires CUDA 13+ and NVIDIA GPU):

nvcc -O3 -arch=sm_89 -std=c++17 gpu_sort_bench.cu -o gpu_bench
./gpu_bench    # outputs JSON results

Streamlit visualisation:

pip install streamlit
streamlit run gravity_sort_app.py

Files

File Description
gravity_sort_benchmark.cpp All CPU algorithms + benchmark harness (C++17, no dependencies)
gpu_sort_bench.cu GPU benchmark: thrust::sort, CUB radix, GPU-AGS
gpu_bench_results.json Raw results from L40S benchmark run
adaptive_gravity_sort_paper.tex Full paper (LaTeX)
gravity_sort_app.py Streamlit visualisation app
ideas.md Development log: physics analogies tried, rejected, and integrated

Citation

If you use this code, algorithm, or paper in any work — including research, software, products, or presentations — you must cite:

@techreport{newmannorlund2025gravitysort,
  author      = {Newman-Norlund, Roger},
  title       = {Adaptive Gravity Sort: A Physics-Inspired Distribution Sort
                 with Automatic Bucket-Spacing from the Arithmetic Mean},
  institution = {Independent Research},
  year        = {2025},
  url         = {https://github.com/rnorlund/adaptive-gravity-sort}
}

See CITATION.cff for additional citation formats. Use of this work without attribution violates the license terms.


License

Academic Attribution License — free to use, modify, and distribute for any purpose, with mandatory attribution and citation. See LICENSE for full terms.


Development History

This algorithm was developed over approximately one week, iterating through several physics analogies before finding the viscous fluid model:

  • Particle collisions → pairwise comparisons = bubble sort (rejected)
  • Brownian motion → randomised pivot = known quicksort trick (rejected)
  • Centrifuge → bit rotation breaks sort order, reduces to radix sort (rejected)
  • Magnetic deflection → Lorentz force is 2D, not mappable to 1D ordering (rejected)
  • Osmosis → O(n) per recursion level = O(n log n), no improvement (rejected)
  • Shock waves → histogram overhead, bimodal regression 9.5→2.8× (rejected)
  • Isotope separation → entropy-optimal bit ≠ arithmetic sort order, produces wrong output (rejected)
  • Viscous fluid settling → monotone, closed-form, distribution-aware ✓

The full exploration is documented in ideas.md.

About

Adaptive Gravity Sort: A physics-inspired parallel sort for uint32 that adapts bucket spacing from the arithmetic mean. 4.77–9.16× over std::sort on CPU, 75–343× on GPU.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors