Skip to content

imap with heuristic chunksize to speed up bead spread calculation#41

Merged
omgol411 merged 1 commit intoisblab:mainfrom
omgol411:bead_spread_fix
Apr 13, 2026
Merged

imap with heuristic chunksize to speed up bead spread calculation#41
omgol411 merged 1 commit intoisblab:mainfrom
omgol411:bead_spread_fix

Conversation

@omgol411
Copy link
Copy Markdown
Contributor

@omgol411 omgol411 commented Apr 13, 2026

Addressing #39

  • density and bead spread calculation are combined in a single function get_bead_spread which is called in parallel
  • chunksize is determined heuristically similar to map

Summary by CodeRabbit

  • Performance Improvements
    • Optimized bead spread computation through an improved parallelization strategy.
    • Enhanced CPU core utilization with dynamic core selection for more efficient execution.

- density and bead spread calculation are combined in a single function `get_bead_spread` which is called in parallel
- `chunksize` is determined heuristically similar to `map`
@omgol411 omgol411 requested a review from shruthivis April 13, 2026 06:35
@omgol411 omgol411 self-assigned this Apr 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

The PR optimizes parallel bead spread computation by introducing a helper function that combines density and spread calculations, replacing a two-phase parallelization approach with direct parallel iteration using dynamic core selection and explicit chunking.

Changes

Cohort / File(s) Summary
Bead Spread Parallelization Refactor
src/main.py
Introduced get_bead_spread() helper function that merges density computation and spread derivation. Reworked run_prism() to replace two-phase parallelization (collect densities, then compute spread) with single-phase direct iteration using p.imap() with dynamic core selection and calculated chunksize.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 Hops through parallelized beads,
No two-phase dance, just one stream's speed,
Chunksize and cores align with grace,
Density spreads at quickened pace!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title clearly summarizes the main change: introducing imap with heuristic chunksize to optimize bead spread calculation, which directly corresponds to the primary modifications in the PR.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@omgol411 omgol411 changed the title imap with heuristic chunksize to speed up bead spread calculation addressing #39 imap with heuristic chunksize to speed up bead spread calculation Apr 13, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main.py`:
- Around line 106-109: The current computation uses args.cores directly so a
zero or negative CLI value makes cores_ non-positive and breaks chunk sizing;
clamp/validate args.cores first (e.g., convert to int and ensure at least 1, and
optionally at most os.cpu_count() - 1), then compute cores_ using that validated
value; update the logic around cores_, args.cores, coords.shape[1], chunksize
and extra so cores_ is always >=1 before performing divmod on coords.shape[1].
- Around line 86-90: get_bead_spread currently forces binding of large objects
via functools.partial which causes huge per-task serialization; instead add an
initializer function (e.g., init_worker) that accepts and stores module-level
globals (coords, mass, radius, grid, voxel_size, n_breaks) for each worker,
change get_bead_spread(i) to only accept the bead index and call
main_density_calc(i, coords, mass, radius, grid, voxel_size, n_breaks) and
calc_bead_spread(density, grid) using those module-level variables, and update
the Pool construction to use initializer=init_worker with
initializer_args=(coords, mass, radius, grid, voxel_size, n_breaks) and map only
the indices (removing functools.partial usage).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b748e6a2-786b-4150-8997-4bf997ae2445

📥 Commits

Reviewing files that changed from the base of the PR and between 37f1bce and 112ea5a.

📒 Files selected for processing (1)
  • src/main.py

Comment thread src/main.py
Comment on lines +86 to +90
def get_bead_spread(i, coords, mass, radius, grid, voxel_size, n_breaks):
# dummy function to call main_density_calc for parallelization
density = main_density_calc(i, coords, mass, radius, grid, voxel_size, n_breaks)
spread = calc_bead_spread(density, grid)
return spread
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
from functools import partial
import inspect
import multiprocessing.pool as mp_pool
import pickle

def worker(i, coords, mass, radius, grid, voxel_size, n_breaks):
    return i

coords = b"x" * 5_000_000
mass = [0] * 5000
radius = [0] * 5000
grid = {"cells": list(range(5000))}
bound = partial(worker, coords=coords, mass=mass, radius=radius, grid=grid, voxel_size=4, n_breaks=50)

print("serialized_partial_bytes =", len(pickle.dumps(bound)))
print("--- multiprocessing.pool.Pool._get_tasks ---")
print(inspect.getsource(mp_pool.Pool._get_tasks))
PY

Repository: isblab/prism

Length of output: 352


🏁 Script executed:

# Read the file to check if it exists and get size
wc -l src/main.py && head -1 src/main.py

Repository: isblab/prism

Length of output: 90


🏁 Script executed:

# Read the relevant sections of the file
sed -n '80,120p' src/main.py | cat -n

Repository: isblab/prism

Length of output: 2045


🏁 Script executed:

# Check for functools import and verify partial is used
grep -n "import\|from" src/main.py | head -20

Repository: isblab/prism

Length of output: 573


🏁 Script executed:

# Verify the exact line numbers of Pool usage and get_bead_spread definition
grep -n "def get_bead_spread\|with Pool\|partial(" src/main.py

Repository: isblab/prism

Length of output: 381


Use initializer= instead of binding large objects via partial().

The helper function at line 86 forces line 113 to close over coords, mass, radius, and grid using functools.partial. With multiprocessing.Pool, the bound partial is serialized into every task batch instead of just the bead index i. On large inputs, this ~5 MB serialization overhead per batch can dominate runtime and memory, negating the parallelization speedup.

Load the shared data once per worker via initializer= and keep only the index in the queued argument:

Suggested pattern
+_bead_spread_ctx = {}
+
+def _init_bead_spread_worker(coords, mass, radius, grid, voxel_size, n_breaks):
+	_bead_spread_ctx.update({
+		"coords": coords,
+		"mass": mass,
+		"radius": radius,
+		"grid": grid,
+		"voxel_size": voxel_size,
+		"n_breaks": n_breaks,
+	})
+
-def get_bead_spread(i, coords, mass, radius, grid, voxel_size, n_breaks):
-	# dummy function to call main_density_calc for parallelization
-	density = main_density_calc(i, coords, mass, radius, grid, voxel_size, n_breaks)
-	spread = calc_bead_spread(density, grid)
+def get_bead_spread(i):
+	density = main_density_calc(
+		i,
+		_bead_spread_ctx["coords"],
+		_bead_spread_ctx["mass"],
+		_bead_spread_ctx["radius"],
+		_bead_spread_ctx["grid"],
+		_bead_spread_ctx["voxel_size"],
+		_bead_spread_ctx["n_breaks"],
+	)
+	spread = calc_bead_spread(density, _bead_spread_ctx["grid"])
 	return spread
-	with Pool(cores_) as p:
+	with Pool(
+		cores_,
+		initializer=_init_bead_spread_worker,
+		initargs=(coords, mass, radius, grid, args.voxel_size, args.n_breaks),
+	) as p:
 		bead_spread = []
-		for spread in tqdm.tqdm( p.imap( partial(get_bead_spread, coords=coords, mass=mass, radius=radius, grid=grid, voxel_size=args.voxel_size, n_breaks=args.n_breaks), range(coords.shape[1]), chunksize=chunksize ) ):
+		for spread in tqdm.tqdm( p.imap( get_bead_spread, range(coords.shape[1]), chunksize=chunksize ) ):
 			bead_spread.append( spread )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.py` around lines 86 - 90, get_bead_spread currently forces binding
of large objects via functools.partial which causes huge per-task serialization;
instead add an initializer function (e.g., init_worker) that accepts and stores
module-level globals (coords, mass, radius, grid, voxel_size, n_breaks) for each
worker, change get_bead_spread(i) to only accept the bead index and call
main_density_calc(i, coords, mass, radius, grid, voxel_size, n_breaks) and
calc_bead_spread(density, grid) using those module-level variables, and update
the Pool construction to use initializer=init_worker with
initializer_args=(coords, mass, radius, grid, voxel_size, n_breaks) and map only
the indices (removing functools.partial usage).

Comment thread src/main.py
Comment on lines +106 to +109
cores_ = min(max(os.cpu_count() - 1, 1), args.cores)
chunksize, extra = divmod(coords.shape[1], cores_ * 4)
if extra:
chunksize += 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clamp the derived worker count before chunk sizing.

args.cores is unvalidated here, so --cores 0 or a negative value makes cores_ non-positive and breaks the arithmetic in this block. Clamp the CLI value before taking the minimum.

🩹 Minimal fix
-	cores_ = min(max(os.cpu_count() - 1, 1), args.cores)
+	cores_ = min(max(os.cpu_count() - 1, 1), max(args.cores, 1))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.py` around lines 106 - 109, The current computation uses args.cores
directly so a zero or negative CLI value makes cores_ non-positive and breaks
chunk sizing; clamp/validate args.cores first (e.g., convert to int and ensure
at least 1, and optionally at most os.cpu_count() - 1), then compute cores_
using that validated value; update the logic around cores_, args.cores,
coords.shape[1], chunksize and extra so cores_ is always >=1 before performing
divmod on coords.shape[1].

@omgol411
Copy link
Copy Markdown
Contributor Author

omgol411 commented Apr 13, 2026

Regarding the patch suggested by coderabbit, we don't need to use functools.partial.
But, we also don't need to pass all the coordinates, radii and masses of all beads to main_density_calc.
That function only needs it for a bead at a time. However, this problem is not directly relevant to this PR, I'm addressing it in a separate PR addressing #42 .

@omgol411 omgol411 merged commit 94fa67c into isblab:main Apr 13, 2026
2 checks passed
@omgol411 omgol411 deleted the bead_spread_fix branch April 13, 2026 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant