Fix addressing #38 and #39 by omgol411 · Pull Request #40 · isblab/prism

omgol411 · 2026-04-12T13:23:41Z

~~Addressing #38 , output_dir argument from run_prism was removed~~

For addressing #39 , the following changes were made

imap with chunksize > 1 to speed up both density and bead spread calculation

Summary by CodeRabbit

Refactor
- Simplified parallel processing to a single streamlined workflow with improved chunking and CPU utilization
- Introduced a helper to consolidate per-item computation and conversion steps
- Standardized output directory handling to always create and use a single output path
Style
- Cleaned up file-read indentation for consistent formatting

coderabbitai · 2026-04-12T13:23:54Z

📝 Walkthrough

Walkthrough

A new helper get_bead_spread(...) was added and the parallel workflow in src/main.py was simplified to a single Pool.imap that computes bead spread per index with an explicit chunksize derived from cores_ = min(os.cpu_count() - 1, args.cores). Output directory creation now always calls os.makedirs(output_dir, exist_ok=True) after defaulting output_dir = args.output. The top-level ihm_parser import was removed in favor of inline parse_ihm_models import. Minor reindentation in src/ihm_parser.py only.

Changes

Cohort / File(s)	Summary
Bead spread & parallel flow `src/main.py`	Added `get_bead_spread(i, coords, mass, radius, grid, voxel_size, n_breaks)`. Replaced two-stage parallel map (density then spread) with a single `Pool(...).imap(get_bead_spread, ...)` using explicit `chunksize` computed from `cores_ = min(os.cpu_count() - 1, args.cores)` and `divmod(..., cores_*4)`.
Output directory handling `src/main.py`	Defaulted `output_dir = args.output` and unconditionally call `os.makedirs(output_dir, exist_ok=True)`; all subsequent writes use `output_dir`. Removed prior conditional/existing-directory reassignment logic.
Import adjustments `src/main.py`	Removed unconditional top-level `ihm_parser` import; `parse_ihm_models` is imported inline where needed.
IHM parser formatting `src/ihm_parser.py`	Reindented the `with open(..., encoding='utf8')` / `ihm.reader.read(...)` block to align with surrounding `try/except`; no functional changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
I hopped through code with careful care,
Merged spread and density into one fair share,
Cores trimmed tidy, chunks sliced right,
Folders made ready in the night,
A rabbit’s joy — computations light!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Fix addressing `#38` and `#39`' references the issue numbers but is vague and does not clearly summarize the main technical changes (refactored multiprocessing with imap/chunksize, simplified density/bead spread calculation, output directory handling).	Revise the title to be more descriptive of the actual changes, such as 'Refactor multiprocessing with imap chunksize and combine density/bead spread calculations' or similar.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main.py`:
- Line 85: The run_prism signature was changed and no longer accepts a per-call
output path, breaking IHM callers that pass f"output_{ist}_{img}" and expect
artifacts in that folder; restore per-call output handling by adding an output
parameter to run_prism (e.g., def run_prism(coords, mass, radius, ps_names,
args, output=None)) and, at the start of run_prism, set args.output = output if
output is provided (falling back to the existing args.output when None) so
existing behavior remains backward compatible; ensure callers in ihm_parser.py
that currently pass a per-model folder continue to pass it and any other
invocations without an output argument still work.
- Around line 98-101: Replace the broken NumPy usage and len() calls by
computing an integer worker count safely: get system_cpus = os.cpu_count() or 1,
then compute worker_count = max(1, min(system_cpus - 1, args.cores)) (ensure
system_cpus-1 doesn't go below 1), use that integer in the chunksize calculation
(chunksize, extra = divmod(coords.shape[1], worker_count * 4)) and adjust the
extra-handling branch; refer to the existing variables cores_, chunksize,
coords.shape, and args.cores when making this change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f675bc40-8d53-4e7c-8780-d491f77d818d

📥 Commits

Reviewing files that changed from the base of the PR and between 37f1bce and ccddf3b.

📒 Files selected for processing (1)

src/main.py

coderabbitai · 2026-04-12T13:27:52Z

+	cores_ = np.min(os.cpu_count() - 1, args.cores)
+	chunksize, extra = divmod(coords.shape[1], len(cores_) * 4)
+	if extra:
+        chunksize += 1


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

What is the signature of numpy.min, and does np.min(x, y)compare two scalar values or treatyas theaxisparameter? Also, what happens in Python when callinglen() on an integer?

💡 Result:

The signature of numpy.min (np.min) is np.min(a, axis=None, out=None, keepdims=, initial=, where=True). It computes the minimum of the array elements. np.min(x, y) where x and y are scalars treats y as the axis parameter (likely resulting in an error if y is not a valid axis index, e.g., TypeError or IndexError), not as a second value to compare. To compare two scalars, use np.minimum(x, y) instead. Calling len on an integer in Python raises TypeError: object of type 'int' has no len. Integers are not iterable or sequence types that support len.

Citations:

1: https://numpy.org/doc/stable/reference/generated/numpy.min.html

2: https://numpy.org/doc/2.2/reference/generated/numpy.min.html

3: https://numpy.org/doc/1.25/reference/generated/numpy.min.html

4: https://www.datacamp.com/doc/numpy/min

5: https://numpy.org/doc/1.19/reference/generated/numpy.amin.html

6: https://numpy.org/doc/1.21/reference/generated/numpy.minimum.html

7: https://medium.com/@amit25173/understanding-numpy-minimum-with-examples-ac660d10296c

8: https://stackoverflow.com/questions/35204529/typeerror-object-of-type-int-has-no-len-error-assistance-needed

9: https://stackoverflow.com/questions/73825057/python-len-function-not-working-as-expected-and-giving-me-the-error-typeerro

10: https://medium.com/@abuolubunmi21/understanding-typeerror-using-len-with-integers-in-python-77546c4a6cc4

11: https://techgeekbuzz.com/blog/python-typeerror-object-of-type-int-has-no-len-solution

12: https://careerkarma.com/blog/python-typeerror-object-of-type-int-has-no-len/

🏁 Script executed:

cd /tmp && find . -name "main.py" -type f 2>/dev/null | head -5

Repository: isblab/prism

Length of output: 38

🏁 Script executed:

git ls-files | grep -E "main\.py|src/" | head -20

Repository: isblab/prism

Length of output: 342

🏁 Script executed:

wc -l src/main.py

Repository: isblab/prism

Length of output: 71

🏁 Script executed:

sed -n '95,105p' src/main.py

Repository: isblab/prism

Length of output: 361

🏁 Script executed:

grep -n "args.cores\|argparse\|add_argument.*cores" src/main.py | head -20

Repository: isblab/prism

Length of output: 272

🏁 Script executed:

head -50 src/main.py

Repository: isblab/prism

Length of output: 1990

Fix the worker-count/chunk-size calculation before opening the pool.

np.min(os.cpu_count() - 1, args.cores) misuses the NumPy API—the second argument is treated as the axis parameter, not a value to compare. Line 99 then calls len() on the resulting scalar, which will raise TypeError: object of type 'int' has no len(). This code will fail before either Pool(...) starts, and it can also produce 0 workers on single-core or unknown-CPU hosts.

Suggested fix

- cores_ = np.min(os.cpu_count() - 1, args.cores) - chunksize, extra = divmod(coords.shape[1], len(cores_) * 4) - if extra: - chunksize += 1 + cpu_count = os.cpu_count() or 1 + usable_cores = cpu_count - 1 if cpu_count > 1 else 1 + cores_ = max(1, min(usable_cores, args.cores)) + chunksize, extra = divmod(coords.shape[1], cores_ * 4) + if extra: + chunksize += 1 + chunksize = max(1, chunksize)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cores_ = np.min(os.cpu_count() - 1, args.cores)

chunksize, extra = divmod(coords.shape[1], len(cores_) * 4)

if extra:

chunksize += 1

cpu_count = os.cpu_count() or 1

usable_cores = cpu_count - 1 if cpu_count > 1 else 1

cores_ = max(1, min(usable_cores, args.cores))

chunksize, extra = divmod(coords.shape[1], cores_ * 4)

if extra:

chunksize += 1

chunksize = max(1, chunksize)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/main.py` around lines 98 - 101, Replace the broken NumPy usage and len() calls by computing an integer worker count safely: get system_cpus = os.cpu_count() or 1, then compute worker_count = max(1, min(system_cpus - 1, args.cores)) (ensure system_cpus-1 doesn't go below 1), use that integer in the chunksize calculation (chunksize, extra = divmod(coords.shape[1], worker_count * 4)) and adjust the extra-handling branch; refer to the existing variables cores_, chunksize, coords.shape, and args.cores when making this change.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/main.py (1)

65-85: ⚠️ Potential issue | 🔴 Critical

Harden worker/chunksize calculation and use the computed worker count.

Line 65 can still fail when os.cpu_count() is None, and on single-core hosts it can yield 0, which makes Line 66 divide by zero. Also, Line 80 should use the computed worker count, not args.cores.

Proposed fix

-	cores_ = min(os.cpu_count() - 1, args.cores)
-	chunksize, extra = divmod(coords.shape[1], cores_ * 4)
+	cpu_count = os.cpu_count() or 1
+	usable_cores = max(1, cpu_count - 1)
+	cores_ = max(1, min(usable_cores, args.cores))
+	chunksize, extra = divmod(coords.shape[1], cores_ * 4)
 	if extra:
 		chunksize += 1
+	chunksize = max(1, chunksize)
@@
-	with Pool(args.cores) as p:
+	with Pool(cores_) as p:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/main.py` around lines 65 - 85, Protect against os.cpu_count() returning
None and single-core hosts by computing workers = max(1, (os.cpu_count() or 1) -
1, 1) (or equivalent) and use that variable (e.g., cores_ or workers)
everywhere; ensure the divmod denominator uses workers * 4 and never zero,
compute chunksize with a safe fallback (e.g., if coords.shape[1] == 0 then
chunksize = 1) and only increment chunksize when extra > 0, and finally pass the
computed workers to Pool(...) instead of args.cores; update references to
_part_func, chunksize, extra and the Pool(...) call to use the new workers
variable consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main.py`:
- Line 145: The --return_spread argument is defined with action='store_true' but
default=True so it is always True; update the parser.add_argument call for
"--return_spread" to make it togglable by either setting default=False when
keeping action='store_true' or switching to argparse.BooleanOptionalAction to
provide paired --return-spread / --no-return-spread flags (referencing the
parser.add_argument invocation for "--return_spread" and the action parameter).

---

Duplicate comments:
In `@src/main.py`:
- Around line 65-85: Protect against os.cpu_count() returning None and
single-core hosts by computing workers = max(1, (os.cpu_count() or 1) - 1, 1)
(or equivalent) and use that variable (e.g., cores_ or workers) everywhere;
ensure the divmod denominator uses workers * 4 and never zero, compute chunksize
with a safe fallback (e.g., if coords.shape[1] == 0 then chunksize = 1) and only
increment chunksize when extra > 0, and finally pass the computed workers to
Pool(...) instead of args.cores; update references to _part_func, chunksize,
extra and the Pool(...) call to use the new workers variable consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e10a549-ec36-447b-b4ed-b62334f48355

📥 Commits

Reviewing files that changed from the base of the PR and between ccddf3b and 2f2371f.

📒 Files selected for processing (2)

src/ihm_parser.py
src/main.py

✅ Files skipped from review due to trivial changes (1)

src/ihm_parser.py

coderabbitai · 2026-04-12T15:51:13Z

+		help="Type of input: npz/pdb/cif/rmf/dcd.",
+		required=True, type=str)
+	parser.add_argument("--voxel_size", "-v", help="Voxel size for density calculations", default=4, type=int)
+	parser.add_argument("--return_spread", "-rs", help="Return the spread bead_spread", action='store_true', default = True)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's check if src/main.py exists and examine the area around line 145 if [ -f src/main.py ]; then echo "=== Checking src/main.py around line 145 ===" wc -l src/main.py sed -n '140,150p' src/main.py else echo "src/main.py not found at expected location" # Try to find main.py find . -name "main.py" -type f fi

Repository: isblab/prism

Length of output: 1113

🏁 Script executed:

# Search for any related --return-spread or --no-return-spread arguments rg "return.spread|return_spread" --max-count 20

Repository: isblab/prism

Length of output: 2391

🏁 Script executed:

# Look for other argparse action='store_true' patterns to understand the codebase's conventions rg "action=.*store" -A 1 src/main.py

Repository: isblab/prism

Length of output: 269

--return_spread flag is non-toggleable and always evaluates to True.

With action='store_true' and default=True, the argument will be True whether the flag is provided or absent, making it impossible for users to disable this feature. Either change to default=False if toggling is desired, or use argparse.BooleanOptionalAction (Python 3.9+) with paired flags like --return-spread and --no-return-spread.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/main.py` at line 145, The --return_spread argument is defined with action='store_true' but default=True so it is always True; update the parser.add_argument call for "--return_spread" to make it togglable by either setting default=False when keeping action='store_true' or switching to argparse.BooleanOptionalAction to provide paired --return-spread / --no-return-spread flags (referencing the parser.add_argument invocation for "--return_spread" and the action parameter).

shruthivis

The diff has too many unrelated changes in too many unrelated lines.
Stick to 1 change in 1 PR. Submit a PR addressing only the imap issue with minimal code changes to make it work as expected.
Don't move around lines unless required (e.g. bead_spread=[] initialization).

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

omgol411 requested a review from shruthivis April 12, 2026 16:13

shruthivis requested changes Apr 13, 2026

View reviewed changes

omgol411 closed this Apr 13, 2026

omgol411 force-pushed the main branch from 2f2371f to 37f1bce Compare April 13, 2026 05:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix addressing #38 and #39#40

Fix addressing #38 and #39#40
omgol411 wants to merge 0 commit intoisblab:mainfrom
omgol411:main

omgol411 commented Apr 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Uh oh!

shruthivis left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

omgol411 commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

shruthivis left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

omgol411 commented Apr 12, 2026 •

edited

Loading

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading