Skip to content

cNMF_program_analysis.py pre-loop correlation step balloons memory past 700 GB on moderately-sized datasets #7

@adamklie

Description

@adamklie

Problem

Stage3_Interpretation/A_Plotting/Slurm_Version/cNMF_program_analysis.py:109–115 runs two correlation precomputes before the per-program plot loop:

# compute correlations
waterfall_correlation = {}

for samp in args.sample:
    precomputed = f"{args.corr_matrix_path}/corr_program_matrix_{samp}.txt" if args.corr_matrix_path else None
    save = ...
    df = compute_program_waterfall_cor(f"{args.perturb_path_base}_{samp}.txt", ...)
    waterfall_correlation[samp] = df

program_correlation = compute_program_correlation_matrix(mdata)

What these are doing:

  • compute_program_waterfall_cor reads the per-K perturbation results, pivots to programs × targets of log2FC, and computes the program×program correlation across targets. Output is a K×K matrix (e.g. 200×200).
  • compute_program_correlation_matrix densifies mdata['cNMF'].X (cells × K programs), wraps in a DataFrame, and computes program×program correlation across cells.

Observation

Run on a single-cell dataset with ~270,000 cells × ~36,000 genes, K=200 programs, ~2,000 perturbation targets and 1 sample. The precompute step never reaches the per-program plotting loop:

SLURM --mem Outcome
256 GB OOM at ~54 min
480 GB OOM at ~1h38m, MaxRSS 494 GB
700 GB At 1h02m: MaxRSS 676 GB, no PDFs written, would have OOM'd shortly after

Stable indicator: only the SLURM-startup/config files appear in the output dir; no per-program PDF is ever produced because the loop is never reached.

The two correlation functions themselves should be cheap (output sizes are 200×200 floats ≈ kilobytes). Memory ramping by tens of GB per minute suggests something else is accumulating in the precompute — possibly intermediate dense materializations in pivot_table/corr, or accumulating references from the surrounding code path.

Asks

  1. Profile the precompute step to identify the actual memory hog (the docstring functions don't account for the observed growth).
  2. If the precompute is intrinsically heavy, support chunked/streaming computation, or allow callers to skip it via --corr_matrix_path to a pre-computed file (the arg already exists but the precompute still runs to recompute and cache).
  3. Document expected memory scaling vs (cells, K, n_targets) so users can size SLURM allocations appropriately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions