compute_fake_perturbation_tests crashes with AttributeError: 'Namespace' object has no attribute 'reference_targets'

# `compute_fake_perturbation_tests` crashes with `AttributeError: 'Namespace' object has no attribute 'reference_targets'`

## Summary

Running the U-test calibration script with `--compute_fake_perturbation_tests` causes the script to crash at line 160 because it references `args.reference_targets`, which is never defined on the argparse namespace.

## Reproduction

```bash
python src/Stage2_Evaluation/B_Calibration/Slurm_version/U-test_perturbation_calibration/U-test_perturbation_calibration.py \
    --out_dir <out> \
    --run_name <run> \
    --mdata_guide_path <h5mu> \
    --guide_annotation_path <tsv> \
    --guide_annotation_key "non-targeting" \
    ... \
    --components 30 50 60 80 100 200 250 300 \
    --sel_thresh 2.0 \
    --compute_fake_perturbation_tests
```

The fake-test loop runs the first iteration successfully (loads the h5mu, picks 6 fake-targeting guides, subsets to NT guides), then crashes when it tries to call `compute_perturbation_association`:

```
Processing K=30, sel_thresh=2.0
  Running iteration 1/50
  Found 600 valid non-targeting guides out of 14151 total
Traceback (most recent call last):
  File ".../U-test_perturbation_calibration.py", line 385, in main
    test_stats_fake_df = compute_fake_perturbation_tests()
  File ".../U-test_perturbation_calibration.py", line 160, in compute_fake_perturbation_tests
    reference_targets=args.reference_targets,
AttributeError: 'Namespace' object has no attribute 'reference_targets'. Did you mean: 'reference_gtf_path'?
```

## Root cause

`src/Stage2_Evaluation/B_Calibration/Slurm_version/U-test_perturbation_calibration/U-test_perturbation_calibration.py:160`:

```python
test_stats_df = compute_perturbation_association(
    mdata_samp,
    prog_key=args.prog_key,
    collapse_targets=True,
    pseudobulk=False,
    reference_targets=args.reference_targets,  # <-- args.reference_targets is undefined
    FDR_method=args.FDR_method,
    n_jobs=-1,
    inplace=False
)
```

The argparser defines `--guide_annotation_key` (default `['non-targeting']`) but never `--reference_targets`. The real-test path at lines 44–49 uses a local variable `reference_targets`, which is computed from either the annotation TSV or `args.guide_annotation_key`. The fake-test path was likely written by copy-paste but `reference_targets` was left referencing the (non-existent) namespace attribute.

`git blame` shows this line has been present since the initial commit (`5ca68dc`, 2026-04-23), so it predates the "Fix 7 bugs in U-test_perturbation_calibration.py" commit (`a37a60b`, 2026-05-01). It's an 8th typo in the same file that the fix-7 commit didn't catch.

## Why this wasn't caught earlier

This bug has existed since day 1 of the repo and only surfaces when running the new PerturbNMF script end-to-end on a real dataset with `--compute_fake_perturbation_tests`.

Pre-existing fake-test outputs from the Engreitz lab (e.g., `5_fake_perturbation_association_results.txt`-style files) appear to come from the older internal `cNMF_benchmarking` tool (referenced in commented-out paths inside `cNMF_evaluation_pipeline.py`, e.g. `/oak/.../cNMF_benchmarking/cNMF_benchmarking_pipeline/Evaluation/...`). PerturbNMF is a publishable rewrite of that internal tool, and the rewrite introduced typo-class bugs (this one + the 7 fixed in `a37a60b`) that wouldn't have been triggered by running the original tool. We appear to be the first external users to drive the new PerturbNMF U-test code end-to-end on a fresh dataset, which is why this and adjacent issues are surfacing now rather than being caught during the rewrite.

## Proposed fix

Mirror the fallback logic from the real-test path (lines 44–49). If a TSV is provided, derive the reference-target list from it; otherwise fall back to `args.guide_annotation_key`.

```diff
@@ -157,7 +157,7 @@ def compute_fake_perturbation_tests():
                         prog_key=args.prog_key,
                         collapse_targets=True,
                         pseudobulk=False,
-                        reference_targets=args.reference_targets,
+                        reference_targets=args.guide_annotation_key,
                         FDR_method=args.FDR_method,
                         n_jobs=-1,
                         inplace=False
```

In the fake-test code path, the relabeled NT subset has `target ∈ {'non-targeting', 'targeting'}` (see line 149), so `reference_targets=['non-targeting']` is the right choice. Using `args.guide_annotation_key` matches the real-test fallback and respects user override.

## Latent issues in the same function (out of scope for this PR)

While debugging, two adjacent issues surfaced. Mentioning here so they can be tracked separately:

1. **Line 391** in `main()`: when `--visualizations` is set without `--compute_real_perturbation_tests`, `test_stats_real_df` is undefined and `pd.concat([test_stats_real_df, test_stats_fake_df], ...)` raises `NameError`. Either guard the concat on `args.compute_real_perturbation_tests`, or auto-load via `load_real_perturbation_tests()` if real-test results already exist on disk.

2. **Line 199** in `load_real_perturbation_tests()`: hardcoded `for samp in ['D0', 'D4', 'D7']`. This function is currently unreachable from `main()` (so harmless today), but if (1) is fixed by calling `load_real_perturbation_tests()`, this hardcode would break for any non-D0/D4/D7 dataset.

## Environment

- PerturbNMF main @ `8f7c9dd` (also reproduces on `4bf662a`)
- Python 3.10, pandas, scipy, multipy, etc.
- Running on Carter HPC (UCSD) — first end-to-end run of the new pipeline on Huangfu HUES8 datasets.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compute_fake_perturbation_tests crashes with AttributeError: 'Namespace' object has no attribute 'reference_targets' #2

`compute_fake_perturbation_tests` crashes with `AttributeError: 'Namespace' object has no attribute 'reference_targets'`

Summary

Reproduction

Root cause

Why this wasn't caught earlier

Proposed fix

Latent issues in the same function (out of scope for this PR)

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

compute_fake_perturbation_tests crashes with AttributeError: 'Namespace' object has no attribute 'reference_targets' #2

Description

compute_fake_perturbation_tests crashes with AttributeError: 'Namespace' object has no attribute 'reference_targets'

Summary

Reproduction

Root cause

Why this wasn't caught earlier

Proposed fix

Latent issues in the same function (out of scope for this PR)

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`compute_fake_perturbation_tests` crashes with `AttributeError: 'Namespace' object has no attribute 'reference_targets'`