fix: unpin ribodetector GPU from pytorch-gpu=1.11.0 (__cuda workaround no longer needed) by pinin4fjords · Pull Request #11258 · nf-core/modules

pinin4fjords · 2026-04-22T09:11:05Z

Summary

Unpin ribodetector's GPU container from pytorch-gpu=1.11.0. That version was the last release whose conda dependencies didn't require the __cuda virtual package, which is absent on Wave's GPU-less build servers. seqeralabs/wave#1027 (merged) removes that constraint by retrying failed solves with CONDA_OVERRIDE_CUDA set, so any post-1.11 pytorch-gpu can now be built.

Given ribodetector is an inference-only CNN from 2022, the goal here is not to chase newer PyTorch; it's to unfreeze from the __cuda-avoidance pin while keeping host compatibility as wide as possible.

Choices

conda-forge::pytorch-gpu=2.1.0 - oldest post-__cuda-era pytorch-gpu on conda-forge that has a py<=3.10 build matrix overlap with bioconda::ribodetector=0.3.3 (which forces py<=3.10).
conda-forge::cuda-version=11.2 - lowest CUDA minor that 2.1.0's py<=3.10 builds target. NVIDIA driver floor ~450 (2020), which covers essentially every current HPC GPU host. For comparison, pinning at >=12,<13 silently resolves to cuda-version=12.9 (driver floor ~575, early 2025).
Both pins are exact (no ranges) per nf-core policy.

Address @mashehu review

Pin versions exactly, no ranges (cuda-version=11.2).
Capture the CUDA runtime version in the versions topic (emits cpu on the non-GPU path, actual CUDA version on the GPU path).

Validation

Wave build of the new environment.gpu.yml with the conda/micromamba:v2 template succeeded and exercises the __cuda retry path (proves #1027 works for a real-world env):

community.wave.seqera.io/library/ribodetector_pytorch-gpu_cuda-version:fa9183da731515ea
oras://community.wave.seqera.io/library/ribodetector_pytorch-gpu_cuda-version:840843c8b08b83a5

nf-core modules lint ribodetector is clean apart from the known Wave-tag-version-heuristic warning shared with all GPU modules.

Note for Wave team

For this env, the default Wave CLI --await PT15M fires before the build completes (it takes ~20-25 minutes end-to-end, likely due to the __cuda retry plus a large solver matrix for 2.1-era builds). Tag ribodetector_pytorch-gpu_cuda-version:fa9183da731515ea should be enough to locate the build on the Wave backend.

seqeralabs/wave#1026 - Wave __cuda issue
seqeralabs/wave#1027 - Wave fix (merged)
nf-core/rnaseq#1788 - Pipeline-side integration
fix: split GPU CI into separate jobs per tag to fix silent test skipping #11203 - GPU CI workflow fix (split jobs per tag)
RFC: document GPU container variants in meta.yml (ribodetector) #11259 - RFC on documenting multiple GPU container variants in meta.yml (out of scope here)
Supersedes fix: update ribodetector GPU container to modern PyTorch/CUDA #11197

Test plan

CI module tests (CPU path) pass
GPU container runs ribodetector on a sample dataset

Update GPU container from PyTorch 1.11.0 (CUDA 11.1, March 2022) to PyTorch 2.10.0 (CUDA 12.9) and pin cuda-version>=12,<13 in environment.gpu.yml to keep the solver within supported CUDA versions. The old GPU container used PyTorch 1.11.0 because it was the last version whose conda dependencies did not require the __cuda virtual package, which is absent on Wave's GPU-less build servers. Wave now handles this automatically via a two-pass solve (seqeralabs/wave#1027), so we can build containers with current PyTorch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…st host compat Reframes the GPU-container refresh. The motivation here is not to chase a newer PyTorch version; it's to unpin from 1.11.0, which was the last release whose conda dependencies avoided the __cuda virtual package. Wave #1027 (merged) removes that constraint, so any post-1.11 pytorch-gpu can now be built. Given ribodetector is an inference-only CNN from 2022 and has no use for newer PyTorch features, the lowest post-__cuda pytorch-gpu on conda-forge that has a py<=3.10 + low-CUDA build is pytorch-gpu=2.1.0 with cuda-version=11.2. This maps to an NVIDIA driver floor of ~450 (2020), covering essentially every current HPC GPU host - far wider than the bleeding-edge 2.10.0 + cuda-version=12.9 combination (driver floor 575, early 2025). Address mashehu's review: - Exact pins, no ranges (nf-core policy). - cuda runtime version captured as a versions topic emit so the container's CUDA minor is visible in downstream provenance reports. Reports `cpu` on the non-GPU path. Container hashes regenerated from the new environment.gpu.yml; the CPU container is unchanged (environment.yml not touched). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The new versions_cuda topic emit adds an output channel to the process, which breaks snapshot equality for both the real and stub CPU tests. Patch the snap file to include the new entry (`cpu` on the non-GPU path, populated by eval at runtime). GPU snapshot will be regenerated via the nf-core-bot workflow after this lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

nf-test serialises object keys alphabetically; `versions_cuda` comes before `versions_ribodetector` (c < r) in the actual snapshot output. My previous edit had the reverse order which didn't match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pinin4fjords · 2026-04-22T13:10:15Z

@nf-core-bot update gpu snapshot path:modules/nf-core/ribodetector

Per mashehu: 'cpu' is not a version string, making it misleading inside versions topic channels. Switch the eval fallback to 'no CUDA available', which is unambiguous about what the task's pytorch build actually supports. GPU path is unaffected (the eval's `or` only fires when torch.version.cuda is None). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ntegration Bump to the latest upstream of: - fq/lint (nf-core/modules#11227): constrain reads arity to 1..2 - ribodetector (nf-core/modules#11258): unpin GPU container from pytorch-gpu=1.11.0; emit cuda version on the topic - tximeta/tximport (nf-core/modules#11141): fix gene-level crash on mismatched transcript FASTA/GTF - fastq_fastqc_umitools_trimgalore (nf-core/modules#11228): handle null trim_log in the read-count map - custom/catadditionalfasta (nf-core/modules#11256): topic-based versions, explicit out/\${prefix}.{fasta,gtf} paths, task.ext.prefix ?: meta.id prefix handling The custom/catadditionalfasta interface change needs pipeline-side follow-up in conf/modules/prepare_genome.config: - Fix the stale CAT_ADDITIONAL_FASTA selector (now CUSTOM_CATADDITIONALFASTA) and split PREPROCESS_TRANSCRIPTS_FASTA_GENCODE into its own block. - Set ext.prefix = "\${params.genome ?: fasta.baseName}_\${add_fasta.baseName}" so output filenames follow the previous {genome}_{add_name} pattern; the new module default (meta.id) would otherwise rename outputs to genome_transcriptome.{fasta,gtf}. Behaviour note: fixing the withName selector also exposes a pre-existing intent that was masked. CUSTOM_CATADDITIONALFASTA outputs now only publish when --save_reference is set; the stale selector previously let them fall through to the default publishDir and land in <outdir>/custom/out/ regardless of --save_reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions Bot added the size/xs label Apr 22, 2026

pinin4fjords mentioned this pull request Apr 22, 2026

RFC: document GPU container variants in meta.yml (ribodetector) #11259

Draft

mashehu reviewed Apr 22, 2026

View reviewed changes

Comment thread modules/nf-core/ribodetector/environment.gpu.yml Outdated

mashehu reviewed Apr 22, 2026

View reviewed changes

Comment thread modules/nf-core/ribodetector/environment.gpu.yml Outdated

github-actions Bot added size/s and removed size/xs labels Apr 22, 2026

pinin4fjords changed the title ~~fix: update ribodetector GPU container to modern PyTorch/CUDA~~ fix: unpin ribodetector GPU from pytorch-gpu=1.11.0 (__cuda workaround no longer needed) Apr 22, 2026

mashehu reviewed Apr 22, 2026

View reviewed changes

Comment thread modules/nf-core/ribodetector/tests/main.nf.test.snap Outdated

pinin4fjords and others added 2 commits April 22, 2026 13:38

Merge branch 'master' into fix/ribodetector-containers-only

4c11604

[automated] Update gpu snapshot

b822b65

SPPearce approved these changes Apr 22, 2026

View reviewed changes

pinin4fjords enabled auto-merge April 22, 2026 13:46

pinin4fjords added this pull request to the merge queue Apr 22, 2026

Merged via the queue into master with commit fbfb844 Apr 22, 2026
43 checks passed

pinin4fjords deleted the fix/ribodetector-containers-only branch April 22, 2026 13:51

pinin4fjords mentioned this pull request Apr 22, 2026

chore: sync nf-core components and retire ch_versions plumbing nf-core/rnaseq#1814

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: unpin ribodetector GPU from pytorch-gpu=1.11.0 (__cuda workaround no longer needed)#11258

fix: unpin ribodetector GPU from pytorch-gpu=1.11.0 (__cuda workaround no longer needed)#11258
pinin4fjords merged 7 commits intomasterfrom
fix/ribodetector-containers-only

pinin4fjords commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pinin4fjords commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

pinin4fjords commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Choices

Address @mashehu review

Validation

Note for Wave team

Related

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pinin4fjords commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pinin4fjords commented Apr 22, 2026 •

edited

Loading