Problem
Currently, if a cellranger count job fails partway through (e.g., due to memory issues, time limits, or transient errors), the entire job must restart from scratch because snakemake will remove files.
Proposed solution
Improve the Snakemake workflow to handle Cell Ranger failures more gracefully:
- Detect incomplete pipestances: Check if the output directory exists but lacks completion markers
- Remove stale lock files: If a lock file exists but no Cell Ranger process is running, automatically remove
{output_dir}/_lock
- Verify invocation parameters: Ensure the Snakemake rule parameters match the original run (Cell Ranger checks
_invocation and will error if parameters differ)
- Parse error logs: Check for
_errors files to log what stage failed and why
- Re-run with same command: Simply re-execute the same
cellranger count command - Cell Ranger automatically resumes
Cell Ranger resume behavior
- Cell Ranger has no
--resume flag - it automatically detects incomplete pipestances
- When the output directory (specified by
--id) exists and is incomplete, Cell Ranger resumes from the failed stage
- The same command-line arguments must be used (checked via
_invocation file)
- Lock file (
_lock) prevents multiple instances from running simultaneously
Implementation considerations
- Check for lock files and remove if stale (no running process)
- Log clearly whether resuming from a previous failure or starting fresh
- Handle the case where parameters have changed (Cell Ranger will refuse to resume)
- Consider adding a Snakemake parameter to force restart from scratch if needed
Acceptance criteria
Resources
Problem
Currently, if a
cellranger countjob fails partway through (e.g., due to memory issues, time limits, or transient errors), the entire job must restart from scratch because snakemake will remove files.Proposed solution
Improve the Snakemake workflow to handle Cell Ranger failures more gracefully:
{output_dir}/_lock_invocationand will error if parameters differ)_errorsfiles to log what stage failed and whycellranger countcommand - Cell Ranger automatically resumesCell Ranger resume behavior
--resumeflag - it automatically detects incomplete pipestances--id) exists and is incomplete, Cell Ranger resumes from the failed stage_invocationfile)_lock) prevents multiple instances from running simultaneouslyImplementation considerations
Acceptance criteria
Resources