Conversation
6 documentation files (zero publishDir references remain):
- docs/en/docs/side_quests/essential_scripting_patterns.md - Removed publishDir from GENERATE_REPORT code block
- docs/en/docs/side_quests/metadata.md - Removed publishDir from COWPY code block
- docs/en/docs/side_quests/dev_environment.md - Removed publishDir from FASTQC code block, changed search example from
publishDir to container
- docs/en/docs/side_quests/debugging.md - Removed publishDir from all 3 processes in corrected workflow, added
publish:/output {} blocks
- docs/en/docs/side_quests/nf-test.md - Removed publishDir from both processes, added publish:/output {} blocks
- docs/en/docs/side_quests/working_with_files.md - Rewrote section 6 to use output {} block pattern; updated process output
to emit full meta; updated sections 6.1, 6.2.2, 6.4, Takeaway, and Summary
~25 script files (zero publishDir references remain):
- workflows_of_workflows: 6 module files (removed publishDir), 1 main.nf (added publish/output)
- essential_scripting_patterns: 2 module files (removed publishDir), 1 main.nf (added publish/output)
- metadata: 3 module files (removed publishDir), 2 main.nf files (added publish/output)
- ide_features: 5 script files (removed publishDir from all, added publish/output to both workflow files)
- debugging: 2 workflow files (removed publishDir, added publish/output)
- nf-test: 1 main.nf (removed publishDir, added publish/output)
- working_with_files: 2 module files (removed publishDir, changed output to emit full meta), 1 main.nf (added publish/output
with dynamic path closure)
✅ Deploy Preview for nextflow-training ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Nextflow linting complete! ❌ 1 files had 1 errors 💡 Tip: Click filename locations to go directly to that code. View all 1 issues
View formatting changes
|
| directory params.output | ||
| mode 'copy' |
There was a problem hiding this comment.
This syntax is from an old preview version. These settings should go in the config:
outputDir = params.output
workflow.output.mode = 'copy'
docs/en/docs/side_quests/nf-test.md
Outdated
| directory 'results' | ||
| mode 'copy' |
There was a problem hiding this comment.
Also, the outputs here need to be declared even if they don't require any custom settings:
output {
greetings {
// path '.'
}
upper_greetings {
// path '.'
}
}I commented the path '.' because that's what it will do by default, but you could add it if you want to be explicit / not have empty output declarations
| The `tag` directive uses closure syntax (`{ ... }`) because it references input variables (`meta`) that aren't available until the process runs. | ||
| The closure defers evaluation until runtime. |
There was a problem hiding this comment.
Slightly off-topic -- with the v2 parser you don't need the closure here anymore, it's optional. If this guide is assuming the v2 parser, then it should be safe to remove the closure
But you might still want to have a note about the fact that the closure used to be needed
| files | ||
| ] | ||
| } | ||
| .set { ch_samples } |
There was a problem hiding this comment.
Slightly off-topic -- try to prefer assignments over set and tuple() over [] (when making a tuple):
ch_samples = ch_files.map { id, files ->
def (sample, replicate, type, readNum) = id.tokenize('_')
tuple(
[
id: sample,
replicate: replicate.replace('rep', ''),
type: type
],
files
)
}This code will work better with the type checker in the future
| directory 'results' | ||
| mode 'copy' |
There was a problem hiding this comment.
Same as above, move to config
| directory 'results' | ||
| mode 'copy' |
There was a problem hiding this comment.
Same as above, move to config
| directory params.output | ||
| mode 'copy' |
There was a problem hiding this comment.
Same as above, move to config
| processed = processFiles.out | ||
| heavy = heavyProcess.out | ||
| files = handleFiles.out |
There was a problem hiding this comment.
Slightly off-topic -- I think we should try to use regular assignments instead of .out as much as possible, since .out is a relatively unusual syntax pattern whereas assignments are a widely understood concept
I think .out is mainly useful for when a process / workflow has multiple outputs, but in this case, each process has only one output and you're already assigning the first two (processed_ch and heavy_ch), so I think it's reasonable to avoid .out entirely here.
| fastqc_html = FASTQC.out.html | ||
| fastqc_zip = FASTQC.out.zip |
There was a problem hiding this comment.
(continuing discussion of .out vs assignments)
Here I think it's fair to use .out. Hopefully with records, this will collapse to a single channel of records, at which point we could remove the use of .out here as well.
There was a problem hiding this comment.
Actually, what I would go ahead and do here is to join the two channels into a single channel:
ch_fastqc = FASTQC.out.html.join(FASTQC.out.zip)
publish:
fastqc = ch_fastqcThen you only have one output declaration:
output {
fastqc {
path 'fastqc'
}
}My intuition is that it will be better to combine related outputs into a single workflow output as much as possible
You could imagine taking this further and combining all "per-sample" outputs into a single workflow output called samples, while using the dynamic path closure to keep organizing output files by tool. That is more a matter of personal taste, but at the very least I think we should at least collapse to one workflow output per process output, as a best practice.
There was a problem hiding this comment.
You could also convert the tuples to maps, in anticipation of records:
ch_fastqc = FASTQC.out.html.join(FASTQC.out.zip)
publish:
fastqc = ch_fastqc.map { sample_id, html, zip -> [sample_id: sample_id, html: html, zip: zip] }But I wouldn't say this is necessary at this point
| directory params.output_dir | ||
| mode 'copy' |
There was a problem hiding this comment.
Same as above, move to config
| } | ||
| counts { | ||
| path 'counts' | ||
| } |
There was a problem hiding this comment.
(continuing discussion of collapsing outputs)
Just to illustrate, this is what it would like to collapse outputs to the extreme:
samples { s ->
s.fastqc_html >> 'fastqc'
s.fastqc_zip >> 'fastqc'
s.trimmed_fastq >> 'trimmed'
s.alignments >> 'alignments'
s.counts >> 'counts'
}This assumes you join the individual output channels into a single channel of maps / records:
samples = RNASEQ_PIPELINE.out.fastqc
.join( RNASEQ_PIPELINE.out.trimmed )
.join( RNASEQ_PIPELINE.out.alignments )
.join( RNASEQ_PIPELINE.out.counts )
.map { sample_id, fastqc_html, fastqc_zip, trimmed_fastq, alignments, counts -> [sample_id: sample_id, fastqc_html: fastqc_html, fastqc_zip: fastqc_zip, trimmed_fastq: trimmed_fastq, alignments: alignments, counts: counts] }As you can see, it gets pretty ugly when you're still using tuples... so I'm not suggesting that we do this now. Just something to consider for when we eventually introduce records
|
I think that concludes my review. Main two points are:
Aside from that, I left some extra thoughts about best practices. Feel free to take them or leave them Hopefully you can just throw my general suggestions into a spec and let the AI update all of the code examples for you 😄 |
Thanks @bentsherman, this is super helpful! |
Move directory/mode from output {} blocks to nextflow.config files,
declare all published outputs in output {} blocks, use variable
assignments instead of .out in publish blocks, and use tuple()/
assignment over []/.set {} for type-checker compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
||
| docker.enabled = true | ||
|
|
||
| outputDir = 'results' |
There was a problem hiding this comment.
For what it's worth, the output dir defaults to results, so you could omit this setting. But I think it's totally fair if you want to be explicit about it
1. Essential Scripting Patterns
2. Metadata
Scripts: Removed publishDir from modules/cowpy.nf in starter, solution/3.2, and solution/3.3. Added publish:/output {} to both solution main.nf files, publishing to results/.
Docs: Removed publishDir line from the COWPY process code block. No narrative changes needed.
3. Dev Environment + IDE Features
Scripts: Removed publishDir from basic_workflow.nf, modules/fastqc.nf, modules/star.nf, modules/utils.nf (MULTIQC and CUSTOM_DUMPSOFTWAREVERSIONS), and complex_workflow.nf (TRIM_GALORE and FEATURECOUNTS). Added publish:/output {} to both basic_workflow.nf and complex_workflow.nf. Added fastqc, trimmed, and alignments to the RNASEQ_PIPELINE named workflow's emit: section so the entry workflow can publish them.
Docs: Removed publishDir from the "Before" code block showing the inline FASTQC process. Changed the project-wide search example from searching for publishDir to searching for container.
4. Debugging
Scripts: Removed publishDir from all three processes in both buggy_workflow.nf (starter with bugs intact) and solutions/buggy_workflow.nf. Added publish:/output {} to both, publishing to processed/, heavy/, and files/ subdirectories under params.output.
Docs: Updated the "Complete Corrected Workflow" solution code block to match: removed publishDir from all three processes, added publish:/output {} blocks.
5. nf-test
Scripts: Removed publishDir from both sayHello and convertToUpper processes. Added publish:/output {} publishing to results/. Test assertions checking $launchDir/results/ are unchanged since the output directory is still results/.
Docs: Updated the expandable workflow code block to match. No changes to test assertion code or prose about "published to a directory called results/".
6. Workflows of Workflows
7. Working with Files
Scripts: Removed publishDir from modules/analyze_reads.nf (starter and solution). Changed process output from val(meta.id) to val(meta) so the output block can use metadata for dynamic paths. Added publish:/output {} to solution main.nf with a path closure: { meta, file -> "${meta.type}/${meta.id}/${meta.replicate}" }.
Docs (most extensive changes):