Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
8a75d43
feat add STAT k-mer extraction and deacon compatible index build
wkgardner Apr 17, 2026
91a2c39
add in rust dependecies for rust-script
wkgardner Apr 17, 2026
abc8b77
add deinterleave process for deacon paired read input
wkgardner Apr 17, 2026
3c6079b
add processes for both deacon index building and deacon filtering
wkgardner Apr 17, 2026
61cc8d8
fix: rewrite deacon STAT processes for correct pipeline integration
wkgardner Apr 22, 2026
4fbe00a
remove: drop SPLIT_INTERLEAVED_READS process
wkgardner Apr 22, 2026
f2c08ed
feat: replace STAT aligns_to with deacon for human virus read extraction
wkgardner Apr 22, 2026
3e9381b
making rust-script specific for supported platforms
wkgardner May 5, 2026
c7d98dd
feat: add interleaved/single-end branching to DEACON_FILTER_HUMAN_VIR…
wkgardner May 5, 2026
a51f619
refactor: simplify PREPROCESS_CONTIGS to assembly + filtering only
wkgardner May 5, 2026
71a3bac
feat: frontload deacon virus extraction in STAT_BLAST_WORKFLOW
wkgardner May 5, 2026
27c3b46
feat: restructure main.nf for frontloaded virus extraction pipeline
wkgardner May 5, 2026
d32eade
fix: downgrade RUN_SPADES resource label from ludicrous to high
wkgardner May 5, 2026
e98b7fb
feat: accept pre-interleave R1/R2 in DEACON_FILTER_HUMAN_VIRUS_READS
wkgardner May 7, 2026
576b362
feat: emit pre-interleave channel and conditionally skip INTERLEAVE_P…
wkgardner May 7, 2026
fac7530
feat: update COUNT_READS for pre-interleave R1/R2 input
wkgardner May 7, 2026
307ff7f
feat: wire STAT_BLAST_WORKFLOW to pre-interleave channel
wkgardner May 7, 2026
1babd9e
feat: pass pre-interleave channel to STAT_BLAST, interleaved to GOTTCHA2
wkgardner May 7, 2026
da5b503
refactor: rewrite main.nf for single-workflow pipeline
wkgardner May 9, 2026
bb81775
refactor: simplify GATHER_READS to file resolution only
wkgardner May 9, 2026
bab9505
feat: inline preprocessing into STAT_BLAST_WORKFLOW, remove tool sele…
wkgardner May 9, 2026
8165a4d
refactor: remove tool selection and GOTTCHA2 from NvdUtils
wkgardner May 9, 2026
67e46b9
config: remove GOTTCHA2, clumpify, and tools params from nextflow.config
wkgardner May 9, 2026
618a55a
config: remove GOTTCHA2 and clumpify from results.config
wkgardner May 9, 2026
50d9fbc
refactor: remove GOTTCHA2 and tools from Python models, CLI, and state
wkgardner May 9, 2026
523bacc
schema: remove GOTTCHA2/clumpify properties, bump to v3.0.0
wkgardner May 9, 2026
530dd49
remove(gottcha2): delete all GOTTCHA2 workflow files
wkgardner May 9, 2026
4c94e05
remove(clumpify): delete clumpify workflow and dead STAT processes
wkgardner May 9, 2026
195ef34
remove: delete old schema, preprocess_reads workflow, and host_deplet…
wkgardner May 9, 2026
0b491c8
fix: remove validate_tools validator referencing deleted tools field
wkgardner May 9, 2026
2b737fb
remove unused deacon process
wkgardner May 9, 2026
be18076
remove super high resource labels
wkgardner May 9, 2026
7b15390
small profiling changes and update fingerprint
wkgardner May 9, 2026
3ddfc3f
remove projectDir from rust-script process path
wkgardner May 9, 2026
41752d8
chore: update resources requirements for deacon
wkgardner May 12, 2026
9d2a82a
feat: add --summary JSON output to DEACON_FILTER_HUMAN_VIRUS_READS
wkgardner May 13, 2026
9f957fa
refactor: replace COUNT_READS with deacon summary read counts
wkgardner May 13, 2026
0c85179
feat: add ADD_READ_COUNTS_TO_BLAST process
wkgardner May 13, 2026
3acd86a
feat: wire ADD_READ_COUNTS_TO_BLAST into STAT_BLAST_WORKFLOW
wkgardner May 13, 2026
78fc9a3
fix: rename ADD_READ_COUNTS_TO_BLAST output to avoid clobbering input
wkgardner May 14, 2026
4a7e01d
remove slow and now-unused read counting modules as well as now-unuse…
nrminor May 14, 2026
0ab9345
feat!: establish v3 pipeline parameter surface
nrminor May 14, 2026
441081b
refactor: remove pipeline fingerprint verification
nrminor May 15, 2026
c46ff22
build!: stop building STAT in the container image
nrminor May 15, 2026
ef153bd
feat!: replace STAT database params with virus index params
nrminor May 15, 2026
e932acc
test: align Slack notification tests with current API
nrminor May 15, 2026
4e6c22c
refactor!: replace STAT contig classification chain with deacon filter
wkgardner May 15, 2026
be9170b
test: add human-depleted illumina test data
wkgardner May 15, 2026
cb7cdc7
chore update gitignore to track test data
wkgardner May 15, 2026
dfa396a
refactor!: migrate state management to BLAST-only v3 schema
nrminor May 15, 2026
26a9f2b
refactor!: rename main workflow to NVD_MAIN
nrminor May 15, 2026
f866ece
updating the version of the py_nvd library __init__.py to 3.0.0
nrminor May 15, 2026
0221c25
ci: align v3 validation checks with current surface
nrminor May 15, 2026
616bbb6
test: remove retired STAT-chain coverage
nrminor May 15, 2026
fc44f7c
refactor!: isolate v3 hit parquet store
nrminor May 15, 2026
1bbf1e6
fix: defer pipeline root lookup until command execution
nrminor May 15, 2026
fb667f6
refactor: restructure results directory layout for v3 pipeline
wkgardner May 18, 2026
d60d891
feat: add CONCATENATE_EXPERIMENT_BLAST process
wkgardner May 18, 2026
3b2893c
feat: wire CONCATENATE_EXPERIMENT_BLAST into main workflow
wkgardner May 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/scripts/validate_schema_completeness.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,11 @@
"state_dir",
# Exposed via negated CLI flag --no-slack
"slack_enabled",
# Deacon tuning (set via params-file or preset)
"deacon_kmer_size",
"deacon_window_size",
"deacon_abs_threshold",
"deacon_rel_threshold",
# Host index/depletion tuning (set via params-file or preset)
"host_kmer_size",
"host_window_size",
"host_abs_threshold",
"host_rel_threshold",
}


Expand Down
12 changes: 7 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,16 +113,18 @@ jobs:

echo "✅ Config syntax valid for ${{ matrix.profile }} profile"

- name: Verify required parameters exist
- name: Verify expected top-level parameters exist
run: |
# Check that critical parameters are defined
for param in samplesheet experiment_id tools results; do
# Check that stable top-level parameters are defined in the rendered config.
# The v3 pipeline has a single main workflow, so there is no public
# `tools` selector parameter to validate here.
for param in samplesheet experiment_id results; do
if ! grep -q "^\s*${param}\s*=" config_output.txt; then
echo "::error::Required parameter '${param}' not found in config"
echo "::error::Expected parameter '${param}' not found in config"
exit 1
fi
done
echo "✅ All required parameters present"
echo "✅ All expected top-level parameters present"

validate-config-includes:
name: Validate Config File Includes
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
# assets
!/assets
!/assets/example_samplesheet.csv
!/assets/test_samplesheet.csv

# docs
!/docs
Expand Down Expand Up @@ -90,3 +91,7 @@
!/.vscode/extensions.json
!/.vscode/settings.json

# testing data
!/tests
!/tests/data
!/tests/data/*
14 changes: 0 additions & 14 deletions .pre-commit-config.yaml

This file was deleted.

41 changes: 2 additions & 39 deletions Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ ENV ~=/opt

# run a few apt installs
RUN apt-get update && \
apt-get install -y curl wget git gcc g++ cmake util-linux && \
apt-get install -y curl util-linux && \
rm -rf /var/lib/apt/lists/* && \
mkdir /dependencies && \
dpkg -l > /dependencies/apt-get.lock
Expand All @@ -39,8 +39,7 @@ ENV PATH=$PATH:$HOME/.pixi/bin
# 4) install everything else with pixi
RUN cd $HOME && \
script -q -c "pixi install --frozen" && \
script -q -c "pixi clean cache --assume-yes" && \
script -q -c "pixi add cxx-compiler cmake make"
script -q -c "pixi clean cache --assume-yes"

# 5) Add pixi environment to PATH (works in Docker, Podman, AND Apptainer)
ENV PATH=$PATH:/opt/.pixi/envs/default/bin
Expand All @@ -49,41 +48,5 @@ ENV PATH=$PATH:/opt/.pixi/envs/default/bin
ENV NXF_CACHE_DIR=/scratch
ENV NXF_HOME=/scratch

# ----------------------------------------------------------------------------------- #

# Copy necessary ncbi files
COPY conf/user-settings.mkfg /.ncbi/user-settings.mkfg

# Install NCBI tools and configure environment
RUN mkdir -p /.ncbi && \
chmod 777 /.ncbi/user-settings.mkfg && \
mkdir /build && cd /build && \
git config --global http.sslVerify false && \
git clone -b 3.2.0 https://github.com/ncbi/ngs-tools.git && \
git clone -b 3.2.1 https://github.com/ncbi/ncbi-vdb.git && \
git clone -b 3.2.1 https://github.com/ncbi/sra-tools.git && \
sed -i 's/cmake_minimum_required[[:space:]]*([[:space:]]*VERSION[[:space:]]*2\.8\.12[[:space:]]*)/cmake_minimum_required(VERSION 3.5)/' /build/ngs-tools/CMakeLists.txt && \
echo "Verifying CMake version change:" && \
grep "cmake_minimum_required" /build/ngs-tools/CMakeLists.txt

# Build ncbi-vdb, sra-tools and ngs-tools
RUN cd /build/ncbi-vdb && \
./configure --relative-build-out-dir && \
make -j$(nproc) && \
cd /build/sra-tools && \
./configure --relative-build-out-dir && \
make -j$(nproc) && \
cd /build/ngs-tools && \
./configure --relative-build-out-dir && \
make -j$(nproc) && \
# Copy built binaries and clean up in the same layer
find /build/OUTDIR -type f -executable -exec cp {} /usr/local/bin/ \; && \
cd / && rm -rf /build

# remove now unnecessary compilers and clean the PyPI and conda caches
RUN cd $HOME && pixi remove rust cxx-compiler cmake make && pixi clean cache --yes

# Fix snakemake permission issue
RUN mkdir /.cache; chmod a+rwX /.cache


4 changes: 4 additions & 0 deletions assets/test_samplesheet.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
sample_id,srr,platform,fastq1,fastq2
water,,illumina,tests/data/water_R1.fastq.gz,tests/data/water_R2.fastq.gz
hits_only,,illumina,tests/data/hits_only_R1.fastq.gz,tests/data/hits_only_R2.fastq.gz
water_plus_hits,,illumina,tests/data/water_plus_hits_R1.fastq.gz,tests/data/water_plus_hits_R2.fastq.gz
6 changes: 1 addition & 5 deletions bin/check_run_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -466,10 +466,6 @@ def parse_args() -> argparse.Namespace:
"--blast-db-version",
help="BLAST database version (ignored - now passed to register_hits.py)",
)
parser.add_argument(
"--stat-db-version",
help="STAT database version (ignored - now passed to register_hits.py)",
)
parser.add_argument(
"--lock-ttl",
type=int,
Expand All @@ -479,7 +475,7 @@ def parse_args() -> argparse.Namespace:
parser.add_argument(
"--upload-types",
help="Comma-separated upload types to check for duplicate detection "
"(e.g., 'blast,blast_fasta' or 'gottcha2,gottcha2_fasta'). "
"(e.g., 'blast,blast_fasta'). "
"When set, only uploads matching these types count as 'already uploaded'.",
)
parser.add_argument(
Expand Down
Loading
Loading