This workflow's subsampling is implemented by a single augur filter call. It works fine, but is limited in capabilities and not portable across pathogen repos.
Description of work
-
Update the configuration from:
|
filter: |
|
min_length: 8000 |
|
group_by: country year month MuV_genotype division |
|
exclude: "{build}/exclude.txt" |
|
include: "{build}/include.txt" |
|
specific: |
|
north-america: --subsample-max-sequences 4000 --min-date 2006 --query "region=='North America' & (MuV_genotype=='G')" |
|
global: --subsample-max-sequences 4000 --min-date 1950 |
to something more generic and customizable.
-
Replace the filter rule with a rule that calls augur subsample.
This rule should:
- Allow concurrent sample runs with
threads
- Pass a dump of Snakemake's
config as a YAML file to --config
- Extract the relevant configuration using
--config-section
- Allow Snakemake to intelligently handle conditional runs of the subsample rule in the case of config changes.
- Be compatible with external analysis directories (
nextstrain run)
This workflow's subsampling is implemented by a single augur filter call. It works fine, but is limited in capabilities and not portable across pathogen repos.
Description of work
Update the configuration from:
mumps/phylogenetic/defaults/config.yaml
Lines 13 to 20 in 53c0a6e
to something more generic and customizable.
Replace the filter rule with a rule that calls
augur subsample.This rule should:
threadsconfigas a YAML file to--config--config-sectionnextstrain run)