snakemake rules for running anvio
This Snakefile will run a basic anvi'o analysis of a set of assemblies and a set of samples, using the snakemake workflow manager.
Files
Snakefile
The file containing the snakemake rules
config.yaml.template
An example file containing the specific sample and assembly names that will be used in the workflow, as well as additional variable specifications.
Must be renamed 'config.yaml' for use.
cluster.json
A file specifying compute cluster job resource parameters for particular rules in the Snakefile.
requirements.txt
A file specifying conda requirements for the snakefile conda environment
launch.sh
Example launch command for launching on a Torque (qsub) job manager
anvio_install.sh
The commands I used for setting up my Anvi'o conda environment on Centos6
dag.svg
Visualization of the workflow steps run when executing the example config.yaml
data [directory]
Fake example data structure. For your own data, replace the files in data/assemblies and data/samples with the assembled contigs and the per-sample paired-end gzipped fastqs, respectively. Files must be named [assembly].fa and [sample].R[1,2].fq.gz
Code Snippets
29 30 | shell: "tar -czf {output} {input}" |
38 39 | shell: "bowtie2-build {input} data/assemblies/{wildcards.assembly}" |
53 54 55 56 | shell: "bowtie2 -x {params.idx_base} -p {params.threads} --no-unal " \ "-q -1 {input.R1} -2 {input.R2} | " \ "samtools view -bS - > {output}" |
63 64 | shell: "samtools sort -O bam -o {output} {input}" |
71 72 | shell: "samtools index {input}" |
79 80 | shell: "anvi-gen-contigs-database -f {input} -o {output}" |
89 90 91 92 | shell: """ anvi-run-hmms -c {input} --num-threads {params.threads} """ |
99 100 | shell: "anvi-get-dna-sequences-for-gene-calls -c {input} -o {output}" |
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | shell: """ export CENTRIFUGE_BASE={params.centrifuge_base} centrifuge -f --threads {params.threads} \ -x {params.centrifuge_models} \ {input.fa} \ -S {output.hits} \ --report-file {output.report} ln -s {output.hits} data/anvio/{wildcards.assembly}/centrifuge_hits.tsv ln -s {output.report} data/anvio/{wildcards.assembly}/centrifuge_report.tsv cd data/anvio/{wildcards.assembly} anvi-import-taxonomy -c {wildcards.assembly}.db \ -i centrifuge_report.tsv centrifuge_hits.tsv \ -p centrifuge rm centrifuge_hits.tsv rm centrifuge_report.tsv cd ../../../ """ |
148 149 150 151 152 153 154 | shell: """ anvi-profile -i {input.sorted} \ -c {input.db} \ --overwrite-output-destinations \ -o data/sorted_reads/{wildcards.assembly}.{wildcards.sample}.bam-ANVIO_PROFILE """ |
168 169 170 171 172 173 174 | shell: """ anvi-merge {input.profiles} \ -o data/anvio/{wildcards.assembly}/SAMPLES_MERGED \ -c {input.db} \ -W """ |
182 183 184 185 186 187 188 | shell: """ anvi-summarize -p {input.prof} \ -c {input.db} \ -o data/anvio/{wildcards.assembly}/{wildcards.assembly}_SAMPLES-SUMMARY \ -C CONCOCT """ |
Support
- Future updates
Related Workflows





