Snakemake pipeline for detection limit test from laser-microdissected samples
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
A Snakemake workflow for assessing detection limit from laser-microdissected samples.
Usage
-
Requirements
-
Clone the repository Clone the repository, and set it as the working directory.
git clone --recursive https://github.com/3d-omics/detection_limit_test.git
cd detection_limit_test
- Run the pipeline with the test data (takes 5 minutes)
snakemake \ --use-conda \ --conda-frontend mamba \ -j 8
-
Edit the following files:
-
config/samples.tsv
: the control file with the sequencing libraries and their location. -
config/features.tsv
: the references against which to map the libraries: human, chicken / pig, MAG catalogue. -
config/params.tsv
: parameters for every program. The defaults are reasonable.
-
-
Run the pipeline on a cluster with slurm and go for a walk:
./run_slurm
Brief description
-
Trim reads and remove adaptors with
fastp
-
Map to human, chicken / pig, mag catalogue:
-
Map to the reference with
bowtie2
-
Extract the reads that have one of both ends unmapped with
samtools
-
Map those unmapped reads to the next reference
-
-
Generate MAG statistics with
-
coverm
-
singlem
-
nonpareil
-
-
Generate lots of reports in the
reports/
folder
Rulegraph
Code Snippets
21 22 23 24 25 26 27 28 29 | shell: """ bowtie2-build \ --threads {threads} \ {params.extra} \ {input.reference} \ {output.mock} \ 2> {log} 1>&2 """ |
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | shell: """ if [[ {params.is_paired} = "True" ]] ; then (bowtie2 \ -x {input.mock} \ -1 {input.forward_} \ -2 {input.reverse_} \ --threads {threads} \ --rg-id '{params.rg_id}' \ --rg '{params.rg_extra}' \ {params.extra} \ | samtools sort \ --threads {threads} \ -m {params.samtools_mem} \ | samtools rmdup - - \ | samtools view \ --reference {input.reference} \ --output {output.cram} \ --output-fmt cram,level=9,nthreads={threads} \ --write-index \ ) 2> {log} 1>&2 else (bowtie2 \ -x {input.mock} \ -U {input.forward_} \ --threads {threads} \ --rg-id '{params.rg_id}' \ --rg '{params.rg_extra}' \ {params.extra} \ | samtools sort \ --threads {threads} \ -m {params.samtools_mem} \ | samtools rmdup -s - - \ | samtools view \ --reference {input.reference} \ --output {output.cram} \ --output-fmt cram,level=9,nthreads={threads} \ --write-index \ ) 2> {log} 1>&2 fi """ |
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | shell: """ if [[ {params.is_paired} = "True" ]] ; then (samtools view \ --reference {input.reference} \ --threads {threads} \ -u \ -o /dev/stdout \ -f 12 \ {input.cram} \ | samtools fastq \ -1 {output.forward_} \ -2 {output.reverse_} \ -0 /dev/null \ -c 9 \ --threads {threads} \ ) 2> {log} 1>&2 else (samtools view \ --reference {input.reference} \ --threads {threads} \ -u \ -o /dev/stdout \ -f 4 \ {input.cram} \ | samtools fastq \ -0 {output.forward_} \ -c 9 \ --threads {threads} \ ) 2> {log} 1>&2 fi """ |
21 22 23 24 25 26 27 28 29 | shell: """ bowtie2-build \ --threads {threads} \ {params.extra} \ {input.reference} \ {output.mock} \ 2> {log} 1>&2 """ |
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | shell: """ if [[ {params.is_paired} == "True" ]] ; then (bowtie2 \ -x {input.mock} \ -1 {input.forward_} \ -2 {input.reverse_} \ --threads {threads} \ --rg-id '{params.rg_id}' \ --rg '{params.rg_extra}' \ {params.extra} \ | samtools sort \ -l 9 \ -M \ -m {params.samtools_mem} \ -o {output.cram} \ --reference {input.reference} \ --threads {threads} \ --write-index \ ) 2> {log} 1>&2 else (bowtie2 \ -x {input.mock} \ -U {input.forward_} \ --threads {threads} \ --rg-id '{params.rg_id}' \ --rg '{params.rg_extra}' \ {params.extra} \ | samtools sort \ -l 9 \ -M \ -m {params.samtools_mem} \ -o {output.cram} \ --reference {input.reference} \ --threads {threads} \ --write-index \ ) 2> {log} 1>&2 fi """ |
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | shell: """ if [[ {params.is_paired} = True ]] ; then fastp \ --in1 {input.forward_} \ --in2 {input.reverse_} \ --out1 >(bgzip -l 1 -@ {threads} > {output.forward_}) \ --out2 >(bgzip -l 1 -@ {threads} > {output.reverse_}) \ --unpaired1 >(bgzip -l 1 -@ {threads} > {output.unpaired1}) \ --unpaired2 >(bgzip -l 1 -@ {threads} > {output.unpaired2}) \ --html {output.html} \ --json {output.json} \ --compression 1 \ --verbose \ --adapter_sequence {params.adapter_forward} \ --adapter_sequence_r2 {params.adapter_reverse} \ --length_required {params.length_required} \ --thread {threads} \ {params.extra} \ 2> {log} 1>&2 else fastp \ --in1 {input.forward_} \ --out1 >(bgzip -l 1 -@ {threads} > {output.forward_}) \ --html {output.html} \ --json {output.json} \ --compression 1 \ --verbose \ --adapter_sequence {params.adapter_forward} \ --length_required {params.length_required} \ --thread {threads} \ {params.extra} \ 2> {log} 1>&2 fi """ |
12 13 | shell: "fastqc {input} 2> {log} 1>&2" |
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | shell: """ if [[ {params.is_paired} = "True" ]] ; then kraken2 \ --db {input.database} \ --threads {threads} \ --paired \ --gzip-compressed \ --output >(pigz > {output.out_gz}) \ --report {output.report} \ {input.forward_} \ {input.reverse_} \ > {log} 2>&1 else kraken2 \ --db {input.database} \ --threads {threads} \ --gzip-compressed \ --output >(pigz > {output.out_gz}) \ --report {output.report} \ {input.forward_} \ > {log} 2>&1 fi """ |
13 14 15 16 17 | shell: """ ln --symbolic $(readlink --canonicalize {input.forward_}) {output.forward_} 2> {log} ln --symbolic $(readlink --canonicalize {input.reverse_}) {output.reverse_} 2> {log} """ |
12 13 14 15 16 17 18 19 20 21 22 23 24 | shell: """ (gzip \ --decompres \ --stdout {input.fa_gz} \ | bgzip \ --compress-level 9 \ --threads {threads} \ --stdout \ /dev/stdin \ > {output.fa_gz} \ ) 2> {log} """ |
43 44 45 46 47 48 49 50 51 52 53 54 55 | shell: """ (gzip \ --decompres \ --stdout {input.fa_gz} \ | bgzip \ --compress-level 9 \ --threads {threads} \ --stdout \ /dev/stdin \ > {output.fa_gz} \ ) 2> {log} """ |
18 19 20 21 22 23 24 25 26 27 28 29 | shell: """ multiqc \ --title {params.library} \ --force \ --filename {params.library} \ --outdir {params.out_dir} \ --dirs \ --dirs-depth 1 \ {input} \ 2> {log} 1>&2 """ |
13 14 15 16 17 18 19 20 21 22 | shell: """ multiqc \ --filename reads \ --title reads \ --force \ --outdir {params.dir} \ {input} \ 2> {log} 1>&2 """ |
37 38 39 40 41 42 43 44 45 46 | shell: """ multiqc \ --title fastp \ --force \ --filename fastp \ --outdir {params.dir} \ {input} \ 2> {log} 1>&2 """ |
61 62 63 64 65 66 67 68 69 70 71 | shell: """ multiqc \ --title {params.title} \ --force \ --filename {params.title} \ --outdir {params.dir} \ --module kraken \ {input} \ 2> {log} 1>&2 """ |
97 98 99 100 101 102 103 104 105 106 107 108 | shell: """ multiqc \ --title {params.title} \ --force \ --filename {params.title} \ --outdir {params.dir} \ --dirs \ --dirs-depth 1 \ {input.reports} \ 2> {log} 1>&2 """ |
135 136 137 138 139 140 141 142 143 144 145 146 | shell: """ multiqc \ --title {params.title} \ --force \ --filename {params.title} \ --outdir {params.dir} \ --dirs \ --dirs-depth 1 \ {input.reports} \ 2> {log} 1>&2 """ |
12 13 | shell: "samtools flagstats {input.cram} > {output.txt} 2> {log}" |
27 28 | shell: "samtools idxstats {input.cram} > {output.tsv} 2> {log}" |
43 44 45 46 47 48 49 50 | shell: """ samtools stats \ --reference {input.reference} \ {input.cram} \ > {output.tsv} \ 2> {log} """ |
64 65 66 67 68 69 70 71 | shell: """ samtools stats \ --reference {input.reference} \ {input.cram} \ > {output.tsv} \ 2> {log} """ |
28 29 30 31 32 33 34 35 36 37 38 39 | shell: """ gzip -dc {input.forward_} > {output.forward_fq} 2> {log} nonpareil \ -s {output.forward_fq} \ -T kmer \ -b {params.prefix} \ -f fastq \ -t {threads} \ 2>> {log} 1>&2 || true """ |
64 65 66 67 68 69 70 | shell: """ Rscript --no-init-file workflow/scripts/aggregate_nonpareil.R \ --input-folder {params.input_dir} \ --output-file {output} \ 2> {log} 1>&2 """ |
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | shell: """ if [[ {params.is_paired} = "True" ]] ; then singlem pipe \ --forward {input.forward_} \ --reverse {input.reverse_} \ --otu-table {output.otu_table} \ --archive-otu-table {output.archive_otu_table} \ --taxonomic-profile {output.condense} \ --metapackage {input.data} \ --threads {threads} \ --assignment-threads {threads} \ 2> {log} 1>&2 || true else singlem pipe \ --forward {input.forward_} \ --otu-table {output.otu_table} \ --archive-otu-table {output.archive_otu_table} \ --taxonomic-profile {output.condense} \ --metapackage {input.data} \ --threads {threads} \ --assignment-threads {threads} \ 2> {log} 1>&2 || true fi """ |
151 152 153 154 155 156 157 158 | shell: """ singlem condense \ --input-archive-otu-tables {input.archive_otu_tables} \ --taxonomic-profile {output.condense} \ --metapackage {input.data} \ 2> {log} 1>&2 """ |
190 191 192 193 194 195 196 197 198 199 200 | shell: """ samtools view \ -F 4 \ --threads {threads} \ --reference {input.reference} \ --output {output.bam} \ --fast \ {input.cram} \ 2> {log} """ |
219 220 221 222 223 224 225 226 227 | shell: """ coverm genome \ --bam-files {input.bam} \ --methods {params.method} \ --separator {params.separator} \ --min-covered-fraction {params.min_covered_fraction} \ > {output} 2> {log} || true \ """ |
245 246 247 248 249 250 251 | shell: """ Rscript --no-init-file workflow/scripts/aggregate_coverm.R \ --input-folder {params.input_dir} \ --output-file {output} \ 2> {log} 1>&2 """ |
276 277 278 279 280 281 282 283 | shell: """ coverm contig \ --bam-files {input.bam} \ --methods {params.method} \ --proper-pairs-only \ > {output} 2> {log} || true """ |
301 302 303 304 305 306 307 | shell: """ Rscript --no-init-file workflow/scripts/aggregate_coverm.R \ --input-folder {params.input_dir} \ --output-file {output} \ 2> {log} 1>&2 """ |
Support
- Future updates
Related Workflows





