Viralrecon - Bioinformatics analysis pipeline for low-frequency variant calling.
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
nfcore/viralrecon is a bioinformatics analysis pipeline used to perform assembly and intrahost/low-frequency variant calling for viral samples. The pipeline currently supports metagenomics and amplicon sequencing data derived from the Illumina sequencing platform.
This pipeline is a re-implementation of the SARS_Cov2_consensus-nf and SARS_Cov2_assembly-nf pipelines initially developed by Sarai Varona and Sara Monzon from BU-ISCIII . Porting both of these pipelines to nf-core was an international collaboration between numerous contributors and developers, led by Harshil Patel from the The Bioinformatics & Biostatistics Group at The Francis Crick Institute , London. We appreciated the need to have a portable, reproducible and scalable pipeline for the analysis of COVID-19 sequencing samples and so the Avengers Assembled! Please come and join us and add yourself to the contributor list :)
We have integrated a number of options in the pipeline to allow you to run specific aspects of the workflow if you so wish. For example, you can skip all of the assembly steps with the
--skip_assembly
parameter. See
usage docs
for all of the available options when running the pipeline.
Pipeline summary
-
Download samples via SRA, ENA or GEO ids (
ENA FTP
,parallel-fastq-dump
; if required ) -
Merge re-sequenced FastQ files (
cat
; if required ) -
Read QC (
FastQC
) -
Adapter trimming (
fastp
) -
Variant calling
i. Read alignment (Bowtie 2
)
ii. Sort and index alignments (SAMtools
)
iii. Primer sequence removal (iVar
; amplicon data only )
iv. Duplicate read marking (picard
; removal optional )
v. Alignment-level QC (picard
,SAMtools
)
vi. Choice of multiple variant calling and consensus sequence generation routes (VarScan 2
,BCFTools
,BEDTools
||iVar variants and consensus
||BCFTools
,BEDTools
)
- Variant annotation (SnpEff
,SnpSift
)
- Consensus assessment report (QUAST
) -
De novo
assembly
i. Primer trimming (Cutadapt
; amplicon data only )
ii. Removal of host reads (Kraken 2
)
iii. Choice of multiple assembly tools (SPAdes
||metaSPAdes
||Unicycler
||minia
)
- Blast to reference genome (blastn
)
- Contiguate assembly (ABACAS
)
- Assembly report (PlasmidID
)
- Assembly assessment report (QUAST
)
- Call variants relative to reference (Minimap2
,seqwish
,vg
,Bandage
)
- Variant annotation (SnpEff
,SnpSift
) -
Present QC and visualisation for raw read, alignment, assembly and variant calling results (
MultiQC
)
Code Snippets
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | """ zcat $vcf \\ | grep -v '#' \\ | awk -v FS='\t' -v OFS='\t' '{print \$1, (\$2-1), (\$2)}' \\ > variants.bed bedtools \\ slop \\ -i variants.bed \\ -g $sizes \\ -b $window \\ > variants.slop.bed ASCIIGenome \\ -ni \\ -x "trackHeight 0 bam#1 && trackHeight $track_height bam@2 $paired_end && filterVariantReads && save ${prefix}.%r.pdf" \\ --batchFile variants.slop.bed \\ --fasta $fasta \\ $bam \\ $vcf \\ $bed_track \\ $gff_track \\ > /dev/null cat <<-END_VERSIONS > versions.yml "${task.process}": asciigenome: \$(echo \$(ASCIIGenome -ni --version 2>&1) | sed -e "s/ASCIIGenome //g") bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 34 | """ collapse_primer_bed.py \\ --left_primer_suffix $left_suffix \\ --right_primer_suffix $right_suffix \\ $bed \\ ${bed.baseName}.collapsed.bed cat <<-END_VERSIONS > versions.yml "${task.process}": python: \$(python --version | sed 's/Python //g') END_VERSIONS """ |
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | """ sed -r '/^[ACTGactg]+\$/ s/\$/X/g' $adapters > adapters.sub.fa cutadapt \\ --cores $task.cpus \\ $args \\ $paired \\ $trimmed \\ $reads \\ > ${prefix}.cutadapt.log cat <<-END_VERSIONS > versions.yml "${task.process}": cutadapt: \$(cutadapt --version) END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 | """ awk 'BEGIN{OFS=\"\\t\";FS=\"\\t\"}{print \$0,\$5/\$15,\$5/\$14}' $hits | awk 'BEGIN{OFS=\"\\t\";FS=\"\\t\"} \$15 > 200 && \$17 > 0.7 && \$1 !~ /phage/ {print \$0}' > tmp.out cat $header tmp.out > ${prefix}.txt cat <<-END_VERSIONS > versions.yml "${task.process}": sed: \$(echo \$(sed --version 2>&1) | sed 's/^.*GNU sed) //; s/ .*\$//') END_VERSIONS """ |
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | """ ivar_variants_to_vcf.py \\ $tsv \\ ${prefix}.vcf \\ --fasta $fasta \\ $args \\ > ${prefix}.variant_counts.log cat $header ${prefix}.variant_counts.log > ${prefix}.variant_counts_mqc.tsv cat <<-END_VERSIONS > versions.yml "${task.process}": python: \$(python --version | sed 's/Python //g') END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 | """ kraken2-build --db kraken2_db --threads $task.cpus $args --download-taxonomy kraken2-build --db kraken2_db --threads $task.cpus $args2 --download-library $library kraken2-build --db kraken2_db --threads $task.cpus $args3 --build cat <<-END_VERSIONS > versions.yml "${task.process}": kraken2: \$(echo \$(kraken2 --version 2>&1) | sed 's/^.*Kraken version //; s/ .*\$//') pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) END_VERSIONS """ |
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | """ samtools \\ mpileup \\ $args \\ --reference $fasta \\ $bam \\ $mpileup \\ | awk -v OFS='\\t' '{print \$1, \$2-1, \$2, \$4}' | awk '\$4 < $args2' > lowcov_positions.txt make_bed_mask.py \\ $vcf \\ lowcov_positions.txt \\ ${prefix}.bed cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') python: \$(python --version | sed 's/Python //g') END_VERSIONS """ |
22 23 24 25 26 27 28 29 30 31 32 33 | """ make_variants_long_table.py \\ --bcftools_query_dir ./bcftools_query \\ --snpsift_dir ./snpsift \\ --pangolin_dir ./pangolin \\ $args cat <<-END_VERSIONS > versions.yml "${task.process}": python: \$(python --version | sed 's/Python //g') END_VERSIONS """ |
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | """ ## Run MultiQC once to parse tool logs multiqc -f $args $custom_config . ## Parse YAML files dumped by MultiQC to obtain metrics multiqc_to_custom_csv.py --platform illumina ## Manually remove files that we don't want in the report if grep -q ">skip_assembly<" workflow_summary_mqc.yaml; then rm -f *assembly_metrics_mqc.csv fi if grep -q ">skip_variants<" workflow_summary_mqc.yaml; then rm -f *variants_metrics_mqc.csv fi rm -f variants/report.tsv ## Run MultiQC a second time multiqc -f $args -e general_stats --ignore nextclade_clade_mqc.tsv $custom_config . cat <<-END_VERSIONS > versions.yml "${task.process}": multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" ) END_VERSIONS """ |
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | """ ## Run MultiQC once to parse tool logs multiqc -f $args $custom_config . ## Parse YAML files dumped by MultiQC to obtain metrics multiqc_to_custom_csv.py --platform nanopore ## Manually remove files that we don't want in the report rm -rf quast ## Run MultiQC a second time multiqc -f $args -e general_stats --ignore *nextclade_clade_mqc.tsv $custom_config . cat <<-END_VERSIONS > versions.yml "${task.process}": multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" ) END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 | """ plot_base_density.r \\ --fasta_files $fasta \\ --prefixes $prefix \\ --output_dir ./ cat <<-END_VERSIONS > versions.yml "${task.process}": r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//') END_VERSIONS """ |
25 26 27 28 29 30 31 32 33 34 35 36 | """ plot_mosdepth_regions.r \\ --input_files ${beds.join(',')} \\ --output_dir ./ \\ --output_suffix $prefix \\ $args cat <<-END_VERSIONS > versions.yml "${task.process}": r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//') END_VERSIONS """ |
21 22 23 24 25 26 27 28 | """ sed "s/>/>${meta.id} /g" $fasta > ${prefix}.fa cat <<-END_VERSIONS > versions.yml "${task.process}": sed: \$(echo \$(sed --version 2>&1) | sed 's/^.*GNU sed) //; s/ .*\$//') END_VERSIONS """ |
22 23 24 25 26 27 28 29 30 31 32 | """ check_samplesheet.py \\ $samplesheet \\ samplesheet.valid.csv \\ --platform $platform cat <<-END_VERSIONS > versions.yml "${task.process}": python: \$(python --version | sed 's/Python //g') END_VERSIONS """ |
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | """ snpEff \\ -Xmx${avail_mem}g \\ ${fasta.baseName} \\ -config $config \\ -dataDir $db \\ $args \\ $vcf \\ -csvStats ${prefix}.snpeff.csv \\ > ${prefix}.snpeff.vcf mv snpEff_summary.html ${prefix}.snpeff.summary.html cat <<-END_VERSIONS > versions.yml "${task.process}": snpeff: \$(echo \$(snpEff -version 2>&1) | cut -f 2 -d ' ') END_VERSIONS """ |
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | """ mkdir -p snpeff_db/genomes/ cd snpeff_db/genomes/ ln -s ../../$fasta ${basename}.fa cd ../../ mkdir -p snpeff_db/${basename}/ cd snpeff_db/${basename}/ ln -s ../../$gff genes.gff cd ../../ echo "${basename}.genome : ${basename}" > snpeff.config snpEff \\ -Xmx${avail_mem}g \\ build \\ -config snpeff.config \\ -dataDir ./snpeff_db \\ -gff3 \\ -v \\ ${basename} cat <<-END_VERSIONS > versions.yml "${task.process}": snpeff: \$(echo \$(snpEff -version 2>&1) | cut -f 2 -d ' ') END_VERSIONS """ |
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | """ SnpSift \\ -Xmx${avail_mem}g \\ extractFields \\ -s "," \\ -e "." \\ $args \\ $vcf \\ CHROM POS REF ALT \\ "ANN[*].GENE" "ANN[*].GENEID" \\ "ANN[*].IMPACT" "ANN[*].EFFECT" \\ "ANN[*].FEATURE" "ANN[*].FEATUREID" \\ "ANN[*].BIOTYPE" "ANN[*].RANK" "ANN[*].HGVS_C" \\ "ANN[*].HGVS_P" "ANN[*].CDNA_POS" "ANN[*].CDNA_LEN" \\ "ANN[*].CDS_POS" "ANN[*].CDS_LEN" "ANN[*].AA_POS" \\ "ANN[*].AA_LEN" "ANN[*].DISTANCE" "EFF[*].EFFECT" \\ "EFF[*].FUNCLASS" "EFF[*].CODON" "EFF[*].AA" "EFF[*].AA_LEN" \\ > ${prefix}.snpsift.txt cat <<-END_VERSIONS > versions.yml "${task.process}": snpsift: \$( echo \$(SnpSift split -h 2>&1) | sed 's/^.*version //' | sed 's/(.*//' | sed 's/t//g' ) END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | """ abacas.pl \\ -r $fasta \\ -q $scaffold \\ $args \\ -o ${prefix}.abacas mv nucmer.delta ${prefix}.abacas.nucmer.delta mv nucmer.filtered.delta ${prefix}.abacas.nucmer.filtered.delta mv nucmer.tiling ${prefix}.abacas.nucmer.tiling mv unused_contigs.out ${prefix}.abacas.unused.contigs.out cat <<-END_VERSIONS > versions.yml "${task.process}": abacas: \$(echo \$(abacas.pl -v 2>&1) | sed 's/^.*ABACAS.//; s/ .*\$//') END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 36 | """ artic \\ guppyplex \\ $args \\ --directory $fastq_dir \\ --output ${prefix}.fastq pigz -p $task.cpus *.fastq cat <<-END_VERSIONS > versions.yml "${task.process}": artic: $VERSION END_VERSIONS """ |
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | """ $hd5_plugin_path artic \\ minion \\ $args \\ --threads $task.cpus \\ --read-file $fastq \\ --scheme-directory ./primer-schemes \\ --scheme-version $version \\ $model \\ $fast5 \\ $summary \\ $scheme \\ $prefix cat <<-END_VERSIONS > versions.yml "${task.process}": artic: $VERSION END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 | """ Bandage image $gfa ${prefix}.png $args Bandage image $gfa ${prefix}.svg $args cat <<-END_VERSIONS > versions.yml "${task.process}": bandage: \$(echo \$(Bandage --version 2>&1) | sed 's/^.*Version: //; s/ .*\$//') END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 34 35 | """ cat $fasta \\ | bcftools \\ consensus \\ $vcf \\ $args \\ > ${prefix}.fa cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 | """ bcftools filter \\ --output ${prefix}.vcf.gz \\ $args \\ $vcf cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
38 39 40 41 42 43 44 45 | """ touch ${prefix}.vcf.gz cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | """ echo "${meta.id}" > sample_name.list bcftools \\ mpileup \\ --fasta-ref $fasta \\ $args \\ $bam \\ $intervals \\ $mpileup \\ | bcftools call --output-type v $args2 \\ | bcftools reheader --samples sample_name.list \\ | bcftools view --output-file ${prefix}.vcf.gz --output-type z $args3 $bgzip_mpileup tabix -p vcf -f ${prefix}.vcf.gz bcftools stats ${prefix}.vcf.gz > ${prefix}.bcftools_stats.txt cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
30 31 32 33 34 35 36 37 38 39 40 41 42 | """ bcftools norm \\ --fasta-ref ${fasta} \\ --output ${prefix}.${extension}\\ $args \\ --threads $task.cpus \\ ${vcf} cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
52 53 54 55 56 57 58 59 | """ touch ${prefix}.${extension} cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
29 30 31 32 33 34 35 36 37 38 39 40 41 42 | """ bcftools query \\ --output ${prefix}.txt \\ $regions_file \\ $targets_file \\ $samples_file \\ $args \\ $vcf cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
29 30 31 32 33 34 35 36 37 38 39 40 | """ bcftools \\ sort \\ --output ${prefix}.${extension} \\ $args \\ $vcf cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
52 53 54 55 56 57 58 59 | """ touch ${prefix}.${extension} cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
29 30 31 32 33 34 35 36 37 38 39 40 41 | """ bcftools stats \\ $args \\ $regions_file \\ $targets_file \\ $samples_file \\ $vcf > ${prefix}.bcftools_stats.txt cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
46 47 48 49 50 51 52 53 | """ touch ${prefix}.bcftools_stats.txt cat <<-END_VERSIONS > versions.yml "${task.process}": bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 36 | """ bedtools \\ getfasta \\ $args \\ -fi $fasta \\ -bed $bed \\ -fo ${prefix}.fa cat <<-END_VERSIONS > versions.yml "${task.process}": bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 | """ bedtools \\ maskfasta \\ $args \\ -fi $fasta \\ -bed $bed \\ -fo ${prefix}.fa cat <<-END_VERSIONS > versions.yml "${task.process}": bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 | """ bedtools \\ merge \\ -i $bed \\ $args \\ > ${prefix}.bed cat <<-END_VERSIONS > versions.yml "${task.process}": bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 36 | """ DB=`find -L ./ -name "*.ndb" | sed 's/\\.ndb\$//'` blastn \\ -num_threads $task.cpus \\ -db \$DB \\ -query $fasta \\ $args \\ -out ${prefix}.blastn.txt cat <<-END_VERSIONS > versions.yml "${task.process}": blast: \$(blastn -version 2>&1 | sed 's/^.*blastn: //; s/ .*\$//') END_VERSIONS """ |
22 23 24 25 26 27 28 29 30 31 32 | """ makeblastdb \\ -in $fasta \\ $args mkdir blast_db mv ${fasta}* blast_db cat <<-END_VERSIONS > versions.yml "${task.process}": blast: \$(blastn -version 2>&1 | sed 's/^.*blastn: //; s/ .*\$//') END_VERSIONS """ |
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | """ INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\\.rev.1.bt2\$//"` [ -z "\$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\\.rev.1.bt2l\$//"` [ -z "\$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1 bowtie2 \\ -x \$INDEX \\ $reads_args \\ --threads $task.cpus \\ $unaligned \\ $args \\ 2> ${prefix}.bowtie2.log \\ | samtools $samtools_command $args2 --threads $task.cpus -o ${prefix}.bam - if [ -f ${prefix}.unmapped.fastq.1.gz ]; then mv ${prefix}.unmapped.fastq.1.gz ${prefix}.unmapped_1.fastq.gz fi if [ -f ${prefix}.unmapped.fastq.2.gz ]; then mv ${prefix}.unmapped.fastq.2.gz ${prefix}.unmapped_2.fastq.gz fi cat <<-END_VERSIONS > versions.yml "${task.process}": bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) END_VERSIONS """ |
22 23 24 25 26 27 28 29 | """ mkdir bowtie2 bowtie2-build $args --threads $task.cpus $fasta bowtie2/${fasta.baseName} cat <<-END_VERSIONS > versions.yml "${task.process}": bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') END_VERSIONS """ |
26 27 28 29 30 31 32 33 | """ cat ${readList.join(' ')} > ${prefix}.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//') END_VERSIONS """ |
40 41 42 43 44 45 46 47 48 | """ cat ${read1.join(' ')} > ${prefix}_1.merged.fastq.gz cat ${read2.join(' ')} > ${prefix}_2.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//') END_VERSIONS """ |
57 58 59 60 61 62 63 64 | """ touch ${prefix}.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//') END_VERSIONS """ |
68 69 70 71 72 73 74 75 76 | """ touch ${prefix}_1.merged.fastq.gz touch ${prefix}_2.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//') END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 | """ samtools faidx $fasta cut -f 1,2 ${fasta}.fai > ${fasta}.sizes cat <<-END_VERSIONS > versions.yml "${task.process}": getchromsizes: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
35 36 37 38 39 40 41 42 43 | """ touch ${fasta}.fai touch ${fasta}.sizes cat <<-END_VERSIONS > versions.yml "${task.process}": getchromsizes: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | """ [ ! -f ${prefix}.fastq.gz ] && ln -sf $reads ${prefix}.fastq.gz fastp \\ --stdout \\ --in1 ${prefix}.fastq.gz \\ --thread $task.cpus \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $adapter_list \\ $fail_fastq \\ $args \\ 2> ${prefix}.fastp.log \\ | gzip -c > ${prefix}.fastp.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g") END_VERSIONS """ |
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | """ [ ! -f ${prefix}.fastq.gz ] && ln -sf $reads ${prefix}.fastq.gz fastp \\ --in1 ${prefix}.fastq.gz \\ --out1 ${prefix}.fastp.fastq.gz \\ --thread $task.cpus \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $adapter_list \\ $fail_fastq \\ $args \\ 2> ${prefix}.fastp.log cat <<-END_VERSIONS > versions.yml "${task.process}": fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g") END_VERSIONS """ |
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | """ [ ! -f ${prefix}_1.fastq.gz ] && ln -sf ${reads[0]} ${prefix}_1.fastq.gz [ ! -f ${prefix}_2.fastq.gz ] && ln -sf ${reads[1]} ${prefix}_2.fastq.gz fastp \\ --in1 ${prefix}_1.fastq.gz \\ --in2 ${prefix}_2.fastq.gz \\ --out1 ${prefix}_1.fastp.fastq.gz \\ --out2 ${prefix}_2.fastp.fastq.gz \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $adapter_list \\ $fail_fastq \\ $merge_fastq \\ --thread $task.cpus \\ --detect_adapter_for_pe \\ $args \\ 2> ${prefix}.fastp.log cat <<-END_VERSIONS > versions.yml "${task.process}": fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g") END_VERSIONS """ |
28 29 30 31 32 33 34 35 36 37 38 | """ printf "%s %s\\n" $rename_to | while read old_name new_name; do [ -f "\${new_name}" ] || ln -s \$old_name \$new_name done fastqc $args --threads $task.cpus $renamed_files cat <<-END_VERSIONS > versions.yml "${task.process}": fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" ) END_VERSIONS """ |
42 43 44 45 46 47 48 49 50 | """ touch ${prefix}.html touch ${prefix}.zip cat <<-END_VERSIONS > versions.yml "${task.process}": fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" ) END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 | """ gunzip \\ -f \\ $args \\ $archive cat <<-END_VERSIONS > versions.yml "${task.process}": gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//') END_VERSIONS """ |
37 38 39 40 41 42 43 | """ touch $gunzip cat <<-END_VERSIONS > versions.yml "${task.process}": gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//') END_VERSIONS """ |
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | """ samtools \\ mpileup \\ --reference $fasta \\ $args2 \\ $bam \\ $mpileup \\ | ivar \\ consensus \\ $args \\ -p $prefix cat <<-END_VERSIONS > versions.yml "${task.process}": ivar: \$(echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//') END_VERSIONS """ |
25 26 27 28 29 30 31 32 33 34 35 36 37 | """ ivar trim \\ $args \\ -i $bam \\ -b $bed \\ -p $prefix \\ > ${prefix}.ivar.log cat <<-END_VERSIONS > versions.yml "${task.process}": ivar: \$(echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//') END_VERSIONS """ |
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | """ samtools \\ mpileup \\ $args2 \\ --reference $fasta \\ $bam \\ $mpileup \\ | ivar \\ variants \\ $args \\ $features \\ -r $fasta \\ -p $prefix cat <<-END_VERSIONS > versions.yml "${task.process}": ivar: \$(echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//') END_VERSIONS """ |
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | """ kraken2 \\ --db $db \\ --threads $task.cpus \\ --report ${prefix}.kraken2.report.txt \\ --gzip-compressed \\ $unclassified_option \\ $classified_option \\ $readclassification_option \\ $paired \\ $args \\ $reads $compress_reads_command cat <<-END_VERSIONS > versions.yml "${task.process}": kraken2: \$(echo \$(kraken2 --version 2>&1) | sed 's/^.*Kraken version //; s/ .*\$//') pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) END_VERSIONS """ |
26 27 28 29 30 31 32 33 34 35 36 37 38 | """ echo "${read_list}" | sed 's/,/\\n/g' > input_files.txt minia \\ $args \\ -nb-cores $task.cpus \\ -in input_files.txt \\ -out $prefix cat <<-END_VERSIONS > versions.yml "${task.process}": minia: \$(echo \$(minia --version 2>&1 | grep Minia) | sed 's/^.*Minia version //;') END_VERSIONS """ |
45 46 47 48 49 50 51 52 53 54 55 56 57 58 | """ mosdepth \\ --threads $task.cpus \\ $interval \\ $reference \\ $args \\ $prefix \\ $bam cat <<-END_VERSIONS > versions.yml "${task.process}": mosdepth: \$(mosdepth --version 2>&1 | sed 's/^.*mosdepth //; s/ .*\$//') END_VERSIONS """ |
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | """ touch ${prefix}.global.dist.txt touch ${prefix}.region.dist.txt touch ${prefix}.summary.txt touch ${prefix}.per-base.d4 touch ${prefix}.per-base.bed.gz touch ${prefix}.per-base.bed.gz.csi touch ${prefix}.regions.bed.gz touch ${prefix}.regions.bed.gz.csi touch ${prefix}.quantized.bed.gz touch ${prefix}.quantized.bed.gz.csi touch ${prefix}.thresholds.bed.gz touch ${prefix}.thresholds.bed.gz.csi cat <<-END_VERSIONS > versions.yml "${task.process}": mosdepth: \$(mosdepth --version 2>&1 | sed 's/^.*mosdepth //; s/ .*\$//') END_VERSIONS """ |
27 28 29 30 31 32 33 34 35 36 | """ NanoPlot \\ $args \\ -t $task.cpus \\ $input_file cat <<-END_VERSIONS > versions.yml "${task.process}": nanoplot: \$(echo \$(NanoPlot --version 2>&1) | sed 's/^.*NanoPlot //; s/ .*\$//') END_VERSIONS """ |
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | """ nextclade \\ dataset \\ get \\ $args \\ --name $dataset \\ $fasta \\ $version \\ --output-dir $prefix cat <<-END_VERSIONS > versions.yml "${task.process}": nextclade: \$(echo \$(nextclade --version 2>&1) | sed 's/^.*nextclade //; s/ .*\$//') END_VERSIONS """ |
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | """ nextclade \\ run \\ $args \\ --jobs $task.cpus \\ --input-dataset $dataset \\ --output-all ./ \\ --output-basename ${prefix} \\ $fasta cat <<-END_VERSIONS > versions.yml "${task.process}": nextclade: \$(echo \$(nextclade --version 2>&1) | sed 's/^.*nextclade //; s/ .*\$//') END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 34 | """ pangolin \\ $fasta\\ --outfile ${prefix}.pangolin.csv \\ --threads $task.cpus \\ $args cat <<-END_VERSIONS > versions.yml "${task.process}": pangolin: \$(pangolin --version | sed "s/pangolin //g") END_VERSIONS """ |
33 34 35 36 37 38 39 40 41 42 43 44 45 46 | """ picard \\ -Xmx${avail_mem}M \\ CollectMultipleMetrics \\ $args \\ --INPUT $bam \\ --OUTPUT ${prefix}.CollectMultipleMetrics \\ $reference cat <<-END_VERSIONS > versions.yml "${task.process}": picard: \$(picard CollectMultipleMetrics --version 2>&1 | grep -o 'Version.*' | cut -f2- -d:) END_VERSIONS """ |
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | """ touch ${prefix}.CollectMultipleMetrics.alignment_summary_metrics touch ${prefix}.CollectMultipleMetrics.insert_size_metrics touch ${prefix}.CollectMultipleMetrics.quality_distribution.pdf touch ${prefix}.CollectMultipleMetrics.base_distribution_by_cycle_metrics touch ${prefix}.CollectMultipleMetrics.quality_by_cycle_metrics touch ${prefix}.CollectMultipleMetrics.read_length_histogram.pdf touch ${prefix}.CollectMultipleMetrics.base_distribution_by_cycle.pdf touch ${prefix}.CollectMultipleMetrics.quality_by_cycle.pdf touch ${prefix}.CollectMultipleMetrics.insert_size_histogram.pdf touch ${prefix}.CollectMultipleMetrics.quality_distribution_metrics cat <<-END_VERSIONS > versions.yml "${task.process}": picard: \$(echo \$(picard CollectMultipleMetrics --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:) END_VERSIONS """ |
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | """ picard \\ -Xmx${avail_mem}M \\ MarkDuplicates \\ $args \\ --INPUT $bam \\ --OUTPUT ${prefix}.bam \\ --REFERENCE_SEQUENCE $fasta \\ --METRICS_FILE ${prefix}.MarkDuplicates.metrics.txt cat <<-END_VERSIONS > versions.yml "${task.process}": picard: \$(echo \$(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:) END_VERSIONS """ |
51 52 53 54 55 56 57 58 59 60 | """ touch ${prefix}.bam touch ${prefix}.bam.bai touch ${prefix}.MarkDuplicates.metrics.txt cat <<-END_VERSIONS > versions.yml "${task.process}": picard: \$(echo \$(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:) END_VERSIONS """ |
31 32 33 34 35 36 37 38 39 40 41 42 43 44 | """ plasmidID \\ -d $fasta \\ -s $prefix \\ -c $scaffold \\ $args \\ -o . mv NO_GROUP/$prefix ./$prefix cat <<-END_VERSIONS > versions.yml "${task.process}": plasmidid: \$(echo \$(plasmidID --version 2>&1)) END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 | """ pycoQC \\ $args \\ -f $summary \\ -o ${prefix}.html \\ -j ${prefix}.json cat <<-END_VERSIONS > versions.yml "${task.process}": pycoqc: \$(pycoQC --version 2>&1 | sed 's/^.*pycoQC v//; s/ .*\$//') END_VERSIONS """ |
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | """ quast.py \\ --output-dir $prefix \\ $reference \\ $features \\ --threads $task.cpus \\ $args \\ ${consensus.join(' ')} ln -s ${prefix}/report.tsv cat <<-END_VERSIONS > versions.yml "${task.process}": quast: \$(quast.py --version 2>&1 | sed 's/^.*QUAST v//; s/ .*\$//') END_VERSIONS """ |
23 24 25 26 27 28 29 30 31 32 33 34 | """ samtools \\ flagstat \\ --threads ${task.cpus} \\ $bam \\ > ${prefix}.flagstat cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 | """ samtools \\ idxstats \\ --threads ${task.cpus-1} \\ $bam \\ > ${prefix}.idxstats cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
24 25 26 27 28 29 30 31 32 33 34 35 | """ samtools \\ index \\ -@ ${task.cpus-1} \\ $args \\ $input cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
38 39 40 41 42 43 44 45 46 47 | """ touch ${input}.bai touch ${input}.crai touch ${input}.csi cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
25 26 27 28 29 30 31 | """ samtools sort $args -@ $task.cpus -o ${prefix}.bam -T $prefix $bam cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
35 36 37 38 39 40 41 42 | """ touch ${prefix}.bam cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
25 26 27 28 29 30 31 32 33 34 35 36 37 | """ samtools \\ stats \\ --threads ${task.cpus} \\ ${reference} \\ ${input} \\ > ${prefix}.stats cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
41 42 43 44 45 46 47 48 | """ touch ${prefix}.stats cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | """ samtools \\ view \\ --threads ${task.cpus-1} \\ ${reference} \\ ${readnames} \\ $args \\ -o ${prefix}.${file_type} \\ $input \\ $args2 cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
57 58 59 60 61 62 63 64 65 | """ touch ${prefix}.bam touch ${prefix}.cram cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ |
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | """ spades.py \\ $args \\ --threads $task.cpus \\ --memory $maxmem \\ $custom_hmms \\ $reads \\ -o ./ mv spades.log ${prefix}.spades.log if [ -f scaffolds.fasta ]; then mv scaffolds.fasta ${prefix}.scaffolds.fa gzip -n ${prefix}.scaffolds.fa fi if [ -f contigs.fasta ]; then mv contigs.fasta ${prefix}.contigs.fa gzip -n ${prefix}.contigs.fa fi if [ -f transcripts.fasta ]; then mv transcripts.fasta ${prefix}.transcripts.fa gzip -n ${prefix}.transcripts.fa fi if [ -f assembly_graph_with_scaffolds.gfa ]; then mv assembly_graph_with_scaffolds.gfa ${prefix}.assembly.gfa gzip -n ${prefix}.assembly.gfa fi if [ -f gene_clusters.fasta ]; then mv gene_clusters.fasta ${prefix}.gene_clusters.fa gzip -n ${prefix}.gene_clusters.fa fi cat <<-END_VERSIONS > versions.yml "${task.process}": spades: \$(spades.py --version 2>&1 | sed 's/^.*SPAdes genome assembler v//; s/ .*\$//') END_VERSIONS """ |
32 33 34 35 36 37 38 39 | """ bgzip $command -c $args -@${task.cpus} $input > ${output} cat <<-END_VERSIONS > versions.yml "${task.process}": tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') END_VERSIONS """ |
46 47 48 49 50 51 52 53 | """ touch ${output} cat <<-END_VERSIONS > versions.yml "${task.process}": tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') END_VERSIONS """ |
23 24 25 26 27 28 29 30 | """ tabix $args $tab cat <<-END_VERSIONS > versions.yml "${task.process}": tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') END_VERSIONS """ |
34 35 36 37 38 39 40 41 | """ touch ${tab}.tbi cat <<-END_VERSIONS > versions.yml "${task.process}": tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') END_VERSIONS """ |
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | """ unicycler \\ --threads $task.cpus \\ $args \\ $short_reads \\ $long_reads \\ --out ./ mv assembly.fasta ${prefix}.scaffolds.fa gzip -n ${prefix}.scaffolds.fa mv assembly.gfa ${prefix}.assembly.gfa gzip -n ${prefix}.assembly.gfa mv unicycler.log ${prefix}.unicycler.log cat <<-END_VERSIONS > versions.yml "${task.process}": unicycler: \$(echo \$(unicycler --version 2>&1) | sed 's/^.*Unicycler v//; s/ .*\$//') END_VERSIONS """ |
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | """ mkdir $prefix ## Ensures --strip-components only applied when top level of tar contents is a directory ## If just files or multiple directories, place all in prefix if [[ \$(tar -taf ${archive} | grep -o -P "^.*?\\/" | uniq | wc -l) -eq 1 ]]; then tar \\ -C $prefix --strip-components 1 \\ -xavf \\ $args \\ $archive \\ $args2 else tar \\ -C $prefix \\ -xavf \\ $args \\ $archive \\ $args2 fi cat <<-END_VERSIONS > versions.yml "${task.process}": untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') END_VERSIONS """ |
54 55 56 57 58 59 60 61 62 | """ mkdir $prefix touch ${prefix}/file.txt cat <<-END_VERSIONS > versions.yml "${task.process}": untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') END_VERSIONS """ |
25 26 27 28 29 30 31 32 33 34 | """ vcfuniq \\ $vcf \\ | bgzip -c $args > ${prefix}.vcf.gz cat <<-END_VERSIONS > versions.yml "${task.process}": vcflib: $VERSION END_VERSIONS """ |
Support
- Future updates
Related Workflows





