DiVA WGS: Next-Generation Sequencing Whole Genome Data Analysis Pipeline
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
DiVA WGS is a pipeline for Next-Generation Sequencing Whole Genome data anlysis.
All solida-core workflows follow GATK Best Practices for Germline Variant Discovery, with the incorporation of further improvements and refinements after their testing with real data in various CRS4 Next Generation Sequencing Core Facility research sequencing projects.
Pipelines are based on Snakemake , a workflow management system that provides all the features needed to create reproducible and scalable data analyses.
Software dependencies are specified into the
environment.yaml
file and directly managed by Snakemake using
Conda
, ensuring the reproducibility of the workflow on a great number of different computing environments such as workstations, clusters and cloud environments.
Pipeline Overview
The pipeline workflow is composed by two major analysis sections:
-
Mapping : single and/or paired-end reads in fastq format are aligned against a reference genome to produce a deduplicated and recalibrated BAM file. This section is executed by DiMA pipeline.
-
Variant Calling : a joint call is performed from all project's bam files
Parallely, statistics collected during these steps are used to generate reports for Quality Control .
A complete view of the analysis workflow is provided by the pipeline's graph .
Pipeline Handbook
DiVA WGS
pipeline documentation can be found in the
docs/
directory:
-
Required Files:
-
Running the pipeline:
Contact us
Code Snippets
16 17 18 19 20 21 22 23 | shell: "gatk SplitIntervals --java-options {params.custom} " "-R {params.genome} " "-L {params.intervals} " "-mode {params.mode} " "--scatter-count {params.scattercount} " "-O split " ">& {log} " |
41 42 43 44 45 46 47 48 49 | shell: "gatk HaplotypeCaller --java-options {params.custom} " "-R {params.genome} " "-I {input.cram} " "-O {output.gvcf} " "-ERC GVCF " "-G StandardAnnotation " "-L split/{wildcards.interval}-scattered.interval_list " ">& {log} " |
24 25 26 | shell: "cp {input.bam} {output.bam} && " "cp {input.bai} {output.bai} " |
21 22 23 24 25 26 27 | shell: "mkdir -p db; " "gatk GenomicsDBImport --java-options {params.custom} " "{params.gvcfs} " "--genomicsdb-workspace-path db/{wildcards.interval} " "-L split/{wildcards.interval}-scattered.interval_list " ">& {log} " |
43 44 45 46 47 48 49 | shell: "gatk GenotypeGVCFs --java-options {params.custom} " "-R {params.genome} " "-V gendb://db/{wildcards.interval} " "-G StandardAnnotation " "-O {output} " ">& {log} " |
16 17 18 19 20 21 | shell: "picard {params.custom} CollectInsertSizeMetrics " "I={input.bam} " "O={output.metrics} " "H={output.histogram} " "&> {log} " |
38 39 40 41 42 43 44 | shell: "picard {params.custom} CollectWgsMetrics " "{params.arguments} " "I={input.bam} " "O={output.metrics} " "R={params.genome} " "&> {log} " |
24 25 26 27 28 29 30 31 32 33 | shell: "multiqc " "{input} " "{params.fastqc} " "{params.trimming} " "{params.params} " "-o {params.outdir} " "-n {params.outname} " "--sample-names {params.reheader} " ">& {log}" |
13 14 15 | shell: "bcftools concat -a {input.vcfs} | bgzip -cf > {output};" "tabix -p vcf {output}" |
56 57 58 59 60 61 62 63 64 65 | shell: "gatk VariantRecalibrator --java-options {params.custom} " "-R {params.genome} " "-V {input.vcf} " "{params.recal} " "-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 " "--output {output.recal} " "--tranches-file {output.tranches} " "--rscript-file {output.plotting} " ">& {log}" |
85 86 87 88 89 90 91 | shell: "gatk ApplyVQSR --java-options {params.custom} " "-R {params.genome} " "-V {input.vcf} -mode {params.mode} " "--recal-file {input.recal} -ts-filter-level 99.0 " "--tranches-file {input.tranches} -O {output} " ">& {log}" |
Support
- Future updates
Related Workflows





