SVONT-Pipeline: Structural Variant Detection and Annotation from Oxford Nanopore Data using Snakemake
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
SVONT-pipeline is pipeline for structural variant detection and annotation using Oxford Nanopore data . Oxford Nanopore Technologies provide technology for long reads sequencing. Pipeline is implemented by Snakemake and written in Snakefile. Snakemake is workflow manager based od python.
Files in following formats are used as input:
-
sequences in FAST5 format
-
sequences in FASTQ format zipped in .gz files
-
reference sequence in FASTA format
Annotated structural variants are summarized in tsv file which is the output of SVONT-pipeline.
Features
SVONT-pipeline perform following steps:
-
unzip of zipped FASTQ files
-
make one FASTQ file where are extracted all FASTQ files
-
transformation of fastq files to file in FASTA format
-
computing statistics from reads, visualization
-
mapping reads to reference sequence
-
detection of structural variant
-
annotation of detected variants
Dependencies
For execution SVONT-pipeline are reqiured following packages:
minimap2
,
nanopolish
,
pysam
,
samtools
,
snakemake
,
sniffles2
,
NanoPlot
,
NanoStat
,
AnnotSV
,
gzip
, script fastq_to_fasta.py
Packages nanopolish, minimap2, samtools, pysam, sniffles=2.0, snakemake, gzip, NanoPlot and NanoStat can be install using Conda:
conda install -c bioconda nanopolish minimap2 samtools pysam sniffles=2.0 snakemake NanoPlot NanoStat
conda install -c conda-forge gzip
A tool AnnotSV can´t be installed using Conda, it need to be clone from github repository:
git clone https://github.com/lgmgeo/AnnotSV.git
make install
The python script fastq_to_fasta.py can be download from this repository.
How to run SVONT-pipeline
Configuration file
SVONT-pipeline uses a configuration file in format YAML which has to contain folowing variables:
run: name_of_the_run
fast5Dir: path_to_fast5_directory
ref: path_to_reference_fasta_file (index file should also be present in the same folder)
fastqDir: path_to_fastq_directory
AnnotSV: path_to_AnnotSV_directory_which_was_installed
Folder structure
To run SVON-pipeline user need to have these folders in this structure:
| ├── data/ | ├── example_input/ | └── ref/ └── src/ ├── Snakefile ├── scripts/ | └── fastq_to_fasta.py └── config/ └── example_config.yaml
Pipeline execution
Run
$ snakemake --configfile config/example_config.yaml -c1
in the
src
folder. A successful run will create a run directory in the data folder.
Output
The output comprises the following files and directories in the data folder:
-
AnnotSV.log
-
reads.fasta
-
reads.fasta.index
-
reads.fasta.index.fai
-
reads.fasta.index.gzi
-
reads.fasta.index.readdb
-
reads.vcf
-
reads_fastq_all.fastq
-
reads-ref.sorted.bam
-
reads-ref.sorted.bam.bai
-
stats
-
directory annotation
-
directory fastq
-
directory graphs
-
directory log
Code Snippets
5 6 | shell: "echo {input}" |
16 17 18 19 20 21 | shell: """ export ANNOTSV={input.AnnotSV} {input.AnnotSV}/bin/AnnotSV -SVinputFile {input.vcf} \ -annotationMode split -genomeBuild GRCh37 -outputDir {output} >& {log} & """ |
28 29 | shell: "sniffles --input {input.sorted} --vcf {output}" |
38 39 | shell: "minimap2 -ax map-ont -t 8 {input.ref} {input.fasta} | samtools sort -o {output} -T reads.tmp; samtools index {output}" |
48 49 | shell: "nanopolish index -d {input.fast5Dir} {input.fasta} > {log}" |
57 58 | shell: "NanoPlot -o {output} --color green --format jpg --title {wildcards.run} --fasta {input.fasta}" |
65 66 | shell: "NanoStat -n {output} --fastq {input}" |
73 74 | shell: "cat {input.fastqDir}/* > {output}" |
81 82 | script: "scripts/fastq_to_fasta.py" |
89 90 91 92 93 | shell: """ cp -r {input.fastqDir} {output} gzip -d {output}/* """ |
Support
- Future updates
Related Workflows





