ATAC-Seq Data Analysis Pipeline for Identifying Nuclear Sites in Ikaros Translocation Study
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
This is a ATACSeq snakemake pipeline for High Performance Computing course. We first did the ATACSeq data analysis with slurm (https://github.com/dijashis/Projet_HPC) . We use ATAC-Seq data from Gomez-Cabrero et al. (2019) from a murine B3 cell line. One of the goals of the study is to identify new nuclear sites following translocation of the transcription factor Ikaros after exposure to the drug Tamoxifen The original data set has 50,000 cells collected per sample 3 replicates per sample and 2 cell stages: 0 and 24h (harvest time after drug treatment). Use (https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html) for reproducibility.
To launch the pipeline you we need :
-
Files that should be in
data/mydatalocal/atacseq/
( subsets and bowtie2 index) -
One config files detailling all the different options and data needed to launch the pipe.
config/config.yaml
in the config directory -
Snakefiles
(entrypoint of the workflow contains rules and scripts) -
env.yaml
( we need conda with bioconda and conda-forge and the following dependancies for all rules in snakefile )
dependencies :
- FastQC==0.11.9
- Cutadapt==3.5
- Bowtie2
- samtools==1.14
- r-base==4.1.1
- openjdk==10.0.2
- picard==2.26.5
- deepTools
- MACS2
- bedtools
Code Snippets
39 40 41 42 43 | shell: """ mkdir -p tmp gunzip -c {input} > {output} """ |
54 55 56 57 58 | shell: """ mkdir -p results/fastqc_init fastqc {input} -o "results/fastqc_init" -t {threads} """ |
72 73 74 75 76 | shell: """ mkdir -p results/trimming cutadapt -j {threads} -a CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -A CTGTCTCTTATACACATCTGACGCTGCCGACGA -o {output.clean_R1} -p {output.clean_R2} {input.read} {input.read2} """ |
87 88 89 90 91 | shell: """ mkdir -p results/fastqc_post_trim fastqc {input} -o "results/fastqc_post_trim" -t {threads} """ |
103 104 105 106 107 | shell: """ mkdir -p results/bowtie2 bowtie2 --very-sensitive -p {threads} -x data/mydatalocal/atacseq/bowtie2/all -1 {input.R1} -2 {input.R2} | samtools view -q 2 -bS - | samtools sort - -o {output.bam} """ |
119 120 121 122 123 124 | shell: """ mkdir -p results/picard picard MarkDuplicates -I {input.map} -O {output.bamnet} -M {output.net_txt} -REMOVE_DUPLICATES true samtools index -b {output.bamnet} """ |
136 137 138 139 140 141 142 | shell: """ mkdir -p results/deeptools plotCoverage --bamfiles {params.bam} --plotFile {output.plot_cov} --smartLabels --plotFileFormat pdf -p 6 multiBamSummary bins --bamfiles {params.bam} -o {output.summary} -p 6 plotCorrelation -in {output.summary} --corMethod spearman --skipZeros --whatToPlot heatmap --colorMap RdYlBu --plotNumbers -o {output.plot_corr} """ |
159 160 161 162 163 | shell: """ mkdir -p results/macs macs2 callpeak -t {input.pic} -f BAM -n {params.nom} --outdir {params.rep} """ |
176 177 178 179 180 181 182 | shell: """ mkdir -p results/bedtools bedtools intersect -a {input.zero} -b {input.V} > {output.commun} bedtools intersect -v -a {input.zero} -b {input.V} > {output.uniquezero} bedtools intersect -v -a {input.V} -b {input.zero} > {output.onlyV} """ |
Support
- Future updates
Related Workflows





