Snakemake for demultiplexing 16S sequencing files using QIIME1 on the cluster.

public 1yr ago 0 bookmarks

View Workflow

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output, operation, topic

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

Login to HTCF.
Move to your /scratch/ directory.
Clone the repo:

git clone --recurse-submodules git@github.com:RachelRodgers/demux-htcf-snake.git

Make a directory to hold the snakemake profile:

mkdir -p ~/.config/snakemake/slurm_demux

Edit line 2 of the config.yaml file to include your email address:

--mail-user=<yourEmailAddress>

Copy the cluster submit and profile files to the appropriate locations:

cd demux-htcf-snake
cp config/config.yaml ~/.config/snakemake/slurm_demux
cp slurm-submit/*.py ~/.config/snakemake

Create a directory to hold your sequences to be demultiplexed and move your data to that directory.
Create a directory to hold your mapping file and copy your mapping file to that directory.
Edit the ./config/demux_config.yaml file appropriately.

Submit in one of two ways: a. With sbatch script:

sbatch submit_demux_snake.sbatch

b. Interactively:

# start an interactive session
interactive
# load snakemake 5.10
ml snakemake/5.10.0-python-3.6.5
# dry run (prints steps and stops)
snakemake --profile slurm_demux -np
# production run:
snakemake --profile slurm_demux

See slurm output files in the logs_slurm/ directory which will generate inside the repository's directory.

Code Snippets

shell:
	"""
	cat {input.r1} > {output.r1}
	cat {input.r2} > {output.r2}
	cat {input.i1} > {output.i1}
	cat {input.i2} > {output.i2}
	"""

SnakeMake From line 91 of workflow/Snakefile

shell:
	"""
	ml qiime
	bash ./workflow/scripts/extractCommand.sh
	"""

SnakeMake QIIME2.0 From line 107 of workflow/Snakefile

shell:
	"""
	ml qiime
	split_libraries_fastq.py \
		-i {input.r1} \
		-b {input.bc} \
		-o {params.outdir} \
		-m {MAP} \
		--barcode_type {bc_type} \
		--store_demultiplexed_fastq \
		-r 999 -n 999 -q 0 -p 0.001
	"""

SnakeMake QIIME2.0 From line 121 of workflow/Snakefile

shell:
	"""
	ml qiime
	split_libraries_fastq.py \
		-i {input.r2} \
		-b {input.bc} \
		-o {params.outdir} \
		-m {MAP} \
		--barcode_type {bc_type} \
		--store_demultiplexed_fastq \
		--rev_comp \
		-r 999 -n 999 -q 0 -p 0.001
	"""

SnakeMake QIIME2.0 From line 142 of workflow/Snakefile

shell:
	"""
	ml qiime
	split_sequence_file_on_sample_ids.py \
		-i {input} \
		-o {params} \
		--file_type fastq
	"""

SnakeMake QIIME2.0 From line 164 of workflow/Snakefile

shell:
	"""
	ml qiime
	split_sequence_file_on_sample_ids.py \
		-i {input} \
		-o {params} \
		--file_type fastq
	"""

SnakeMake QIIME2.0 From line 181 of workflow/Snakefile

shell:
	"""
	rename 's/\.fastq$/_R1.fastq/' ./results/split_samples_R1/*fastq

	rename 's/\.fastq$/_R2.fastq/' ./results/split_samples_R2/*fastq
	"""

SnakeMake From line 197 of workflow/Snakefile

shell:
	"""
	cp ./results/split_samples_R1/*.fastq ./results/compressed/
	mv ./results/split_samples_R2/*.fastq ./results/compressed/
	gzip ./results/compressed/*
	"""