Alignment and Annotation Pipeline for Klebsiella pneumoniae
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
MBB659Termproject
MBB659 Final Project: Alignment and annotation pipeline for Klebsiella pneumoniae
Project background
Klebsiella pneumoniae is considered by World Health Organization as a critical priority pathogen that urgently needs new antibiotic treatment, as it is highly resistant to most antibiotics and encodes a diverse set of antimicrobial resistance (AMR) genes that can be easily transmitted between different bacteria1,2. Of great concern is K. pneumoniae ’s role in trafficking AMR genes on a global scale3. Recently, in addition to the reservoir of AMR genes, carbapenemase genes carried by many large plasmid have further hindered the effects of last-line-of-defence antibiotics used in treatment4. Furthermore, K. pneumoniae is a natural inhabitant of the gastrointestinal microbiome and an important pathogen in nosocomial infections2. There are strains of hypervirulent K. pneumoniae (hvkp) that cause these community-linked outbreaks6. However, it is observed these are often less resistant compared to AMR strains. These hvkp strains are found to have thicker capsules to evade host immune mechanisms and proliferate within the host. However, there has been a rise in detection of carbapenemase genes carried in hvkp5,6. I hypothesize that hvkp strains of K. pneumoniae containing carbapenemase encoded on plasmids likely have higher mutations and genetic variants that affect capsule genes, such as rmpA and rmpA2 , that contributes to a less effective capsule the intake of carbapenemase resistance encoded plasmids.
The steps for this pipeline are:
-
Download reference genome and reads SRA_toolkit
-
Align reads to the reference genome bwa mem
-
Call variants and manipulate bcftools
-
Gene annotation snpEff
Directed Acyclic Graph:
Heres how to access the git repository:
Cloning the repository:
Create folder where you want the repo
Open terminal in that folder
In terminal enter:
git clone https://github.com/kluongni/MBB659Termproject.git
Change directory to workflow with:
cd /MBB659Termproject/workflow
Assuming the user has conda installed:
if not: https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html
Activate the conda environment with:
conda env create --file environment.yml
(press y when prompted)
conda activate termProject
Run the Snakemake file within the directory:
snakemake --cores * "results/annotatedVcf.vcf" will bring you to the end of the pipeline
*designates amount of cores user would like to dedicate for the run.
Inputs: | |
---|---|
GCA_000009885.1_ASM988v1 is a hvkp whole-genome assembly used as a reference genome for alignment. | |
SRR10160941 is the SRA accession for a Illumina High-throughput sequenced carbapenemase carrying hvkp. | |
Outputs: | |
----------------------------------------------------------------------------------------------------- | |
snpEff_genes.txt is a text file containing the gene annotation. | |
annotatedVcf.vcf is a annotated vcf file of all the genes found within the sequence that was aligned. | |
The snpEFF_genes.txt provides valuable information on the genes returned from the annotation. The upstream and downstream genetic variants can be parsed from the file to provide further analyses. On the right, the graph shows that contrary to my hypothesis, there were no significant amount of genetic variations that had a detrimental effect in the rmpA gene that I believed would lead to carbapenemase acquisition. |
![]() |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
References:
-
Navon-Venezia, S., Kondratyeva, K. & Carattoli, A. Klebsiella pneumoniae: a major worldwide source and shuttle for antibiotic resistance. FEMS Microbiol. Rev. 013, 252–275 (2017).
-
Sands, K. et al. Characterization of antimicrobial-resistant Gram-negative bacteria that cause neonatal sepsis in seven low- and middle-income countries. Nat. Microbiol. 23, 24.
-
Wyres, K. L. & Holt, K. E. Klebsiella pneumoniae as a key trafficker of drug resistance genes from environmental to clinically important bacteria. Curr. Opin. Microbiol. 45, 131–139 (2018).
-
Chiu, S. K. et al. Carbapenem Nonsusceptible Klebsiella pneumoniae in Taiwan: Dissemination and Increasing Resistance of Carbapenemase Producers During 2012-2015. Sci. Rep. 8, 1–9 (2018).
-
Lee, C. R. et al. Antimicrobial resistance of hypervirulent Klebsiella pneumoniae: Epidemiology, hypervirulence-associated determinants, and resistance mechanisms. Front. Cell. Infect. Microbiol. 7, (2017).
-
Xie, M. et al. Clinical evolution of ST11 carbapenem resistant and hypervirulent Klebsiella pneumoniae. Commun. Biol. 4, 1–9 (2021).
Code Snippets
10 11 12 13 14 | shell: """ wget -nc https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/009/885/GCA_000009885.1_ASM988v1/GCA_000009885.1_ASM988v1_genomic.fna.gz -P results gzip -d results/GCA_000009885.1_ASM988v1_genomic.fna.gz """ |
20 21 | shell: "fastq-dump SRR10160941 --split-files -O results" |
28 29 30 31 | shell: """ samtools faidx {input.genome} """ |
41 42 43 44 45 | shell: """ bwa index {input.genome} bwa mem -t {threads} {input.genome} {input.read1} {input.read2} | samtools view -u -F 4 -q 30 -@ {threads} | samtools sort -O BAM -o {output.alignedBAM} -@ {threads} """ |
69 70 | shell: "snpEff ann Klebsiella_pneumoniae_subsp_pneumoniae_ntuh_k2044 {input.calledVcf} > {output.annotatedVcf}" |
Support
- Future updates
Related Workflows





