Barcode Mapping and Read Extraction Workflow with bcmap
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
bcmap
Maps barcodes to a reference genome and returns genomic windows from which the barcoded reads most likely originate. Each window is assessed with a quality score representing the trustworthines of the mapping. A barcode index is constructed alongside the mapping and can be used to quickly retrieve all reads belonging to a barcode. For ease of use we provide a snakemake workflow to extract all reads from user defined regions of interest.
Prerequisites
- gcc version 7.2.0
Installation
git clone https://github.com/kehrlab/bcmap.git
cd bcmap
make
Data requirements
-
Paired-end Linked-reads
-
Barcodes are stored in BX:Z: flag of read Ids
-
Sorted by barcode (use i.e. bcctools (only for 10x genomics linked-reads) or samtools )
To trimm, correct and sort barcodes with bcctools use the following command in the bcctools folder:
./script/run_bcctools -f fastq first.fq.gz second.fq.gz
Commands
For detailed information on Arguments and Options:
./bcmap [command] --help
index
Builds an minimized open addressing k-mer index of the reference genome. The index is required to run "map".
./bcmap index reference.fa [options]
map
Maps the barcodes of the provided readfiles to the reference and creates a barcode index of the readfiles to quickly retrieve all reads of a given barcode.
./bcmap map readfile1.fastq readfile2.fastq [options]
Content of output bed-file:
- chromosome startposition endposition barcode mapping_score
Bcmap returns a output.hist file that can be ploted using plot_score_histogram.py resulting in a plot like the one above. To create a set of mappings with very high precision (at the cost of some recall), the local minimum inbetween the two peaks should be set as the score threshold. A lower theshold yields better recall at the cost of precision, a higher threshold is not recomended.
get
Returns all reads of the given barcodes. Barcodes can be provided directly as argument or in a file.
./bcmap get readfile1.fastq readfile2.fastq Barcodes [options]
Example
This small example demonstrates how to use bcmap and allows you to check if it is properly installed. Navigate to the bcmap folder and run the commands listed below.
# building the index for chr21.fa
./bcmap index example/chr21.fa -o example/Index
# mapping the reads of readfile 1 and 2 to chromosome 21
./bcmap map example/readfile.1.fq example/readfile.2.fq -i example/Index -r example/ReadIndex -o example/results.bed
# extracting the first barcode from the results
awk '{if(NR==1) print($4)}' example/results.bed > example/FirstBarcode.txt
# extracting all reads belonging to the first barcode
./bcmap get example/readfile.1.fq example/readfile.2.fq example/FirstBarcode.txt -r example/ReadIndex -o example/readsOfFirstBarcode
# extracting reads of barcode AACATCGCAAACAGTA
./bcmap get example/readfile.1.fq example/readfile.2.fq AACATCGCAAACAGTA -r example/ReadIndex -o example/readsOfAACATCGCAAACAGTA
Code Snippets
29 30 | shell: "../bcmap index {input.reference} -o {params.index_name}" |
42 43 | shell: "../bcmap map {input.readfile1} {input.readfile2} -i {input.index} -o {output.barcode_index} -r {output.readfile_index} -t {threads}" |
52 53 | shell: """awk '{{if({params.conditions}) print($4)}}' {input.barcode_index} > {output}""" |
66 67 | shell: "../bcmap get {input.readfile1} {input.readfile2} {input.barcodes} -o {params.output_prefix} -r {input.readfile_index}" |
Support
- Future updates
Related Workflows





