Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag

public 1yr ago 0 bookmarks

View Workflow

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output, operation, topic

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag

Marek Bartosovic, Goncalo Castelo-Branco

Code repository related to preprint.

Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag Marek Bartosovic, Gonçalo Castelo-Branco bioRxiv 2022.03.08.483459; doi: https://doi.org/10.1101/2022.03.08.483459

Data availability

Processed files - Seurat objects, fragments file (cellranger), bigwig tracks per cluster and .h5 matrices are available as supplementary files in the GEO repository https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE198467

Reproducing the analysis

Step 1: prepare environment

conda environment is provided in env/environment.yaml

conda env create -f env/environment.yaml

Additional package dependencies that need to be installed:

R
install.packages(c('argparse','ggplot2','funr','Signac','scales','Seurat','rmarkdown','mclust','GGally','BiocManager','patchwork','markdown','UpSetR','pheatmap','viridis','purrr','Rmagic','devtools','raster'))
BiocManager::install(c('ensembldb','EnsDb.Mmusculus.v79','GenomeInfoDb', 'GenomicRanges', 'IRanges', 'Rsamtools','BiocGenerics','rtracklayer','limma','slingshot','BiocGenerics', 'DelayedArray', 'DelayedMatrixStats','limma', 'S4Vectors', 'SingleCellExperiment','SummarizedExperiment', 'batchelor', 'Matrix.utils')))

Have seqtk installed in $PATH

https://github.com/lh3/seqtk

Install papermill for cli for jupyter notebooks

python3 -m pip install papermill

Step 2: Download the data

use fasterq-dump or alternative to download the fastq files

https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump

GEO repository

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE198467

Step 3: Clone the github repo with analysis code

git clone https://github.com/mardzix/bcd_nano_CUTnTag

Step 4: Modify config

Change config/config.yaml and specify path to

Absolute path to fastq files
Specific path to the tmp folder (Create any folder, e.g. in home)
Specify conda environment to use
Modify path to cellranger reference

Step 5: Run the pipeline

Pipeline is implemented in workflow management software Snakemake

Change the cluster-specific profile to your preference (e.g. slurm, condor etc.), or run without profile

For some example profiles see:

https://snakemake.readthedocs.io/en/stable/executing/cli.html#profiles
https://github.com/Snakemake-Profiles/slurm
https://github.com/Snakemake-Profiles/htcondor

snakemake --snakefile code/workflow/Snakefile_single_modality.smk --cores 16 --profile htcondor -p

Johannes Köster, Sven Rahmann, Snakemake—a scalable bioinformatics workflow engine, Bioinformatics, Volume 28, Issue 19, 1 October 2012, Pages 2520–2522, https://doi.org/10.1093/bioinformatics/bts480

Code Snippets

shell:
    'Rscript {input.script} -i {input.seurat} -m {wildcards.combination} -o {output}'

SnakeMake From line 25 of workflow/Snakefile_multimodal.smk

shell:
    "Rscript -e \"rmarkdown::render(input='{input.notebook}', "
    "                                output_file = '{params.report}', "
    "                                params=list(seurat = '{params.seurat_in}', "
    "                                           output = '{params.seurat_out}'))\" "

SnakeMake From line 39 of workflow/Snakefile_multimodal.smk

shell:
    "Rscript {input.script} --seurat {input.seurat} --clusters OPC MOL --reduction wnn.umap --out {output.seurat}"

SnakeMake From line 52 of workflow/Snakefile_multimodal.smk

shell:
    'Rscript {input.script} -i {input.seurat} -o {output.seurat} {params.clusters} -d {params.idents} -m {params.modalities} -a {params.assay} -t {input.seurat_pt}'

SnakeMake From line 68 of workflow/Snakefile_multimodal.smk

shell:
    "fasterq-dump -t {params.tmp} -f -e {threads} --split-files --include-technical -o {params.out} {wildcards.SRA}"

SnakeMake From line 39 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "gzip {input}"

SnakeMake From line 47 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "mv {input} {output}"

SnakeMake From line 55 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'rm -r results/nbiotech_data/cellranger/{wildcards.sample}/; '
    'cd results/nbiotech_data/cellranger/; '
    '/data/bin/cellranger-atac count --id {wildcards.sample} --sample {params.sample} --reference {params.cellranger_ref} --fastqs {params.fastq_dir}'

SnakeMake From line 70 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'bamCoverage -b {input.cellranger_bam} -o {output.bigwig} -p {threads} --minMappingQuality 5 '
    ' --binSize 50 --centerReads --smoothLength 250 --normalizeUsing RPKM --ignoreDuplicates --extendReads'

SnakeMake DeepTools From line 82 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'macs2 callpeak -t {input.cellranger_bam} -g mm -f BAMPE -n {wildcards.sample} '
    '--outdir {params.macs_outdir} --keep-dup=1 --llocal 100000 --cutoff-analysis --min-length 1000 --max-gap 1000  2>&1 '

SnakeMake macs2 From line 93 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'macs2 callpeak -t {input.cellranger_bam} -g mm -f BAMPE -n {wildcards.sample} '
    '--outdir {params.macs_outdir} --keep-dup=1 --llocal {wildcards.llocal} 2>&1 '

SnakeMake macs2 From line 104 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'macs2 callpeak -t {input} -g mm -f BAMPE -n {wildcards.sample} '
    '--outdir {params.macs_outdir} --llocal 100000 --keep-dup=1 --broad-cutoff=0.1 ' 
    '--min-length 1000 --max-gap 1000 --broad 2>&1 '

SnakeMake macs2 From line 115 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "wget -O {output} {params.url}"

SnakeMake From line 125 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "bedtools genomecov -bg -g {input.genome} -i {input.fragments} > {output}"

SnakeMake BEDTools From line 134 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "~/bin/SEACR/SEACR_1.3.sh {input} 0.01 norm relaxed {params.out_prefix}"

SnakeMake From line 144 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'python3 {params.script} {input.fragments} {wildcards.sample} | bgzip > {output.fragments}; '
    'tabix -p bed {output.fragments}'

SnakeMake tabix From line 155 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'bedtools intersect -abam {input.bam} -b {input.peaks} -u | samtools view -f2 | '
    'awk -f {params.get_cell_barcode} | sed "s/CB:Z://g" | python3 {params.add_sample_to_list} {wildcards.sample} | '
    'sort -T {params.tmpdir} | uniq -c > {output.overlap} && [[ -s {output.overlap} ]] ; '

SnakeMake SAMtools BEDTools From line 169 of workflow/Snakefile_pre_nbiotech.smk

shell:
  ' samtools view -f2 {input.bam}| '
  'awk -f {params.get_cell_barcode} | sed "s/CB:Z://g" | python3 {params.add_sample_to_list} {wildcards.sample} | '
  'sort -T {params.tmpdir} | uniq -c > {output.all_bcd} && [[ -s {output.all_bcd} ]] ; '

SnakeMake SAMtools From line 183 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "Rscript {params.script} --metadata {input.metadata} --fragments {input.fragments} --bcd_all {input.bcd_all} --bcd_peak {input.bcd_peak} --sample {wildcards.sample} --sample {wildcards.sample} --out_prefix {params.out_prefix}"

SnakeMake From line 205 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "Rscript {input.script} --sample {wildcards.sample} --antibody {params.antibody} --metadata {input.metadata}  --fragments {input.fragments} --peaks {input.peaks} --out_prefix {params.out_prefix} --window {wildcards.binwidth} --genome_version {params.genome}"

SnakeMake From line 220 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'wget -O {output} {params.url}'

SnakeMake From line 228 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'cd {params.dirname}; '
    'tar -xvzf `basename {input.archive}`'

SnakeMake From line 238 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'wget -O {output.bw_tar} {params.url}'

SnakeMake From line 247 of workflow/Snakefile_pre_nbiotech.smk

shell:
    'cd {params.dirname}; '
    'tar -xvzf `basename {input.bw_tar}`'

SnakeMake From line 287 of workflow/Snakefile_pre_nbiotech.smk

shell:
    "python3 {input.script} -i {input.fastq} -o {params.out_folder} --single_cell --barcode {wildcards.barcode} 2>&1"

SnakeMake From line 64 of workflow/Snakefile_preprocess.smk

shell:
    'rm -rf results/multimodal_data/{wildcards.sample}/cellranger/{wildcards.sample}_{wildcards.antibody}_{wildcards.barcode}/; '
    'cd results/multimodal_data/{wildcards.sample}/cellranger/; '
    '/data/bin/cellranger-atac count --id {wildcards.sample}_{wildcards.antibody}_{wildcards.barcode} --reference {params.cellranger_ref} --fastqs {params.fastq_folder}'

SnakeMake From line 80 of workflow/Snakefile_preprocess.smk

shell:
    'bamCoverage -b {input.cellranger_bam} -o {output.bigwig} -p {threads} --minMappingQuality 5 '
    ' --binSize 50 --centerReads --smoothLength 250 --normalizeUsing RPKM --ignoreDuplicates --extendReads'

SnakeMake DeepTools From line 91 of workflow/Snakefile_preprocess.smk

shell:
    'macs2 callpeak -t {input.cellranger_bam} -g mm -f BAMPE -n {wildcards.antibody} '
    '--outdir {params.macs_outdir} --keep-dup=1 --llocal 100000 --cutoff-analysis --min-length 1000 --max-gap 1000  2>&1 '

SnakeMake macs2 From line 102 of workflow/Snakefile_preprocess.smk

shell:
    'macs2 callpeak -t {input.cellranger_bam} -g mm -f BAMPE -n {wildcards.antibody} '
    '--outdir {params.macs_outdir} --keep-dup=1 --llocal {wildcards.llocal} 2>&1 '

SnakeMake macs2 From line 113 of workflow/Snakefile_preprocess.smk

shell:
    'macs2 callpeak -t {input} -g mm -f BAMPE -n {wildcards.antibody} '
    '--outdir {params.macs_outdir} --llocal 100000 --keep-dup=1 --broad-cutoff=0.1 ' 
    '--min-length 1000 --max-gap 1000 --broad 2>&1 '

SnakeMake macs2 From line 124 of workflow/Snakefile_preprocess.smk

shell:
    'cut -f1,2 {params.faidx} > {output}'

SnakeMake From line 134 of workflow/Snakefile_preprocess.smk

shell:
    "bedtools genomecov -bg -g {input.genome} -i {input.fragments} > {output}"

SnakeMake BEDTools From line 143 of workflow/Snakefile_preprocess.smk

shell:
    "~/bin/SEACR/SEACR_1.3.sh {input} 0.01 norm relaxed {params.out_prefix}"

SnakeMake From line 153 of workflow/Snakefile_preprocess.smk

shell:
    'python3 {params.script} {input.fragments} {wildcards.sample} | bgzip > {output.fragments}; '
    'tabix -p bed {output.fragments}'

SnakeMake tabix From line 164 of workflow/Snakefile_preprocess.smk

shell:
    'bedtools intersect -abam {input.bam} -b {input.peaks} -u | samtools view -f2 | '
    'awk -f {params.get_cell_barcode} | sed "s/CB:Z://g" | python3 {params.add_sample_to_list} {wildcards.sample} | '
    'sort -T {params.tmpdir} | uniq -c > {output.overlap} && [[ -s {output.overlap} ]] ; '

SnakeMake SAMtools BEDTools From line 178 of workflow/Snakefile_preprocess.smk

shell:
  ' samtools view -f2 {input.bam}| '
  'awk -f {params.get_cell_barcode} | sed "s/CB:Z://g" | python3 {params.add_sample_to_list} {wildcards.sample} | '
  'sort -T {params.tmpdir} | uniq -c > {output.all_bcd} && [[ -s {output.all_bcd} ]] ; '

SnakeMake SAMtools From line 192 of workflow/Snakefile_preprocess.smk

shell:
    "Rscript {params.script} --metadata {input.metadata} --fragments {input.fragments} --bcd_all {input.bcd_all} --bcd_peak {input.bcd_peak} --antibody {wildcards.modality} --sample {wildcards.sample} --out_prefix {params.out_prefix}"

SnakeMake From line 214 of workflow/Snakefile_preprocess.smk

shell:
    "Rscript {input.script} --sample {wildcards.sample}   --antibody {wildcards.antibody} --metadata {input.metadata} --fragments {input.fragments} --out_prefix {params.out_prefix} --window {wildcards.binwidth} --genome_version {params.genome}"

SnakeMake From line 227 of workflow/Snakefile_preprocess.smk

shell:
    "Rscript {input.script} --sample {wildcards.sample}   --antibody {wildcards.modality} --metadata {input.metadata} --fragments {input.fragments} " \ 
    " --peaks {input.peaks} --out_prefix {params.out_prefix} --genome_version {params.genome}"

SnakeMake From line 242 of workflow/Snakefile_preprocess.smk

shell:
    'zcat {input.fragments} | sort -T {params.tmpdir} -k1,1 -k2,2n | bgzip > {output.fragments_merged} && tabix -p bed {output.fragments_merged}'

SnakeMake tabix From line 255 of workflow/Snakefile_preprocess.smk

shell:
    "bedToBam -i {input.fragments} -g {input.chrom_sizes} > {output.bam} && "
    "samtools sort -@ {threads} -o {output.bam_sorted} {output.bam} &&"
    "samtools index {output.bam_sorted} && "
    "bamCoverage -b {output.bam_sorted} -o {output.bigwig} -p {threads} --minMappingQuality 5 --binSize 50 --smoothLength 250 --normalizeUsing RPKM --ignoreDuplicates"

SnakeMake SAMtools DeepTools From line 268 of workflow/Snakefile_preprocess.smk

shell:
    'macs2 callpeak -t {input} -g mm -f BED -n {wildcards.modality} '
    '--outdir {params.macs_outdir} --llocal 100000 --keep-dup=1 --broad-cutoff=0.1 ' 
    '--min-length 1000 --max-gap 1000 --broad --nomodel 2>&1 '

SnakeMake macs2 From line 281 of workflow/Snakefile_preprocess.smk

shell:
    'Rscript {input.script} -i {input.seurat} -o {output.seurat}'

SnakeMake From line 293 of workflow/Snakefile_preprocess.smk

shell:
    'Rscript {params.script} -i {input.seurat} -o {output.seurat} -a {wildcards.feature} -d 40 -g {params.plot_group} '

SnakeMake From line 307 of workflow/Snakefile_preprocess.smk

shell:
    "Rscript {input.script} -i {input.seurat} -r {input.rna} -o {output}"

SnakeMake From line 54 of workflow/Snakefile_single_modality.smk

shell:
    "Rscript -e \"rmarkdown::render(input='{input.notebook}', "
    "                                output_file = '{params.report}', "
    "                                params=list(out_prefix = '{params.out_prefix}', "
    "                                           modality = '{wildcards.modality}', "
    "                                           feature = '{wildcards.feature}', "
    "                                           input = '{params.out_prefix}{input.seurat}', "
    "                                           integrated = '{params.out_prefix}{input.integrated}', "                # TODO - fix absolute paths integration                         
    "                                           output = '{params.out_prefix}{output.seurat}'))\" "

SnakeMake From line 68 of workflow/Snakefile_single_modality.smk

shell:
    'Rscript {input.script} --input {input.seurat} --output {output.seurat}'

SnakeMake From line 85 of workflow/Snakefile_single_modality.smk

shell:
    "Rscript {input.script}  --input {input.seurat} --fragments {input.fragments} --output_folder {output.bigwig} --idents {wildcards.idents} "

SnakeMake From line 95 of workflow/Snakefile_single_modality.smk

shell:
    "Rscript {input.script} -i {input.seurat} -o {output.markers} --idents {wildcards.idents}"

SnakeMake From line 105 of workflow/Snakefile_single_modality.smk

shell:
    'Rscript {input.script} -i {input.seurat} -o {output}'

SnakeMake From line 114 of workflow/Snakefile_single_modality.smk

shell:
    "Rscript {input.script} -i {input.seurat} -o {output.csv} -d {wildcards.ident}"

SnakeMake From line 123 of workflow/Snakefile_single_modality.smk

shell:
    "python3 {input.script} {input.bam} {wildcards.sample} {output.bam}"

SnakeMake From line 132 of workflow/Snakefile_single_modality.smk

shell:
    "samtools merge -@ {threads} {output.bam} {input.bam}"

SnakeMake SAMtools From line 151 of workflow/Snakefile_single_modality.smk

shell:
    'samtools index {input.bam}; '
    'bamCoverage -b {input.bam} -o {output.bw} -p {threads} --normalizeUsing RPKM'

SnakeMake SAMtools DeepTools From line 160 of workflow/Snakefile_single_modality.smk

shell:
    "python3 {input.script} {input.bam} {input.table} NA {output.bam_files}"

SnakeMake From line 172 of workflow/Snakefile_single_modality.smk

shell:
    'sh {input.script} {input.bam} {output.bw} {threads}'

SnakeMake From line 183 of workflow/Snakefile_single_modality.smk

shell:
    'cat {input.csv} | grep {wildcards.sample} > {output.csv_per_sample}; '
    'python3 {input.script} {input.bam} {output.csv_per_sample} NA {output.bam_per_cluster}'

SnakeMake From line 194 of workflow/Snakefile_single_modality.smk

shell:
    'sh {input.script} {input.bam} {output.bw} {threads}'

SnakeMake From line 205 of workflow/Snakefile_single_modality.smk

shell:
    'sh {input.script}  {input.bam} {output.peaks}'

SnakeMake From line 214 of workflow/Snakefile_single_modality.smk

shell:
    'Rscript {input.script} --input {input.csv} --nmarkers {params.nmarkers} --output {output.bed}'

SnakeMake From line 225 of workflow/Snakefile_single_modality.smk

shell:
    'python3 {params.script} {input} > {output}'

SnakeMake From line 236 of workflow/Snakefile_single_modality.smk

ShowHide 56 more snippets with no or duplicated tags.

Comments

Support

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Created: 1yr ago

Updated: 1yr ago

Maitainers: public

URL: https://github.com/mardzix/bcd_nano_CUTnTag

Name: bcd_nano_cutntag

Version: 1

Badge:

Insert copied code into your website to add a link to this workflow.

License: None

Keywords:

macs2 tabix BEDTools DeepTools SAMtools Snakemake

Future updates

Related Workflows

psychip_snakemake — Show Details View Workflow

ENCODE pipeline for histone marks developed for the psychENCODE project

public

psychip pipeline is an improved version of the ENCODE pipeline for histone marks developed for the psychENCODE project. The o...

raw sequence reads Alignment Sequence alignment report macs2 ucsc-bedclip bedGraphToBigWig BEDTools BWA Picard SAMtools Snakemake

Free

Near-real time tracking of SARS-CoV-2 in Connecticut

public

Repository containing scripts to perform near-real time tracking of SARS-CoV-2 in Connecticut using genomic data. This pipeli...

JSON nextclade Augur Biopython FOCUS Pandas Snakemake bs4 epiweeks geopy matplotlib numpy pycountry pycountry-convert uszipcode

Free

cellranger-snakemake-gke — Show Details View Workflow

snakemake workflow to run cellranger on a given bucket using gke.

public

A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...

macs2 ucsc-bedclip bedGraphToBigWig BEDTools BWA Picard SAMtools Snakemake

Free

ATLAS - Three commands to start analyzing your metagenome data

public

Metagenome-atlas is a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, t...

raw sequence reads Genome assembly Annotation track checkm2 gunc prodigal snakemake-wrapper-utils MEGAHIT Atlas BBMap Biopython BioRuby Bwa-mem2 cd-hit CheckM DAS Diamond eggNOG-mapper v2 MetaBAT 2 Minimap2 MMseqs MultiQC Pandas Picard pyfastx SAMtools SemiBin Snakemake SPAdes SqueezeMeta TADpole VAMB CONCOCT ete3 gtdbtk h5py networkx numpy plotly psutil utils metagenomics

Free

175

rna-seq-star-deseq2 — Show Details View Workflow

RNA-seq workflow using STAR and DESeq2

public

This workflow performs a differential gene expression analysis with STAR and Deseq2. The usage of this workflow is described ...

Free

dna-seq-gatk-variant-calling — Show Details View Workflow

This Snakemake pipeline implements the GATK best-practices workflow

public

This Snakemake pipeline implements the GATK best-practices workflow for calling small germline variants. The usage of thi...

VCF raw sequence reads Variant calling genetic variants gatk rust-bio-tools snakemake-wrapper-utils tabix BCFtools BWA FastQC MultiQC Pandas Picard SAMtools Snakemake Trimmomatic Variant Effect Predictor (VEP) common matplotlib numpy seaborn DNA

Free

Multimodal chromatin profiling using nanobody-based single-cell CUT&amp;Tag

Help improve this workflow!

Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag

Marek Bartosovic, Goncalo Castelo-Branco

Data availability

Reproducing the analysis

Step 1: prepare environment

conda environment is provided in env/environment.yaml

Additional package dependencies that need to be installed:

Have seqtk installed in $PATH

Install papermill for cli for jupyter notebooks

Step 2: Download the data

Step 3: Clone the github repo with analysis code

Step 4: Modify config

Step 5: Run the pipeline

Code Snippets

Comments

Support

Free

Related Workflows

public

public

public

public

public

public

Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag