HRIBO: High-throughput annotation by Ribo-seq workflow for analyzing bacterial Ribo-seq data
We present HRIBO (High-throughput annotation by Ribo-seq), a workflow to enable reproducible and high-throughput analysis of bacterial Ribo-seq data. The workflow performs all required pre-processing steps and quality control. Importantly, HRIBO outputs annotation-independent ORF predictions based on two complementary prokaryotic-focused tools, and integrates them with additional computed features. This facilitates both the rapid discovery of ORFs and their prioritization for functional characterization.
For a detailed description of this workflow, the installation, usage and examples, please refer to the ReadTheDocs documentation .
HRIBO installs all dependencies via conda . Once you have conda installed simply type:
conda create -c bioconda -c conda-forge -n snakemake snakemake source activate snakemake
Basic usage
The retrieval of input files and running the workflow locally and on a server cluster via a queuing system is working as follows. Create a project directory and change into it:
mkdir project cd project
Retrieve the HRIBO from GitHub:
git clone git@github.com:gelhausr/HRIBO.git
The workflow requires a genome sequence (fasta), an annotation file (gtf) and the sequencing results files (fastq). We recommend retrieving both the genome and the annotation files from Ensembl Genomes . Copy the genome and the annotation file into the project folder, decompress them and name them genome.fa and annotation.gtf.
Create a folder fastq and copy your compressed fastq.gz files into the fastq folder.
Please copy the template of the sample sheet and the config file into the HRIBO folder.
cp HRIBO/templates/config.yaml HRIBO/ cp HRIBO/templates/samples.tsv HRIBO/
Customize the config.yaml with the used adapter sequence and optionally with the path to a precomputed STAR genome index. For correct removal of reads mapping to ribosomal genes please specify the taxonomic group of the used organism (Eukarya, Bacteria, Archea). Now edit the sample sheet corresponding to your project, using one line per sequencing result, stating the used method (RIBO for ribosome profiling, RNA for RNA-seq), the applied condition (e.g. A, B, CTRL, TREAT), the replicate (e.g. 1, 2,..) and the filename. Following is an example:
method | condition | replicate | fastqFile |
---|---|---|---|
RIBO | A | 1 | "fastq/FP-ctrl-1-2.fastq.gz" |
RIBO | B | 1 | "fastq/FP-treat-1-2.fastq.gz" |
RNA | A | 1 | "fastq/Total-ctrl-1-2.fastq.gz" |
RNA | B | 1 | "fastq/Total-treat-1-2.fastq.gz" |
Now you can start your workflow.
Run Snakemake locally:
snakemake --use-conda -s HRIBO/Snakefile --configfile HRIBO/config.yaml --directory ${PWD} -j 20 --latency-wait 60
Run Snakemake on the cluster:
Edit cluster.yaml according to your queuing system and cluster hardware. The following example works for Grid Engine:
snakemake --use-conda -s HRIBO/Snakefile --configfile HRIBO/config.yaml --directory ${PWD} -j 20 --cluster-config HRIBO/cluster.yaml --cluster "qsub -N {cluster.jobname} -cwd -q {cluster.qname} -pe {cluster.parallelenvironment} -l {cluster.memory} -o {cluster.logoutputdir} -e {cluster.erroroutputdir} -j {cluster.joinlogs} -M <email>" --latency-wait 60
Once the workflow has finished you can request a automatically generated report.html file with the following command:
snakemake --report report.html
Code Snippets
21 22 | shell: "mkdir -p auxiliary; HRIBO/scripts/enrich_annotation.py -a {input.annotation} -o {output}" |
32 33 34 35 36 | shell: """ mkdir -p auxiliary; awk -F'\\t' '/^[^#]/ {{printf "%s\\t%s\\t%s\\t%s\\t%s\\t%s\\t%s\\t%s\\tID=uid%s;\\n", $1, $2, $3, $4, $5, $6, $7, $8, NR-1}}' {input} > {output} """ |
46 47 | shell: "mkdir -p auxiliary; HRIBO/scripts/samples_to_xlsx.py -i {input} -o {output}" |
59 60 | shell: "mkdir -p auxiliary; HRIBO/scripts/generate_excel.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}" |
72 73 | shell: "mkdir -p auxiliary; HRIBO/scripts/generate_excel.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}" |
85 86 | shell: "mkdir -p auxiliary; HRIBO/scripts/generate_excel_reparation.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}" |
97 98 | shell: "mkdir -p auxiliary; HRIBO/scripts/generate_read_table.py -r {input.reads} -t {input.total} -o {output}" |
109 110 | shell: "mkdir -p auxiliary; HRIBO/scripts/generate_read_table.py -r {input.reads} -t {input.total} -o {output}" |
125 126 127 128 129 130 131 132 133 134 | shell: """ mkdir -p auxiliary; if [ -z {params.contrasts} ] then HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output} else HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -c {params.contrasts} -g {input.genome} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output} fi """ |
150 151 152 153 154 155 156 157 158 159 | shell: """ mkdir -p auxiliary; if [ -z {params.contrasts} ] then HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output} else HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -c {params.contrasts} -g {input.genome} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output} fi """ |
177 178 179 180 181 182 183 184 185 | shell: """ if [ -z {params.contrasts} ] then mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output} else mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -c {params.contrasts} -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output} fi """ |
204 205 206 207 208 209 210 211 212 | shell: """ if [ -z {params.contrasts} ] then mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output} else mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -c {params.contrasts} -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output} fi """ |
10 11 12 13 14 | shell: """ mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input.reparation_orfs} {input.currentAnnotation} -o {output} """ |
18 19 | run: shell("mkdir -p deepribo; mv {input} deepribo/DeepRibo_model_v1.pt") |
32 33 | shell: "mkdir -p coverage_deepribo; HRIBO/scripts/coverage_deepribo.py --alignment_file {input.bam} --output_file_prefix coverage_deepribo/{wildcards.condition}-{wildcards.replicate}" |
45 46 47 48 49 50 | shell: """ mkdir -p coverage_deepribo bedtools genomecov -bg -ibam {input.bam} -strand + > {output.covfwd} bedtools genomecov -bg -ibam {input.bam} -strand - > {output.covrev} """ |
65 66 67 68 69 70 | shell: """ mkdir -p deepribo/{wildcards.condition}-{wildcards.replicate}/0/; mkdir -p deepribo/{wildcards.condition}-{wildcards.replicate}/1/; DataParser.py {input.covS} {input.covAS} {input.asiteS} {input.asiteAS} {input.genome} deepribo/{wildcards.condition}-{wildcards.replicate} -g {input.annotation} """ |
80 81 | shell: "mkdir -p deepribo; Rscript HRIBO/scripts/parameter_estimation.R -f {input} -o {output}" |
96 97 98 99 100 | shell: """ mkdir -p deepribo; DeepRibo.py predict deepribo/ --pred_data {wildcards.condition}-{wildcards.replicate}/ -r {params.rpkm} -c {params.cov} --model {input.model} --dest {output} --num_workers {threads} """ |
110 111 | shell: "mkdir -p tracks; HRIBO/scripts/create_deepribo_gff.py -c {wildcards.condition} -r {wildcards.replicate} -i {input} -o {output}" |
121 122 | shell: "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input} -o {output}" |
132 133 | shell: "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input.merged_gff} -o {output}" |
145 146 | shell: "mkdir -p tracks; HRIBO/scripts/merge_duplicates_deepribo.py -i {input.ingff} -o {output.merged} -a {input.annotation}" |
159 160 | shell: "mkdir -p auxiliary; HRIBO/scripts/generate_excel_deepribo.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}" |
172 173 174 175 176 | shell: """ mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input.deepribo_orfs} {input.reparation_orfs} {input.currentAnnotation} -o {output} """ |
4 5 6 7 8 9 | run: if not os.path.exists("contrasts"): os.makedirs("contrasts") for f in CONTRASTS: print(f) open(f"contrasts/{f}", 'a').close() |
22 23 24 25 26 | shell: """ mkdir -p diffex_input/riborex/; python3 HRIBO/scripts/prepare_diffex_input.py -r {input.rawreads} -c {wildcards.contrast} -t riborex -o diffex_input/riborex/ """ |
40 41 42 43 44 | shell: """ mkdir -p diffex_input/xtail/; python3 HRIBO/scripts/prepare_diffex_input.py -r {input.rawreads} -c {wildcards.contrast} -t xtail -o diffex_input/xtail/ """ |
31 32 33 34 35 | shell: """ mkdir -p deltate; HRIBO/scripts/prepare_deltate_input.py -c {params.contrast} -r {input.rawreads} -b bam/ -o {params.out_dir} """ |
55 56 57 58 59 60 61 62 63 64 | shell: """ mkdir -p deltate; touch {output.fcribo} touch {output.fcrna} touch {output.fcte} touch deltate/{params.contrast}/Result_figures.pdf DTEG.R {input.ribo} {input.rna} {input.samples} 0 deltate/{params.contrast}/ || true cp deltate/{params.contrast}/Result_figures.pdf {output.fig} """ |
81 82 83 84 | shell: """ python3 HRIBO/scripts/generate_excel_deltate.py -a {input.annotation} -g {input.genome} -i {input.deltate_ribo} -r {input.deltate_rna} -t {input.deltate_te} -o {output.xlsx_sorted} --padj_cutoff {params.padj_cutoff} --log2fc_cutoff {params.log2fc_cutoff} """ |
94 95 96 97 | shell: """ python3 HRIBO/scripts/merge_differential_expression.py {input.deltate} -o {output} -t deltate """ |
12 13 14 15 16 | shell: """ mkdir -p riborex; HRIBO/scripts/riborex.R -r {input.ribo} -m {input.rna} -c {input.cv} -x {output.table}; """ |
31 32 33 34 | shell: """ python3 HRIBO/scripts/generate_excel_riborex.py -a {input.annotation} -g {input.genome} -i {input.riborex_out} -o {output.xlsx_sorted} --padj_cutoff {params.padj_cutoff} --log2fc_cutoff {params.log2fc_cutoff} """ |
44 45 46 47 | shell: """ python3 HRIBO/scripts/merge_differential_expression.py {input.riborex} -o {output} -t riborex """ |
14 15 16 17 18 | shell: """ mkdir -p xtail; HRIBO/scripts/xtail.R -r {input.ribo} -m {input.rna} -c {input.cv} -x {output.table} -f {output.fcplot} -p {output.rplot}; """ |
33 34 35 36 | shell: """ python3 HRIBO/scripts/generate_excel_xtail.py -a {input.annotation} -g {input.genome} -i {input.xtail_out} -o {output.xlsx_sorted} --padj_cutoff {params.padj_cutoff} --log2fc_cutoff {params.log2fc_cutoff} """ |
46 47 48 49 | shell: """ python3 HRIBO/scripts/merge_differential_expression.py {input.xtail} -o {output} -t xtail """ |
11 12 | shell: "mkdir -p genomeSegemehlIndex; echo \"Computing Segemehl index\"; segemehl.x --threads {threads} -x {output.index} -d {input.genome} 2> {log}" |
40 41 42 43 | shell: """ mkdir -p sammulti; segemehl.x -e -d {input.genome} -i {input.genomeSegemehlIndex} {params.fastq} --threads {threads} -o {output.sammulti} 2> {log} """ |
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | shell: """ set +e mkdir -p sam awk '$2 == "4"' {input.sammulti} > {input.sammulti}.unmapped gawk -i inplace '$2 != "4"' {input.sammulti} samtools view -H <(cat {input.sammulti}) | grep '@HD' > {output.sam} samtools view -H <(cat {input.sammulti}) | grep '@SQ' | sort -t$'\t' -k1,1 -k2,2V >> {output.sam} samtools view -H <(cat {input.sammulti}) | grep '@RG' >> {output.sam} samtools view -H <(cat {input.sammulti}) | grep '@PG' >> {output.sam} cat {input.sammulti} |grep -v '^@' | grep -w 'NH:i:1' >> {output.sam} exitcode=$? if [ $exitcode -eq 1 ] then exit 1 else exit 0 fi """ |
82 | shell: "if [ \"{params.method}\" == \"NOTSET\" ]; then HRIBO/scripts/sam_strand_inverter.py --sam_in_filepath={input.sam} --sam_out_filepath={output.sam}; else cp {input.sam} {output.sam}; fi" |
92 93 | shell: "mkdir -p bammulti; samtools view -@ {threads} -bh {input.sam} | samtools sort -@ {threads} -o {output} -O bam" |
103 104 | shell: "mkdir -p rRNAbam; samtools view -@ {threads} -bh {input.sam} | samtools sort -@ {threads} -o {output} -O bam" |
115 116 | shell: "mkdir -p maplink; ln -s {params.inlink} {params.outlink}" |
9 10 | shell: "mkdir -p tracks; cat {input.reparation} >> {output}.unsorted; bedtools sort -i {output}.unsorted > {output};" |
20 21 | shell: "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input.mergedGff} -o {output}" |
31 32 | shell: "mkdir -p tracks; HRIBO/scripts/merge_duplicates_reparation.py -i {input} -o {output}" |
43 44 | shell: "mkdir -p tracks; HRIBO/scripts/reannotate_orfs.py -a {input.annotation} -c {input.reparation} -o {output}" |
54 55 | shell: "mkdir -p tracks; HRIBO/scripts/annotation_unite.py -a {input} -o {output}" |
17 18 19 20 21 | shell: """ mkdir -p metageneprofiling; HRIBO/scripts/read_length_statistics.py -a {input.bamfiles} -r {params.readlengths} -o metageneprofiling/ > {log} """ |
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | shell: """ mkdir -p metageneprofiling; if [ {params.colorList} == nocolor ]; then colorList=""; else colorList="--color_list {params.colorList}"; fi; HRIBO/scripts/metagene_profiling.py -b {input.bam} -g {input.genome} -a {input.annotation} -o {output.meta} \ --read_lengths {params.readlengths} \ --normalization_methods {params.normalizationMethods} \ --mapping_methods {params.mappingMethods} \ --positions_in_ORF {params.positionsInORF} \ --positions_out_ORF {params.positionsOutORF} \ --filtering_method {params.filteringMethods} \ --neighboring_genes_distance {params.neighboringGenesDistance} \ --rpkm_threshold {params.rpkmThreshold} \ --length_cutoff {params.lengthCutoff} \ --output_formats {params.outputFormats} \ --include_plotly_js {params.includePlotlyJS} \ ${{colorList}}; > {log} """ |
11 12 13 14 15 16 | shell: """ mkdir -p pca; sed -e '1s/-/_/g' {input.rawreads} > {output.rawreads}; HRIBO/scripts/preparePCAinput.py -s {input.samples} -o {output.meta}; """ |
32 33 34 35 36 | shell: """ mkdir -p pca; HRIBO/scripts/analyse_variance.R -r {input.rawreads} -m {input.meta} -o pca/; """ |
49 50 51 52 53 | shell: """ mkdir -p pca; HRIBO/scripts/plot_PCA.py -r {input.rld} -p {input.pvar} -c {input.cor} -o pca/; """ |
7 8 | shell: "mkdir -p genomes; cp {input.genome} genomes/genome.fa" |
16 17 | shell: "mkdir -p annotation; cp {input.annotation} annotation/annotation.gff" |
25 26 | shell: "mkdir -p annotation; HRIBO/scripts/gtf2gff3.py -a {input} -o {output}" |
13 14 | shell: "mkdir -p qc/4unique; fastqc -o qc/4unique -t {threads} -f sam_mapped {input.sam}; mv qc/4unique/{params.prefix}_fastqc.html {output.html}; mv qc/4unique/{params.prefix}_fastqc.zip {output.zip}" |
28 29 | shell: "mkdir -p qc/3mapped; fastqc -o qc/3mapped -t {threads} -f sam_mapped {input.sam}; mv qc/3mapped/{params.prefix}_fastqc.html {output.html}; mv qc/3mapped/{params.prefix}_fastqc.zip {output.zip}" |
43 44 | shell: "mkdir -p qc/5removedrRNA; fastqc -o qc/5removedrRNA -t {threads} {input}; mv qc/5removedrRNA/{params.prefix}_fastqc.html {output.html}; mv qc/5removedrRNA/{params.prefix}_fastqc.zip {output.zip}" |
55 56 57 58 59 60 61 62 63 64 65 | shell: """ mkdir -p qc/all; column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq) if [[ " ${{column3[@]}} " =~ "gene" ]]; then featureCounts -T {threads} -t gene -g ID -a {input.annotation} -o {output.txt} {input.bam}; else touch {output.txt}; fi """ |
76 77 78 79 80 81 82 83 84 85 86 | shell: """ mkdir -p qc/trnainall; column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq) if [[ " ${{column3[@]}} " =~ "tRNA" ]]; then featureCounts -T {threads} -t tRNA -g ID -a {input.annotation} -o {output.txt} {input.bam}; else touch {output.txt}; fi """ |
97 98 99 100 101 102 103 104 105 106 107 | shell: """ mkdir -p qc/rrnainall; column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq) if [[ " ${{column3[@]}} " =~ "rRNA" ]]; then featureCounts -T {threads} -t rRNA -g ID -a {input.annotation} -o {output.txt} {input.bam}; else touch {output.txt}; fi """ |
118 119 120 121 122 123 124 125 126 127 128 | shell: """ mkdir -p qc/rrnainallaligned; column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq) if [[ " ${{column3[@]}} " =~ "rRNA" ]]; then featureCounts -T {threads} -t rRNA -g ID -a {input.annotation} -o {output.txt} {input.bam}; else touch {output.txt}; fi """ |
139 140 141 142 143 144 145 146 147 148 149 | shell: """ mkdir -p qc/rrnainuniquelyaligned; column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq) if [[ " ${{column3[@]}} " =~ "rRNA" ]]; then featureCounts -T {threads} -t rRNA -g ID -a {input.annotation} -o {output.txt} {input.bam}; else touch {output.txt}; fi """ |
159 160 | shell: "mkdir -p coverage; bedtools genomecov -ibam {input} -bg > {output}" |
13 14 | shell: "mkdir -p qc/1raw; fastqc -o qc/1raw -t {threads} {input.fastq}; mv qc/1raw/{params.prefix}_fastqc.html {output.html}; mv qc/1raw/{params.prefix}_fastqc.zip {output.zip}" |
27 28 | shell: "mkdir -p qc/2trimmed; fastqc -o qc/2trimmed -t {threads} {input}; mv qc/2trimmed/{params.prefix}_fastqc.html {output.html}; mv qc/2trimmed/{params.prefix}_fastqc.zip {output.zip}" |
46 47 48 49 50 51 | shell: """ mkdir -p qc/1raw fastqc -o qc/1raw -t {threads} {input.fastq1}; mv qc/1raw/{params.prefix1}_fastqc.html {output.html1}; mv qc/1raw/{params.prefix1}_fastqc.zip {output.zip1} fastqc -o qc/1raw -t {threads} {input.fastq2}; mv qc/1raw/{params.prefix2}_fastqc.html {output.html2}; mv qc/1raw/{params.prefix2}_fastqc.zip {output.zip2} """ |
68 69 70 71 72 73 | shell: """ mkdir -p qc/2trimmed; fastqc -o qc/2trimmed -t {threads} {input}; mv qc/2trimmed/{params.prefix1}_fastqc.html {output.html1}; mv qc/2trimmed/{params.prefix1}_fastqc.zip {output.zip1} fastqc -o qc/2trimmed -t {threads} {input}; mv qc/2trimmed/{params.prefix2}_fastqc.html {output.html2}; mv qc/2trimmed/{params.prefix2}_fastqc.zip {output.zip2} """ |
114 115 | shell: "export LC_ALL=en_US.utf8; export LANG=en_US.utf8; multiqc -f -d --exclude picard --exclude gatk -z -o {params.dir} qc/1raw qc/2trimmed qc/3mapped qc/4unique qc/5removedrRNA qc/all qc/trnainall qc/rrnainallaligned qc/rrnainuniquelyaligned qc/rrnainall trimmed 2> {log}" |
13 14 15 16 17 18 19 20 21 22 | shell: """ if [ "{params.features}" == None ]; then features=""; else features="--use_features {params.features}"; fi; mkdir -p readcounts HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O --for_diff_expr -o {output} -t {threads} -a {input.annotation} ${{features}} """ |
34 35 36 37 38 | shell: """ mkdir -p readcounts HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O -o {output} -t {threads} -a {input.annotation} """ |
50 51 52 53 54 | shell: """ mkdir -p readcounts HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O -o {output} -t {threads} -a {input.annotation} """ |
66 67 68 69 70 | shell: """ mkdir -p readcounts HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O -o {output} -t {threads} -a {input.annotation} """ |
82 83 84 85 86 | shell: """ mkdir -p readcounts HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O --with_M --fraction -o {output} -t {threads} -a {input.annotation} """ |
98 99 100 101 102 | shell: """ mkdir -p auxiliary HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O --fraction -o {output} -t {threads} -a {input.annotation} """ |
113 114 115 116 | shell: """ mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output} """ |
127 128 129 130 | shell: """ mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output} """ |
141 142 143 144 | shell: """ mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output} """ |
155 156 157 158 | shell: """ mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output} """ |
169 170 171 172 | shell: """ mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output} """ |
184 185 | shell: "mkdir -p readcounts; HRIBO/scripts/total_mapped_reads.py -b {input.bam} -m {output.mapped} -l {output.length}" |
197 198 | shell: "mkdir -p readcounts; HRIBO/scripts/total_mapped_reads.py -b {input.bam} -m {output.mapped} -l {output.length}" |
210 211 | shell: "mkdir -p readcounts; HRIBO/scripts/total_mapped_reads.py -b {input.bam} -m {output.mapped} -l {output.length}" |
10 11 12 | run: outputName = os.path.basename(input[0]) shell("mkdir -p uniprotDB; mv {input} uniprotDB/{outputName}; gunzip uniprotDB/{outputName}") |
34 35 | shell: "mkdir -p reparation; if [ uniprotDB/uniprot_sprot.fasta.bak does not exist ]; then cp -p uniprotDB/uniprot_sprot.fasta uniprotDB/uniprot_sprot.fasta.bak; fi; mkdir -p {params.prefix}/tmp; reparation.pl -bam {input.bam} -g {input.genome} -gtf {input.gtf} -db {input.db} -out {params.prefix} -threads {threads}; if [ uniprotDB/uniprot_sprot.fasta does not exist ]; then cp -p uniprotDB/uniprot_sprot.fasta.bak uniprotDB/uniprot_sprot.fasta; fi;" |
45 46 | shell: "mkdir -p tracks; HRIBO/scripts/create_reparation_gff.py -c {wildcards.condition} -r {wildcards.replicate} -i {input} -o {output}" |
56 57 | shell: "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input} -o {output}" |
9 10 11 12 | shell: """ mkdir -p annotation; awk -F'\\t' '$3 == "rRNA" || $3 == "tRNA"' {input.annotation} | awk -F'\\t' '{{print $1 FS $4 FS $5 FS "." FS "." FS $7}}' > {output.annotation} """ |
23 24 | shell: "mkdir -p norRNA; mkdir -p mapuniqnorrna; bedtools intersect -v -a {input.mapuniq} -b {input.annotation} > {output.bam}" |
18 19 | shell: "mkdir -p trimlink; ln -s {params.inlink} {params.outlink};" |
35 36 | shell: "mkdir -p trimlink; ln -s {params.inlink1} {params.outlink1}; ln -s {params.inlink2} {params.outlink2};" |
54 55 | shell: "mkdir -p trimmed; cutadapt -j {threads} {params.adapter3} {params.adapter5} {params.quality} {params.filtering} -o {output.fastq} {input.fastq}" |
74 75 | shell: "mkdir -p trimmed; cutadapt -j {threads} {params.adapter3q} {params.adapter5q} {params.adapter3p} {params.adapter5p} {params.quality} {params.filtering} -o {output.fastq1} -p {output.fastq2} {input.fastq1} {input.fastq2}" |
11 12 | shell: "samtools faidx {rules.retrieveGenome.output}" |
23 24 | shell: "mkdir -p genomes; cut -f1,2 {input[0]} > genomes/sizes.genome" |
34 35 | shell: "mkdir -p genomes; HRIBO/scripts/reverse_complement.py --input_fasta_filepath genomes/genome.fa --output_fasta_filepath genomes/genome.rev.fa" |
46 47 | shell: "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string ATG --output_gff3_filepath {output}" |
58 59 | shell: "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string GTG,TTG,CTG --output_gff3_filepath {output}" |
71 72 | shell: "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string TAG,TGA,TAA --output_gff3_filepath {output}" |
83 84 | shell: "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string AAGG --output_gff3_filepath {output}" |
98 99 | shell: "samtools index -@ {threads} maplink/{params.prefix}" |
112 113 | shell: "samtools index -@ {threads} bammulti/{params.prefix}" |
126 127 | shell: "samtools index -@ {threads} rRNAbam/{params.prefix}" |
148 149 | shell: "mkdir -p totalmappedtracks; mkdir -p totalmappedtracks/raw; mkdir -p totalmappedtracks/mil; mkdir -p totalmappedtracks/min; HRIBO/scripts/mapping.py --mapping_style global --bam_path {input.bam} --wiggle_file_path totalmappedtracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};" |
170 171 | shell: "mkdir -p uniquemappedtracks; mkdir -p uniquemappedtracks/raw; mkdir -p uniquemappedtracks/mil; mkdir -p uniquemappedtracks/min; HRIBO/scripts/mapping.py --mapping_style global --bam_path {input.bam} --wiggle_file_path uniquemappedtracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};" |
182 183 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
194 195 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
206 207 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
218 219 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
230 231 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
242 243 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
254 255 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
266 267 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
278 279 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
290 291 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
302 303 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
314 315 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
336 337 | shell: "mkdir -p globaltracks; mkdir -p globaltracks/raw; mkdir -p globaltracks/mil; mkdir -p globaltracks/min; HRIBO/scripts/mapping.py --mapping_style global --bam_path {input.bam} --wiggle_file_path globaltracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};" |
348 349 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
360 361 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
372 373 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
384 385 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
396 397 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
408 409 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
430 431 | shell: "mkdir -p centeredtracks; mkdir -p centeredtracks/raw; mkdir -p centeredtracks/mil; mkdir -p centeredtracks/min; HRIBO/scripts/mapping.py --mapping_style centered --bam_path {input.bam} --wiggle_file_path centeredtracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};" |
441 442 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
453 454 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
465 466 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
477 478 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
489 490 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
501 502 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
523 524 | shell: "mkdir -p fiveprimetracks; mkdir -p fiveprimetracks/raw; mkdir -p fiveprimetracks/mil; mkdir -p fiveprimetracks/min; HRIBO/scripts/mapping.py --mapping_style first_base_only --bam_path {input.bam} --wiggle_file_path fiveprimetracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};" |
535 536 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
547 548 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
559 560 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
571 572 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
583 584 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
595 596 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
617 618 | shell: "mkdir -p threeprimetracks; mkdir -p threeprimetracks/raw; mkdir -p threeprimetracks/mil; mkdir -p threeprimetracks/min; HRIBO/scripts/mapping.py --mapping_style last_base_only --bam_path {input.bam} --wiggle_file_path threeprimetracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};" |
629 630 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
641 642 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
653 654 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
665 666 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
677 678 | shell: "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}" |
689 690 | shell: "wigToBigWig {input.rev} {input.genomeSize} {output.rev}" |
702 703 | shell: "mkdir -p tracks; multiBamSummary bins --smartLabels --bamfiles {input.bam} -o {output} -p {threads};" |
713 714 | shell: "mkdir -p figures; plotCorrelation -in {input.npz} --corMethod spearman --skipZeros --plotTitle \"Spearman Correlation of Read Counts\" --whatToPlot heatmap --colorMap RdYlBu --plotNumbers -o {output.correlation} --outFileCorMatrix SpearmanCorr_readCounts.tab" |
724 725 | shell: "mkdir -p tracks; cat {input[0]} | grep -v '\tgene\t' > tracks/annotation-woGenes.gtf; gtf2bed < tracks/annotation-woGenes.gtf > tracks/annotation.bed" |
736 737 | shell: "mkdir -p tracks; cut -f1-6 {input[0]} > tracks/annotationNScore.bed6; awk '{{$5=1 ; print ;}}' tracks/annotation.bed6 > tracks/annotation.bed6; bedToBigBed -type=bed6 -tab tracks/annotation.bed6 {input[1]} tracks/annotation.bb" |
752 753 754 755 756 757 758 759 760 761 762 | shell: """ set +e mkdir -p tracks/color bigWigToWig {input.infwd} {params.unzippedfwd} bigWigToWig {input.inrev} {params.unzippedrev} sed -i '2s/^/track type=wiggle_0 visibility=full color=0,0,128 autoscale=on\\n/' {params.unzippedfwd} sed -i '2s/^/track type=wiggle_0 visibility=full color=0,130,200 autoscale=on\\n/' {params.unzippedrev} gzip -f {params.unzippedfwd} gzip -f {params.unzippedrev} """ |
774 775 776 777 778 779 780 781 782 783 784 | shell: """ set +e mkdir -p tracks/color cp {input.rbs} ./tracks/color/ cp {input.start} ./tracks/color/ cp {input.stop} ./tracks/color/ sed -i '1s/^/##track type=wiggle_0 visibility=full color=145,30,180 autoscale=on\\n/' {output.outrbs} sed -i '1s/^/##track type=wiggle_0 visibility=full color=210,245,60 autoscale=on\\n/' {output.outstart} sed -i '1s/^/##track type=wiggle_0 visibility=full color=230,25,75 autoscale=on\\n/' {output.outstop} """ |
Support
- Future updates
Related Workflows





