Analysis of Dual RNA-seq data - an experimental method for interrogating host-pathogen interactions through simultaneous RNA-seq.

public 1yr ago Version: 1.0.0 0 bookmarks

Dual RNA-seq pipeline

nf-core/dualrnaseq is a bioinformatics pipeline built using Nextflow , a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Introduction

nf-core/dualrnaseq is specifically used for the analysis of Dual RNA-seq data, interrogating host-pathogen interactions through simultaneous RNA-seq.

This pipeline has been initially tested with eukaryotic host's including Human and Mouse, and pathogens including Salmonella enterica , Orientia tsutsugamushi , Streptococcus penumoniae , Escherichia coli and Mycobacterium leprae . The workflow should work with any eukaryotic and bacterial organisms with an available reference genome and annotation.

Method

The workflow merges host and pathogen genome annotations taking into account differences in annotation conventions, then processes raw data from FastQ inputs ( FastQC , BBDuk ), quantifies gene expression ( STAR and HTSeq ; STAR , Salmon and tximport ; or Salmon in quasimapping mode and tximport ), and summarises the results ( MultiQC ), as well as generating a number of custom summary plots and separate results tables for the pathogen and host. See the output documentation for more details.

Workflow

The workflow diagram below gives a simplified visual overview of how dualrnaseq has been designed.

nf-core/dualrnaseq

Documentation

The nf-core/dualrnaseq pipeline comes with documentation about the pipeline, found in the docs/ directory:

Credits

nf-core/dualrnaseq was coded and written by Bozena Mika-Gospodorz and Regan Hayward.

We thank the following people for their extensive assistance in the development of this pipeline:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines .

For further information or help, don't hesitate to get in touch on the Slack #dualrnaseq channel (you can join with this invite ).

Citations

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x . ReadCube: Full Access Link

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

Code Snippets

"""
echo $workflow.manifest.version > v_pipeline.txt
echo $workflow.nextflow.version > v_nextflow.txt
python --version > v_python.txt
R --version > v_r.txt
cutadapt --version > v_cutadapt.txt
fastqc --version > v_fastqc.txt
multiqc --version > v_multiqc.txt
STAR --version > v_star.txt
htseq-count . . --version > v_htseq.txt
samtools --version > v_samtools.txt
gffread --version > v_gffread.txt
salmon --version > v_salmon.txt
scrape_software_versions.py &> software_versions_mqc.yaml
"""

NextFlow SAMtools FastQC MultiQC STAR Cutadapt Salmon gffread htseqcount From line 721 of master/main.nf

'''
python !{workflow.projectDir}/bin/check_replicates.py -s !{sample_name} 2>&1
'''

NextFlow From line 773 of master/main.nf

'''
cp -n !{f_ext} !{base_name_file}.fasta
'''

NextFlow From line 804 of master/main.nf

	    '''
      gunzip -f -S .zip !{f_ext}
	    cp -n !{old_base_name_file} !{base_name_file}.fasta
	    '''

NextFlow From line 810 of master/main.nf

'''
gunzip -f !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.fasta
'''

NextFlow From line 817 of master/main.nf

'''
echo "Your pathogen genome files appear to have the wrong extension. \n Currently, the pipeline only supports .fasta or .fa, or compressed files with .zip or .gz extensions."
'''

NextFlow From line 822 of master/main.nf

'''
cp -n !{f_ext} !{base_name_file}.fasta
'''

NextFlow From line 852 of master/main.nf

'''
gunzip -f -S .zip !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.fasta
'''

NextFlow From line 858 of master/main.nf

'''
gunzip -f !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.fasta
'''

NextFlow From line 865 of master/main.nf

'''
echo "Your host genome files appear to have the wrong extension. \n Currently, the pipeline only supports .fasta or .fa, or compressed files with .zip or .gz extensions."
'''

NextFlow From line 870 of master/main.nf

'''
cp -n !{f_ext} !{base_name_file}.gff3
'''

NextFlow From line 901 of master/main.nf

'''
gunzip -f -S .zip !{f_ext}
cp -n !{base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 905 of master/main.nf

'''
gunzip -f !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 913 of master/main.nf

'''
echo "Your pathogen GFF file appears to be in the wrong format or has the wrong extension. \n Currently, the pipeline only supports .gff or .gff3, or compressed files with .zip or .gz extensions."
'''

NextFlow From line 918 of master/main.nf

'''
cp -n !{f_ext} !{base_name_file}.gff3
'''

NextFlow From line 953 of master/main.nf

'''
gunzip -f -S .zip !{f_ext}
cp -n !{base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 957 of master/main.nf

'''
gunzip -f !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 965 of master/main.nf

'''
echo "Your host GFF file appears to be in the wrong format or has the wrong extension. \n Currently, the pipeline only supports .gff or .gff3, or compressed files with .zip or .gz extensions."
'''

NextFlow From line 970 of master/main.nf

'''
cp -n !{f_ext} !{base_name_file}.gff3
'''

NextFlow From line 1005 of master/main.nf

'''
gunzip -f -S .zip !{f_ext}
cp -n !{base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 1009 of master/main.nf

'''
gunzip -f !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 1017 of master/main.nf

'''
echo "Your host GFF file appears to be in the wrong format or has the wrong extension. \n Currently, the pipeline only supports .gff or .gff3, or compressed files with .zip or .gz extensions."
'''

NextFlow From line 1022 of master/main.nf

'''
cp -n !{f_ext} !{base_name_file}.gff3
'''

NextFlow From line 1051 of master/main.nf

'''
gunzip -f -S .zip !{f_ext}
cp -n !{base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 1055 of master/main.nf

'''
gunzip -f !{f_ext}
cp -n !{old_base_name_file} !{base_name_file}.gff3
'''

NextFlow From line 1063 of master/main.nf

'''
echo "Your host GFF tRNA file appears to be in the wrong format or has the wrong extension. \n Currently, the pipeline only supports .gff or .gff3, or compressed files with .zip or .gz extensions."
'''

NextFlow From line 1068 of master/main.nf

"""
cat $pathogen_fa $host_fa > host_pathogen.fasta
"""

NextFlow From line 1100 of master/main.nf

"""
cat $host_gff_genome $host_gff_tRNA > ${outfile_name}
"""

NextFlow From line 1130 of master/main.nf

"""
$workflow.projectDir/bin/replace_feature_gff.sh $gff ${outfile_name} $features
"""

NextFlow From line 1158 of master/main.nf

"""
$workflow.projectDir/bin/replace_feature_gff.sh $gff ${outfile_name} $features
"""

NextFlow From line 1184 of master/main.nf

"""
$workflow.projectDir/bin/replace_attribute_gff.sh $gff ${outfile_name} $host_attribute $pathogen_attribute
"""

NextFlow From line 1212 of master/main.nf

"""
cat $pathogen_gff_genome $host_gff > host_pathogen_htseq.gff
"""

NextFlow From line 1237 of master/main.nf

"""
python $workflow.projectDir/bin/extract_annotations_from_gff.py -gff $gff -f $features -a $pathogen_attribute -org pathogen -q_tool htseq -o ${outfile_name}
"""

NextFlow HTSeq From line 1266 of master/main.nf

"""
python $workflow.projectDir/bin/extract_annotations_from_gff.py -gff $gff -f $features -a $host_attribute -org host -q_tool htseq -o ${outfile_name}
"""

NextFlow HTSeq From line 1295 of master/main.nf

"""
$workflow.projectDir/bin/extract_reference_names_from_fasta_files.sh reference_host_names.txt $host_fa
"""

NextFlow From line 1327 of master/main.nf

"""
$workflow.projectDir/bin/extract_reference_names_from_fasta_files.sh reference_pathogen_names.txt $pathogen_fa
"""

NextFlow From line 1354 of master/main.nf

"""
$workflow.projectDir/bin/replace_attribute_gff.sh $gff ${outfile_name} parent Parent
"""

NextFlow From line 1386 of master/main.nf

"""
$workflow.projectDir/bin/replace_attribute_gff.sh $gff ${outfile_name} parent $host_attribute
"""

NextFlow From line 1416 of master/main.nf

"""
cat $host_gff_genome $host_gff_tRNA > ${outfile_name}
"""

NextFlow From line 1441 of master/main.nf

"""
$workflow.projectDir/bin/replace_feature_gff.sh $gff ${outfile_name} $features
"""

NextFlow From line 1471 of master/main.nf

"""
$workflow.projectDir/bin/replace_attribute_gff.sh $gff ${outfile_name} parent $pathogen_attribute
"""

NextFlow From line 1499 of master/main.nf

"""
python $workflow.projectDir/bin/extract_annotations_from_gff.py -gff $gff -f $features -a parent -org pathogen -q_tool salmon -o ${outfile_name}
"""

NextFlow Salmon From line 1528 of master/main.nf

"""
python $workflow.projectDir/bin/extract_annotations_from_gff.py -gff $gff -f quant -a parent -org host -q_tool salmon -o ${outfile_name}
"""

NextFlow Quant Salmon From line 1562 of master/main.nf

"""
gffread -w $outfile_name -g $host_fa $gff
"""

NextFlow gffread From line 1597 of master/main.nf

"""
python $workflow.projectDir/bin/gff_to_fasta_transcriptome.py -fasta $host_fa -gff $gff  -f $features -a $attribute -o $outfile_name
"""

NextFlow From line 1627 of master/main.nf

"""
cat $host_tr_fa $host_tRNA_tr_fa > host_transcriptome.fasta
"""

NextFlow From line 1656 of master/main.nf

"""
python $workflow.projectDir/bin/gff_to_fasta_transcriptome.py -fasta $pathogen_fa -gff $gff -f $features -a $attribute  -o $outfile_name
"""

NextFlow From line 1702 of master/main.nf

"""
cat $pathogen_tr_fa $host_tr_fa > host_pathogen_transcriptome.fasta
"""

NextFlow From line 1730 of master/main.nf

"""
$workflow.projectDir/bin/replace_feature_gff.sh $gff ${outfile_name} $features
"""

NextFlow From line 1760 of master/main.nf

"""
cat $pathogen_gff_genome $host_gff > host_pathogen_star_alignment_mode.gff
"""

NextFlow From line 1786 of master/main.nf

"""
fastqc --quiet --threads $task.cpus --noextract $reads $fastqc_params
"""

NextFlow FastQC From line 1814 of master/main.nf

"""
cutadapt -j ${task.cpus} -q $q_value -a $adapter_seq_3 -m 1 -o ${name_out} $reads $cutadapt_params
"""

NextFlow Cutadapt From line 1859 of master/main.nf

"""
cutadapt -j ${task.cpus} -q $q_value -a ${adapter_seq_3[0]} -A ${adapter_seq_3[1]} -o ${name_1} -p ${name_2} -m 1 ${reads[0]} ${reads[1]} $cutadapt_params
"""

NextFlow Cutadapt From line 1872 of master/main.nf

"""
bbduk.sh -Xmx1g in=$reads out=${name_out} ref=$adapters minlen=$minlen qtrim=$qtrim trimq=$trimq ktrim=$ktrim k=$k mink=$mink hdist=$hdist &> $fileoutput $bbduk_params
"""

NextFlow BBMap From line 1915 of master/main.nf

"""
bbduk.sh -Xmx1g in1=${reads[0]} in2=${reads[1]} out1=${name_1} out2=${name_2} ref=$adapters minlen=$minlen qtrim=$qtrim trimq=$trimq ktrim=$ktrim k=$k mink=$mink hdist=$hdist $bbduk_params tpe tbo &> $fileoutput
"""

NextFlow BBMap From line 1929 of master/main.nf

"""
fastqc --threads ${task.cpus} --quiet --noextract $reads $fastqc_params
"""

NextFlow FastQC From line 1967 of master/main.nf

"""
$workflow.projectDir/bin/count_total_reads.sh $fastq >> total_raw_reads_fastq.tsv
"""

NextFlow From line 1999 of master/main.nf

"""
$workflow.projectDir/bin/collect_total_raw_read_pairs.py -i $tsv
"""

NextFlow From line 2025 of master/main.nf

'''
grep ">" !{host_fa} | cut -d " " -f 1 > decoys.txt
sed -i -e 's/>//g' decoys.txt
cat !{host_pathogen_transcriptome_fasta} !{host_fa} > gentrome.fasta
'''

NextFlow From line 2069 of master/main.nf

"""
salmon index -t $gentrome -i transcripts_index --decoys $decoys -k $kmer_length -p ${task.cpus} $keepDuplicates $salmon_sa_params_index
"""

NextFlow Salmon From line 2098 of master/main.nf

"""
salmon quant -p ${task.cpus} -i $index -l $libtype -r $reads $softclip --incompatPrior $incompatPrior $UnmappedNames --validateMappings $dumpEq $writeMappings -o $sample_name $salmon_sa_params_mapping
"""

NextFlow Quant Salmon From line 2136 of master/main.nf

		"""
		salmon quant -p ${task.cpus} -i $index -l $libtype -1 ${reads[0]} -2 ${reads[1]} $softclip --incompatPrior $incompatPrior $UnmappedNames --validateMappings $dumpEq $writeMappings -o $sample_name $salmon_sa_params_mapping
 		"""

NextFlow Quant Salmon From line 2142 of master/main.nf

"""
$workflow.projectDir/bin/split_quant_tables_salmon.sh $transcriptome_pathogen $transcriptome_host  salmon/*/quant.sf "quant.sf"
"""

NextFlow From line 2171 of master/main.nf

"""
$workflow.projectDir/bin/salmon_extract_ambig_uniq_transcripts_genes.R salmon/*/quant.sf salmon/*/aux_info/ambig_info.tsv $sample_name $annotations
"""

NextFlow From line 2203 of master/main.nf

"""
$workflow.projectDir/bin/salmon_host_comb_ambig_uniq.R salmon/*/aux_info/*_host_quant_ambig_uniq.sf
"""

NextFlow From line 2225 of master/main.nf

"""
$workflow.projectDir/bin/salmon_pathogen_comb_ambig_uniq.R salmon/*/aux_info/*_pathogen_quant_ambig_uniq.sf
"""

NextFlow From line 2247 of master/main.nf

"""
python $workflow.projectDir/bin/collect_quantification_data.py -i $input_quantification -q salmon -a $gene_attribute -org both
"""

NextFlow Salmon From line 2274 of master/main.nf

"""
$workflow.projectDir/bin/split_quant_tables_salmon.sh $transcriptome_pathogen $transcriptome_host $quant_table "quant_salmon.tsv"
pathonen_tab=\$(if [ \$(cat pathogen_quant_salmon.tsv | wc -l) -gt 1  ]; then echo "true"; else echo "false"; fi)
host_tab=\$(if [ \$(cat host_quant_salmon.tsv | wc -l) -gt 1  ]; then echo "true"; else echo "false"; fi)
"""

NextFlow From line 2310 of master/main.nf

"""
$workflow.projectDir/bin/combine_quant_annotations.py -q $quantification_table -annotations $annotation_table -a $attribute -org pathogen
"""

NextFlow From line 2338 of master/main.nf

"""
$workflow.projectDir/bin/combine_quant_annotations.py -q $quantification_table -annotations $annotation_table -a $attribute -org host
"""

NextFlow From line 2364 of master/main.nf

"""
$workflow.projectDir/bin/tximport.R salmon $annotations $sample_name
"""

NextFlow Salmon From line 2390 of master/main.nf

"""
python $workflow.projectDir/bin/collect_quantification_data.py -i $input_quantification -q salmon -a gene_id -org host_gene_level
"""

NextFlow Salmon From line 2414 of master/main.nf

"""
$workflow.projectDir/bin/combine_annotations_salmon_gene_level.py -q $quantification_table -annotations $annotation_table -a gene_id -org host
"""

NextFlow From line 2439 of master/main.nf

"""
python $workflow.projectDir/bin/scatter_plots.py -q $quant_table -a $attribute -org pathogen 
"""

NextFlow From line 2473 of master/main.nf

"""
python $workflow.projectDir/bin/scatter_plots.py -q $quant_table -a $attribute -org host
"""

NextFlow From line 2504 of master/main.nf

"""
$workflow.projectDir/bin/extract_processed_reads.sh salmon/*/aux_info/meta_info.json $sample_name salmon
"""

NextFlow Salmon From line 2528 of master/main.nf

"""
cat $process_reads > processed_reads_salmon.tsv
"""

NextFlow From line 2552 of master/main.nf

"""
python $workflow.projectDir/bin/mapping_stats.py -q_p $quant_table_pathogen -q_h $quant_table_host -total_processed $total_processed_reads -total_raw $total_raw_reads -a $attribute -t salmon -o salmon_host_pathogen_total_reads.tsv
"""

NextFlow Salmon From line 2580 of master/main.nf

"""
python $workflow.projectDir/bin/plot_mapping_statistics_salmon.py -i $stats
"""

NextFlow From line 2604 of master/main.nf

'''
python !{workflow.projectDir}/bin/RNA_class_content.py -q !{quant_table} -a !{attribute} -annotations !{gene_annotations} -q_tool salmon -org pathogen 2>&1
'''

NextFlow Salmon From line 2635 of master/main.nf

'''
python !{workflow.projectDir}/bin/RNA_class_content.py -q !{quant_table} -a !{attribute} -annotations !{gene_annotations} -rna !{rna_classes_to_replace} -q_tool salmon -org host 2>&1
'''

NextFlow Salmon From line 2666 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_each.py -i $stats_table
"""

NextFlow From line 2694 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_combined.py -i $stats_table -org pathogen
"""

NextFlow From line 2722 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_each.py -i $stats_table
"""

NextFlow From line 2749 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_combined.py -i $stats_table -org host
"""

NextFlow From line 2776 of master/main.nf

"""
mkdir index
STAR --runThreadN ${task.cpus} --runMode genomeGenerate --genomeDir index/ --genomeFastaFiles $fasta --sjdbGTFfile $gff --sjdbGTFfeatureExon exon --sjdbGTFtagExonParentTranscript Parent --sjdbOverhang $sjdbOverhang $star_salmon_index_params
"""

NextFlow STAR From line 2815 of master/main.nf

"""
mkdir $sample_name
STAR --runThreadN ${task.cpus} --genomeDir . --sjdbGTFfile $gff $readFilesCommand --readFilesIn $reads --outSAMtype BAM Unsorted --outSAMunmapped $outSAMunmapped --outSAMattributes $outSAMattributes --outFileNamePrefix $sample_name/$sample_name --sjdbGTFfeatureExon quant --sjdbGTFtagExonParentTranscript parent --quantMode TranscriptomeSAM --quantTranscriptomeBan $quantTranscriptomeBan --outFilterMultimapNmax $outFilterMultimapNmax --outFilterType $outFilterType --limitBAMsortRAM $limitBAMsortRAM --alignSJoverhangMin $alignSJoverhangMin --alignSJDBoverhangMin $alignSJDBoverhangMin --outFilterMismatchNmax $outFilterMismatchNmax --outFilterMismatchNoverReadLmax $outFilterMismatchNoverReadLmax --alignIntronMin $alignIntronMin --alignIntronMax $alignIntronMax --alignMatesGapMax $alignMatesGapMax --winAnchorMultimapNmax $winAnchorMultimapNmax $star_salmon_alignment_params
"""

NextFlow STAR Quant From line 2863 of master/main.nf

"""
mkdir $sample_name
STAR --runThreadN ${task.cpus} --genomeDir . --sjdbGTFfile $gff $readFilesCommand --readFilesIn ${reads[0]} ${reads[1]} --outSAMtype BAM Unsorted --outSAMunmapped $outSAMunmapped --outSAMattributes $outSAMattributes --outFileNamePrefix $sample_name/$sample_name --sjdbGTFfeatureExon quant --sjdbGTFtagExonParentTranscript parent --quantMode TranscriptomeSAM --quantTranscriptomeBan $quantTranscriptomeBan --outFilterMultimapNmax $outFilterMultimapNmax --outFilterType $outFilterType --limitBAMsortRAM $limitBAMsortRAM --alignSJoverhangMin $alignSJoverhangMin --alignSJDBoverhangMin $alignSJDBoverhangMin --outFilterMismatchNmax $outFilterMismatchNmax --outFilterMismatchNoverReadLmax $outFilterMismatchNoverReadLmax --alignIntronMin $alignIntronMin --alignIntronMax $alignIntronMax --alignMatesGapMax $alignMatesGapMax --winAnchorMultimapNmax $winAnchorMultimapNmax $star_salmon_alignment_params
"""

NextFlow STAR Quant From line 2868 of master/main.nf

"""
salmon quant -p ${task.cpus} -t $transcriptome -l $libtype -a $bam_file --incompatPrior $incompatPrior -o $sample_name $salmon_alignment_based_params
"""

NextFlow Quant Salmon From line 2902 of master/main.nf

"""
$workflow.projectDir/bin/split_quant_tables_salmon.sh $transcriptome_pathogen $transcriptome_host salmon/*/quant.sf "quant.sf"
"""

NextFlow From line 2930 of master/main.nf

"""
$workflow.projectDir/bin/salmon_extract_ambig_uniq_transcripts_genes.R salmon/*/quant.sf salmon/*/aux_info/ambig_info.tsv $sample_name $annotations
"""

NextFlow From line 2963 of master/main.nf

"""
$workflow.projectDir/bin/salmon_host_comb_ambig_uniq.R salmon/*/aux_info/*_host_quant_ambig_uniq.sf
"""

NextFlow From line 2985 of master/main.nf

"""
$workflow.projectDir/bin/salmon_pathogen_comb_ambig_uniq.R salmon/*/aux_info/*_pathogen_quant_ambig_uniq.sf
"""

NextFlow From line 3007 of master/main.nf

"""
$workflow.projectDir/bin/tximport.R salmon $annotations $sample_name
"""

NextFlow Salmon From line 3033 of master/main.nf

"""
python $workflow.projectDir/bin/collect_quantification_data.py -i $input_quantification -q salmon -a gene_id -org host_gene_level
"""

NextFlow Salmon From line 3057 of master/main.nf

"""
python $workflow.projectDir/bin/collect_quantification_data.py -i $input_quantification -q salmon -a $gene_attribute -org both
"""

NextFlow Salmon From line 3082 of master/main.nf

"""
$workflow.projectDir/bin/split_quant_tables_salmon.sh $transcriptome_pathogen $transcriptome_host $quant_table "quant_salmon.tsv"
pathonen_tab=\$(if [ \$(cat pathogen_quant_salmon.tsv | wc -l) -gt 1  ]; then echo "true"; else echo "false"; fi)
host_tab=\$(if [ \$(cat host_quant_salmon.tsv | wc -l) -gt 1  ]; then echo "true"; else echo "false"; fi)
"""

NextFlow From line 3117 of master/main.nf

"""
$workflow.projectDir/bin/combine_quant_annotations.py -q $quantification_table -annotations $annotation_table -a $attribute -org pathogen
"""

NextFlow From line 3145 of master/main.nf

"""
$workflow.projectDir/bin/combine_quant_annotations.py -q $quantification_table -annotations $annotation_table -a $attribute -org host
"""

NextFlow From line 3170 of master/main.nf

"""
$workflow.projectDir/bin/combine_annotations_salmon_gene_level.py -q $quantification_table -annotations $annotation_table -a gene_id -org host
"""

NextFlow From line 3194 of master/main.nf

"""
$workflow.projectDir/bin/extract_processed_reads.sh $Log_final_out $sample_name star
"""

NextFlow From line 3220 of master/main.nf

"""
cat $process_reads > processed_reads_star.tsv
"""

NextFlow From line 3243 of master/main.nf

"""
python $workflow.projectDir/bin/scatter_plots.py -q $quant_table -a $attribute -org pathogen
"""

NextFlow From line 3274 of master/main.nf

"""
python $workflow.projectDir/bin/scatter_plots.py -q $quant_table -a $attribute -org host 
"""

NextFlow From line 3305 of master/main.nf

"""
$workflow.projectDir/bin/extract_processed_reads.sh salmon_alignment_mode/*/aux_info/meta_info.json $sample_name salmon_alignment
"""

NextFlow From line 3329 of master/main.nf

"""
cat $process_reads > processed_reads_salmon_alignment.tsv
"""

NextFlow From line 3352 of master/main.nf

"""
python $workflow.projectDir/bin/mapping_stats.py -q_p $quant_table_pathogen -q_h $quant_table_host -total_processed $total_processed_reads -total_raw $total_raw_reads -a $attribute --star_processed $total_processed_reads_star -t salmon_alignment -o salmon_alignment_host_pathogen_total_reads.tsv
"""

NextFlow From line 3381 of master/main.nf

"""
python $workflow.projectDir/bin/plot_mapping_statistics_salmon_alignment.py -i $stats
"""

NextFlow From line 3407 of master/main.nf

'''
python !{workflow.projectDir}/bin/RNA_class_content.py -q !{quant_table} -a !{attribute} -annotations !{gene_annotations} -q_tool salmon -org pathogen 2>&1
'''

NextFlow Salmon From line 3436 of master/main.nf

'''
python !{workflow.projectDir}/bin/RNA_class_content.py -q !{quant_table} -a !{attribute} -annotations !{gene_annotations} -rna !{rna_classes_to_replace} -q_tool salmon -org host 2>&1
'''

NextFlow Salmon From line 3466 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_each.py -i $stats_table
"""

NextFlow From line 3494 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_each.py -i $stats_table
"""

NextFlow From line 3521 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_combined.py -i $stats_table -org pathogen
"""

NextFlow From line 3549 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_combined.py -i $stats_table -org host
"""

NextFlow From line 3576 of master/main.nf

"""
mkdir index
STAR --runThreadN ${task.cpus} --runMode genomeGenerate --genomeDir index/ --genomeFastaFiles $fasta --sjdbGTFfile $gff --sjdbGTFfeatureExon exon --sjdbGTFtagExonParentTranscript Parent --sjdbOverhang $sjdbOverhang $star_index_params
"""

NextFlow STAR From line 3617 of master/main.nf

"""
	mkdir $sample_name
	STAR --runThreadN ${task.cpus} --genomeDir . --sjdbGTFfile $gff $readFilesCommand --readFilesIn $reads --outSAMtype BAM SortedByCoordinate --outSAMunmapped $outSAMunmapped --outSAMattributes $outSAMattributes --outWigType $outWigType --outWigStrand $outWigStrand --outFileNamePrefix $sample_name/$sample_name --sjdbGTFfeatureExon exon --sjdbGTFtagExonParentTranscript Parent --outFilterMultimapNmax $outFilterMultimapNmax --outFilterType $outFilterType --limitBAMsortRAM $limitBAMsortRAM --alignSJoverhangMin $alignSJoverhangMin --alignSJDBoverhangMin $alignSJDBoverhangMin --outFilterMismatchNmax $outFilterMismatchNmax --outFilterMismatchNoverReadLmax $outFilterMismatchNoverReadLmax --alignIntronMin $alignIntronMin --alignIntronMax $alignIntronMax --alignMatesGapMax $alignMatesGapMax --winAnchorMultimapNmax $winAnchorMultimapNmax $star_alignment_params
"""

NextFlow STAR From line 3666 of master/main.nf

"""
mkdir $sample_name
STAR --runThreadN ${task.cpus} --genomeDir . --sjdbGTFfile $gff $readFilesCommand --readFilesIn ${reads[0]} ${reads[1]} --outSAMtype BAM SortedByCoordinate --outSAMunmapped $outSAMunmapped --outSAMattributes $outSAMattributes --outWigType $outWigType --outWigStrand $outWigStrand --outFileNamePrefix $sample_name/$sample_name --sjdbGTFfeatureExon exon --sjdbGTFtagExonParentTranscript Parent --outFilterMultimapNmax $outFilterMultimapNmax --outFilterType $outFilterType --limitBAMsortRAM $limitBAMsortRAM --alignSJoverhangMin $alignSJoverhangMin --alignSJDBoverhangMin $alignSJDBoverhangMin --outFilterMismatchNmax $outFilterMismatchNmax --outFilterMismatchNoverReadLmax $outFilterMismatchNoverReadLmax --alignIntronMin $alignIntronMin --alignIntronMax $alignIntronMax --alignMatesGapMax $alignMatesGapMax --winAnchorMultimapNmax $winAnchorMultimapNmax $star_alignment_params
"""

NextFlow STAR From line 3671 of master/main.nf

"""
$workflow.projectDir/bin/remove_crossmapped_reads_BAM.sh $alignment $workflow.projectDir/bin $host_reference $pathogen_reference $cross_mapped_reads $bam_file_without_crossmapped
"""

NextFlow From line 3708 of master/main.nf

"""
$workflow.projectDir/bin/remove_crossmapped_read_pairs_BAM.sh $alignment $workflow.projectDir/bin $host_reference $pathogen_reference $cross_mapped_reads $bam_file_without_crossmapped
"""

NextFlow From line 3712 of master/main.nf

"""
$workflow.projectDir/bin/extract_processed_reads.sh $Log_final_out $sample_name star
"""

NextFlow From line 3737 of master/main.nf

"""
cat $process_reads > processed_reads_star.tsv
"""

NextFlow From line 3762 of master/main.nf

'''
!{workflow.projectDir}/bin/count_uniquely_mapped_reads.sh !{alignment} !{host_reference_names} !{pathogen_reference_names} !{sample_name} !{name}
'''

NextFlow From line 3790 of master/main.nf

'''
!{workflow.projectDir}/bin/count_uniquely_mapped_read_pairs.sh !{alignment} !{host_reference_names} !{pathogen_reference_names} !{sample_name} !{name}
'''

NextFlow From line 3794 of master/main.nf

"""
python $workflow.projectDir/bin/combine_tables.py -i $stats -o uniquely_mapped_reads_star.tsv -s uniquely_mapped_reads
"""

NextFlow From line 3818 of master/main.nf

"""
$workflow.projectDir/bin/count_cross_mapped_reads.sh $cross_mapped_reads
"""

NextFlow From line 3842 of master/main.nf

'''
!{workflow.projectDir}/bin/count_multi_mapped_reads.sh !{alignment} !{host_reference_names} !{pathogen_reference_names} !{sample_name} !{name}
'''

NextFlow From line 3870 of master/main.nf

'''
!{workflow.projectDir}/bin/count_multi_mapped_read_pairs.sh !{alignment} !{host_reference_names} !{pathogen_reference_names} !{sample_name} !{name}
'''

NextFlow From line 3874 of master/main.nf

"""
python $workflow.projectDir/bin/combine_tables.py -i $stats -o multi_mapped_reads_star.tsv -s multi_mapped_reads
"""

NextFlow From line 3899 of master/main.nf

"""
python $workflow.projectDir/bin/mapping_stats.py -total_raw $total_raw_reads -total_processed $total_processed_reads -m_u $uniquely_mapped_reads -m_m $multi_mapped_reads -c_m $cross_mapped_reads -t star -o star_mapping_stats.tsv
"""

NextFlow From line 3928 of master/main.nf

"""
python $workflow.projectDir/bin/plot_mapping_stats_star.py -i $stats
"""

NextFlow From line 3952 of master/main.nf

"""
htseq-count -n $task.cpus -t quant -f bam -r pos $st $gff -i $host_attr -s $stranded --max-reads-in-buffer=$max_reads_in_buffer -a $minaqual $htseq_params > $name_file2
sed -i '1{h;s/.*/'"$sample_name"'/;G}' "$name_file2"
"""

NextFlow Quant htseqcount From line 3999 of master/main.nf

"""
python $workflow.projectDir/bin/collect_quantification_data.py -i $input_quantification -q htseq -a $host_attribute 
"""

NextFlow HTSeq From line 4029 of master/main.nf

"""
$workflow.projectDir/bin/calculate_TPM_HTSeq.R $input_quantification $host_attribute $gff_pathogen $gff_host
"""

NextFlow From line 4057 of master/main.nf

	    """
	    $workflow.projectDir/bin/split_quant_tables.sh $quant_table $host_annotations $pathogen_annotations quantification_uniquely_mapped_htseq.tsv
            pathonen_tab=\$(if [ \$(cat pathogen_quantification_uniquely_mapped_htseq.tsv | wc -l) -gt 1  ]; then echo "true"; else echo "false"; fi)
            host_tab=\$(if [ \$(cat host_quantification_uniquely_mapped_htseq.tsv | wc -l) -gt 1  ]; then echo "true"; else echo "false"; fi)
	    """

NextFlow From line 4094 of master/main.nf

"""
$workflow.projectDir/bin/combine_quant_annotations.py -q $quantification_table -annotations $annotation_table -a $attribute -org pathogen
"""

NextFlow From line 4122 of master/main.nf

"""
$workflow.projectDir/bin/combine_quant_annotations.py -q $quantification_table -annotations $annotation_table -a $attribute -org host
"""

NextFlow From line 4148 of master/main.nf

"""
python $workflow.projectDir/bin/scatter_plots.py -q $quant_table -a $attribute -org pathogen
"""

NextFlow From line 4181 of master/main.nf

"""
python $workflow.projectDir/bin/scatter_plots.py -q $quant_table -a $attribute -org host 
"""

NextFlow From line 4213 of master/main.nf

"""
python $workflow.projectDir/bin/mapping_stats.py -q_p $quant_table_pathogen -q_h $quant_table_host -a $attribute  -star $star_stats -t htseq -o htseq_uniquely_mapped_reads_stats.tsv
"""

NextFlow HTSeq From line 4241 of master/main.nf

"""
python $workflow.projectDir/bin/plot_mapping_stats_htseq.py -i $stats
"""

NextFlow From line 4266 of master/main.nf

'''
python !{workflow.projectDir}/bin/RNA_class_content.py -q !{quant_table} -a !{attribute} -annotations !{gene_annotations} -q_tool htseq -org pathogen 2>&1
'''

NextFlow HTSeq From line 4296 of master/main.nf

'''
python !{workflow.projectDir}/bin/RNA_class_content.py -q !{quant_table} -a !{attribute} -annotations !{gene_annotations} -rna !{rna_classes_to_replace} -q_tool htseq -org host 2>&1
'''

NextFlow HTSeq From line 4327 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_each.py -i $stats_table
"""

NextFlow From line 4355 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_each.py -i $stats_table
"""

NextFlow From line 4383 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_combined.py -i $stats_table -org pathogen
"""

NextFlow From line 4411 of master/main.nf

"""
python $workflow.projectDir/bin/plot_RNA_class_stats_combined.py -i $stats_table -org host
"""

NextFlow From line 4438 of master/main.nf

"""
multiqc -d --export -f $rtitle $rfilename $custom_config_file . 
"""

NextFlow MultiQC From line 4477 of master/main.nf

"""
markdown_to_html.py $output_docs -o results_description.html
"""

NextFlow From line 4499 of master/main.nf

ShowHide 144 more snippets with no or duplicated tags.

Comments

Support

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Created: 1yr ago

Updated: 1yr ago

Maitainers: public

URL: https://nf-co.re/dualrnaseq

Name: dualrnaseq

Version: 1.0.0

Badge:

Insert copied code into your website to add a link to this workflow.

License: None

Keywords:

FASTQ-illumina raw sequence reads RNA-Seq quantification Quality control report BBMap Cutadapt FastQC gffread HTSeq htseqcount MultiQC Quant Salmon SAMtools STAR RNA-Seq

Refs:

https://nf-co.re/dualrnaseq

Future updates

Related Workflows

psychip_snakemake — Show Details View Workflow

ENCODE pipeline for histone marks developed for the psychENCODE project

public

psychip pipeline is an improved version of the ENCODE pipeline for histone marks developed for the psychENCODE project. The o...

raw sequence reads Alignment Sequence alignment report macs2 ucsc-bedclip bedGraphToBigWig BEDTools BWA Picard SAMtools Snakemake

Free

Near-real time tracking of SARS-CoV-2 in Connecticut

public

Repository containing scripts to perform near-real time tracking of SARS-CoV-2 in Connecticut using genomic data. This pipeli...

JSON nextclade Augur Biopython FOCUS Pandas Snakemake bs4 epiweeks geopy matplotlib numpy pycountry pycountry-convert uszipcode

Free

cellranger-snakemake-gke — Show Details View Workflow

snakemake workflow to run cellranger on a given bucket using gke.

public

A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...

macs2 ucsc-bedclip bedGraphToBigWig BEDTools BWA Picard SAMtools Snakemake

Free

ATLAS - Three commands to start analyzing your metagenome data

public

Metagenome-atlas is a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, t...

raw sequence reads Genome assembly Annotation track checkm2 gunc prodigal snakemake-wrapper-utils MEGAHIT Atlas BBMap Biopython BioRuby Bwa-mem2 cd-hit CheckM DAS Diamond eggNOG-mapper v2 MetaBAT 2 Minimap2 MMseqs MultiQC Pandas Picard pyfastx SAMtools SemiBin Snakemake SPAdes SqueezeMeta TADpole VAMB CONCOCT ete3 gtdbtk h5py networkx numpy plotly psutil utils metagenomics

Free

175

rna-seq-star-deseq2 — Show Details View Workflow

RNA-seq workflow using STAR and DESeq2

public

This workflow performs a differential gene expression analysis with STAR and Deseq2. The usage of this workflow is described ...

Free

dna-seq-gatk-variant-calling — Show Details View Workflow

This Snakemake pipeline implements the GATK best-practices workflow

public

This Snakemake pipeline implements the GATK best-practices workflow for calling small germline variants. The usage of thi...

VCF raw sequence reads Variant calling genetic variants gatk rust-bio-tools snakemake-wrapper-utils tabix BCFtools BWA FastQC MultiQC Pandas Picard SAMtools Snakemake Trimmomatic Variant Effect Predictor (VEP) common matplotlib numpy seaborn DNA

Free