Genetic Code Prediction and Annotation in Uncultivated Ciliates from Marine Environments
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
Genetic code prediction from karyorelict and heterotrich ciliates
Among the ciliates, there is an unusually diverse number of genetic codes used by different species, compared with other groups of eukaryotes. The karyorelicts and heterotrichs have some of the most unusual types of codes, where stop codons are potentially ambiguous and can also code for amino acids.
Aims:
-
Assemble single cell transcriptome RNAseq data from uncultivated ciliates
-
Assemble whole-genome-amplification DNAseq data from uncultivated ciliates
-
Predict genetic codes from above assembles
-
Annotate genes from above assemblies, identify clear examples of ambiguous stop genetic codes
See our preprint (doi:10.1101/2022.04.12.488043) for more information.
Data
-
Single-cell transcriptome RNAseq from uncultivated marine ciliates, collected from Roscoff, France
-
Published datasets downloaded from SRA
Data deposition
-
Primary read data from this study: ENA PRJEB50648
-
Snakemake pipelines:
Code Snippets
80 81 82 | shell: "phyloFlash.pl -lib {wildcards.lib}_rnaseq_maponly_{wildcards.readlim} -readlength 150 -readlimit {wildcards.readlim} -skip_spades -read1 {input.fwd} -read2 {input.rev} -CPUs {threads} -html -treemap -log -poscov -zip -dbhome {params.db} 2> {log};" "mv {wildcards.lib}_rnaseq_maponly_{wildcards.readlim}.phyloFlash* qc/phyloFlash/;" |
112 113 114 | shell: "phyloFlash.pl -lib {wildcards.lib}_dnaseq -readlength 150 -read1 {input.fwd} -read2 {input.rev} -CPUs {threads} -almosteverything -dbhome {params.db} 2> {log};" "mv {wildcards.lib}_dnaseq.phyloFlash* qc/phyloFlash/;" |
126 127 | shell: "bbduk.sh -Xmx10g threads={threads} ref=resources/adapters.fa,resources/phix174_ill.ref.fa.gz in={input.fwd} in2={input.rev} ktrim=r qtrim=rl trimq=24 minlength=25 out={output.fwd} out2={output.rev} 2> {log}" |
139 140 | shell: "bbduk.sh -Xmx10g threads={threads} ref=resources/adapters.fa,resources/phix174_ill.ref.fa.gz in={input.fwd} in2={input.rev} ktrim=r qtrim=rl trimq=24 minlength=25 out={output.fwd} out2={output.rev} 2> {log}" |
153 154 | shell: "Trinity --seqType fq --max_memory 64G --bflyHeapSpaceMax 40G --CPU {threads} --full_cleanup --left {input.fwd} --right {input.rev} --output {params.prefix} &> {log};" |
171 172 173 | shell: "cat {input.assem} | parallel --gnu -j {threads} --recstart '>' -N 100 --pipe blastx -query - -db {params.db_prefix} -evalue 1e-20 -max_target_seqs 1 -outfmt 6 > {output.blastx};" "analyze_blastPlus_topHit_coverage.pl {output.blastx} {input.assem} {input.db} &> {log};" |
184 185 | shell: "bbduk.sh -Xmx10g threads={threads} ref=resources/adapters.fa,resources/phix174_ill.ref.fa.gz in={input} ktrim=r qtrim=rl trimq=24 minlength=25 out={output} 2> {log}" |
196 197 | shell: "Trinity --seqType fq --max_memory 64G --bflyHeapSpaceMax 40G --CPU {threads} --full_cleanup --single {input} --output {params.prefix} &> {log};" |
Support
- Future updates
Related Workflows





