Snakemake workflow for running maximum likelihood (ML) phylogenetic analysis using RAxML-NG and the associated tools. Provides much more accurate results than IQ-TREE based phylogenies.
Scalable RAxML-NG-based phylogenetic analysis using Snakemake
This is a Snakemake workflow for running scalable maximum likelihood (ML) phylogenetic analysis using RAxML-NG and associated tools (Pythia, ModelTest-NG). This workflow is considerably slower than the IQ-TREE-based one, but is much better in terms of accuracy, especially in difficult-to-analyze datasets.
This workflow performs all steps sequentially, from MSA to model selection and phylogeny inference.
Usage:
snakemake --cores 10 --snakefile Snakefile
Dependencies:
Mafft https://github.com/GSLBiotech/mafft
Trimal https://github.com/inab/trimal
Pythia https://github.com/tschuelia/PyPythia
ModelTest-NG https://github.com/ddarriba/modeltest
RAxML-NG https://github.com/amkozlov/raxml-ng
ETE3 http://etetoolkit.org/
Code Snippets
5 6 7 8 9 10 11 12 13 14 15 | if [ $# != 1 ]; then echo "USAGE: ./script <fasta-file>" exit fi numSpec=$(grep -c ">" $1) tmp=$(cat $1 | sed "s/>[ ]*\(\w*\).*/;\1</" | tr -d "\n" | tr -d ' ' | sed 's/^;//' | tr "<" " " ) length=$(($(echo $tmp | sed 's/[^ ]* \([^;]*\);.*/\1/' | wc -m ) - 1)) echo "$numSpec $length" echo $tmp | tr ";" "\n" |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | import re import argparse from ete3 import Tree parser = argparse.ArgumentParser() parser.add_argument('tree_newick') args = parser.parse_args() input = args.tree_newick def midpoint_root(input): tree = Tree(input, format = 1) ## get midpoint root of tree ## midpoint = tree.get_midpoint_outgroup() ## set midpoint root as outgroup ## tree.set_outgroup(midpoint) tree.write(format=1, outfile=input+".midpoint_rooted") midpoint_root(input) |
14 | shell: "cat {input[0]} > {output[0]}" |
19 | shell: "mafft --auto {input[0]} > {output[0]}" |
24 | shell: "trimal -in {input} -out {output} -fasta -gt 0.50" |
29 | shell: "./aln2phylip.sh {input} > {output}" |
34 | shell: "pythia --msa {input} -r raxml-ng/bin/raxml-ng --removeDuplicates -o {output}" |
39 | shell: "modeltest-ng -i {input[0]} -d aa -t ml -c -T raxml" |
46 47 | shell: "MODEL=`grep -P "\sBIC" all_genes.pep.aln.trimmed.phy.log | grep -v "Best" | sed 's/.* //g' | sed 's/ .*//g'` && raxml-ng-mpi --all --msa {input.msa} --model $MODEL --tree rand{10},pars{90} --threads 40" |
52 | shell: "python3 midpoint.py {input[0]}" |
Support
- Future updates
Related Workflows





