ML Sequence-structure tree calculation
1 Download and install R including the phangorn package
Download the lastest R-version from www.r-project.org.To install the phangorn package, type:
install.packages("phangorn")
2 Calculate the sequence-structure tree based on maximum likelihood
In this examples we use a one-letter coded .fasta file containing sequence and structure information.# Load the phangorn librariy library(phangorn) # Read the aligned .fasta file (one letter encoded sequence and structure) file = "c:/path_to/seq_str_one_letter_code.fasta" dat = read.phyDat(file,format="fasta", type="AA") # Calculate the maximum likelihood tree dm = dist.ml(dat) tree = NJ(dm) # estimate a GTR+I+G ML tree using the nj tree as starting tree fitStart = pml(tree, dat, k=4, inv=.2) fit = optim.pml(fitStart, TRUE, TRUE, TRUE, TRUE, TRUE,TRUE) # plot tree root_name = "NameOfSpeciesWithinDataset" plot(ladderize(root(fit$tree, match(root_name,fit$tree$tip.label))), type="phylogram", show.node.label=T,underscore=T,cex=0.8,no.margin=T) # plot bootstraped tree bs = bootstrap.pml(fit, bs=100, optNni=TRUE) treeBS = plotBS(fit$tree, bs, type="phylogram", bs.col="red", bs.adj=NULL) # save ML tree write.tree(fit$tree,file="ML_GTR_I_G.tre") # save ML Bootstrap tree write.tree(treeBS,file="BS_ML_GTR_I_G.tre")
References:
R Core Team (2014) R: A Language and Environment for Statistical Computing. http://www.R-project.org
Schliep K.P. (2011) phangorn: phylogenetics analysis in R. Bioinformatics, 15;27(4):592-3
Wolf, M., Koetschan, C., Müller, T. (2014) ITS2, 18S, 16S or any other RNA — simply aligning sequences and their individual secondary structures simultaneously by an automatic approach. Gene, 546(2):145-9