R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Summary of Document

In this document, we’ll go through how we can quickliy simulate trees so we can learn how to do important phylogenetic techniques and methods, such as: tree manipulation, rerooting, signal estimation, trait mapping, etc.

First, you should load the required packages. You should install if you do not have them yet.

library(ape)

First to get going, we’ll simulate a tree in R. We’ll do a simple one. Something like 50 tips should work well.

set.seed(585)
tree<-rtree(50)
plot(tree, cex=0.8)

Finding a Smoothing Paramter

in order to date this tree, we need to identify a smoothing parameter called lambda (different than Pagel’s Lambda). To do this, we’ll want to test multiple values for lambda. We’ll set up a list of numbers which are 10 to the power of -1 through 6 as the variable ‘l’ (lowercase L). We’ll also make a blank list called cv (for crossvalidation values). This will be as long as the number of values we have in “l”

l<-10^(-1:6)
cv<-numeric(length(l))

In order to test these values, we will run a loop in which we run the cross validation. This will run a check for each tip. The bigger the tree is, the longer it’ll take. The output of the loops will be pasted into the “cv” object. I won’t print the output here, as it’s a bit messy

We will want to plot the results to see which lambda we want to use. The lower the cv value, the better. We can then also use the min() function to tell us if we are questioning the plot

plot(l,cv)

min(cv)
## [1] -1480.535
# we can see how it stands within the list
cv
## [1] 76091.5625 87281.1897 -1480.5348 -1420.0021 -1413.2562 -1281.8597  -114.7272
## [8]   568.4544
#this function lets us know which position that is
which.min(cv)
## [1] 3

Smoothing our Tree

In order to make our tree “ultrametric” in which all current taxa reach the end of the tree (nothing looks extinct), we can use the chronos() function. We can see the clear difference below

#we can use the "which.min()" function again to give us the position for the lambda list
new.tree<-chronos(tree, lambda=l[which.min(cv)], model="correlated")
## 
## Setting initial dates...
## Fitting in progress... get a first set of estimates
##          (Penalised) log-lik = -77.42488 
## Optimising rates... dates... -77.42488 
## 
## log-Lik = -76.93595 
## PHIIC = 446
plot(new.tree)

#vs old tree without smoothing
plot(tree)

We can confirm the length of the tree from root to tip using the below function. We can see the new.tree object is 1, where as that’s not the case for the other

#old tree
max(node.depth.edgelength(tree))
## [1] 6.062266
#new tree
max(node.depth.edgelength(new.tree))
## [1] 1

Correcting for Time

While the tree looks good, it’s dated from 0 to 1 (rather 1 to 0 in “million years ago”). We should correct this to represent actual time and not proportion of time.

First, we need to know how long the divergence date is for our outgroup from the rest of the tree. You can find this information usually on https://timetree.org/

Let’s say for this example the divergence date was 87 million years ago. We want to correct our tree for that length of time. We can then check the length and it should line up with the date we gave it.

root.time<-87
new.tree$edge.length<-new.tree$edge.length*root.time
plot(new.tree)

 max(node.depth.edgelength(new.tree))
## [1] 87

We can then write our tree file to our working directory using this line: write.tree(new.tree, “timed.tree.new”)