In this tutorial we will be looking into how to output phylo4d_ext objects to various branch-annotated formats.
In a previous tutorial, we worked with a simple phylogenetic tree in class RBrownie:::phylo4d_ext. I wrote that tree to a file in simmap 1.5 format (using the methods we'll be reviewing here) and have attached it to this tutorial (download). The first step then is to open your R environment and download 'phyext_test.txt' to R's working directory (getwd()).
Now, let's read that file into R:
library(RBrownie)
phyd.ex <- read.simmap.new("phyext_test.txt", specialpatt=".*" )
class(phyd.ex) # 'phylo4d_ext'
NOTE: read.simmap.new is RBrownie's function for reading in simmap 1.5 formatted files. The specialpatt is a strange option and was added to fix some weird behavior; it should be gone in future releases (I think....). For now though, you should set it to ".*" when reading simmap v1.5 files.
phyd.ex has 4 tip nodes and 1 subnode (which can be viewed using plot(phyd.ex)). Before we get started with the tutorial, we need to prune this object a bit, just to make it more manageable. Basically, we just want to get rid of a few data columns and add some internal nodes data.
First, let's look at which data columns are available for this object
tdata(phyd.ex)
Looks like we have 4 columns: "standard_char", "limbsize" , "numberbaldspots", "isrodent". We don't really need "limbsize" or "numberofbaldspots" for this tutorial so let's dump 'em:
phyd.ex <- rmdata(phyd.ex, c("limbsize","numberbaldspots")) # could have used c(2,3)
NOTE: rmdata also removes subnode.data columns if it can find matching indices
And let's also add some internal node data for 'standard_char':
tmpdat = data.frame("standard_char"=rep(1,3))
tdata(phyd.ex, type="internal", merge.data=TRUE, match.data=F) <- tmpdat
Okay, now we are ready to look at different ways to output this tree.
SIMMAP 1.0 and 1.1
RBrownie supports writing 'phylo4d_ext' objects to older-style simmap format (1.0 and 1.1) chiefly because 1.1 is the format Brownie currently reads. It can only handle one character per node/subnode and does not label that character (in contrast to simmap 1.5). Using the write.simmap function we can see what this output looks like:
write.simmap(phyd.ex, usestate=1, vers=1.1, file=stdout())
(A:{0,0.075:1,0.025},(B:{0,0.2},(C:{0,0.3},D:{0,0.4}):{1,0.5}):{1,0.1}):{1,0};
Because simmap 1.1 is limited to one character per node write.simmap by default uses the first data column, "standard_char" (usestate=1 did not need to be specified). If we wanted to ouput "isrodent" instead, we could have:
write.simmap(phyd.ex, usestate=2, file = stdout())
# OR
write.simmap(phyd.ex, usestate="isrodent", file = stdout())
There are two things here to notice. First, the default version (vers=...) write.simmap uses in 1.1. Second, is that RBrownie will actually write out the NA values if that is what they are in your object
write.simmap.old is a wrapper for write.simmap(...,vers=1.1).
Writing this object to a nexus file (instead of a newick file) is just as easy:
write.nexus.simmap(phyd.ex, vers=1.1, usestate='isrodent', file=stdout() )
If you look at args(write.nexus.simmap), you'll see that the only arguments it accepts are a tree (obj) and a flag indicating whether or not to translate the taxa names (translate). Any other options specified (like vers and usestate) are passed to write.simmap through the ellipsis argument (...).
SIMMAP 1.5
The simmap 1.5 format is different from it's predecessors in that it supports the output of multiple variables per node and keeps them named. For example:
#write.simmap(phyd.ex, vers=1.5, file=stdout())
# OR:
write.simmap.new(phyd.ex, file=stdout())
Notice how when usestate was not specified, all variables were written out. But we can select only one or the other if we'd like:
write.simmap.new(phyd.ex, usestate="isrodent", file=stdout())
# OR
write.simmap.new(phyd.ex, usestate=2 , file=stdout())
The interface for writing out nexus files is simmap to version 1.1 as well:
write.nexus.simmap(phyd.ex, vers=1.5, file = stdout())
Finally, let's quickly test to make sure the read and write functions actually work together:
text10 = capture.output(write.simmap(phyd.ex,vers=1.0,file=stdout()))
text11 = capture.output(write.simmap(phyd.ex,vers=1.1,file=stdout()))
text15 = capture.output(write.simmap(phyd.ex,vers=1.5,file=stdout()))
#
newphyd10 = read.simmap(text = text10, vers=1.0)
newphyd11 = read.simmap(text = text11, vers=1.1)
newphyd15 = read.simmap.new(text = text15, specialpatt=".*")
#
# compare:
all( tdata(phyd.ex) == tdata(newphyd15), na.rm=T) # TRUE
all( sndata(phyd.ex) == sndata(newphyd15), na.rm=T) # TRUE
brownie
'RBrownie:::brownie' is an extension of 'RBrownie:::brownie' and classes of that type will work with all the functions above. However, there are also special write functions for writing 'brownie' objects to file using a modified nexus format which are directly compatible with the Brownie program.
This function is called writeBrownie. It takes a 'brownie' object and writes TAXA, TREES, CHARACTERS, CHARACTERS2, and ASSUMPTIONS block where the last two blocks are optional. All the trees in the TREES block are written in simmap 1.1 format as this is the format Brownie currently works with.
Using our tree/data phyd.ex, we first need to convert it to a 'brownie' object:
br = brownie(phyd.ex)
And now we can write it to file:
#writeBrownie(br, file="junk_brownie.nex")
# OR:
#writeBrownie(br, file="junk_brownie.nex", usestate=1)
NOTE: use readBrownie("junk_brownie.nex") to read it back in
writeBrownie writes all data except usestate to either CHARACTERS or CHARACTERS2 nexus block, separating the data by data type (continuous or discrete). usestate data is written only to the TREES portion of the file in old-style (1.1) simmap. And that's it, the function is pretty simple.
Conclusion
Again, please post any specific questions or bugs you encounter. Thanks!