The main class object users interact with in toytree is called a ToyTree
. This object contains a number of useful functions for interacting with the underlying TreeNode
structure (e.g., rooting, dropping tips) and for drawing trees and adding data from the tree (e.g., support values) to the plots. The link between tree structure and the data used to build tree drawings is tightly linked in toytree with the goal of making it very difficult for users to accidentally plot tip or node labels in an incorrect order. This section of the tutorial is primarily about how ToyTree objects store data, and how to access it easily using their functions.
In [1]:
import toytree
import toyplot
import numpy as np
In [2]:
# load a tree for this tutorial
tre = toytree.tree("https://eaton-lab.org/data/Cyathophora.tre")
Toytree provides many functions for modifying the tree structure (e.g., rooting a tree, dropping tips), as well as methods for applying styles to specific parts of the tree (e.g., coloring edges differently). Both of these require an easy and reliable method for selecting specific parts of the tree while also minimizing the chance for user error.
In toytree we recommend using tip labels to select the location in the tree where it should be manipulated. Whey use tip labels instead of node names or indices? Well, using node indices (e.g., idx labels) would be a reasonable alternative, but it turns out this would likely be more error prone for users (although it is also allowed as an option). This is because if the tree is modified (e.g., if tips are dropped or the tree is re-rooted) the node indices will change. In contrast, the relationships among tips (i.e., who shares a more recent common ancestor with whom) does not change with any of these tree modifications. Node names are another option, but in most trees internal nodes are not named.
The plot below shows how node idx labels changes as the tree is modified. This is the reason why using idx labels as selectors is more error prone.
In [3]:
# store a rooted copy of tre (more on this later...)
rtre = tre.root(['33588_przewalskii', '32082_przewalskii'])
In [4]:
rtre.draw();
In [25]:
# a multitree storing the unrooted and rooted toytrees
mtre = toytree.mtree([tre, rtre])
# plot shows that idx labels change with rerooting
mtre.draw(
node_labels='idx',
node_sizes=15,
);
Many toytree functions allow for a variety of input methods to select the list of tip labels to represent a clade. To create the name list without having to type each name out by hand, you can use fuzzy name matching. The three options are to write each name into a list using the names
argument; to select samples based on a shared unique string sequence in their names with wildcard
; or using a regex
(regular expression) statement to match samples using more complex name patterns.
In the example below I use the function .get_mrca_idx_from_tip_labels()
, which returns the correct node index of the mrca of the tips entered as arguments. You can see in the example below that the names, wildcard, and regex arguments return the correct node label for the clade that includes the two przewalskii samples (see the figure above) for each tree.
In [26]:
# get an idx label of przewalskii clade using names, wildcard or regex
print('tre: ', tre.get_mrca_idx_from_tip_labels(names=['33588_przewalskii', '32082_przewalskii']))
print('tre: ', tre.get_mrca_idx_from_tip_labels(wildcard="prz"))
print('tre: ', tre.get_mrca_idx_from_tip_labels(regex="[0-9]*_przewalskii"))
# get an idx label of przewalskii clade using names, wildcard or regex
print('rtre:', rtre.get_mrca_idx_from_tip_labels(names=['33588_przewalskii', '32082_przewalskii']))
print('rtre:', rtre.get_mrca_idx_from_tip_labels(wildcard="prz"))
print('rtre:', rtre.get_mrca_idx_from_tip_labels(regex="[0-9]*_przewalskii"))
In [27]:
tre.idx_dict[19]
Out[27]:
If you really want to select parts of the tree using nodes because maybe the tip names are very hard to match then this can be done using the get_tip_labels()
function to build a list of tip names from a node idx label. If you enter an idx
argument to this function it will return a list of names descended from the node. If no idx argument is entered then the root node idx is used so that all tip labels are returned.
In [28]:
# get list of tips descended from a specific node in the tree
tre.get_tip_labels(idx=19)
Out[28]:
In [29]:
# get list of all tips in the tree
tre.get_tip_labels()
Out[29]:
The .get_tip_labels()
function can be combined with .get_mrca_idx_from_tip_labels()
function to get a list of names that are all descendend from a common ancestor. For example, in the rooted tree above if I wanted to get a list of all tip labels in the ingroup clade I could select just one sample from each of the two subclades in it with .get_mrca_idx_from_tip_labels()
to get the node idx of their common ancestor. Then pass this to .get_tip_labels()
to return the full list of descendants. This is an efficient way to build a list of tip label names for large clade without having to write them all out by hand.
In [30]:
# get node index (idx) of mrca
idx = rtre.get_mrca_idx_from_tip_labels(["29154_superba", "40578_rex"])
# get tip labels descended from node idx
rtre.get_tip_labels(idx=idx)
Out[30]:
ToyTrees provide a number of functions for modifying the tree structure. All of these methods return a modified copy of the object -- they do not change your original tree by modifying it in place. This is useful because you can reliably chain together multiple tree modification functions (e.g., see Chaining many functions and arguments). As discussed above, it is generally good practice to use tip name selectors to identify clades that should be modified on the tree. In some cases, if you are modifying a tree and using plotting styles that both rely on the tree structure, it may be easier and more clear to separate the code into multiple separate function calls. The process of chaining arguments together makes for elegant code, but use whichever method is most comfortable for you. See the Cookbook gallery for more examples.
In [31]:
# three ways to do the same re-rooting
rtre = tre.root(names=["32082_przewalskii", "33588_przewalskii"])
rtre = tre.root(wildcard="prz")
rtre = tre.root(regex="[0-9]*_przewalskii")
# draw the rooted tree
rtre.draw(node_labels='idx', node_sizes=15);
There is also a function .unroot()
to remove the root node from trees. This creates a polytomy at the root. Technically there still exists a point on the treenode structure that we refer to as the root, but it does not appear in drawings.
In [32]:
# an unrooted tree
rtre.unroot().draw();
Dropping tips from a tree retains the structure of the remaining nodes in the tree. Here again you can use fuzzy name matching to select the tips you wish to drop from the tree. In this case the names that are selected with matching do not have to form a monophyletic clade, however, if you select to remove all tips in the tree then it will raise an error.
In [33]:
rtre.drop_tips(wildcard="cyatho").draw();
In [34]:
# dropping tips unladderized the tree, so we re-ladderized it before plotting
rtre.drop_tips(wildcard="cyatho").ladderize().draw();
Rotating nodes of the tree does not affect the actual tree structure (e.g., the newick structure does not change), it simply affects the order of tips when the tree is drawn. You can rotate nodes by entering tip names as in the previous examples using either names, wildcard, or regex. The names must form a monophyletic clade for one of the descendants of the node you wish to rotate. Rotating nodes for plotting is usually done for some aesthetic reason, such as aligning tips better with geography or trait values plotted on the tips of the tree.
In [35]:
rtre.rotate_node(wildcard="prz").draw();
This method should generally not be used much unless needed. The problem is that you usually don't know what to set the branch length to for the new edge when you split a polytomy. If the tree is unrooted then you should use .root()
instead to root it. If you have a hard polytomy in the tree and need to resolve it then this will resolve all polytomies in the tree. You can change what the default .dist and .support values will be on the new node.
In [36]:
toytree.tree("((a,b,c),d);").resolve_polytomy(dist=1.).draw();
Because the tree modification calls in toytrees always return a copy of the object, you can chain together many of these functions when building a plot. This is especially nice if you are only modifying the tree temporarily for the purpose of plotting (e.g., rotating nodes), and so you don't need to store the intermediate trees. It's kind of analagous to using pipes in bash programming.
When chaining many function calls and plotting styles together in toytree code it is best to use good coding practices. In the example below I split each function call and style option over a separate line. This makes the code more readable, and easier to debug, since you can comment out a line at a time to examine its effect without it breaking the rest of the command. The parentheses surrounding the main function calls makes this possible.
In [37]:
# readable style for writing long draw functions
canvas, axes, mark = (
tre
.root(wildcard="prz")
.drop_tips(wildcard="superba")
.rotate_node(wildcard="30686")
.draw(
tip_labels_align=True,
edge_style={
"stroke": toytree.colors[3],
}
)
)
In [38]:
rtre.get_tip_labels() # list of labels in node-plot order
rtre.get_tip_coordinates() # array of tip plot coordinates in idx order
rtre.get_node_values() # list in node-plot order
rtre.get_node_dict() # dict mapping idx:name for each tip
rtre.get_node_coordinates() # array of node plot coordinates in idx order
rtre.get_edge_values() # list of edge values in edge plot order
rtre.get_edge_values_mapped(); # list of edge values with mapped dict in edge plot order
In [39]:
rtre.is_bifurcating() # boolean
rtre.is_rooted(); # boolean
In [40]:
rtre.nnodes # number of nodes in the tree
rtre.ntips # number of tips in the tree
rtre.newick # the newick representation of the tree
rtre.features # list of node features that can be accessed
rtre.style; # dict of plotting style of tree
In [41]:
# if no file handle is entered then the newick string is returned
rtre.write()
Out[41]:
In [42]:
# the fmt (format) options write different newick formats.
rtre.write(tree_format=9)
Out[42]:
In [43]:
# write to file
rtre.write("/tmp/mytree.tre", tree_format=0)