The .treenode
attribute of ToyTrees allows users to access the underlying TreeNode structure directly. This is where you can traverse the tree and query the parent/child relationships of nodes. While this is used extensively within the code of toytree, most users will likely not need to interact with TreeNodes in order do most things they want toytree for (i.e., drawing). However, for power users, the TreeNode structure of toytrees provides a lot of additional functionality especially for doing scientific computation and research on trees. The TreeNode
object in toytree is a modified fork of the TreeNode in ete3. Thus, you can read the very detailed ete documentation if you want a detailed understanding of the object.
In [1]:
import toytree
import toyplot
import numpy as np
# generate a random tree
tre = toytree.rtree.unittree(ntips=10, seed=12345)
TreeNode objects are always nested inside of ToyTree objects, and accessed from ToyTrees. When you use .treenode
to access a TreeNode from a ToyTree you are actually accessing the top level node of the tree structure, the root. The root TreeNode is connected to every other TreeNode in the tree, and together they describe the tree structure.
In [2]:
# the .treenode attribute of the ToyTree returns its root TreeNode
tre.treenode
Out[2]:
In [3]:
# the .idx_dict of a toytree makes TreeNodes accessible by index
tre.idx_dict
Out[3]:
To traverse a tree means to move from node to node to visit every node of the tree. In this case, we move from TreeNode to TreeNode. Depending on your reason for traversing the tree, the order in which nodes are visited may be arbitrary, or, it may actually be very important. For example, if you wish to calculate some new value on a node that depends on the values of its children, then you will want to visit the child nodes before you visit their parents. TreeNodes can be traversed in three ways. Below I print the order that nodes are visited for each. You can see the node index labels plotted on the tree which toytree uses to order nodes for plotting.
In [4]:
print('levelorder:', [node.idx for node in tre.treenode.traverse("levelorder")])
print('preorder: ', [node.idx for node in tre.treenode.traverse("preorder")])
print('postorder: ', [node.idx for node in tre.treenode.traverse("postorder")])
tre.draw(node_labels=True, node_sizes=16);
TreeNodes have a large number of attributes and functions available to them which you can explore using tab-completion in a notebook and from the ete3 tutorial. In general, only advanced users will need to access attributes of the TreeNodes directly. For example, it is easier to access node idx and name labels from ToyTrees than from TreeNodes, since ToyTrees will return the values in the order they will be plotted.
In [5]:
# traverse the tree and access node attributes
for node in tre.treenode.traverse(strategy="levelorder"):
print("{:<5} {:<5} {:<5} {:<5}".format(
node.idx, node.name, node.is_leaf(), node.is_root()
)
)
For the purposes of plotting, there are cases where accessing TreeNode attributes can be particularly powerful. For example, when you want to build a list of values for plotting that are based on the tree structure itself (number of children, edge length, is_root, etc.). You can traverse through the tree and calculate these attributes for each node.
When doing so, I have a recommended best practice that once again is intended to help users avoid accidentally plotting values in an incorrect order. This recommended practice is to add new features to the TreeNodes by traversing the tree, but then to retrieve and plot the features from the TreeNodes using ToyTree, since ToyTrees are the objects that organize the coordinates for plotting.
In [6]:
# see available features on a ToyTree
tre.features
Out[6]:
Let's say we wanted to plot a value on each node of a toytree. You can use the toytree function .set_node_values()
to set a value to each node. This takes the feature name, a dictionary mapping values to idx labels, and optionally a default value that is assigned to all other nodes. You can modify existing features or set new features.
In [18]:
# set a feature a few nodes with a new name
tre = tre.set_node_values(
feature="name",
values={0: 'tip-0', 1: 'tip-1', 2: 'tip-2'},
)
In [19]:
# set a feature to every node of a random integer in 1-5
tre = tre.set_node_values(
feature="randomint",
values={idx: np.random.randint(1, 5) for idx in tre.idx_dict},
)
Another potentially useful 'feature' to access includes statistics about the tree. For example, we may want to measure the number of extant descendants of each node on a tree. Such things can be measured directly from TreeNode objects. Below I use get_leaves()
as an example. You can see the ete3 docs for more info on TreeNode functions and attributes.
In [20]:
# set a feature to every node for the number of descendants
tre = tre.set_node_values(
feature="ndesc",
values={
idx: len(node.get_leaves())
for (idx, node) in tre.idx_dict.items()
}
)
The set_node_values()
function of toytrees operates similarly to the loop below which visits each TreeNode of the tree and adds a feature. The .traverse()
function of treenodes is convenient for accessing all nodes.
In [21]:
# add a new feature to every node
for node in tre.treenode.traverse():
node.add_feature("ndesc", len(node.get_leaves()))
Note: Use caution when modifying features of TreeNode objects because you can easily mess up the data that toytree needs in order to correctly plot trees and orient nodes, and tips, etc. This is why interacting with TreeNode objects directly should be considered an advanced method for toytree users. In contrast to ToyTree functions, which do not modify the tree structure in place, but instead return a copy, modification to TreeNodes do occur in place and therefore effect the current tree. Be aware that if you modify the parent/child relationships in the TreeNode it will change the tree. Similarly, if you change the .dist
or .idx
values of nodes it will effect the edge lengths and the order in which nodes are plotted.
The recommended workflow for adding features to TreeNodes and including them in toytree drawings is to use ToyTrees to retrieve the features, since ToyTree ensure the correct order.
When you add a new feature to TreeNodes it can then be accessed by ToyTrees just like other default features: "height", "idx", "name", etc. You can use .get_node_values()
to retrive them in the proper order, and to censor values for the root or tips if wanted. This also allows you to further build color mappings based on these values, calculate further statistics, etc.
In [22]:
# ndesc is now an available feature alongside the defaults
tre.features
Out[22]:
In [23]:
# it can be accessed from the ToyTree object using .get_node_values()
tre.get_node_values('ndesc', True, True)
Out[23]:
In [24]:
# and can be accessed by shortcut using just the feature name to 'node_labels'
tre.draw(node_labels=("ndesc", 1, 0), node_sizes=15);
Here is another example where color values are stored on TreeNodes and then retrieved from the ToyTree, and then used as draw argument to color nodes based on their TreeNode attribute. The nodes are colored based on whether the TreeNode was True or False for the .is_leaf()
. We use the default color palette of toytree accessed from toytree.colors
.
In [25]:
# traverse the tree and modify nodes (add new 'color' feature)
for node in tre.treenode.traverse():
if node.is_leaf():
node.add_feature('color', toytree.colors[1])
else:
node.add_feature('color', toytree.colors[2])
# store color list with values for tips and root
colors = tre.get_node_values('color', show_root=1, show_tips=1)
# draw tree with node colors
tre.draw(node_labels=False, node_colors=colors, node_sizes=15);
Keep in mind that for many lists of attributes you wish to plot on nodes of a tree, or to use for color mapping, such as support values or names you likely will not need to add features to the tree since the features are already available by default. In that case you can get far using just the get_node_values()
function from ToyTrees.