TreeNode objects

The .treenode attribute of ToyTrees allows users to access the underlying TreeNode structure directly. This is where you can traverse the tree and query the parent/child relationships of nodes. While this is used extensively within the code of toytree, most users will likely not need to interact with TreeNodes in order do most things they want toytree for (i.e., drawing). However, for power users, the TreeNode structure of toytrees provides a lot of additional functionality especially for doing scientific computation and research on trees. The TreeNode object in toytree is a modified fork of the TreeNode in ete3. Thus, you can read the very detailed ete documentation if you want a detailed understanding of the object.


In [1]:
import toytree
import toyplot
import numpy as np

# generate a random tree
tre = toytree.rtree.unittree(ntips=10, seed=12345)

TreeNode objects are always nested inside of ToyTree objects, and accessed from ToyTrees. When you use .treenode to access a TreeNode from a ToyTree you are actually accessing the top level node of the tree structure, the root. The root TreeNode is connected to every other TreeNode in the tree, and together they describe the tree structure.


In [2]:
# the .treenode attribute of the ToyTree returns its root TreeNode
tre.treenode


Out[2]:
<toytree.TreeNode.TreeNode at 0x7f2b1f711978>

In [3]:
# the .idx_dict of a toytree makes TreeNodes accessible by index
tre.idx_dict


Out[3]:
{18: <toytree.TreeNode.TreeNode at 0x7f2b1f711978>,
 17: <toytree.TreeNode.TreeNode at 0x7f2b1f7e14e0>,
 16: <toytree.TreeNode.TreeNode at 0x7f2b2fc31198>,
 15: <toytree.TreeNode.TreeNode at 0x7f2b1f71a710>,
 14: <toytree.TreeNode.TreeNode at 0x7f2b1f71a278>,
 13: <toytree.TreeNode.TreeNode at 0x7f2b1f71ada0>,
 12: <toytree.TreeNode.TreeNode at 0x7f2b2fc26ac8>,
 11: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7828>,
 10: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7898>,
 9: <toytree.TreeNode.TreeNode at 0x7f2b2fc2dc18>,
 8: <toytree.TreeNode.TreeNode at 0x7f2b1f71ae48>,
 7: <toytree.TreeNode.TreeNode at 0x7f2b2fbc78d0>,
 6: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7908>,
 5: <toytree.TreeNode.TreeNode at 0x7f2b2fbc77f0>,
 4: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7940>,
 3: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7978>,
 2: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7860>,
 1: <toytree.TreeNode.TreeNode at 0x7f2b2fbc79b0>,
 0: <toytree.TreeNode.TreeNode at 0x7f2b2fbc79e8>}

Traversing TreeNodes

To traverse a tree means to move from node to node to visit every node of the tree. In this case, we move from TreeNode to TreeNode. Depending on your reason for traversing the tree, the order in which nodes are visited may be arbitrary, or, it may actually be very important. For example, if you wish to calculate some new value on a node that depends on the values of its children, then you will want to visit the child nodes before you visit their parents. TreeNodes can be traversed in three ways. Below I print the order that nodes are visited for each. You can see the node index labels plotted on the tree which toytree uses to order nodes for plotting.


In [4]:
print('levelorder:', [node.idx for node in tre.treenode.traverse("levelorder")])
print('preorder:  ', [node.idx for node in tre.treenode.traverse("preorder")])
print('postorder: ', [node.idx for node in tre.treenode.traverse("postorder")])
tre.draw(node_labels=True, node_sizes=16);


levelorder: [18, 17, 16, 9, 15, 14, 13, 8, 12, 5, 11, 2, 10, 7, 6, 4, 3, 1, 0]
preorder:   [18, 17, 9, 15, 8, 12, 7, 6, 16, 14, 5, 11, 4, 3, 13, 2, 10, 1, 0]
postorder:  [9, 8, 7, 6, 12, 15, 17, 5, 4, 3, 11, 14, 2, 1, 0, 10, 13, 16, 18]
0123456789101112131415161718r0r1r2r3r4r5r6r7r8r9

TreeNodes have a large number of attributes and functions available to them which you can explore using tab-completion in a notebook and from the ete3 tutorial. In general, only advanced users will need to access attributes of the TreeNodes directly. For example, it is easier to access node idx and name labels from ToyTrees than from TreeNodes, since ToyTrees will return the values in the order they will be plotted.


In [5]:
# traverse the tree and access node attributes
for node in tre.treenode.traverse(strategy="levelorder"):
    print("{:<5} {:<5} {:<5} {:<5}".format(
        node.idx, node.name, node.is_leaf(), node.is_root()
        )
    )


18    18    0     1    
17    17    0     0    
16    16    0     0    
9     r9    1     0    
15    15    0     0    
14    14    0     0    
13    13    0     0    
8     r8    1     0    
12    12    0     0    
5     r5    1     0    
11    11    0     0    
2     r2    1     0    
10    10    0     0    
7     r7    1     0    
6     r6    1     0    
4     r4    1     0    
3     r3    1     0    
1     r1    1     0    
0     r0    1     0    

Adding features to TreeNodes

For the purposes of plotting, there are cases where accessing TreeNode attributes can be particularly powerful. For example, when you want to build a list of values for plotting that are based on the tree structure itself (number of children, edge length, is_root, etc.). You can traverse through the tree and calculate these attributes for each node.

When doing so, I have a recommended best practice that once again is intended to help users avoid accidentally plotting values in an incorrect order. This recommended practice is to add new features to the TreeNodes by traversing the tree, but then to retrieve and plot the features from the TreeNodes using ToyTree, since ToyTrees are the objects that organize the coordinates for plotting.


In [6]:
# see available features on a ToyTree 
tre.features


Out[6]:
{'dist', 'height', 'idx', 'name', 'support'}

Let's say we wanted to plot a value on each node of a toytree. You can use the toytree function .set_node_values() to set a value to each node. This takes the feature name, a dictionary mapping values to idx labels, and optionally a default value that is assigned to all other nodes. You can modify existing features or set new features.


In [18]:
# set a feature a few nodes with a new name
tre = tre.set_node_values(
    feature="name", 
    values={0: 'tip-0', 1: 'tip-1', 2: 'tip-2'},
)

In [19]:
# set a feature to every node of a random integer in 1-5
tre = tre.set_node_values(
    feature="randomint", 
    values={idx: np.random.randint(1, 5) for idx in tre.idx_dict},
)

Another potentially useful 'feature' to access includes statistics about the tree. For example, we may want to measure the number of extant descendants of each node on a tree. Such things can be measured directly from TreeNode objects. Below I use get_leaves() as an example. You can see the ete3 docs for more info on TreeNode functions and attributes.


In [20]:
# set a feature to every node for the number of descendants
tre = tre.set_node_values(
    feature="ndesc", 
    values={
        idx: len(node.get_leaves())
        for (idx, node) in tre.idx_dict.items()
    }
)

The set_node_values() function of toytrees operates similarly to the loop below which visits each TreeNode of the tree and adds a feature. The .traverse() function of treenodes is convenient for accessing all nodes.


In [21]:
# add a new feature to every node
for node in tre.treenode.traverse():
    node.add_feature("ndesc", len(node.get_leaves()))

Modifying features of TreeNodes

Note: Use caution when modifying features of TreeNode objects because you can easily mess up the data that toytree needs in order to correctly plot trees and orient nodes, and tips, etc. This is why interacting with TreeNode objects directly should be considered an advanced method for toytree users. In contrast to ToyTree functions, which do not modify the tree structure in place, but instead return a copy, modification to TreeNodes do occur in place and therefore effect the current tree. Be aware that if you modify the parent/child relationships in the TreeNode it will change the tree. Similarly, if you change the .dist or .idx values of nodes it will effect the edge lengths and the order in which nodes are plotted.

Accessing features from ToyTrees

The recommended workflow for adding features to TreeNodes and including them in toytree drawings is to use ToyTrees to retrieve the features, since ToyTree ensure the correct order. When you add a new feature to TreeNodes it can then be accessed by ToyTrees just like other default features: "height", "idx", "name", etc. You can use .get_node_values() to retrive them in the proper order, and to censor values for the root or tips if wanted. This also allows you to further build color mappings based on these values, calculate further statistics, etc.


In [22]:
# ndesc is now an available feature alongside the defaults
tre.features


Out[22]:
{'dist', 'height', 'idx', 'name', 'ndesc', 'randomint', 'support'}

In [23]:
# it can be accessed from the ToyTree object using .get_node_values()
tre.get_node_values('ndesc', True, True)


Out[23]:
array([10,  4,  6,  3,  3,  3,  2,  2,  2,  1,  1,  1,  1,  1,  1,  1,  1,
        1,  1])

In [24]:
# and can be accessed by shortcut using just the feature name to 'node_labels'
tre.draw(node_labels=("ndesc", 1, 0), node_sizes=15);


2223336410tip-0tip-1tip-2r3r4r5r6r7r8r9

Here is another example where color values are stored on TreeNodes and then retrieved from the ToyTree, and then used as draw argument to color nodes based on their TreeNode attribute. The nodes are colored based on whether the TreeNode was True or False for the .is_leaf(). We use the default color palette of toytree accessed from toytree.colors.


In [25]:
# traverse the tree and modify nodes (add new 'color' feature)
for node in tre.treenode.traverse():
    if node.is_leaf():
        node.add_feature('color', toytree.colors[1])
    else:
        node.add_feature('color', toytree.colors[2])

# store color list with values for tips and root
colors = tre.get_node_values('color', show_root=1, show_tips=1)

# draw tree with node colors
tre.draw(node_labels=False, node_colors=colors, node_sizes=15);


tip-0tip-1tip-2r3r4r5r6r7r8r9

Keep in mind that for many lists of attributes you wish to plot on nodes of a tree, or to use for color mapping, such as support values or names you likely will not need to add features to the tree since the features are already available by default. In that case you can get far using just the get_node_values() function from ToyTrees.