Toytree is a Python tree plotting library designed for use inside
jupyter notebooks. In fact, this entire tutorial was created using notebooks, and assumes that you are following along in a notebook of your own. To begin, we will import toytree
, and the plotting library it is built on, toyplot
, as well as numpy
for generating some numerical data.
In [1]:
import toytree # a tree plotting library
import toyplot # a general plotting library
import numpy as np # numerical library
In [2]:
print(toytree.__version__)
print(toyplot.__version__)
print(np.__version__)
The main Class object is toytree is a ToyTree
, which provides plotting functionality in addition to a number of useful functions and attributes for returning values and statistics about trees. As we'll see below, you can generate a ToyTree object in many ways, but generally it is done by reading in a newick formatted string of text. The example below shows the simplest way to load a ToyTree which is to use the toytree.tree()
convenience function to parse a file, URL, or string.
In [3]:
# load a toytree from a newick string at a URL
tre = toytree.tree("https://eaton-lab.org/data/Cyathophora.tre")
In [4]:
# root and draw the tree (more details on this coming up...)
rtre = tre.root(wildcard="prz")
rtre.draw(tip_labels_align=True);
ToyTrees can be flexibly loaded from a range of text formats. Below are two newick strings in different tree_formats
. The first has edge lengths and support values, the second has edge-lengths and node-labels. These are two different ways of writing tree data in a serialized format. Format 0 expects the internal node values to be integers or floats to represent support values, format 1 expects internal node values to be strings as node labels.
In [5]:
# newick with edge-lengths & support values
newick = "((a:1,b:1)90:3,(c:3,(d:1, e:1)100:2)100:1)100;"
tre0 = toytree.tree(newick, tree_format=0)
# newick with edge-lengths & string node-labels
newick = "((a:1,b:1)A:3,(c:3,(d:1, e:1)B:2)C:1)root;"
tre1 = toytree.tree(newick, tree_format=1)
To parse either format you can tell toytree the format of the newick string following the tree parsing formats in ete. The default option, and most common format is 0. If you don't enter a tree_format
argument the default format will usually parse it just fine. Toytree can also parse extended newick format (nhx) files, which store additional metadata, as well as mrbayes formatted files (tree_format=10
) which are a variant of NHX. Any of these formats can be parsed from a NEXUS file automatically.
In [6]:
# parse an NHX format string with node supports and names
nhx = "((a:3[&&NHX:name=a:support=100],b:2[&&NHX:name=b:support=100]):4[&&NHX:name=ab:support=60],c:5[&&NHX:name=c:support=100]);"
ntre = toytree.tree(nhx)
# parse a mrbayes format file with NHX-like node and edge info
mb = "((a[&prob=100]:0.1[&length=0.1],b[&prob=100]:0.2[&length=0.2])[&prob=90]:0.4[&length=0.4],c[&prob=100]:0.6[&length=0.6]);"
mtre = toytree.tree(mb, tree_format=10)
# parse a NEXUS formatted file containing a tree of any supported format
nex = """
#NEXUS
begin trees;
translate;
1 apple,
2 blueberry,
3 cantaloupe,
4 durian,
;
tree tree0 = [&U] ((1,2),(3,4));
end;
"""
xtre = toytree.tree(nex)
You can use tab-completion by typing the name of the tree variable (e.g., rtre
below) followed by a dot and then pressing <tab>
to see the many attributes of ToyTrees. Below I print a few of them as examples.
In [7]:
rtre.ntips
Out[7]:
In [8]:
rtre.nnodes
Out[8]:
In [9]:
tre.is_rooted(), rtre.is_rooted()
Out[9]:
In [10]:
rtre.get_tip_labels()
Out[10]:
In [11]:
rtre.get_edges()
Out[11]:
The main Class objects in toytree exist as a nested hierarchy. The core of any tree is the TreeNode
object, which stores the tree structure in memory and allows fast traversal over nodes of the tree to describe its structure. This object is wrapped inside of ToyTree
objects, which provide convenient access to TreeNodes while also providing plotting and tree modification functions. And multiple ToyTrees can be grouped together into MultiTree
objects, which are useful for iterating over multiple trees, or for generating plots that overlay and compare trees.
The underlying TreeNode object of Toytrees will be familiar to users of the ete3 Python library, since it is pretty much a stripped-down forked version of their TreeNode class object. This is useful since ete has great documentation. You can access the TreeNode of any ToyTree using its .treenode
attribute, like below. Beginner toytree user's are unlikely to need to access TreeNode objects directly, and instead will mostly access the tree structure through ToyTree objects.
In [12]:
# a TreeNode object is contained within every ToyTree at .tree
tre.treenode
Out[12]:
In [13]:
# a ToyTree object
toytree.tree("((a, b), c);")
Out[13]:
In [14]:
# a MultiTree object
toytree.mtree([tre, tre, tre])
Out[14]:
When you call .draw()
on a tree it returns three objects, a Canvas
, a Cartesian
axes object, and a Mark
. This follows the design principle of the toyplot
plotting library on which toytree is based. The Canvas describes the plot space, and the Cartesian coordinates define how to project points onto that space. One canvas can have multiple cartesian coordinates, and each cartesian object can have multiple Marks. This will be demonstrated more later.
As you will see below, I end many toytree drawing commands with a semicolon (;), this simply hides the printed return statement showing that the Canvas and Cartesian objects were returned. The Canvas will automatically render in the cell below the plot even if you do not save the return Canvas as a variable. Below I do not use a semicolon and so the three returned objects are shown as text (e.g., <toyplot.canvas.Canvas...>), and the plot is displayed.
In [15]:
rtre.draw()
Out[15]:
In [16]:
# the semicolon hides the returned text of the Canvas and Cartesian objects
rtre.draw();
In [17]:
# or, we can store them as variables (this allows more editing on them later)
canvas, axes, mark = rtre.draw()
There are innumerous ways in which to style ToyTree drawings. We provide a number of pre-built tree_styles
(normal, dark, coalescent, multitree), but users can also create their own style dictionaries that can be easily reused. Below are some examples. You can use tab-completion within the draw function to see the docstring for more details on available arguments to toggle, or you can see which styles are available on ToyTrees by accessing their .style
dictionary. See the Styling chapter for more details.
In [18]:
# drawing with pre-built tree_styles
rtre.draw(tree_style='n'); # normal-style
rtre.draw(tree_style='d'); # dark-style
# 'ts' is also a shortcut for tree_style
rtre.draw(ts='o'); # umlaut-style
In [19]:
# define a style dictionary
mystyle = {
"layout": 'd',
"edge_type": 'p',
"edge_style": {
"stroke": toytree.colors[2],
"stroke-width": 2.5,
},
"tip_labels_align": True,
"tip_labels_colors": toytree.colors[0],
"tip_labels_style": {
"font-size": "10px"
},
"node_labels": False,
"node_sizes": 8,
"node_colors": toytree.colors[2],
}
In [20]:
# use your custom style dictionary in one or more tree drawings
rtre.draw(height=400, **mystyle);
Plotting node values on a tree is a useful way of representing additional information about trees. Toytree tries to make this process fool-proof, in the sense that the data you plot on nodes will always be the correct data associated with that node. This is done through simple shortcut methods for plotting node features, as well as a convenience function called .get_node_values()
that draws the values explicitly from the same tree structure that is being plotted (this avoids making a list of values from a tree and then plotting them on that tree only to find that a the order of tips or nodes in the tree has changed.) Finally, toytree also provides interactive features that allow you to explore many features of your data by simply hovering over nodes with your cursor. This is made possible by the HTML+JS framework in which toytrees are displayed in jupyter notebooks, or in web-pages.
In [21]:
# hover over nodes to see pop-up elements
rtre.draw(height=350, node_hover=True, node_sizes=10, tip_labels_align=True);
In the example above the labels on each node indicate their "idx" value, which is simply a unique identifier given to every node. We could alternatively select one of the features that you could see listed on the node when you hovered over it and toytree will display that value on the node instead. In the example below we plot the node support values. You'll notice that in this context no values were shown for the tip nodes, but instead only for internal nodes. More on this below.
In [22]:
rtre.draw(node_labels='support', node_sizes=15);
You can also create plots with the nodes shown, but without node labels. This is often most useful when combined with mapping different colors to nodes to represent different classes of data. In the example below we pass a single color and size for all nodes.
In [23]:
# You can do the same without printing the 'idx' label on nodes.
rtre.draw(
node_labels=None,
node_sizes=10,
node_colors='grey'
);
You can draw values on all the nodes, or only on non-tip nodes, or only on internal nodes (not tips or root). Use the .get_node_values
function of ToyTrees to build a list of values for plotting on the tree. Because the data are extracted from the same tree they will be plotted on the values will always be ordered properly.
In [24]:
tre0.get_node_values("support", show_root=1, show_tips=1)
Out[24]:
In [25]:
tre0.get_node_values("support", show_root=1, show_tips=0)
Out[25]:
In [26]:
tre0.get_node_values("support", show_root=0, show_tips=0)
Out[26]:
In [27]:
# show support values
tre0.draw(
node_labels=tre0.get_node_values("support", 0, 0),
node_sizes=20,
);
In [28]:
# show support values
tre0.draw(
node_labels=tre0.get_node_values("support", 1, 1),
node_sizes=20,
);
Because .get_node_values()
returns values in node plot order, it is especially useful for building lists of values for color mapping on nodes. Here we map different colors to nodes depending on whether the support value is 100 or not.
In [29]:
# build a color list in node plot order with different values based on support
colors = [
toytree.colors[0] if i==100 else toytree.colors[1]
for i in rtre.get_node_values('support', 1, 1)
]
# You can do the same without printing the 'idx' label on nodes.
rtre.draw(
node_sizes=10,
node_colors=colors
);
Toytree drawings can be saved to disk using the render
functions of toyplot. This is where it is useful to store the Canvas object as a variable when it is returned during a toytree drawing. You can save toyplot figures in a variety of formats, including HTML (which is actually an SVG figures wrapped in HTML with addition javascript to provide interactivity); or SVG, PDF, and PNG.
In [30]:
# draw a plot and store the Canvas object to a variable
canvas, axes, mark = rtre.draw(width=400, height=300);
HTML rendering is the default format. This will save the figure as a vector graphic (SVG) wrapped in HTML with additional optional javascript wrapping for interactive features. You can share the file with others and anyone can open it in a browser. You can embed it on your website, or even display it in emails!
In [31]:
# for sharing through web-links (or even email!) html is great!
toyplot.html.render(canvas, "/tmp/tree-plot.html")
Optional formats: If you want to do additional styling of your figures in Illustrator or InkScape (recommended) then SVG is likely your best option. You can save figures in SVG by simply importing this as an additional option from toyplot.
In [32]:
# for creating scientific figures SVG is often the most useful format
import toyplot.svg
toyplot.svg.render(canvas, "/tmp/tree-plot.svg")
Despite the advantages of working with the SVG or HTML formats (e.g., vector graphics and interactive pop-ups), if you're like me you still sometimes love to have an old-fashioned PDF. Again, you can import this from toyplot.
In [33]:
import toyplot.pdf
toyplot.pdf.render(canvas, "/tmp/tree-plot.pdf")
When you call the toytree.draw()
function it returns two Toyplot objects which are used to display the figure. The first is the Canvas, which is the HTML element that holds the figure, and the second is a Cartesian axes object, which represent the coordinates for the plot. You can store these objects when they are returned by the draw()
function to further manipulate the plot. Storing the Canvas is necessary in order to save the plot.
If you wish to combine multiple toytree figures into a single figure then it is easiest to first create instances of the toyplot Canvas and Axes objects and then to add the toytree drawing to this plot by using the .draw(axes=axes)
argument. In the example below we first define the Canvas size, then define two coordinate axes inside of this Canvas, and then we pass these coordinate axes objects to two separate toytree drawings.
In [34]:
# set dimensions of the canvas
canvas = toyplot.Canvas(width=700, height=250)
# dissect canvas into multiple cartesian areas (x1, x2, y1, y2)
ax0 = canvas.cartesian(bounds=('10%', '45%', '10%', '90%'))
ax1 = canvas.cartesian(bounds=('55%', '90%', '10%', '90%'))
# call draw with the 'axes' argument to pass it to a specific cartesian area
style = {
"tip_labels_align": True,
"tip_labels_style": {
"font-size": "9px"
},
}
rtre.draw(axes=ax0, **style);
rtre.draw(axes=ax1, tip_labels_colors='indigo', **style);
# hide the axes (e.g, ticks and splines)
ax0.show=False
ax1.show=False
Toytrees drawings are designed to use a set coordinate space within the axes to make it easy to situate additional plots to align with tree drawings. Regardless of whether the tree drawing is oriented 'right' or 'down' the farthest tip of the tree (not tip label but tip) will align at the zero-axis. For right-facing trees this means at x=0, for down-facing trees this means y=0. On the other axis, tree tips will be spaced from zero to ntips with a unit of 1 between each tip. For tips on aligning additional plotting methods (barplots, scatterplots, etc.) with toytree drawings see the Cookbook gallery. Below I add a grid to overlay tree plots in both orientations to highlight the coordinate space.
In [35]:
# store the returned Canvas and Axes objects
canvas, axes, makr = rtre.draw(
width=300,
height=300,
tip_labels_align=True,
tip_labels=False,
)
# show the axes coordinates
axes.show = True
axes.x.ticks.show = True
axes.y.ticks.show = True
# overlay a grid
axes.hlines(np.arange(0, 13, 2), style={"stroke": "red", "stroke-dasharray": "2,4"})
axes.vlines(0, style={"stroke": "blue", "stroke-dasharray": "2,4"});
In [36]:
# store the returned Canvas and Axes objects
canvas, axes, mark = rtre.draw(
width=300,
height=300,
tip_labels=False,
tip_labels_align=True,
layout='d',
)
# show the axes coordinates
axes.show = True
axes.x.ticks.show = True
axes.y.ticks.show = True
# overlay a grid
axes.vlines(np.arange(0, 13, 2), style={"stroke": "red", "stroke-dasharray": "2,4"})
axes.hlines(0, style={"stroke": "blue", "stroke-dasharray": "2,4"});