Welcome to HoloViews!
This tutorial explains the basics of how to use HoloViews to explore your data. If this is your first contact with HoloViews, you may want to start by looking at our showcase to get a good idea of what can be achieved with HoloViews. If this introduction does not cover the type of visualizations you need, you should check out our Elements and Containers components to see what else is available.
HoloViews allows you to collect and annotate your data in a way that reveals it naturally, with a minimum of effort needed for you to see your data as it actually is. HoloViews is not a plotting library -- it connects your data to plotting code implemented in other packages, such as matplotlib or Bokeh. HoloViews is also not primarily a mass storage or archival data format like HDF5 -- it is designed to package your data to make it maximally visualizable and viewable interactively.
If you supply just enough additional information to the data of interest, HoloViews allows you to store, index, slice, analyze, reduce, compose, display, and animate your data as naturally as possible. HoloViews makes your numerical data come alive, revealing itself easily and without extensive coding.
Here are a few of the things HoloViews allows you to associate with your data:
Image
it will be displayed as an image with a colormap by default, a Curve
will be presented as a line plot on an axis, and so on. Once your data has been encapsulated in an Element
object, other Elements
can easily be created from it, such as obtaining a Curve
by taking a cross-section of a Image
.kdims
) describe how your data can be indexed. The value dimensions (vdims
) describe what the resulting indexed data represents. A numerical Dimension
can have a name, type, range, and unit. This information allows HoloViews to rescale and label axes and allows HoloViews be smart in how it processes your data.HoloViews can display your data even if it knows only the Element type, which lets HoloViews stay out your way when initially exploring your data, offering immediate feedback with reasonable default visualizations. As your analysis becomes more complex and your research progresses, you may offer more of the useful metadata above so that HoloViews will automatically improve your displayed figures accordingly. Throughout, all you need to supply is this metadata plus optional and separate plotting hints (such as choosing specific colors if you like), rather than having to write cumbersome code to put figures together or having to paste bits together manually in an external drawing or plotting program.
Note that the HoloViews data components have only minimal required dependencies (Numpy and Param, both with no required dependencies of their own). This data format can thus be integrated directly into your research or development code, for maximum convenience and flexibility (see e.g. the ImaGen library for an example). Plotting implementations are currently provided for matplotlib and Bokeh, and other plotting packages could be used for the same data in the future if needed. Similarly, HoloViews provides strong support for the IPython/Jupyter notebook interface, and we recommend using the notebook for building reproducible yet interactive workflows, but none of the components require IPython either. Thus HoloViews is designed to fit into your existing workflow, without adding complicated dependencies.
To enable IPython integration, you need to load the IPython extension as follows:
In [ ]:
import holoviews as hv
hv.notebook_extension()
We'll also need Numpy for some of our examples:
In [ ]:
import numpy as np
HoloViews has very well-documented and error-checked constructors for every class (provided by the Param library). There are a number of convenient ways to access this information interactively. E.g. if you have imported holoviews
and Element
and have instantiated an object of that type:
import holoviews as hv
hv.Element(None, group='Value', label='Label')
You can now access the online documentation in the following ways:
hv.help(Element)
or hv.help(e)
: description and parameter documentation for an object or typehv.help(Element,pattern="group")
: only show help items that match the specified regular expression (the string "group" in this example)Element(<Shift+TAB>
in IPython: Repeatedly pressing <Shift+TAB>
after opening an object constructor will get you more information on each press, eventually showing the full output of hv.help(Element)
hv.help(Element, visualization=True)
or hv.help(e, visualization=True)
: description of options for visualizing an Element
or the specific object e
, not the options for the object itself%%output info=True
on an IPython/Jupyter notebook cell or %output info=True
on the whole notebook: show hv.help(o, visualization=True)
for every HoloViews object o
as it is returned in a cell.Lastly, you can tab-complete nearly all arguments to HoloViews classes, so if you try Element(vd<TAB>
, you will see the available keyword arguments (vdims
in this case). All of these forms of help are described in detail in the options tutorial.
To begin, let's see how HoloViews stays out your way when initially exploring some data. Let's view an image, selecting the appropriate RGB Element. Now, although we could immediately load our image into the RGB
object, we will first load it into a raw Numpy array (by specifying array=True
):
In [ ]:
parrot = hv.RGB.load_image('../assets/macaw.png', array=True)
print("%s with shape %s" % (type(parrot),parrot.shape))
As we can see this 400×400 image data array has four channels (the fourth being an unused alpha channel). Now let us make an RGB
element to wrap up this Numpy array with its associated label:
In [ ]:
rgb_parrot = hv.RGB(parrot, label='Macaw')
rgb_parrot
Here rgb_parrot
is an RGB
HoloViews element, which requires 3 or 4 dimensional data and can store an associated label. rgb_parrot
is not a plot -- it is just a data structure with some metadata. The holoviews.ipython
extension, in turn, makes sure that any RGB
element is displayed appropriately, i.e. as a color image with an associated optional title, plotted using matplotlib. But the RGB
object itself does not have any connection to the plotting library, and stores no data about the plot, just its own data, which is sufficient for the external plotting routines to visualize the data usefully and meaningfully. And the same plot can be generated outside of IPython just as easily, e.g. to save as a .png
or .svg
file.
Because rgb_parrot
is just our actual data, it can be composed with other objects, pickled, and analyzed as-is. For instance, we can still access the underlying Numpy array easily via the .data
attribute, and can verify that it is indeed our actual data:
In [ ]:
rgb_parrot.data is parrot
Note that this is generally true throughout HoloViews; if you pass a HoloViews element a Numpy array of the right shape, the .data
attribute will simply be a reference to the data you supplied. If you use an alternative data format when constructing an element, such as a Python list, a Numpy array of the appropriate type will be created and made available through the .data
attribute. You can always use the identity check demonstrated above if you want to make absolutely sure your raw data is being used directly.
As you compose these objects together, you will see that a complex visualization is not simply a visual display, but a rich data structure containing all the raw data or analyzed data ready for further manipulation and analysis.
For some image analysis purposes, working in RGB colour space is too limiting. It is often more flexible to work with a single N×M array at a time and visualize the data in each channel using a colormap. To do this we need the Image Element
instead of the RGB
Element
.
To illustrate, let's start by visualizing the total luminance across all the channels of the parrot image, choosing a specific colormap using the HoloViews %%opts
IPython cell magic. %%opts Image
allows us to pass plotting hints to the underlying visualization code for Image
objects:
In [ ]:
%%opts Image style(cmap='coolwarm')
luminance = hv.Image(parrot.sum(axis=2), label='Summed Luminance')
luminance
As described in the Options tutorial, the same options can also be specified in pure Python, though not quite as succinctly.
The resulting plot is what we would expect: dark areas are shown in blue and bright areas are shown in red. Notice how the plotting hints (your desired colormap in this case) are kept entirely separate from your actual data, so that the Image data structure contains only your actual data and the metadata that describes it, not incidental information like matplotlib or Bokeh options. We will now set the default colormap to grayscale for all subsequent cells using the %opt
command, and look at a single color channel by building an appropriate Image
element:
In [ ]:
%opts Image style(cmap='gray')
red = hv.Image(parrot[:,:,0], label='Red')
red
Here we created the red Image
directly from the NumPy array parrot
. You can also make a lower-dimensional HoloViews component by slicing a higher-dimensional one. For instance, now we will combine this manually constructed red channel with green and blue channels constructed by slicing the rgb_parrot
RGB
HoloViews object to get the appropriate Image
objects:
In [ ]:
channels = red + rgb_parrot[:,:,'G'].relabel('Green') + rgb_parrot[:,:,'B'].relabel('Blue')
channels
Here we have combined these three HoloViews objects using the compositional operator +
to create a new object named channels
. When channels
is displayed by the IPython/Jupyter notebook, each Image
is shown side by side, with appropriate labels. In this format, you can see that the parrot looks quite distinctly different in the red channel than the green and blue channel.
Note that the channels
object isn't merely a display of three elements side by side, but a new composite object of type Layout
containing our three Image
objects:
In [ ]:
print(repr(channels))
Here the syntax :Layout
means that this is an object of Python type Layout
. The entries below it show that this Layout
contains three items accessible via different attributes of the channels
object, i.e. channels.Image.Red
, channels.Image.Green
, and channels.Image.Blue
. The string :Image
indicates that each object is of type Image
.
The attributes provide an easy way to get to the objects inside channels
. In this case, the attributes were set by the +
operator when it created the Layout
, with the first level from the optional group
of each object (discussed below, here inheriting from the type of the object), and the second from the label
we defined for each object above. As for other HoloViews objects, tab completion is provided, so you can type channels.I<TAB>.B<TAB>
to get the Blue channel:
In [ ]:
channels.Image.Blue
You can see that for each item in the Layout
, the repr()
shows us its Python type (Image
) and the attributes needed to access it (.Image.Blue
in this case). The remaining items in brackets and parentheses in the repr()
will be discussed below.
Since we can access the channels easily, we can recompose our data how we like, e.g. to compare the Red and Blue channels side by side:
In [ ]:
channels.Image.Red + channels.Image.Blue
Notice how the labels we have set are useful for both the titles and for the indexing, and are thus not simply plotting-specific details -- they are semantically meaningful metadata describing this data. There are also sublabels A and B generated automatically for each subfigure; these can be changed or disabled if you prefer.
In [ ]:
%%opts Layout [sublabel_format="{alpha}"]
channels.Image.Red + channels.Image.Green + channels.Image.Green
You may wonder what the ".Image
" is doing in the middle of the indexing above. This is the group
name which, even though we haven't set it directly in this case, is as important a concept as the label. All HoloViews objects come with both group
and a label
, which allows you to specify both what kind of thing the object is (its group
), and which specific one it is (the label
). These values will be used to construct subfigure titles, to allow you to access the object by name in containers, and to allow you to set options for specific objects or for groups of them.
Group
s and label
s can both be set to any Python string, including spaces and special characters. The label
is an arbitrary name you can use for this data item. The group
is meant to describe the category or the semantic type of the data. By default, the group is the same as the name of the HoloViews element type, in this case Image
:
In [ ]:
channels.Image.Blue.group
The group is an extremely useful grouping mechanism that allows you to easily structure your data in ways that are semantically meaningful for you. As we noted above, the red channel is the most clearly different from the other two, and we can separate it from the other two channels if we wish by giving them two different groups:
In [ ]:
chans = (hv.Image(parrot[:,:,0], group='RedChannel', label='Macaw')
+ hv.Image(parrot[:,:,1], group='Channel', label='Green')
+ hv.Image(parrot[:,:,2], group='Channel', label='Blue'))
The red channel is given its own special group 'RedChannel'
while the other two channels are grouped under the generic Channel
. The non-Red channels can now be accessed as a group:
In [ ]:
print(repr(chans.Channel))
chans.Channel
And now we can access the interesting red channel via its own group:
In [ ]:
chans.RedChannel
Of course, you can still access the other two channels individually using channels.Channel.Green
and channels.Channel.Blue
respectively; with enough attributes provided you can always get down to the individual, ungrouped objects.
In any case, the reason that there are two levels of indexing here is simply because that's what the +
operator does by default, i.e. it looks up the group
and label
information that every HoloViews object has, and uses those to name the attributes that let you access the object. But the Layout
object is actually a tree, not a fixed two-level structure, and it allows you to store your objects in any tree shape that you prefer. This structure can be used to set up any grouping arrangement that reflects how you want to use your data, such as to make it easy to select certain subsets for plotting or for special operations like setting their plot options or running analyses on them. For instance, if you really wanted to, you could insert a new item arbitrarily deeply nested into your Layout
:
In [ ]:
chans.RedChannel.OriginalData.StoredForSafeKeeping = rgb_parrot
In [ ]:
print(repr(chans))
In [ ]:
chans
You can see that rgb_parrot
is still titled with its internal label "Macaw", even though it's been inserted into a custom location in the Layout
tree; the attributes are only used for grouping and accessing the objects from Python, not for plotting purposes, and so you can use them however you see fit to organize your own data. The two-level structure provided by the +
operator is just a useful default, and should be all many HoloViews users will need. Note that the attributes constructed using +
are sanitized --- spaces are converted to _
, special characters are represented specially, and so on -- just use [TAB]
completion (or print(repr())
) to find out the correct attribute name in unusual cases. So you can use any characters you like in the label and group, and there will still be usable names for use in attribute selection and options processing.
Apart from the attribute access and types, the other items in the repr()
show the dimensions of the Image
objects:
In [ ]:
print(repr(chans))
The key dimensions are shown in brackets [x,y]
. Each of these objects can be indexed by the 2-dimensional (x,y) location in the image:
In [ ]:
chans.Channel.Blue[0.0,0.0]
The value dimensions in parentheses in the repr()
tell you what data will be returned by applying the above indexing over the key dimensions, in this case the single value for the luminance of the blue pixel at that location. The generic default name for that dimension of an Image
is z
, but you can use another name for it if you prefer:
In [ ]:
chans.Channel.Blue = chans.Channel.Blue.clone(vdims=["Luminance"])
print(repr(chans.Channel.Blue))
And of course, if there is more than one value dimension, as there is for the RGB
object rgb_parrot
that we stored in there, you will get all of them back when you index, and then you can further index to select a single dimension:
In [ ]:
print(repr(rgb_parrot))
print(rgb_parrot[0,0])
print(rgb_parrot[0,0][0])
To summarize, a Layout
constructed by +
will be organized into two levels by default, i.e. group
and label
, which is sufficient in many cases. But if you have a more complicated hierarchical collection of different types of data, you can combine it all into a custom-organized Layout
tree structure that respects the semantic categories that characterize your particular set of data. Once these categories have been set up, you can then very easily select appropriate sets of your data for display or further analysis.
As you can see, a Layout
is an incredibly convenient and versatile way of collecting even a huge and complex collection of data together, ready to explore easily.
Putting two components (Elements or Containers) side by side into a Layout
using +
is one of the most common operations in HoloViews, and works with any possible component type. But there is another compositional operator *
that is also very useful for creating complex visualizations, by overlaying components on top of each other. Nearly all components can be overlaid as well, except for a Layout
; a Layout
can contain Overlays
, but never the other way around.
One type of element designed specifically for overlaying is the annotation. Here we use the Arrow Element to label our parrot using the original RGB
object with the overlay (*
) operator:
In [ ]:
extents = (-0.5, -0.5, 0.5, 0.5) # Image spatial extents
o = rgb_parrot * hv.Arrow(-0.1,0.2, 'Polly', '>', extents=extents)
o
An overlay is a compositional data structure, just like Layout
(it is in fact a subclass!). This means the same attribute-access and grouping semantics apply. To illustrate we can index our overlay to pull it apart and lay the two components side by side:
In [ ]:
o + o.RGB.Macaw + o.Arrow.I
Note that when there is no label available for an object in a Layout
or Overlay
, HoloViews will generate an appropriate Roman numeral identifier for indexing. In this case we index our arrow using Arrow.I
. Naturally, Overlays
may themselves be elements of a Layout
, as at left above.
Overlays may be simple annotations as demonstrated above, but often they can contain significant volumes of important data. To demonstrate, we will introduce the concept of operations and the Contours
Element
:
In [ ]:
from holoviews.operation import contours
This operation takes an Image
as input and generates an overlay for us, where our original input is returned with contour lines overlaid on top. Let's have a look at the 10% (darkest) and 80% (brightest) areas of the red channel:
In [ ]:
contours(chans.RedChannel.Macaw, levels=[0.10,0.80])
The final topic for the introduction is animations. Animation relies on a powerful multidimensional data container called a HoloMap
, which is described in detail in the Exploring Data tutorial. Here, as a brief illustration, we show how to construct three HoloMap
s from sets of Images
with Contours
, constructed using a list of different threshold levels for the above image.
As you can see in the above plot, having a large number of threshold levels would be very difficult to include in a single plot. In such a case, one could lay them all out side by side, but here we show how to combine them into three HoloMap
objects that support animation, whether viewed separately or as part of the same Layout
:
In [ ]:
%%opts Contours.Red (color=Palette('Reds')) Contours.Green (color=Palette('Greens')) Contours.Blue (color=Palette('Blues'))
data = {lvl:(contours(chans.RedChannel.Macaw, levels=[lvl], group='Red') +\
contours(chans.Channel.Green, levels=[lvl], group='Green') +\
contours(chans.Channel.Blue, levels=[lvl], group='Blue'))
for lvl in np.linspace(0.1,0.9,9)}
levels = hv.HoloMap(data, kdims=['Levels']).collate()
levels
The levels
object here is a Layout
, just as in the other examples above, but it is displayed as an animation because it happens to contain three HoloMaps
that have an additional dimension Levels
beyond what has been laid out spatially in each image. There's no other special implementation necessary to get animations; they appear automatically whenever there are these additional dimensions in a HoloMap
that haven't been sliced, sampled, or reduced down enough to fit into a single plot. Your data, as always, remains available within the object, if you later want to pull out portions of it to display without an animation:
In [ ]:
green05 = levels.Overlay.Green[0.5]
green05 + green05.Channel + green05.Channel.Green.sample(y=0.0)
Hopefully you now understand the basic concepts of HoloViews. It's now worth checking out the full features of the HoloMap component, as well as all the other types of elements and containers. Have fun!