Widgets and Interactive Data Analysis

This tour is meant to show you some very recent developments in the iPython ecosystem that will likely revolutionize the way we work with data, interact with and evaluate mathematical functions and think about physics.

Created from the mpld3 demo and the widget documentation pages.


Instructions: Create a new directory called Widgets with a notebook called WidgetsTour. Give it a heading 1 cell title Widgets and Interactive Data Analysis. Read this page, typing in the code in the code cells and executing them as you go.

Do not copy/paste.

Type the commands yourself to get the practice doing it. This will also slow you down so you can think about the commands and what they are doing as you type them.</font>

Save your notebook when you are done. There are no additional exercises other than what is embedded in the tour, but these will not be collected or graded. They are just for your amusement.



In [ ]:
%pylab inline

mpld3

The mpld3 project brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular Javascript library for creating interactive data visualizations for the web. The result is a simple API for exporting your matplotlib graphics to HTML code which can be used within the browser, within standard web pages, blogs, or tools such as the IPython notebook.

(Description taken from the mpld3 webpage)

To appreciate the power of this API, have a look at the graph below. It is a static png file that is created with Matplotlib. You cannot interact with it.


In [2]:
# Scatter points
fig, ax = plt.subplots()
np.random.seed(0)
x, y = np.random.normal(size=(2, 200))
color, size = np.random.random((2, 200))

ax.scatter(x, y, c=color, s=500 * size, alpha=0.3)
ax.grid(color='lightgray', alpha=0.7)


Zooming and panning

But if we import the mpld3 library and display the figure using that API, we can create a zoomable and pannable graph that allows us to explore the data interactively!


In [3]:
import mpld3
mpld3.display(fig)


Out[3]:

In the lower left corner of the plotting area, when you hover over it with your mouse, you will see a set of tools. The "Home" button, which restores the graph to its defaults, a "Pan" button (crossed arrows), which when clicked, allows you to click and drag in the plot itself, and a "Zoom" button (magnifying glass), which when clicked allows you to draw a rectangular zoom box on the graph to look at particular regions more closely.

Try it out right now in this webpage. Because this library incorporates both Python and Javascript, when you save an mpld3-enabled figure in a webpage, it retains its ability to interact with you! After you play with the UI on the webpage, try it in the notebook too.

Linked brushing

Another type of interaction is the ability to select a region of a subplot and have it display the corresponding data in other linked subplots. To see what that means, try highlighting a region of one of the subplots in the image below and observe how it shows you where the same data points lie in the other subplots. This provides the ability to explore correlated multi-dimensional data sets. How two variables relate to one another can be visualized in a 2-D plot, but what if you want to see how those variables also relate to a third and fourth variable? This example shows one way to explore those relationships. And of course, you can still also pan and zoom. Because the axes are linked, all of the subplots pan and zoom together.


In [22]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

import mpld3
from mpld3 import plugins, utils


data = load_iris()
X = data.data
y = data.target

# dither the data for clearer plotting
X += 0.1 * np.random.random(X.shape)

fig, ax = plt.subplots(4, 4, sharex="col", sharey="row", figsize=(8, 8))
fig.subplots_adjust(left=0.05, right=0.95, bottom=0.05, top=0.95,
                    hspace=0.1, wspace=0.1)

for i in range(4):
    for j in range(4):
        points = ax[3 - i, j].scatter(X[:, j], X[:, i], c=y, s=40, alpha=0.6)

# remove tick labels
for axi in ax.flat:
    for axis in [axi.xaxis, axi.yaxis]:
        axis.set_major_formatter(plt.NullFormatter())

# Here we connect the linked brush plugin
plugins.connect(fig, plugins.LinkedBrush(points))

mpld3.display()


Out[22]:

Mouseover pixel identifier

This is another interactive element that can be very useful with image files - the ability to draw the pixel value that the mouse is currently pointing at. You can use this to help identify regions of interest to explore more quantitatively. For example, think about loading one of the star images from the Counting Stars Exercises and finding the locations of the more prominent features interactively before running an algorithm on the data.


In [5]:
import matplotlib.pyplot as plt
import numpy as np

import mpld3
from mpld3 import plugins

fig, ax = plt.subplots()

x = np.linspace(-2, 2, 20)
y = x[:, None]
X = np.zeros((20, 20, 4))

X[:, :, 0] = np.exp(- (x - 1) ** 2 - (y) ** 2)
X[:, :, 1] = np.exp(- (x + 0.71) ** 2 - (y - 0.71) ** 2)
X[:, :, 2] = np.exp(- (x + 0.71) ** 2 - (y + 0.71) ** 2)
X[:, :, 3] = np.exp(-0.25 * (x ** 2 + y ** 2))

im = ax.imshow(X, extent=(10, 20, 10, 20),
               origin='lower', zorder=1, interpolation='nearest')
fig.colorbar(im, ax=ax)

ax.set_title('An Image', size=20)

plugins.connect(fig, plugins.MousePosition(fontsize=14))

mpld3.display()


Out[5]:

Custom Plugins

Those were just a few examples of what you can do with built-in plugins, but you can also completely customize them to your needs. Unfortunately, because this API is so new, many useful examples are not yet part of the embedded library. This next one is a case in point. The code cell to produce the plot is pretty involved, and includes some Javascript code to enable it to run. I chose to include it because it is such a powerful example of how you could explore a function's parameter space interactively to gain insight about how a system behaves. For different pairs of amplitude and period, you can quickly see how the shape of the sinusoid varies.

For this one, go ahead and copy/paste the code into your notebook to run it. Don't try to type it all in yourself.


In [6]:
"""
Defining a Custom Plugin
========================
Test the custom plugin demoed on the `Pythonic Perambulations
<http://jakevdp.github.io/blog/2014/01/10/d3-plugins-truly-interactive/>`_
blog.  Hover over the points to see the associated sinusoid.
Use the toolbar buttons at the bottom-right of the plot to enable zooming
and panning, and to reset the view.
"""
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import mpld3
from mpld3 import plugins, utils


class LinkedView(plugins.PluginBase):
    """A simple plugin showing how multiple axes can be linked"""

    JAVASCRIPT = """
    mpld3.register_plugin("linkedview", LinkedViewPlugin);
    LinkedViewPlugin.prototype = Object.create(mpld3.Plugin.prototype);
    LinkedViewPlugin.prototype.constructor = LinkedViewPlugin;
    LinkedViewPlugin.prototype.requiredProps = ["idpts", "idline", "data"];
    LinkedViewPlugin.prototype.defaultProps = {}
    function LinkedViewPlugin(fig, props){
        mpld3.Plugin.call(this, fig, props);
    };

    LinkedViewPlugin.prototype.draw = function(){
      var pts = mpld3.get_element(this.props.idpts);
      var line = mpld3.get_element(this.props.idline);
      var data = this.props.data;

      function mouseover(d, i){
        line.data = data[i];
        line.elements().transition()
            .attr("d", line.datafunc(line.data))
            .style("stroke", this.style.fill);
      }
      pts.elements().on("mouseover", mouseover);
    };
    """

    def __init__(self, points, line, linedata):
        if isinstance(points, matplotlib.lines.Line2D):
            suffix = "pts"
        else:
            suffix = None

        self.dict_ = {"type": "linkedview",
                      "idpts": utils.get_id(points, suffix),
                      "idline": utils.get_id(line),
                      "data": linedata}

fig, ax = plt.subplots(2)

# scatter periods and amplitudes
np.random.seed(0)
P = 0.2 + np.random.random(size=20)
A = np.random.random(size=20)
x = np.linspace(0, 10, 100)
data = np.array([[x, Ai * np.sin(x / Pi)]
                 for (Ai, Pi) in zip(A, P)])
points = ax[1].scatter(P, A, c=P + A,
                       s=200, alpha=0.5)
ax[1].set_xlabel('Period')
ax[1].set_ylabel('Amplitude')

# create the line object
lines = ax[0].plot(x, 0 * x, '-w', lw=3, alpha=0.5)
ax[0].set_ylim(-1, 1)

ax[0].set_title("Hover over points to see lines")

# transpose line data and add plugin
linedata = data.transpose(0, 2, 1).tolist()
plugins.connect(fig, LinkedView(points, lines[0], linedata))

mpld3.display()


Out[6]:

Widgets

Another new library available in iPython is the ability to create widgets, tools that allow you to interact with the notebook in useful ways. Just like mpld3, this is a very new addition to iPython, so the doumentation is not as mature as other libraries, but that will rapidly change as more people use them. For now, we'll explore a couple of different kinds of widgets to give you a sense for some of the possibilities and for how easy it is to create and use them. For more info see this talk by Jake Vanderplas at the 2013 PyData conference.


In [ ]:
from IPython.html import widgets # Widget definitions
from IPython.display import display # Used to display widgets in the notebook

A simple slider


In [ ]:
mywidget = widgets.FloatSliderWidget()
display(mywidget)

You can slide the slider back and forth and then "get" the current value from the widget object with:


In [ ]:
print mywidget.value

Here's a variation on that where the current value is printed when the slider is moved. Play around with it.


In [ ]:
def on_value_change(name, value):
    print(value)

int_range = widgets.IntSliderWidget(min=0, max=10, step=2)
int_range.on_trait_change(on_value_change, 'value')
display(int_range)

You may want to handle the input provided by a widget with a handler when someone interacts with a widget and then do something. Try this one:


In [ ]:
def click_handler(widget):
    print "clicked"

b = widgets.ButtonWidget(description='Click Me!')
b.on_click(click_handler)
display(b)

Interact/Interactive

Interact and Interactive are two other new tools that build on the widgets library to allow you to use Matplotlib to explore and visualize functions or data with varying parameters.


In [ ]:
from IPython.html.widgets import interact, interactive

To use them, we define a function that creates a plot which depends on the values of parameters passed to the function. Then you can create an interact object with the function and specified ranges for those parameters. The interact object displayed will give you sliders (widgets) that let you vary the input parameters and interactively see how they change the plot!

Linear optimizer

Here's a first example. We have some data with error bars that we think should follow a straight line trend. But which straight line? Later on this quarter we will learn how algorithmic fitting of data optimizes the parameters of the function describing the data. For now, we can use a human optimizer to find the best parameters for the data below with an interact object:


In [ ]:
def linear_plot(m=0.5, b=27.0):
    '''
    Create a plot of some data that should
    vary linearly along with a straight line
    function with the given slope and intercept.
    '''
    #data to optimize
    datax = np.array([1.0,2.0,3.0,5.0,7.0,9.0])
    datay = np.array([10.2, 20.5, 24.8, 30.7, 33.6, 37.3])
    erry = np.array([1.0,0.5,2.6,1.0,5.6,6.0])
    #plot the data
    plt.errorbar(datax,datay,xerr=0.0,yerr=erry,fmt='o')
    #create a function to approximate the data using the slope
    #and intercept parameters passed to the function
    steps = 100
    x = np.linspace(0,10.,steps)
    y = m*x+b
    #plot and show the result
    plt.plot(x,y)
    plt.xlim(0.,10.)
    plt.ylim(0.,50.)
    plt.show()

#Create an interactive plot with sliders for varying the slope and intercept
v = interact(linear_plot,m=(0.0,5.0), b=(0.0,50.0))

Spend a few minutes trying to find the "best fit" line to this data and record the corresponding slope and intercept values for that line in your notebook.

Random scatter

Here's another interactive plot that allows you to randomly sample (x,y) pairs within a circle of radius $r$. The interact object lets you increase or decrease the number of samples in the circle.


In [ ]:
def scatter_plot(r=0.5, n=27):
    t = np.random.uniform(0.0,2.0*np.pi,n)
    rad = r*np.sqrt(np.random.uniform(0.0,1.0,n))
    x = np.empty(n)
    y = np.empty(n)
    x = rad*np.cos(t)
    y = rad*np.sin(t)
    fig = plt.figure(figsize=(4,4),dpi=80)
    plt.scatter(x,y)
    plt.xlim(-1.,1.)
    plt.ylim(-1.,1.)
    plt.show()
    
v2 = interact(scatter_plot,r=(0.0,1.0), n=(1,1000))

Sinusoids

Here is an example with two sine curves - one is a pure sine wave, the other is the superposition of two waves with different frequency but the same amplitude.


In [ ]:
def sin_plot(A=5.0,f1=5.0,f2=10.):
    x = np.linspace(0,2*np.pi,1000)
    #pure sine curve
    y = A*np.sin(f1*x)
    #superposition of sine curves with different frequency
    #but same amplitude
    y2 = A*(np.sin(f1*x)+np.sin(f2*x))
    plt.plot(x,y,x,y2)
    plt.xlim(0.,2.*np.pi)
    plt.ylim(-10.,10.)
    plt.grid()
    plt.show()
    
v3 = interact(sin_plot,A=(0.,10.), f1=(1.0,10.0), f2=(1.0,10.0))

Lissajous Curves

This one is a little more complicated. Here we have a parametric plot. We have two sinusoids, one in the $x$ direction and one in the $y$ direction, that both depend on a third parameter $t$. We can set up the $t$ array and compute the values for $x$ and $y$ from their functional dependence on $t$, then plot $y$ vs. $x$. The result when the two curves are sinusoidal is known as a Lissajous curve, which forms interesting patterns when the combination of parameters are related in particular ways.


In [ ]:
def lissajous_plot(a1=0.5,f1=1.,p1=0.,a2=0.5,f2=1.,p2=0.):
    t = np.linspace(0, 20*np.pi, 5000)
    x = a1*np.sin(f1*(t+p1))
    y = a2*np.cos(f2*(t+p2))
    plt.plot(x,y)
    plt.xlim(-1.,1.)
    plt.ylim(-1.,1.)
    plt.show()
    
v4 = interact(lissajous_plot,a1=(0.,1.), f1=(1.0,4.0), p1=(0.,2*np.pi),
                 a2=(0.,1.),f2=(1.0,4.0),p2=(0.,2*np.pi))

Try playing with the curves by adjusting the sliders to make interesting patterns.

Record three parameter combinations that lead to interesting shapes, then create new static plots in other cells to show what they look like.

Image manipulation

The last example in this tour is using interactive to manipulate an image and store the result for subsequent processing. The primary difference between interact and interactive is that the latter allows you to grab the result from the interact (in this case a modified image) and then use it elsewhere.


In [ ]:
from IPython.html.widgets import fixed
import skimage
from skimage import data, filter, io

Read in an image from the skimage library:


In [ ]:
i = data.coffee()
io.Image(i)

Define a function that will let us interact with the image:


In [ ]:
def edit_image(image, sigma=0.1, r=1.0, g=1.0, b=1.0):
    new_image = filter.gaussian_filter(image, sigma=sigma, multichannel=True)
    new_image[:,:,0] = r*new_image[:,:,0]
    new_image[:,:,1] = g*new_image[:,:,1]
    new_image[:,:,2] = b*new_image[:,:,2]
    new_image = io.Image(new_image)
    display(new_image)
    return new_image

Set up the interactive element and display it:


In [ ]:
lims = (0.0,1.0,0.01)
w = interactive(edit_image, image=fixed(i), sigma=(0.0,10.0,0.1), r=lims, g=lims, b=lims)
display(w)

After manipulating the image, see what is stored in the result:


In [ ]:
w.result

Summary

Instead of doing specific exercises with these elements as a separate notebook, play around with them a bit and think about how you might use them in other contexts. We will use them later in the course as we learn how to use the computer to explore physics problems and concepts. You may also find that they will be useful visualization tools for your project demos!


All content is under a modified MIT License, and can be freely used and adapted. See the full license text here.