In this notebook, we will walk through the Bokeh plotting software, and specifically focus on creating a service to deliver interactive widgets to users on the Internet.
The first part of the notebook I will just be demonstrating the basics of Bokeh plotting. The experience is very similar to matplotlib's pyplot
, but there are differences in the way plots are created that I'll note as we go along.
By J Guillochon (Harvard)
Bokeh is a Python package that generates interactive plots for notebooks and web browsers. The package is powered by JavaScript in the browser, but the user mostly uses Python to generate plots (although Javascript is used for some things).
Bokeh is great for sharing your datasets in an accessible way with your peers. You can use Bokeh in one of two ways:
Bokeh is certainly not the only option for interactive web plots (there is an ever-growing list of competitors, many of which are also free). These include (full list here: https://alternativeto.net/software/bokeh/?license=free):
At first we're going to do all our work in Jupyter. We need to tell Bokeh we want to do this with output_notebook
.
In [55]:
import numpy as np
# from six.moves import zip
from bokeh.plotting import figure, show, output_notebook, output_file, reset_output
reset_output()
output_notebook()
# Disable "retina" line below if your monitor doesn't support it.
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
Let's download some sample data. You only need to run this cell once.
In [ ]:
import bokeh
bokeh.sampledata.download()
Now let's generate some random data to play with.
In [48]:
N = 400
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
radii = np.random.random(size=N) * 3.0
Let's make a scatter plot of this data. We're going to turn on many of the interactive tools in this first example. Take a few minutes to play with each of the tools!
In [49]:
TOOLS="hover,crosshair,pan,wheel_zoom,box_zoom,reset,tap,save,box_select,poly_select,lasso_select"
colors2 = ["#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(50+2*x, 30+2*y)]
p1 = figure(width=500, height=300, tools=TOOLS)
p1.scatter(x, y, radius=radii, fill_color=colors2, fill_alpha=0.6, line_color=None)
show(p1)
Now, let's output the above to an HTML file:
In [52]:
from bokeh.io import reset_output, output_file
reset_output()
output_file('test.html')
colors2 = ["#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(50+2*x, 30+2*y)]
p1 = figure(width=500, height=300, tools=TOOLS)
p1.scatter(x, y, radius=radii, fill_color=colors2, fill_alpha=0.6, line_color=None)
show(p1)
Open test.html
produced by the above code in your favorite text editor. Talk to your neighbors about what you see.
Right off the bat you might be asking: can I export this for use in a scientific paper? The answer is sadly not all good news. You can export relatively easily to svg format with a couple of support libraries: selenium
, phantomjs
, and pillow
.
But! LaTeX does not support svg figures easily! There is an svg package out there but it's not on Overleaf and requires Inkscape, a huge software package. The svg package also conflicts with some commonly used templates (e.g. aastex
).
At the moment, the best option is to generate rasterized PNG files if you wish to include Bokeh plots in a paper. This can be done simply enough if you install the above packages:
In [ ]:
from bokeh.io import export_png
export_png(p1, 'test.png')
...but frankly, it's much simpler to just take a screenshot!
I've primarily used Bokeh for the Open Astronomy Catalogs. I will describe three examples:
https://sne.space/sne/SN2005gj/ (photometry, spectra browser)
https://faststars.space/sky-locations/ (metadata browser on a Hammer projection)
https://sne.space/catexplorer (Bokeh server that plots any supernova's light curve on request)
http://ashleyvillar.com/dlps (Created by Ashley Villar to show Zwicky diagram for her transients)
In the cells below, I've picked some example plots made in Bokeh that are unique and demonstrate how powerful it can be as a data visualization tool. While I go through these examples, I want you to think about your own data and how a particular plot could be used in your data's context.
First up, density plots with hexagonal bins:
In [5]:
import numpy as np
from bokeh.io import output_file, show, reset_output
from bokeh.plotting import figure
from bokeh.transform import linear_cmap
from bokeh.util.hex import hexbin
reset_output()
output_notebook()
n = 50000
x = np.random.standard_normal(n)
y = np.random.standard_normal(n)
bins = hexbin(x, y, 0.1)
p = figure(tools="wheel_zoom,reset,tap", match_aspect=True, background_fill_color='#440154')
p.grid.visible = False
p.hex_tile(q="q", r="r", size=0.1, line_color=None, source=bins,
fill_color=linear_cmap('counts', 'Viridis256', 0, max(bins.counts)))
show(p)
Images can be seamlessly displayed (with transparency):
In [12]:
from __future__ import division
import numpy as np
from bokeh.plotting import figure, output_file, show
# create an array of RGBA data
N = 500
img = np.empty((N, N), dtype=np.uint32)
view = img.view(dtype=np.uint8).reshape((N, N, 4))
for i in range(N):
for j in range(N):
view[i, j, 0] = int(255 * i / N)
view[i, j, 1] = 158
view[i, j, 2] = int(255 * j / N)
view[i, j, 3] = int(255 * j / N)
p = figure(plot_width=400, plot_height=400, x_range=(0, 10), y_range=(0, 10))
p.image_rgba(image=[img], x=[0], y=[0], dw=[10], dh=[10])
show(p)
Either from the data you've brought, or if you've brought no data, using this image (https://apod.nasa.gov/apod/image/1804/AmericanEclipseHDR_Lefaudeux_1080.jpg), display the image in a Bokeh environment.
We can link two plots to the same data, which may be of high dimension:
In [18]:
from bokeh.io import output_file, show
from bokeh.layouts import gridplot
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
x = list(range(-20, 21))
y0 = [abs(xx) for xx in x]
y1 = [xx**2 for xx in x]
# create a column data source for the plots to share
source = ColumnDataSource(data=dict(x=x, y0=y0, y1=y1))
TOOLS = "box_select,lasso_select,help,reset"
# create a new plot and add a renderer
left = figure(tools=TOOLS, plot_width=300, plot_height=300, title=None)
left.circle('x', 'y0', source=source)
# create another new plot and add a renderer
right = figure(tools=TOOLS, plot_width=300, plot_height=300, title=None)
right.circle('x', 'y1', source=source)
p = gridplot([[left, right]])
show(p)
Below demonstrates interactive highlighting of data when the user hovers their cursor:
In [16]:
from bokeh.plotting import figure, output_file, show
from bokeh.models import HoverTool
from bokeh.sampledata.glucose import data
subset = data.loc['2010-10-06']
x, y = subset.index.to_series(), subset['glucose']
# Basic plot setup
plot = figure(plot_width=600, plot_height=300, x_axis_type="datetime", tools="",
toolbar_location=None, title='Hover over points')
plot.line(x, y, line_dash="4 4", line_width=1, color='gray')
cr = plot.circle(x, y, size=20,
fill_color="grey", hover_fill_color="firebrick",
fill_alpha=0.05, hover_alpha=0.3,
line_color=None, hover_line_color="white")
plot.add_tools(HoverTool(tooltips=None, renderers=[cr], mode='hline'))
show(plot)
In [38]:
from bokeh.plotting import figure, output_file, show, ColumnDataSource
from bokeh.models import HoverTool
source = ColumnDataSource(data=dict(
x=[1, 2, 3, 4, 5],
y=[2, 5, 8, 2, 7],
desc=['A', 'b', 'C', 'd', 'E'],
imgs=[
'https://bokeh.pydata.org/static/snake.jpg',
'https://bokeh.pydata.org/static/snake2.png',
'https://bokeh.pydata.org/static/snake3D.png',
'https://bokeh.pydata.org/static/snake4_TheRevenge.png',
'https://bokeh.pydata.org/static/snakebite.jpg'
],
fonts=[
'<i>italics</i>',
'<pre>pre</pre>',
'<b>bold</b>',
'<small>small</small>',
'<del>del</del>'
]
))
hover = HoverTool( tooltips="""
<div>
<div>
<img
src="@imgs" height="42" alt="@imgs" width="42"
style="float: left; margin: 0px 15px 15px 0px;"
border="2"
></img>
</div>
<div>
<span style="font-size: 17px; font-weight: bold;">@desc</span>
<span style="font-size: 15px; color: #966;">[$index]</span>
</div>
<div>
<span>@fonts{safe}</span>
</div>
<div>
<span style="font-size: 15px;">Location</span>
<span style="font-size: 10px; color: #696;">($x, $y)</span>
</div>
</div>
"""
)
p = figure(plot_width=400, plot_height=400, tools=[hover],
title="Mouse over the dots")
p.circle('x', 'y', size=20, source=source)
show(p)
In [23]:
from bokeh.io import output_file, show
from bokeh.layouts import gridplot
from bokeh.plotting import figure
x = list(range(11))
y0 = x
y1 = [10-xx for xx in x]
y2 = [abs(xx-5) for xx in x]
# create a new plot
s1 = figure(plot_width=250, plot_height=250, title=None)
s1.circle(x, y0, size=10, color="navy", alpha=0.5)
# create a new plot and share both ranges
s2 = figure(plot_width=250, plot_height=250, x_range=s1.x_range, y_range=s1.y_range, title=None)
s2.triangle(x, y1, size=10, color="firebrick", alpha=0.5)
# create a new plot and share only one range
s3 = figure(plot_width=250, plot_height=250, x_range=s1.x_range, title=None)
s3.square(x, y2, size=10, color="olive", alpha=0.5)
p = gridplot([[s1, s2, s3]], toolbar_location=None)
# show the results
show(p)
In [56]:
import numpy as np
from bokeh.layouts import row, widgetbox
from bokeh.models import CustomJS, Slider
from bokeh.plotting import figure, output_file, show, ColumnDataSource
x = np.linspace(0, 10, 500)
y = np.sin(x)
source = ColumnDataSource(data=dict(x=x, y=y))
plot = figure(y_range=(-10, 10), plot_width=400, plot_height=400)
plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6)
callback = CustomJS(args=dict(source=source), code="""
var data = source.data;
var A = amp.value;
var k = freq.value;
var phi = phase.value;
var B = offset.value;
x = data['x']
y = data['y']
for (i = 0; i < x.length; i++) {
y[i] = B + A*Math.sin(k*x[i]+phi);
}
source.change.emit();
""")
amp_slider = Slider(start=0.1, end=10, value=1, step=.1,
title="Amplitude", callback=callback)
callback.args["amp"] = amp_slider
freq_slider = Slider(start=0.1, end=10, value=1, step=.1,
title="Frequency", callback=callback)
callback.args["freq"] = freq_slider
phase_slider = Slider(start=0, end=6.4, value=0, step=.1,
title="Phase", callback=callback)
callback.args["phase"] = phase_slider
offset_slider = Slider(start=-5, end=5, value=0, step=.1,
title="Offset", callback=callback)
callback.args["offset"] = offset_slider
layout = row(
plot,
widgetbox(amp_slider, freq_slider, phase_slider, offset_slider),
)
# output_file("slider.html", title="slider.py example")
show(layout)
In [57]:
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, HoverTool, CustomJS
# define some points and a little graph between them
x = [2, 3, 5, 6, 8, 7]
y = [6, 4, 3, 8, 7, 5]
links = {
0: [1, 2],
1: [0, 3, 4],
2: [0, 5],
3: [1, 4],
4: [1, 3],
5: [2, 3, 4]
}
p = figure(plot_width=400, plot_height=400, tools="", toolbar_location=None, title='Hover over points')
source = ColumnDataSource({'x0': [], 'y0': [], 'x1': [], 'y1': []})
sr = p.segment(x0='x0', y0='y0', x1='x1', y1='y1', color='olive', alpha=0.6, line_width=3, source=source, )
cr = p.circle(x, y, color='olive', size=30, alpha=0.4, hover_color='olive', hover_alpha=1.0)
# Add a hover tool, that sets the link data for a hovered circle
code = """
var links = %s;
var data = {'x0': [], 'y0': [], 'x1': [], 'y1': []};
var cdata = circle.data;
var indices = cb_data.index['1d'].indices;
for (i=0; i < indices.length; i++) {
ind0 = indices[i]
for (j=0; j < links[ind0].length; j++) {
ind1 = links[ind0][j];
data['x0'].push(cdata.x[ind0]);
data['y0'].push(cdata.y[ind0]);
data['x1'].push(cdata.x[ind1]);
data['y1'].push(cdata.y[ind1]);
}
}
segment.data = data;
""" % links
callback = CustomJS(args={'circle': cr.data_source, 'segment': sr.data_source}, code=code)
p.add_tools(HoverTool(tooltips=None, callback=callback, renderers=[cr]))
show(p)
With your data, create an interactive plot with at least one control to manipulate the plot output.
Useful controls to consider include (see https://bokeh.pydata.org/en/latest/docs/user_guide/interaction/widgets.html):
In [27]:
import pandas as pd
from bokeh.palettes import Spectral4
from bokeh.plotting import figure, output_file, show
from bokeh.sampledata.stocks import AAPL, IBM, MSFT, GOOG
p = figure(plot_width=800, plot_height=250, x_axis_type="datetime")
p.title.text = 'Click on legend entries to hide the corresponding lines'
for data, name, color in zip([AAPL, IBM, MSFT, GOOG], ["AAPL", "IBM", "MSFT", "GOOG"], Spectral4):
df = pd.DataFrame(data)
df['date'] = pd.to_datetime(df['date'])
p.line(df['date'], df['close'], line_width=2, color=color, alpha=0.8, legend=name)
p.legend.location = "top_left"
p.legend.click_policy="hide"
show(p)
Create a plot of the data you brought with you and output it to HTML using the output_file()
function in Bokeh. On Slack, send this file to a person sitting near you, without telling them what the data is. Let them play with your plot for a few minutes and see if they can guess what you are trying to show with the plot. So things aren't too easy, do not label your axes, only use the ticks/tooltips to describe the data.
Running Bokeh in server mode means you do not have to ship all the data to the user immediately when they open the page, instead the data is delivered to them on demand. Since the Bokeh server is written in Python, this means the event handlers that return data back to the user can also be written in Python (no javascript).
Use the bokeh serve
command to run the server example by executing:
bokeh serve sliders.py
at your command prompt. Then navigate to the URL http://localhost:5006/sliders.
The code below will not execute in Jupyter as it is intended to run in a server environment.
In [ ]:
import numpy as np
from bokeh.io import curdoc
from bokeh.layouts import row, widgetbox
from bokeh.models import ColumnDataSource
from bokeh.models.widgets import Slider, TextInput
from bokeh.plotting import figure
# Set up data
N = 200
x = np.linspace(0, 4*np.pi, N)
y = np.sin(x)
source = ColumnDataSource(data=dict(x=x, y=y))
# Set up plot
plot = figure(plot_height=400, plot_width=400, title="my sine wave",
tools="crosshair,pan,reset,save,wheel_zoom",
x_range=[0, 4*np.pi], y_range=[-2.5, 2.5])
plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6)
# Set up widgets
text = TextInput(title="title", value='my sine wave')
offset = Slider(title="offset", value=0.0, start=-5.0, end=5.0, step=0.1)
amplitude = Slider(title="amplitude", value=1.0, start=-5.0, end=5.0, step=0.1)
phase = Slider(title="phase", value=0.0, start=0.0, end=2*np.pi)
freq = Slider(title="frequency", value=1.0, start=0.1, end=5.1, step=0.1)
# Set up callbacks
def update_title(attrname, old, new):
plot.title.text = text.value
text.on_change('value', update_title)
def update_data(attrname, old, new):
# Get the current slider values
a = amplitude.value
b = offset.value
w = phase.value
k = freq.value
# Generate the new curve
x = np.linspace(0, 4*np.pi, N)
y = a*np.sin(k*x + w) + b
source.data = dict(x=x, y=y)
for w in [offset, amplitude, phase, freq]:
w.on_change('value', update_data)
# Set up layouts and add to document
inputs = widgetbox(text, offset, amplitude, phase, freq)
curdoc().add_root(row(inputs, plot, width=800))
curdoc().title = "Sliders"