Bokeh: An interactive approach of graphing data

Bokeh is a Python interactive visualization library that targets modern web browsers for presentation.

We illustrate the approach of graphing with Bokeh through 2 examples:

  • Japan's aging population: We've covered this data in class with Matplotlib. We hope it can become a bridge between the two packages and make a smooth transition. Basic interactions will be introduced in this example.
  • IMBD movies: This is an advanced example from Bokeh Gallery, which requires a comprehensive understanding of the package. It is challenging, but the result is amazing.

This IPython notebook was created by Zhiqi Guo, Yiran Zheng, Jiamin Zhang as final project for the NYU Stern course Data Bootcamp.

Preliminaries

First of all, let's follow our tradition in class and import packages. The following code is from IPython notebook: Data Bootcamp: Examples created by Professor Dave Backus, Chase Coleman, and Spencer Lyon for the NYU Stern course Data Bootcamp.

Setting

To ensure your Bokeh package is up to date, in the command line, execute "conda install Bokeh"


In [1]:
# import packages 
import pandas as pd                   # data management
import matplotlib.pyplot as plt       # graphics 
import matplotlib as mpl              # graphics parameters
import numpy as np                    # numerical calculations 

# IPython command, puts plots in notebook 
%matplotlib inline

# check Python version 
import datetime as dt 
import sys
print('Today is', dt.date.today())
print('What version of Python are we running? \n', sys.version, sep='')


Today is 2016-05-13
What version of Python are we running? 
3.5.1 |Anaconda 2.4.1 (64-bit)| (default, Feb 16 2016, 09:49:46) [MSC v.1900 64 bit (AMD64)]
C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\pandas\computation\__init__.py:19: UserWarning: The installed version of numexpr 2.4.4 is not supported in pandas and will be not be used

  UserWarning)

Example 1: Japan's aging population

Data from the UN's Population Division. Remember one of Professor's favorite quotes?

Last year, for the first time, sales of adult diapers in Japan exceeded those for babies. 

Now let's take a look at the data again.


In [2]:
url1 = 'http://esa.un.org/unpd/wpp/DVD/Files/'
url2 = '1_Indicators%20(Standard)/EXCEL_FILES/1_Population/'
url3 = 'WPP2015_POP_F07_1_POPULATION_BY_AGE_BOTH_SEXES.XLS'
url = url1 + url2 + url3 

cols = [2, 4, 5] + list(range(6,28))
#est = pd.read_excel(url, sheetname=0, skiprows=16, parse_cols=cols, na_values=['…'])
prj = pd.read_excel(url, sheetname=1, skiprows=16, parse_cols=cols, na_values=['…'])

"""
for later:  change cols for the two sources, rename 80+ to 80-84, then concat 
#pop = pd.concat([est, prj], axis=0, join='outer')      
"""
pop = prj 
pop.dtypes


Out[2]:
Major area, region, country or area *     object
Country code                               int64
Reference date (as of 1 July)              int64
0-4                                      float64
5-9                                      float64
10-14                                    float64
15-19                                    float64
20-24                                    float64
25-29                                    float64
30-34                                    float64
35-39                                    float64
40-44                                    float64
45-49                                    float64
50-54                                    float64
55-59                                    float64
60-64                                    float64
65-69                                    float64
70-74                                    float64
75-79                                    float64
80-84                                    float64
85-89                                    float64
90-94                                    float64
95-99                                    float64
100+                                     float64
dtype: object

In [3]:
# rename some variables 
pop = pop.rename(columns={'Reference date (as of 1 July)': 'Year', 
                          'Major area, region, country or area *': 'Country', 
                          'Country code': 'Code'})
# select Japan and years 
countries = ['Japan']
years     = [2015, 2025, 2035, 2045, 2055, 2065]
pop = pop[pop['Country'].isin(countries) & pop['Year'].isin(years)]
pop = pop.drop(['Country', 'Code'], axis=1)
pop = pop.set_index('Year').T
pop


Out[3]:
Year 2015 2025 2035 2045 2055 2065
0-4 5269.038 4872.732 4610.562 4448.702 4271.907 4098.930
5-9 5398.973 5086.975 4720.645 4529.657 4371.016 4183.472
10-14 5603.638 5275.897 4880.967 4619.658 4458.145 4280.722
15-19 5960.784 5425.060 5114.258 4748.906 4557.117 4395.831
20-24 6111.768 5665.781 5340.457 4947.499 4685.414 4517.465
25-29 6843.421 6033.351 5502.279 5194.574 4829.637 4630.632
30-34 7455.687 6166.461 5726.044 5404.633 5013.420 4745.904
35-39 8345.753 6868.725 6067.218 5541.785 5236.482 4869.093
40-44 9689.865 7446.336 6172.067 5739.054 5422.259 5031.587
45-49 8623.094 8289.040 6837.234 6050.330 5533.987 5232.930
50-54 7863.374 9556.244 7365.012 6120.001 5701.140 5393.482
55-59 7535.334 8425.830 8129.515 6728.225 5971.177 5474.018
60-64 8530.749 7577.456 9259.518 7171.861 5985.655 5595.983
65-69 9452.518 7111.085 8021.327 7796.329 6493.459 5795.188
70-74 7770.410 7810.061 7034.014 8693.253 6798.299 5722.098
75-79 6297.532 8212.023 6319.464 7260.858 7167.066 6048.305
80-84 4940.325 6065.908 6327.227 5876.803 7456.047 5963.869
85-89 3115.732 3985.173 5517.980 4469.537 5368.363 5511.094
90-94 1338.774 2137.536 2871.710 3242.964 3233.182 4377.365
95-99 366.082 712.019 1029.126 1597.432 1437.057 1904.730
100+ 60.630 115.971 216.863 338.945 462.541 518.644

Graphing with Matplotlib

  • Returns a static image
  • No web display
  • Hard to rescale
  • No interaction

In [4]:
fig, ax = plt.subplots()
pop[2015].plot(ax=ax,kind='line',alpha=0.5, sharey=True, figsize=(6,4))
ax.set_title('2015 Japanese population by age', fontsize=14, loc='left')


Out[4]:
<matplotlib.text.Text at 0x1e2c8775860>

Graphing with Bokeh

  • Approach #1: Leverge our command of Matplotlib to create a Bokeh graph in a HTML file.

In [5]:
from bokeh import mpl
from bokeh.plotting import output_file, show,figure


fig, ax = plt.subplots()
ax = pop[2015].plot(kind='line')#,alpha=0.5)#, sharey=True, figsize=(6,4))

ax.set_title('2015 Japanese population by age', fontsize=14, loc='left')

output_file('JPN.html') #Get a plot in HTML file

show(mpl.to_bokeh(fig))

Question. What's the difference between the plots generated by two pacakages?

Question.What is the function of each button on the control panel at top of the Bokeh plot?

Comment. From the plot we can see that leveraging other library may not be the best way to graphing in Bokeh, due to the compatibility problem. The reason why we don't use bar chart here is that bar in matplotlib is fully incompatble with Bokeh.


In [ ]:


In [ ]:

  • Approach #2: Generate a plot in a separate html file with Bokeh from scratch.

Let's make a simple dataframe to test out the functions first.


In [6]:
from bokeh.charts import Bar, output_file, show

def simple_bar():
    #Here we first set up a easy data frame to use
    #Best support is with data in a format that is table-like
    data = {
        'sample': ['A','B'],
        'value': [40,30]  
    }

    df = pd.DataFrame(data)

    # set up the title, x-axis and y-axis
    bar = Bar(df, 
              'sample', 
              values='value', 
              bar_width=0.4, #we can manipulate width of bar manually 
              title="Our first test bar chart")
    
    output_file("Simpe_test_bar.html")
    print(df) #Here we print out df to see the plot of bar chart plot with a dataframe
    show(bar)
    
simple_bar()


  sample  value
0      A     40
1      B     30

In [ ]:


In [ ]:

  • Approach #3: Create Bokeh charts integrated in IPython notebook.
    Notice the difference in importing package and calling function

In [7]:
from bokeh.charts import Bar, output_file, show
from bokeh.plotting import *    #Here the line from bokeh.plotting import * implicitly pulls 
                                #the output_notebook function into the namespace.


#Here we first set up a easy data frame to use
#Best support is with data in a format that is table-like
data =  {
        'sample': ['A','B'],
        'value': [40,30]  
        }

df = pd.DataFrame(data)

# set up the title, x-axis and y-axis
bar = Bar(df, 
          'sample', 
          values='value', 
          bar_width=0.4,
          title="Our First Test Bar Chart"
         #,tools='crosshair'
         )

output_notebook()#Here,instead of calling output_file(),
                 #we call output_notebook() to directly display plot in notebook
show(bar)


Loading BokehJS ...
Out[7]:

<Bokeh Notebook handle for In[30]>

Exercise. Uncomment the "tools='crosshair'" attribute of the bar plot to see what happens. Find out more tools that we can use.


In [ ]:

  • Apply the bar chart on Japan's example

In [8]:
#The function barplot will give the population's bar plot for the year we choose
#We can choose the year in 2015,2025,2035,2045,2055,2065

from bokeh.charts import Bar, output_file, show
from bokeh.charts.attributes import CatAttr
def barplot(choose_year):
    
    population = pop[int(choose_year)].tolist()
    year = list(pop.index)
    data = {
        'year': year,
        'population':population 
    }
    df = pd.DataFrame(data)

    bar = Bar(df, 
              label=CatAttr(columns=["year"], sort=False), #Caution:we have to manually turn off the bar sorting
                                                           #or the bar plot desn't follow 
              values='population', 
              ylabel="Population(thousands)",
              title="Japan's populaiton in " + str(choose_year)+" by age", color="red")
    output_file("Japan's populaiton in " + str(choose_year))
    
    
    show(bar) 
    return bar

#Try several years and see how it works
barplot(2015)    
#barplot(2025)
#barplot(2035)


Out[8]:
<bokeh.charts.chart.Chart at 0x1e2c8784f60>

In [ ]:


In [ ]:

Adding interactions

  • Tab panes: Tab panes allow multiple plots or layouts to be show in selectable tabs.

In [9]:
from bokeh.models.widgets import Panel, Tabs
from bokeh.io import output_file, show
from bokeh.plotting import figure

output_file("tab_panes.html", mode='cdn')

p1 = figure(plot_width=300, plot_height=300)
p1 = barplot(2015)
tab1 = Panel(child=p1, title="Japan's Populaiton for 2015") #tab1 for year 2015

p2 = figure(plot_width=300, plot_height=300)
p2 = barplot(2025)
tab2 = Panel(child=p2, title="Japan's Populaiton for 2025") #tab2 for year 2025

p2 = figure(plot_width=300, plot_height=300)
p2 = barplot(2035)
tab3 = Panel(child=p2, title="Japan's Populaiton for 2035") #tab3 for year 2035


tabs = Tabs(tabs=[ tab1, tab2, tab3 ])  # create different tabs

show(tabs)


ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
Out[9]:

<Bokeh Notebook handle for In[30]>


In [ ]:


In [ ]:

Example 2: IMDB Movies

This is a challenging example which involves more widgets that we can play with. Some of the widgets requires a CustomJS callback. If you have a nice command of javascript, then go for it. If you don't, just skip it. These interactions can also be done by using Bokeh Server. We didn't cover it here, but if you are interested in it, you can get tutorials on Bokeh's website.

Import packages and load our data.

Remember to download the csv and put it in the same folder as this IPython notebook.


In [10]:
import numpy as np


from bokeh.plotting import Figure
from bokeh.models import ColumnDataSource, HoverTool, HBox, VBoxForm
from bokeh.models.widgets import Slider, Select, TextInput



movies = pd.read_csv("movies.csv")

In [11]:
movies


Out[11]:
Unnamed: 0 ID imdbID Title Year mpaaRating Runtime Genre Released Director ... Fresh Rotten userMeter userRating userReviews BoxOffice Production color alpha revenue
0 0 4972 tt0004972 The Birth of a Nation 1915 Not Rated 165.0 Drama, History, Romance 1915-03-03 D.W. Griffith ... 38 0 58.0 3.3 4034.0 0.0 Gravitas grey 0.25 0
1 1 6206 tt0006206 Les vampires 1915 Not Rated 399.0 Action, Adventure, Crime 1915-11-13 Louis Feuillade ... 13 0 85.0 3.8 2075.0 0.0 0 grey 0.25 0
2 2 6864 tt0006864 Intolerance: Love's Struggle Throughout the Ages 1916 Not Rated 197.0 Drama, History 1916-09-05 D.W. Griffith ... 27 1 78.0 3.8 4604.0 0.0 Cohen Media Group grey 0.25 0
3 3 9470 tt0009470 Over the Top 1918 0 0.0 Drama, War 1918-03-31 Wilfrid North ... 4 7 51.0 2.8 44707.0 0.0 NaN grey 0.25 0
4 4 9968 tt0009968 Broken Blossoms or The Yellow Man and the Girl 1919 Not Rated 90.0 Drama, Romance 1919-10-20 D.W. Griffith ... 19 1 72.0 3.7 3651.0 0.0 Kino on Video grey 0.25 0
5 5 10323 tt0010323 The Cabinet of Dr. Caligari 1920 Unrated 67.0 Horror 1921-03-19 Robert Wiene ... 37 0 90.0 4.1 25103.0 0.0 0 grey 0.25 0
6 6 11130 tt0011130 Dr. Jekyll and Mr. Hyde 1920 Unrated 49.0 Drama, Horror, Sci-Fi 1920-04-01 John S. Robertson ... 12 1 67.0 3.4 2878.0 0.0 0 grey 0.25 0
7 7 11841 tt0011841 Way Down East 1920 Not Rated 145.0 Drama, Romance 1920-09-03 D.W. Griffith ... 15 1 69.0 3.6 984.0 0.0 Kino Lorber grey 0.25 0
8 8 12349 tt0012349 The Kid 1921 Not Rated 68.0 Comedy, Drama, Family 1921-02-06 Charles Chaplin ... 19 0 96.0 4.2 14440.0 0.0 First National Pictures Inc. grey 0.25 0
9 9 12532 tt0012532 Orphans of the Storm 1921 Not Rated 150.0 Drama 0 D.W. Griffith ... 9 1 71.0 3.8 410.0 0.0 0 grey 0.25 0
10 10 12920 tt0012920 Barb Wire 1922 0 0.0 Western 1922-06-01 Francis J. Grandon ... 10 26 15.0 1.8 47002.0 0.0 PolyGram Video grey 0.25 0
11 11 12938 tt0012938 Beyond the Rocks 1922 TV-PG 80.0 Drama 1922-05-07 Sam Wood ... 10 0 66.0 3.6 299.0 0.0 Paramount Pictures grey 0.25 0
12 12 13086 tt0013086 Dr. Mabuse: The Gambler 1922 Not Rated 242.0 Crime, Mystery, Thriller 1922-05-26 Fritz Lang ... 12 1 89.0 4.1 2169.0 0.0 0 grey 0.25 0
13 13 13257 tt0013257 Häxan: Witchcraft Through the Ages 1922 Not Rated 91.0 Horror 1929-05-27 Benjamin Christensen ... 14 2 81.0 3.9 4080.0 0.0 International Telefilm Enterprises grey 0.25 0
14 14 13427 tt0013427 Nanook of the North 1922 Not Rated 79.0 Documentary 1922-06-11 Robert J. Flaherty ... 22 0 80.0 3.8 4684.0 0.0 0 grey 0.25 0
15 15 13442 tt0013442 Nosferatu 1922 Unrated 81.0 Horror 1929-06-03 F.W. Murnau ... 58 2 88.0 4.0 46065.0 0.0 Film Arts Guild grey 0.25 0
16 16 14142 tt0014142 The Hunchback of Notre Dame 1923 Unrated 133.0 Drama, Romance 1923-09-06 Wallace Worsley ... 11 0 75.0 3.6 2858.0 0.0 Gravitas grey 0.25 0
17 17 14341 tt0014341 Our Hospitality 1923 Not Rated 65.0 Comedy, Family 1923-11-19 John G. Blystone, Buster Keaton ... 17 0 90.0 4.1 3271.0 0.0 0 grey 0.25 0
18 18 14429 tt0014429 Safety Last! 1923 Not Rated 70.0 Comedy, Romance, Thriller 1923-04-01 Fred C. Newmeyer, Sam Taylor ... 24 1 93.0 4.2 3393.0 0.0 Criterion Collection grey 0.25 0
19 19 14624 tt0014624 A Woman of Paris: A Drama of Fate 1923 TV-PG 78.0 Drama, Romance 1924-02-25 Charles Chaplin ... 10 1 81.0 3.8 853.0 0.0 Criterion Collection grey 0.25 0
20 20 14759 tt0014759 Captain Blood 1924 0 110.0 Action, Adventure, Romance 1924-09-21 David Smith, Albert E. Smith ... 24 0 89.0 4.0 7982.0 0.0 Twentieth Century Fox Home Entertainment grey 0.25 0
21 21 15064 tt0015064 The Last Laugh 1924 Not Rated 90.0 Drama 1925-01-05 F.W. Murnau ... 25 0 88.0 4.1 3882.0 0.0 0 grey 0.25 0
22 22 15163 tt0015163 The Navigator 1924 Unrated 59.0 Action, Comedy 1924-10-13 Donald Crisp, Buster Keaton ... 12 0 89.0 4.0 3082.0 0.0 MGM grey 0.25 0
23 23 15400 tt0015400 The Thief of Bagdad 1924 Approved 155.0 Adventure, Family, Fantasy 1924-03-23 Raoul Walsh ... 19 1 81.0 3.7 2167.0 0.0 United Artists grey 0.25 0
24 24 15648 tt0015648 Battleship Potemkin 1925 Unrated 66.0 Drama, History 1925-12-24 Sergei M. Eisenstein ... 44 0 86.0 4.0 17876.0 51000.0 Kino International grey 0.25 51,000
25 25 15864 tt0015864 The Gold Rush 1925 Not Rated 95.0 Adventure, Comedy, Family 0 Charles Chaplin ... 42 0 93.0 4.1 19498.0 0.0 Janus Films grey 0.25 0
26 26 15881 tt0015881 Greed 1924 Not Rated 140.0 Drama 1925-01-26 Erich von Stroheim ... 16 0 91.0 4.3 2366.0 0.0 Warner Home Video grey 0.25 0
27 27 16220 tt0016220 The Phantom of the Opera 1925 Unrated 93.0 Horror, Thriller 1925-11-15 Rupert Julian, Lon Chaney, Ernst Laemmle, Edwa... ... 37 4 84.0 3.8 16986.0 0.0 Universal Pictures grey 0.25 0
28 28 16308 tt0016308 Sally of the Sawdust 1925 0 104.0 Comedy 1925-08-02 D.W. Griffith ... 8 2 54.0 3.4 140.0 0.0 Paramount Pictures grey 0.25 0
29 29 16332 tt0016332 Seven Chances 1925 Not Rated 56.0 Comedy, Family, Romance 1925-03-11 Buster Keaton ... 14 0 91.0 4.1 1809.0 0.0 MGM grey 0.25 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
12539 12539 2830416 tt2830416 Hawking 2013 TV-PG 94.0 Documentary 2013-09-20 Stephen Finnigan ... 22 4 68.0 3.6 149.0 0.0 IFC Films grey 0.25 0
12540 12540 2852376 tt2852376 Heli 2013 0 105.0 Crime, Drama 2013-08-09 Amat Escalante ... 7 4 0.0 0.0 246.0 0.0 0 grey 0.25 0
12541 12541 2852394 tt2852394 Grigris 2013 0 101.0 Drama 2013-07-10 Mahamat-Saleh Haroun ... 5 5 44.0 0.0 35.0 0.0 Les Films du Losange grey 0.25 0
12542 12542 2852400 tt2852400 A Touch of Sin 2013 0 133.0 Drama 2013-11-21 Zhangke Jia ... 47 4 79.0 3.8 1089.0 99700.0 Kino Lorber grey 0.25 99,700
12543 12543 2852406 tt2852406 Omar 2013 0 96.0 Drama, Thriller 2013-10-16 Hany Abu-Assad ... 58 7 84.0 4.0 1678.0 0.0 Adopt Films grey 0.25 0
12544 12544 2852458 tt2852458 Stranger by the Lake 2013 0 97.0 Drama, Thriller 2013-06-12 Alain Guiraudie ... 74 4 70.0 3.6 3076.0 300000.0 Strand Releasing grey 0.25 300,000
12545 12545 2852470 tt2852470 The Missing Picture 2013 0 92.0 Documentary 2014-03-19 Rithy Panh ... 56 1 73.0 3.7 1811.0 12800.0 Strand Releasing grey 0.25 12,800
12546 12546 2870708 tt2870708 Wish I Was Here 2014 0 120.0 Comedy, Drama 2014-07-25 Zach Braff ... 4 7 92.0 0.0 231.0 0.0 0 grey 0.25 0
12547 12547 2884206 tt2884206 I Origins 2014 0 113.0 Drama, Sci-Fi 2014-01-18 Mike Cahill ... 5 6 0.0 0.0 69.0 0.0 0 grey 0.25 0
12548 12548 2893134 tt2893134 Herblock: The Black & the White 2013 0 95.0 Documentary 2013-08-16 Michael Stevens ... 11 1 64.0 3.5 72.0 0.0 0 grey 0.25 0
12549 12549 2936174 tt2936174 Visitors 2013 Not Rated 87.0 Documentary 2014-01-24 Godfrey Reggio ... 27 15 54.0 3.3 805.0 79300.0 Cinedigm Digital Cinema grey 0.25 79,300
12550 12550 2942522 tt2942522 Moebius 2013 0 89.0 Drama 2013-09-05 Ki-duk Kim ... 9 2 0.0 0.0 88.0 0.0 Independent grey 0.25 0
12551 12551 2948266 tt2948266 Dealin' with Idiots 2013 Not Rated 0.0 Comedy 2013-07-12 Jeff Garlin ... 7 12 34.0 2.8 337.0 15300.0 IFC Films grey 0.25 15,300
12552 12552 2954776 tt2954776 First Comes Love 2013 0 105.0 Documentary 2013-05-03 Nina Davenport ... 4 6 67.0 3.5 72.0 0.0 HBO Films grey 0.25 0
12553 12553 3060670 tt3060670 The New Black 2013 0 80.0 Documentary 0 Yoruba Richen ... 8 2 75.0 4.1 117.0 0.0 0 grey 0.25 0
12554 12554 3063516 tt3063516 Jackass Presents: Bad Grandpa 2013 R 92.0 Comedy 2013-10-25 Jeff Tremaine ... 63 42 63.0 3.6 93789.0 102000000.0 Paramount Pictures grey 0.25 102,000,000
12555 12555 3074694 tt3074694 Flowers in the Attic 2014 TV-14 86.0 Drama, Mystery, Thriller 2014-01-18 Deborah Chow ... 9 10 0.0 0.0 246.0 0.0 0 grey 0.25 0
12556 12556 3089388 tt3089388 Tim's Vermeer 2013 PG-13 80.0 Documentary 2013-10-03 Teller ... 83 6 83.0 4.0 3875.0 500000.0 Sony Pictures Classics grey 0.25 500,000
12557 12557 3091552 tt3091552 At Berkeley 2013 0 244.0 Documentary 2014-02-26 Frederick Wiseman ... 35 5 74.0 4.0 283.0 5100.0 Zipporah grey 0.25 5,100
12558 12558 3101474 tt3101474 InRealLife 2013 15A 90.0 Documentary, News 2013-09-20 Beeban Kidron ... 6 6 0.0 0.0 29.0 0.0 0 grey 0.25 0
12559 12559 3118958 tt3118958 Lizzie Borden Took an Ax 2014 TV-14 91.0 Drama, Mystery, Thriller 2014-01-25 Nick Gomez ... 5 6 0.0 0.0 97.0 0.0 0 grey 0.25 0
12560 12560 3119416 tt3119416 Stray Dogs 2013 0 138.0 Drama 2014-03-12 Ming-liang Tsai ... 10 1 0.0 0.0 80.0 0.0 Homegreen Films grey 0.25 0
12561 12561 3137552 tt3137552 Hank: 5 Years from the Brink 2013 0 85.0 Documentary, Biography, News 2013-09-01 Joe Berlinger ... 7 4 59.0 3.6 59.0 0.0 Radical Media grey 0.25 0
12562 12562 3144098 tt3144098 Commitment 2013 Not Rated 113.0 Action, Drama 2013-12-06 Hong-soo Park ... 4 6 67.0 3.6 79.0 0.0 Well Go USA grey 0.25 0
12563 12563 3148890 tt3148890 Embrace of the Vampire 2013 Unrated 91.0 Horror 2013-10-15 Carl Bessai ... 1 9 35.0 2.7 10432.0 0.0 New Line Home Entertainment grey 0.25 0
12564 12564 3210686 tt3210686 Son of God 2014 PG-13 138.0 Drama 2014-02-28 Christopher Spencer ... 13 46 78.0 4.1 31337.0 55700000.0 20th Century Fox grey 0.25 55,700,000
12565 12565 3228928 tt3228928 The Prime Ministers: The Pioneers 2013 0 0.0 Documentary 2013-10-18 Richard Trank ... 3 7 50.0 3.8 83.0 0.0 Moriah Films grey 0.25 0
12566 12566 3234082 tt3234082 12-12-12 2013 R 105.0 Documentary 2013-11-01 Amir Bar-Lev ... 9 5 33.0 3.0 48.0 0.0 The Weinstein Company grey 0.25 0
12567 12567 3263614 tt3263614 Kumiko, the Treasure Hunter 2014 0 105.0 Drama 2014-01-20 David Zellner ... 13 0 0.0 0.0 57.0 0.0 0 grey 0.25 0
12568 12568 3404140 tt3404140 The Attorney 2013 0 127.0 Drama 2014-02-07 Woo-seok Yang ... 7 3 87.0 4.4 507.0 600000.0 Well Go USA grey 0.25 600,000

12569 rows × 30 columns


In [ ]:

Plot with Basic Glyphs


In [12]:
from bokeh.plotting import figure, output_notebook, show

output_notebook()   

p = figure(plot_width=400, plot_height=400)

# add a circle renderer with a size, color, and alpha
p.quad(movies["Meter"], movies["Reviews"], color="navy", alpha=0.5)

# show the results
show(p)


Loading BokehJS ...
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
Out[12]:

<Bokeh Notebook handle for In[30]>

Excersice. Try some other methods (glyphs) such as asterisk, dimond, circle_cross etc.

Comment. See more methods of plotting on Bokeh Reference Guide.

Slider

The Bokeh [slider] is a Bokeh widget which(http://bokeh.pydata.org/en/0.10.0/docs/user_guide/interaction.html#slider) can be configured with start and end values, a step size, an initial value and a title.

Before we make a basic slider, we should import some packages first.


In [13]:
from bokeh.models.widgets import Slider
from bokeh.io import vform

In [14]:
#Lets try make a slider of the reviews
reviews = Slider(title="Minimum number of reviews", 
                 value=80, #initial value when the slider is generated
                 start=10,
                 end=300,
                 step=10)

show(vform(reviews))


ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
Out[14]:

<Bokeh Notebook handle for In[30]>


In [15]:
movies['Year'].describe()


Out[15]:
count    12569.000000
mean      1996.328666
std         17.690950
min       1902.000000
25%       1991.000000
50%       2002.000000
75%       2008.000000
max       2014.000000
Name: Year, dtype: float64

In [16]:
#Try with year
reviews = Slider(title="Year of release", value=1950, start=1902, end=2014, step=1)

show(vform(reviews))


ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
Out[16]:

<Bokeh Notebook handle for In[30]>

CustomJS of Widgets

If we want to use the slider to change the data of a plot, we have to use the CustomJS of Widgets.

We encounter JavaScript Callback here. Callbacks allow us to write javascript pieces and included in our python project, in order to trigger sophisticated interactions. Therefore people can play with the graphic a little bit more.


In [17]:
#Import the CustomJS of Widgets package
from bokeh.models import CustomJS, Range1d

In [18]:
# create a column data source for the plots to share
x = movies["Meter"]
y = movies["Reviews"]
source = ColumnDataSource(data=dict(x=x, y=y))
all_data = ColumnDataSource(data=dict(x=x, y=y))

# create a new figure
p = figure(plot_width=400, plot_height=400)
p.circle(x, y, size=5, color="navy", alpha=0.5, source=source)
p.set(y_range=Range1d(0, 310), x_range=Range1d(-5, 105))

In [19]:
#callback

callback = CustomJS(args=dict(source=source, all_data=all_data), code="""
        var data = source.get('data');
        var all_data = all_data.get('data');
        var f = cb_obj.get('value');
        x = all_data['x'];
        y = all_data['y'];
        data['y'] = [];
        data['x'] = [];
        for (i=0; i < y.length; i++){
            if (y[i]>f) {
                data['y'].push(y[i]);
                data['x'].push(x[i]);
            }
             
        }     
         source.trigger('change');
    """)

In [20]:
reviews = Slider(
    title="Minimum number of reviews", 
    value=50, start=10, end=305, step=5,
    callback=callback)

In [21]:
layout = vform(reviews, p)

show(layout)


ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1004 (BOTH_CHILD_AND_ROOT): Models should not be a document root if they are in a layout box: Figure, ViewModel:Plot, ref _id: 99a0b8a3-70c9-405e-b521-3df3f5d046f2
Out[21]:

<Bokeh Notebook handle for In[30]>

Hover

The Bokeh hover is a passive inspector tool, which displays informational tooltips whenever the cursor is directly over a glyph. The data to show comes from the glyph’s data source, and what is to be displayed is configurable through a tooltips attribute that maps display names to columns in the data source, or to special known variables.


In [22]:
hover = HoverTool(tooltips = [
        ("$", "@revenue"),
        ("Title","@title"),
        ("Year", "@year")    #Start with “@” , interpreted as columns on the data source.
    ])

In [23]:
%reset p


Once deleted, variables cannot be recovered. Proceed (y/[n])? y
Don't know how to reset  p, please run `%reset?` for details

In [24]:
#add revenue to the column data source that will be used by the plot
source = ColumnDataSource(data=dict(x=x, y=y, revenue=movies["revenue"], title=movies["Title"], year=movies["Year"]))

p = figure(plot_width=400, plot_height=400,tools=[hover])
p.circle(x, y, size=5, color="navy", alpha=0.5, source=source)
p.set(y_range=Range1d(0, 310), x_range=Range1d(-5, 105))

In [25]:
show(p)


ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 6eda24e2-2530-42fb-944d-e9fa8b67dfcc
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: 3898b7fa-50ab-4815-9f1b-2dd48897dc67
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure, ViewModel:Plot, ref _id: e58aa9b4-f413-40a4-a96f-67940c9bee25
ERROR:C:\Users\Jamiezhang221\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:W-1004 (BOTH_CHILD_AND_ROOT): Models should not be a document root if they are in a layout box: Figure, ViewModel:Plot, ref _id: 99a0b8a3-70c9-405e-b521-3df3f5d046f2
Out[25]:

<Bokeh Notebook handle for In[30]>


In [26]:
%reset p


Once deleted, variables cannot be recovered. Proceed (y/[n])? y
Don't know how to reset  p, please run `%reset?` for details