This is a bqplot recreation of Mike Bostock's Wealth of Nations. This was also done by Gapminder. It is originally based on a TED Talk by Hans Rosling.


In [ ]:
import pandas as pd
import numpy as np
import os

from bqplot import (
    LogScale, LinearScale, OrdinalColorScale, ColorAxis,
    Axis, Scatter, Lines, CATEGORY10, Label, Figure, Tooltip
)

from ipywidgets import HBox, VBox, IntSlider, Play, jslink

In [ ]:
initial_year = 1800

Cleaning and Formatting JSON Data


In [ ]:
data = pd.read_json(os.path.abspath('../data_files/nations.json'))

In [ ]:
def clean_data(data):
    for column in ['income', 'lifeExpectancy', 'population']:
        data = data.drop(data[data[column].apply(len) <= 4].index)
    return data

def extrap_interp(data):
    data = np.array(data)
    x_range = np.arange(1800, 2009, 1.)
    y_range = np.interp(x_range, data[:, 0], data[:, 1])
    return y_range

def extrap_data(data):
    for column in ['income', 'lifeExpectancy', 'population']:
        data[column] = data[column].apply(extrap_interp)
    return data

In [ ]:
data = clean_data(data)
data = extrap_data(data)

In [ ]:
income_min, income_max = np.min(data['income'].apply(np.min)), np.max(data['income'].apply(np.max))
life_exp_min, life_exp_max = np.min(data['lifeExpectancy'].apply(np.min)), np.max(data['lifeExpectancy'].apply(np.max))
pop_min, pop_max = np.min(data['population'].apply(np.min)), np.max(data['population'].apply(np.max))

In [ ]:
def get_data(year):
    year_index = year - 1800
    income = data['income'].apply(lambda x: x[year_index])
    life_exp = data['lifeExpectancy'].apply(lambda x: x[year_index])
    pop =  data['population'].apply(lambda x: x[year_index])
    return income, life_exp, pop

Creating the Tooltip to display the required fields

bqplot's native Tooltip allows us to simply display the data fields we require on a mouse-interaction.


In [ ]:
tt = Tooltip(fields=['name', 'x', 'y'], labels=['Country Name', 'Income per Capita', 'Life Expectancy'])

Creating the Label to display the year

Staying true to the d3 recreation of the talk, we place a Label widget in the bottom-right of the Figure (it inherits the Figure co-ordinates when no scale is passed to it). With enable_move set to True, the Label can be dragged around.


In [ ]:
year_label = Label(x=[0.75], y=[0.10], default_size=46, font_weight='bolder', colors=['orange'],
                   text=[str(initial_year)], enable_move=True)

Defining Axes and Scales

The inherent skewness of the income data favors the use of a LogScale. Also, since the color coding by regions does not follow an ordering, we use the OrdinalColorScale.


In [ ]:
x_sc = LogScale(min=min(200, income_min), max=income_max)
y_sc = LinearScale(min=life_exp_min, max=life_exp_max)
c_sc = OrdinalColorScale(domain=data['region'].unique().tolist(), colors=CATEGORY10[:6])
size_sc = LinearScale(min=pop_min, max=pop_max)

In [ ]:
ax_y = Axis(label='Life Expectancy', scale=y_sc, orientation='vertical', side='left', grid_lines='solid')

ticks = [2, 4, 6, 8, 10]
income_ticks = [t*100 for t in ticks] + [t*1000 for t in ticks] + [t*10000 for t in ticks]
ax_x = Axis(label='Income per Capita', scale=x_sc, grid_lines='solid', tick_format='~s', tick_values=income_ticks)

Creating the Scatter Mark with the appropriate size and color parameters passed

To generate the appropriate graph, we need to pass the population of the country to the size attribute and its region to the color attribute.


In [ ]:
# Start with the first year's data
cap_income, life_exp, pop = get_data(initial_year)

In [ ]:
wealth_scat = Scatter(x=cap_income, y=life_exp, color=data['region'], size=pop,
                      names=data['name'], display_names=False,
                      scales={'x': x_sc, 'y': y_sc, 'color': c_sc, 'size': size_sc},
                      default_size=4112, tooltip=tt, animate=True, stroke='Black',
                      unhovered_style={'opacity': 0.5})

In [ ]:
nation_line = Lines(x=data['income'][0], y=data['lifeExpectancy'][0], colors=['Gray'],
                       scales={'x': x_sc, 'y': y_sc}, visible=False)

Creating the Figure


In [ ]:
time_interval = 10

In [ ]:
fig = Figure(marks=[wealth_scat, year_label, nation_line], axes=[ax_x, ax_y],
             title='Health and Wealth of Nations', animation_duration=time_interval)

Using a Slider to allow the user to change the year and a button for animation

Here we see how we can seamlessly integrate bqplot into the jupyter widget infrastructure.


In [ ]:
year_slider = IntSlider(min=1800, max=2008, step=1, description='Year', value=initial_year)

When the hovered_point of the Scatter plot is changed (i.e. when the user hovers over a different element), the entire path of that country is displayed by making the Lines object visible and setting it's x and y attributes.


In [ ]:
def hover_changed(change):
    if change.new is not None:
        nation_line.x = data[data['name'] == wealth_scat.names[change.new]]['income'].values[0]
        nation_line.y = data[data['name'] == wealth_scat.names[change.new]]['lifeExpectancy'].values[0]
        nation_line.visible = True
    else:
        nation_line.visible = False
        
wealth_scat.observe(hover_changed, 'hovered_point')

On the slider value callback (a function that is triggered everytime the value of the slider is changed) we change the x, y and size co-ordinates of the Scatter. We also update the text of the Label to reflect the current year.


In [ ]:
def year_changed(change):
    wealth_scat.x, wealth_scat.y, wealth_scat.size = get_data(year_slider.value)
    year_label.text = [str(year_slider.value)]

year_slider.observe(year_changed, 'value')

Add an animation button


In [ ]:
play_button = Play(min=1800, max=2008, interval=time_interval)
jslink((play_button, 'value'), (year_slider, 'value'))

Displaying the GUI


In [ ]:
VBox([HBox([play_button, year_slider]), fig])