~In roughly less than 50 lines of code.
How easy would it be to re-create bar chart race in Python using Jupyter and Matplotlib?
Turns out, in less than 50 lines of code, you can reasonably re-create reusable bar chart race in Python with Matplotlib.
In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.animation as animation
from IPython.display import HTML
Read the city populations dataset with pandas.
We only need 4 columns to work with 'name', 'group', 'year', 'value'.
Typically, a name is mapped to a group and each year has one value.
In [2]:
url = 'https://gist.githubusercontent.com/johnburnmurdoch/4199dbe55095c3e13de8d5b2e5e5307a/raw/fa018b25c24b7b5f47fd0568937ff6c04e384786/city_populations'
df = pd.read_csv(url, usecols=['name', 'group', 'year', 'value'])
df.head(3)
Out[2]:
In [3]:
colors = dict(zip(
["India", "Europe", "Asia", "Latin America", "Middle East", "North America", "Africa"],
["#adb0ff", "#ffb3ff", "#90d595", "#e48381", "#aafbff", "#f7bb5f", "#eafb50"]
))
group_lk = df.set_index('name')['group'].to_dict()
Run below cell draw_barchart(2018) draws barchart for year=2018
In [4]:
fig, ax = plt.subplots(figsize=(15, 8))
def draw_barchart(current_year):
dff = df[df['year'].eq(current_year)].sort_values(by='value', ascending=True).tail(10)
ax.clear()
ax.barh(dff['name'], dff['value'], color=[colors[group_lk[x]] for x in dff['name']])
dx = dff['value'].max() / 200
for i, (value, name) in enumerate(zip(dff['value'], dff['name'])):
ax.text(value-dx, i, name, size=14, weight=600, ha='right', va='bottom')
ax.text(value-dx, i-.25, group_lk[name], size=10, color='#444444', ha='right', va='baseline')
ax.text(value+dx, i, f'{value:,.0f}', size=14, ha='left', va='center')
ax.text(1, 0.4, current_year, transform=ax.transAxes, color='#777777', size=46, ha='right', weight=800)
ax.text(0, 1.06, 'Population (thousands)', transform=ax.transAxes, size=12, color='#777777')
ax.xaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
ax.xaxis.set_ticks_position('top')
ax.tick_params(axis='x', colors='#777777', labelsize=12)
ax.set_yticks([])
ax.margins(0, 0.01)
ax.grid(which='major', axis='x', linestyle='-')
ax.set_axisbelow(True)
ax.text(0, 1.15, 'The most populous cities in the world from 1500 to 2018',
transform=ax.transAxes, size=24, weight=600, ha='left', va='top')
ax.text(1, 0, 'by @pratapvardhan; credit @jburnmurdoch', transform=ax.transAxes, color='#777777', ha='right',
bbox=dict(facecolor='white', alpha=0.8, edgecolor='white'))
plt.box(False)
draw_barchart(2018)
To animate, we will use FuncAnimation from matplotlib.animation.
FuncAnimation makes an animation by repeatedly calling a function (that draws on canvas).
In our case, it'll be draw_barchart.
frames arguments accepts on what values you want to run draw_barchart -- we'll
run from year 1900 to 2018.
Run below cell.
In [ ]:
fig, ax = plt.subplots(figsize=(15, 8))
animator = animation.FuncAnimation(fig, draw_barchart, frames=range(1900, 2019))
HTML(animator.to_jshtml())
# or use animator.to_html5_video() or animator.save()
In [5]:
with plt.xkcd():
fig, ax = plt.subplots(figsize=(15, 8))
draw_barchart(2018)
In [6]:
current_year = 2018
dff = df[df['year'].eq(current_year)].sort_values(by='value', ascending=False).head(10)
dff
Out[6]:
In [7]:
fig, ax = plt.subplots(figsize=(15, 8))
ax.barh(dff['name'], dff['value'])
Out[7]:
In [8]:
fig, ax = plt.subplots(figsize=(15, 8))
dff = dff[::-1]
ax.barh(dff['name'], dff['value'], color=[colors[group_lk[x]] for x in dff['name']])
for i, (value, name) in enumerate(zip(dff['value'], dff['name'])):
ax.text(value, i, name, ha='right')
ax.text(value, i-.25, group_lk[name], ha='right')
ax.text(value, i, value, ha='left')
ax.text(1, 0.4, current_year, transform=ax.transAxes, size=46, ha='right')
Out[8]:
In [9]:
fig, ax = plt.subplots(figsize=(15, 8))
draw_barchart(2018)
Matplotlib is a massive library, being able to adjust every aspect of a plot is powerful but it can be complex / time-consuming for highly customized charts. Atleast, for these bar chart races, it was fairly quick!