Brazil-Germany Semifinal

(Bokeh Example Notebook)

The Brazil-Germany semifinal will go down in history as one of the most stunning upsets of the tournament, but I wanted to (literally) see exactly how lopsided the result is when compared to earlier matches.

DATA



In [1]:

    
import urllib2
import json

import numpy as np
import pandas as pd
from collections import OrderedDict



In [2]:

    
# Data source is (kindly) provided by http://worldcup.sfg.io/
matches_json = urllib2.urlopen('http://worldcup.sfg.io/matches/')
matches = json.load(matches_json)



In [3]:

    
df = pd.DataFrame(columns=['winner','win_goals','loser','loss_goals'])

for match in matches:
    # for matches that have already happened
    if match['status'] == 'future':
        continue

    if match['away_team']['goals'] > match['home_team']['goals']:
        winner = match['away_team']
        loser = match['home_team']
    else:
        # ties are absorbed into this block; terminology doesn't really matter
        winner = match['home_team']
        loser = match['away_team']
    df.loc[len(df)] = [winner['country'], int(winner['goals']),
                        loser['country'], int(loser['goals'])]

Our source is now a four-column dataframe, indexed by match order.



In [4]:

    
df.head()









    Out[4]:






  
    
      
      winner
      win_goals
      loser
      loss_goals
    
  
  
    
      0
            Brazil
       3
         Croatia
       1
    
    
      1
            Mexico
       1
        Cameroon
       0
    
    
      2
       Netherlands
       5
           Spain
       1
    
    
      3
             Chile
       3
       Australia
       1
    
    
      4
          Colombia
       3
          Greece
       0

PLOTTING



In [5]:

    
import bokeh
bokeh.print_versions()









    



    Bokeh version: 0.5.0-12-ge0937de
    Python version: 2.7.8-CPython
    Platform: Darwin-13.2.0-x86_64-i386-64bit



In [6]:

    
import bokeh.plotting as bkp
import bokeh.objects as bko
bkp.output_notebook()

from seabornify import seabornify
# see https://github.com/kdodia/snippets/blob/master/seabornify.py









    




    
        
        
    
        Bokeh
        BokehJS successfully loaded.



In [7]:

    
num_matches = len(df)

win_color = ['#FEFE00' if x == 'Brazil' else
             '#D52B1E' if x == 'Germany' else
             'slateblue' for x in df.winner]

loss_color = ['#FEFE00' if x == 'Brazil' else
             '#D52B1E' if x == 'Germany' else
             'coral' for x in df.loser]

# Ties should have a neutral color representation
win_color = ['gray' if df.win_goals[i] == df.loss_goals[i] else
             win_color[i] for i in xrange(num_matches)]
loss_color = ['gray' if df.win_goals[i] == df.loss_goals[i] else
              loss_color[i] for i in xrange(num_matches)]

For plotting purposes, we'll give "zero goal" teams an 0.1 buffer so they can be represented on the plot.



In [8]:

    
df = df.replace(0, 0.1)

Set up our ColumnDataSource objects so that each index in the columns is implicitly connected. This will be leveraged for the hover tooltip below. (See hover tutorial)



In [9]:

    
win_source = bko.ColumnDataSource(data=dict(
    left = df.index,
    right = df.index+1,
    top = df.win_goals,
    bottom = [0]*num_matches,
    color = win_color,
    country = df.winner))

loss_source = bko.ColumnDataSource(data=dict(
    left = df.index,
    right = df.index+1,
    top = [0]*num_matches,
    bottom = -df.loss_goals,
    color = loss_color,
    country = df.loser))



In [10]:

    
# Initialize new figure
bkp.figure(title="World Cup Goals per Match",
           x_axis_label="Match number",
           y_axis_label="Goals scored",
           plot_height=500,
           plot_width=800,
           tools="hover")

# Plot all following glyphs on the same figure
bkp.hold()

# Winners
bkp.quad(left='left', right='right', top='top', bottom='bottom',
         color='color', source=win_source, line_color=None)

# "Losers"
bkp.quad(left='left', right='right', top='top', bottom='bottom',
         color='color', source=loss_source, line_color=None)

# Direction indication
bkp.text([-1.5],[0.2],text="Winning team", angle=np.pi/2)
bkp.text([-1.5],[-2.4],text="Losing team", angle=np.pi/2)


# Simple visual separator
bkp.line([0, num_matches-1],
         [0,0],
         color='black');

Here we'll attach the country name to the hover tooltip for each glyph.



In [11]:

    
hover = [t for t in bkp.curplot().tools if isinstance(t, bko.HoverTool)][0]

hover.tooltips = OrderedDict([
    ("Country", "@country")
])

Aaaand show()! Germany in red and Brazil in yellow; hover for country information.



In [12]:

    
seabornify(bkp.curplot())
bkp.show()

Conclusion

This game was exceptional.



In [13]:

    
from IPython.display import Image
Image(url="http://i.imgur.com/10gvO.gif")









    Out[13]:

	winner	win_goals	loser	loss_goals
0	Brazil	3	Croatia	1
1	Mexico	1	Cameroon	0
2	Netherlands	5	Spain	1
3	Chile	3	Australia	1
4	Colombia	3	Greece	0