Monthly Temperature Histories: Example D3 in Jupyter

This example uses data from the Daily Global Weather Measurements data set, originally collected by the National Climactic Data Center and available as a public data set on Amazon Web Services (AWS). Only selected weather stations are shown. See this blog post for a description of the data wrangling to produce the smaller csv files used in this example.

This example uses D3 for an integrated multi-part visualization with interactivity and animation.

Notebook Config


In [1]:
from IPython.core.display import display, HTML
from string import Template
import pandas as pd
import json

In [2]:
HTML('<script src="lib/d3/d3.min.js"></script>')


Out[2]:

Data Reading and Formatting


In [3]:
worldmap_data = json.loads(open('data/worldmap.json','r').read())

In [4]:
sites_data_stations = pd.read_csv('data/stations.csv')
sites_data_stations.head()


Out[4]:
ID country_name lat lon station_name
0 10620 SVALBARD 76.500 25.067 HOPEN
1 13840 NORWAY 60.200 11.083 OSLO/GARDERMOEN
2 26800 SWEDEN 56.917 18.150 HOBURG
3 29740 FINLAND 60.317 24.967 HELSINKI-VANTAA
4 31350 UNITED KINGDOM 55.500 -4.583 PRESTWICK(CIV/NAVY)

In [5]:
sites_data_temps = pd.read_csv('data/monthly_temps.csv')
sites_data_temps.head()


Out[5]:
ID ave max min month
0 10620 28.10000 9999.9 17.6 1977-10
1 10620 36.25806 45.5 29.5 1983-8
2 10620 19.66333 38.8 -9.0 1986-11
3 10620 28.89355 36.5 17.4 1986-10
4 10620 9.21290 34.7 -16.2 1986-12

In [6]:
sites_data_temps = sites_data_temps.sort_values(by='ID')

In [7]:
temps_by_ID = []
previous_ID = -1
collected_temps = {}
for i,row in sites_data_temps.iterrows():
    if (row['ID'] != previous_ID) and (previous_ID != -1):
        temps_by_ID.append(collected_temps)
        collected_temps = {}
    collected_temps[row['month']] = {'ave': row['ave'], 
                                     'max': row['max'], 
                                     'min': row['min']}
    previous_ID = row['ID']
temps_by_ID.append(collected_temps)
site_data_temps_2 = pd.DataFrame({'ID': sites_data_temps['ID'].unique(), 
                                  'temps': temps_by_ID})
site_data_temps_2.head()


Out[7]:
ID temps
0 10620 {u'1977-10': {u'ave': 28.1, u'min': 17.6, u'ma...
1 13840 {u'1940-7': {u'ave': 60.85806, u'min': 46.0, u...
2 26800 {u'1986-11': {u'ave': 42.73333, u'min': 36.7, ...
3 29740 {u'1960-11': {u'ave': 30.81667, u'min': 9.0, u...
4 31350 {u'1977-10': {u'ave': 52.87097, u'min': 37.4, ...

In [8]:
sites_data = pd.merge(sites_data_stations, site_data_temps_2, on='ID')
sites_data.head()


Out[8]:
ID country_name lat lon station_name temps
0 10620 SVALBARD 76.500 25.067 HOPEN {u'1977-10': {u'ave': 28.1, u'min': 17.6, u'ma...
1 13840 NORWAY 60.200 11.083 OSLO/GARDERMOEN {u'1940-7': {u'ave': 60.85806, u'min': 46.0, u...
2 26800 SWEDEN 56.917 18.150 HOBURG {u'1986-11': {u'ave': 42.73333, u'min': 36.7, ...
3 29740 FINLAND 60.317 24.967 HELSINKI-VANTAA {u'1960-11': {u'ave': 30.81667, u'min': 9.0, u...
4 31350 UNITED KINGDOM 55.500 -4.583 PRESTWICK(CIV/NAVY) {u'1977-10': {u'ave': 52.87097, u'min': 37.4, ...

In [9]:
sites_data_dict = sites_data.to_dict(orient='records')

Visualization


In [10]:
html_template = Template('''
<style> $css_text </style>
<div><svg width="700" height="500px" id="graph-svg"></svg></div>
<script> $js_text </script>
''')

In [11]:
css_text = open('css/temperature_histories.css','r').read()

In [12]:
js_text_template = Template(open('js/temperature_histories.js','r').read())
js_text = js_text_template.safe_substitute({'worldmapdata': json.dumps(worldmap_data), 
                                            'sitesdata': json.dumps(sites_data_dict) })

In [13]:
display(HTML(html_template.substitute({'css_text': css_text, 'js_text': js_text})))