Texas Drought Statistics

Check Bokeh version.


In [1]:
import bokeh
bokeh.print_versions()


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Bokeh version: 0.4.4-267-ge9807af
Python version: 2.7.6 |Anaconda 1.9.1 (x86_64)| (default, Jan 10 2014, 11:23:15) 
[GCC 4.0.1 (Apple Inc. build 5493)]
Platform: Darwin-13.1.0-x86_64 (Darwin Kernel Version 13.1.0: Wed Apr  2 23:52:02 PDT 2014; root:xnu-2422.92.1~2/RELEASE_X86_64)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Gather imports.


In [2]:
import pandas as pd
import numpy as np
from collections import OrderedDict

import datetime as dt

from bokeh.plotting import *
from bokeh.objects import Range1d

output_notebook()


BokehJS successfully loaded.

Convenience functions to reference dates in UNIX time.


In [3]:
def unix_time(date):
    epoch = dt.datetime.utcfromtimestamp(0)
    delta = date - epoch
    return delta.total_seconds()

def unix_time_millis(date):
    return unix_time(date) * 1000.0

Set a nominal colorscheme.

These colors correspond to the severity of the drought:

D0 - Abnormally Dry
D1 - Moderate
D2 - Severe
D3 - Extreme
D4 - Exceptional (dark red)
Nothing (white)


In [4]:
COLORS = ['#800000', '#942222', '#A84444', '#BC6666', '#D08888', '#FFFFFF']

Import data and reverse dataframe, so dates are in ascending order.


In [5]:
a = pd.read_csv("data/DroughtStats.csv")
a = a.reindex(index=a.index[ ::-1 ])
a = a.reset_index().drop('index', axis=1)

Take a look!


In [6]:
a.head()


Out[6]:
Week Nothing D0 D1 D2 D3 D4
0 1/4/00 2.56 23.71 22.18 51.55 0 0
1 1/11/00 0.17 24.98 23.29 51.56 0 0
2 1/18/00 0.00 25.15 23.29 51.56 0 0
3 1/25/00 0.00 25.15 23.29 51.56 0 0
4 2/1/00 0.00 18.39 25.48 56.13 0 0

5 rows × 7 columns


In [7]:
dates = pd.to_datetime(a['Week'])
categories = list(reversed(a.columns[1:]))

Stacked data will be much easier to plot in future versions of Bokeh, but this algorithm is simple enough to generate the patch coordinates.


In [8]:
xs = [list(dates) + list(reversed(dates))] * len(categories)
ys = []

for i, cat in enumerate(categories):
    if i == 0:
        y = list(np.zeros(len(dates))) + list(reversed(a[cat]))
        cumulative = y
    else:
        prev = categories[i-1]
        y = list(a[prev]) + list(reversed(a[cat]))
        cumulative = list(np.sum([cumulative, y], axis=0))

    ys.append(cumulative)

Set the plot ranges, declare some basic figure properties, and plot the patches.


In [9]:
start_date = unix_time_millis(dates.irow(0))
end_date = unix_time_millis(dates.irow(-1))
x_range = Range1d(start=start_date, end=end_date)

y_range = Range1d(start=0, end=100)

figure(title="Texas Drought Statistics - 2000 to Present",
       plot_width=960,
       x_axis_type='datetime',
       x_range = x_range,
       y_range = y_range,
       background_fill="lightgrey")

patches(
    xs, ys,
    color=COLORS,
    alpha=0.8,
    line_color=None
)


Out[9]:
<bokeh.objects.Plot at 0x1077de910>

Style the plot and show!


In [10]:
axis().major_label_text_font_size = "12pt"
axis().major_label_standoff = 10
axis().axis_line_color = None

xaxis().major_label_orientation = np.pi/4
xaxis().major_tick_out = 5
xaxis().major_tick_line_color = "darkgrey"

ygrid().grid_line_color = "white"
ygrid().grid_line_width = 2

yaxis().major_tick_line_color = "darkgrey"
yaxis()[0].axis_label = "Cumulative Percent in Drought"


show()