We use data from Victor and Carol Tremblay that describe the size distribution of the top 100 US beer producers from 1947 to 2004. This is background data from their book, The US Brewing Industry, MIT Press, 2004. Two interesting features are the consolidation that continues to this day and the growth of micro or craft brewers over the past two decades. The numbers are thousands of 31-gallon barrels. US totals are in column 107, so we could divide to get shares.
This notebook written by Dave Backus for the NYU Stern course Data Bootcamp.
Warning: This works, but we need to clean up the figs a bit, esp the color schemes. I'd also like to see some interactions.
In [4]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
In [5]:
url = 'http://pages.stern.nyu.edu/~dbackus/Data/beer_production_1947-2004.xlsx'
beer = pd.read_excel(url, skiprows=12, index_col=0)
print('Dimensions:', beer.shape)
beer[list(range(1,11))].head()
Out[5]:
In [6]:
vars = list(range(1,101)) #+ ['Total Output of Domestic Brewers (excludes imports)']
pdf = beer[vars].T
In [7]:
pdf.columns
Out[7]:
In [8]:
fig, ax = plt.subplots()
pdf[1947].plot(ax=ax, color='b', logy=True)
pdf[1967].plot(ax=ax, color='m', logy=True)
pdf[1987].plot(ax=ax, color='r', logy=True)
pdf[2004].plot(ax=ax, color='g', logy=True)
ax.legend()
Out[8]:
In [ ]:
In [18]:
# play with the colors
fig, ax = plt.subplots()
years = [1947, 1967, 1987, 2004]
for year in years:
darkness = (year-1940)/(2010-1940)
pdf[year].plot(ax=ax, lw=2, color='b', alpha=darkness, logy=True)
ax.set_ylabel('Sales by Industry Rank (log scale)')
ax.set_xlabel('Industry Rank')
ax.legend(years, fontsize=10, handlelength=2, labelspacing=0.15)
Out[18]:
In [ ]:
In [13]:
from IPython.html.widgets import interact, fixed
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]: