The US beer industry

We use data from Victor and Carol Tremblay that describe the size distribution of the top 100 US beer producers from 1947 to 2004. This is background data from their book, The US Brewing Industry, MIT Press, 2004. Two interesting features are the consolidation that continues to this day and the growth of micro or craft brewers over the past two decades. The numbers are thousands of 31-gallon barrels. US totals are in column 107, so we could divide to get shares.

This notebook written by Dave Backus for the NYU Stern course Data Bootcamp.

Warning: This works, but we need to clean up the figs a bit, esp the color schemes. I'd also like to see some interactions.


In [4]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

In [5]:
url = 'http://pages.stern.nyu.edu/~dbackus/Data/beer_production_1947-2004.xlsx'
beer = pd.read_excel(url, skiprows=12, index_col=0)

print('Dimensions:', beer.shape)
beer[list(range(1,11))].head()


Dimensions: (58, 115)
Out[5]:
1 2 3 4 5 6 7 8 9 10
YEAR
1947 3991 3732 3726 3609 2240 2100 1902 1652 1491 1167
1948 4865 4280 4138 4042 2443 2250 2110 1638 1376 1202
1949 4843 4673 4526 4514 2474 1927 1875 1598 1438 1436
1950 5097 4889 4375 4105 2662 2652 2287 2105 1746 1618
1951 5716 5479 4530 3990 2800 2612 2600 2295 1799 1555

In [6]:
vars = list(range(1,101))   #+ ['Total Output of Domestic Brewers (excludes imports)']
pdf = beer[vars].T

In [7]:
pdf.columns


Out[7]:
Int64Index([1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957,
            1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968,
            1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979,
            1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990,
            1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
            2002, 2003, 2004],
           dtype='int64', name='YEAR')

In [8]:
fig, ax = plt.subplots()

pdf[1947].plot(ax=ax, color='b', logy=True)
pdf[1967].plot(ax=ax, color='m', logy=True)
pdf[1987].plot(ax=ax, color='r', logy=True)
pdf[2004].plot(ax=ax, color='g', logy=True)
ax.legend()


Out[8]:
<matplotlib.legend.Legend at 0x8cb0d68>

In [ ]:


In [18]:
# play with the colors 
fig, ax = plt.subplots()

years = [1947, 1967, 1987, 2004]

for year in years:
    darkness = (year-1940)/(2010-1940)
    pdf[year].plot(ax=ax, lw=2, color='b', alpha=darkness, logy=True)
    
ax.set_ylabel('Sales by Industry Rank (log scale)')
ax.set_xlabel('Industry Rank')
ax.legend(years, fontsize=10, handlelength=2, labelspacing=0.15)


Out[18]:
<matplotlib.legend.Legend at 0xa1eaac8>

In [ ]:


In [13]:
from IPython.html.widgets import interact, fixed


C:\Users\dbackus\Anaconda3\lib\site-packages\IPython\html.py:14: ShimWarning: The `IPython.html` package has been deprecated. You should import from `notebook` instead. `IPython.html.widgets` has moved to `ipywidgets`.
  "`IPython.html.widgets` has moved to `ipywidgets`.", ShimWarning)

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]: