The US beer industry

We use data from Victor and Carol Tremblay that describe the size distribution of the top 100 US beer producers from 1947 to 2004. This is background data from their book, The US Brewing Industry, MIT Press, 2004. Two interesting features are the consolidation that continues to this day and the growth of micro or craft brewers over the past two decades. The numbers are thousands of 31-gallon barrels. US totals are in column 107, so we could divide to get shares.

This notebook written by Dave Backus for the NYU Stern course Data Bootcamp.

Warning: This works, but we need to clean up the figs a bit, esp the color schemes. I'd also like to see some interactions.



In [4]:

    
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt



In [5]:

    
url = 'http://pages.stern.nyu.edu/~dbackus/Data/beer_production_1947-2004.xlsx'
beer = pd.read_excel(url, skiprows=12, index_col=0)

print('Dimensions:', beer.shape)
beer[list(range(1,11))].head()









    



Dimensions: (58, 115)






    Out[5]:






  
    
      
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
    
    
      YEAR
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1947
      3991
      3732
      3726
      3609
      2240
      2100
      1902
      1652
      1491
      1167
    
    
      1948
      4865
      4280
      4138
      4042
      2443
      2250
      2110
      1638
      1376
      1202
    
    
      1949
      4843
      4673
      4526
      4514
      2474
      1927
      1875
      1598
      1438
      1436
    
    
      1950
      5097
      4889
      4375
      4105
      2662
      2652
      2287
      2105
      1746
      1618
    
    
      1951
      5716
      5479
      4530
      3990
      2800
      2612
      2600
      2295
      1799
      1555



In [6]:

    
vars = list(range(1,101))   #+ ['Total Output of Domestic Brewers (excludes imports)']
pdf = beer[vars].T



In [7]:

    
pdf.columns









    Out[7]:





Int64Index([1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957,
            1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968,
            1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979,
            1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990,
            1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
            2002, 2003, 2004],
           dtype='int64', name='YEAR')



In [8]:

    
fig, ax = plt.subplots()

pdf[1947].plot(ax=ax, color='b', logy=True)
pdf[1967].plot(ax=ax, color='m', logy=True)
pdf[1987].plot(ax=ax, color='r', logy=True)
pdf[2004].plot(ax=ax, color='g', logy=True)
ax.legend()









    Out[8]:





<matplotlib.legend.Legend at 0x8cb0d68>



In [ ]:



In [18]:

    
# play with the colors 
fig, ax = plt.subplots()

years = [1947, 1967, 1987, 2004]

for year in years:
    darkness = (year-1940)/(2010-1940)
    pdf[year].plot(ax=ax, lw=2, color='b', alpha=darkness, logy=True)
    
ax.set_ylabel('Sales by Industry Rank (log scale)')
ax.set_xlabel('Industry Rank')
ax.legend(years, fontsize=10, handlelength=2, labelspacing=0.15)









    Out[18]:





<matplotlib.legend.Legend at 0xa1eaac8>



In [ ]:

interaction

This would make a great interaction. See Brian Granger's notebook.

Also these:



In [13]:

    
from IPython.html.widgets import interact, fixed









    



C:\Users\dbackus\Anaconda3\lib\site-packages\IPython\html.py:14: ShimWarning: The `IPython.html` package has been deprecated. You should import from `notebook` instead. `IPython.html.widgets` has moved to `ipywidgets`.
  "`IPython.html.widgets` has moved to `ipywidgets`.", ShimWarning)



In [ ]:



In [ ]:



In [ ]:



In [ ]:



In [ ]:

	1	2	3	4	5	6	7	8	9	10
YEAR
1947	3991	3732	3726	3609	2240	2100	1902	1652	1491	1167
1948	4865	4280	4138	4042	2443	2250	2110	1638	1376	1202
1949	4843	4673	4526	4514	2474	1927	1875	1598	1438	1436
1950	5097	4889	4375	4105	2662	2652	2287	2105	1746	1618
1951	5716	5479	4530	3990	2800	2612	2600	2295	1799	1555