San Diego Burrito Analytics: Bootcamp 2016

Scott Cole

15 Sept 2016

This notebook characterizes the data collected from consuming burritos from Don Carlos during Neuro bootcamp.

Outline

Load data into python
- Use a Pandas dataframe
- View data
- Print some metadata
Hypothesis tests
- California burritos vs. Carnitas burritos
- Don Carlos 1 vs. Don Carlos 2
- Bonferroni correction
Distributions
- Distributions of each burrito quality
- Tests for normal distribution
Correlations
- Hunger vs. Overall rating
- Correlation matrix
Assumptions discussion

0. Import libraries into Python



In [1]:

    
# These commands control inline plotting
%config InlineBackend.figure_format = 'retina'
%matplotlib inline

import numpy as np # Useful numeric package
import scipy as sp # Useful statistics package
import matplotlib.pyplot as plt # Plotting package

1. Load data into a Pandas dataframe



In [25]:

    
import pandas as pd # Dataframe package
filename = './burrito_bootcamp.csv'
df = pd.read_csv(filename)

View raw data



In [21]:

    
df









    Out[21]:






  
    
      
      Location
      Burrito
      Hunger
      Length
      Circum
      Volume
      Tortilla
      Temp
      Meat
      Fillings
      Meat:filling
      Uniformity
      Salsa
      Synergy
      Wrap
      overall
      Rec
      Reviewer
    
  
  
    
      0
      Don Carlos Taco Shop
      Shredded chicken
      3.0
      23.5
      21.50
      0.86
      3.0
      5.0
      3.00
      3.5
      4.0
      4.0
      4.0
      4.0
      4.0
      3.80
      Yes
      Scott
    
    
      1
      Don Carlos Taco Shop
      Carne asada
      3.5
      22.5
      22.00
      0.87
      2.0
      3.5
      2.50
      2.5
      2.0
      4.0
      3.5
      2.5
      5.0
      3.00
      Yes
      Scott
    
    
      2
      Don Carlos Taco Shop
      Soyrizo
      1.5
      22.5
      22.00
      0.87
      3.0
      2.0
      2.50
      3.0
      4.5
      4.0
      3.0
      3.0
      5.0
      3.00
      Yes
      Emily
    
    
      3
      Don Carlos Taco Shop
      Soyrizo
      2.0
      23.0
      22.50
      0.93
      3.0
      2.0
      3.50
      3.0
      4.0
      5.0
      4.0
      4.0
      5.0
      3.75
      Yes
      Ricardo
    
    
      4
      Don Carlos Taco Shop
      Soyrizo
      4.0
      NaN
      NaN
      NaN
      4.0
      5.0
      4.00
      3.5
      4.5
      5.0
      2.5
      4.5
      4.0
      4.20
      Yes
      Scott
    
    
      5
      Don Carlos Taco Shop
      Soyrizo
      4.0
      21.5
      20.00
      0.68
      3.0
      4.0
      5.00
      3.5
      2.5
      2.5
      2.5
      4.0
      1.0
      3.20
      Yes
      Emily
    
    
      6
      Don Carlos Taco Shop
      Soyrizo
      1.5
      23.0
      23.00
      0.97
      2.0
      3.0
      3.00
      2.0
      2.5
      2.5
      NaN
      2.0
      3.0
      2.60
      Yes
      Scott
    
    
      7
      Don Carlos Taco Shop
      California
      4.0
      21.5
      20.50
      0.72
      2.5
      3.0
      3.00
      2.5
      3.0
      3.5
      NaN
      2.5
      3.0
      3.00
      Yes
      Emily
    
    
      8
      Don Carlos Taco Shop
      California
      3.5
      23.0
      21.50
      0.85
      2.0
      4.5
      4.50
      3.5
      1.5
      3.0
      3.5
      4.0
      2.0
      3.90
      Yes
      Scott
    
    
      9
      Don Carlos Taco Shop
      California
      3.5
      22.0
      20.80
      0.76
      2.5
      1.5
      1.50
      3.0
      4.5
      3.0
      1.5
      2.0
      4.5
      2.00
      Yes
      Scott
    
    
      10
      Don Carlos Taco Shop
      California
      2.0
      22.5
      21.50
      0.83
      2.5
      2.5
      2.75
      2.5
      2.5
      2.0
      0.5
      3.0
      3.5
      2.50
      Yes
      Emily
    
    
      11
      Don Carlos Taco Shop
      California
      2.0
      23.0
      21.50
      0.85
      3.0
      4.0
      4.00
      3.0
      4.0
      4.0
      1.0
      2.0
      1.0
      3.00
      Yes
      Marc
    
    
      12
      Don Carlos Taco Shop
      California
      3.5
      21.0
      22.50
      0.85
      3.0
      3.5
      3.50
      4.0
      2.0
      3.5
      1.0
      4.0
      4.0
      3.90
      Yes
      Scott
    
    
      13
      Don Carlos Taco Shop
      California
      3.0
      22.5
      21.00
      0.79
      3.0
      1.0
      1.50
      2.5
      4.0
      4.0
      3.0
      4.5
      5.0
      2.00
      Yes
      Nicole
    
    
      14
      Don Carlos Taco Shop
      California
      3.0
      NaN
      NaN
      NaN
      4.0
      NaN
      2.00
      2.0
      4.0
      4.0
      NaN
      3.0
      4.0
      2.75
      Yes
      Cris
    
    
      15
      Don Carlos Taco Shop
      California
      4.0
      22.0
      20.00
      0.70
      3.0
      2.5
      4.00
      4.0
      3.5
      2.5
      3.5
      5.0
      4.5
      4.20
      Yes
      Emily
    
    
      16
      Don Carlos Taco Shop
      California
      2.5
      20.5
      21.75
      0.77
      4.0
      4.0
      4.50
      4.0
      5.0
      4.5
      3.5
      4.0
      2.0
      4.10
      Yes
      Scott
    
    
      17
      Don Carlos Taco Shop
      California
      3.0
      20.5
      20.00
      0.65
      4.0
      4.0
      3.00
      3.5
      4.0
      4.5
      4.0
      4.0
      4.5
      4.00
      No
      Scott
    
    
      18
      Don Carlos Taco Shop
      California
      2.0
      18.5
      20.50
      0.62
      3.5
      4.0
      3.50
      NaN
      4.0
      NaN
      4.0
      4.0
      1.5
      4.00
      No
      Emily
    
    
      19
      Don Carlos Taco Shop
      California
      4.0
      21.5
      20.00
      0.68
      3.0
      4.0
      2.75
      3.0
      4.0
      2.0
      2.0
      NaN
      5.0
      3.00
      No
      Leo
    
    
      20
      Don Carlos Taco Shop
      California
      2.5
      20.0
      20.00
      0.64
      3.5
      3.0
      3.00
      3.0
      4.0
      4.0
      1.5
      NaN
      4.5
      3.50
      No
      Scott
    
    
      21
      Don Carlos Taco Shop
      Carnitas
      3.5
      19.0
      21.00
      0.67
      1.5
      2.0
      3.00
      3.5
      4.0
      1.0
      3.5
      4.5
      4.0
      4.00
      No
      Scott
    
    
      22
      Don Carlos Taco Shop
      Carnitas
      2.5
      21.5
      22.50
      0.87
      1.5
      2.5
      3.50
      3.0
      4.0
      1.5
      2.5
      3.5
      4.5
      3.50
      No
      Emily
    
    
      23
      Don Carlos Taco Shop
      Carnitas
      2.5
      18.5
      21.00
      0.65
      4.0
      3.0
      4.00
      4.0
      4.0
      4.0
      4.0
      4.0
      3.0
      4.60
      No
      Scott
    
    
      24
      Don Carlos Taco Shop
      Carnitas
      2.5
      23.0
      19.00
      0.66
      3.0
      2.0
      4.50
      4.0
      4.0
      3.5
      3.0
      4.5
      4.5
      4.50
      No
      Emily
    
    
      25
      Don Carlos Taco Shop
      Carnitas
      3.5
      22.5
      20.50
      0.75
      2.5
      2.5
      3.00
      4.0
      4.0
      4.0
      3.0
      3.5
      1.5
      3.80
      No
      Emily
    
    
      26
      Don Carlos Taco Shop
      Carnitas
      3.5
      18.5
      21.50
      0.68
      2.5
      3.0
      3.00
      4.0
      2.0
      2.0
      3.0
      3.5
      3.5
      3.00
      No
      Scott
    
    
      27
      Don Carlos Taco Shop
      Carnitas
      3.5
      16.5
      23.50
      0.73
      3.5
      5.0
      4.00
      4.0
      3.5
      4.5
      3.5
      3.5
      4.0
      4.00
      No
      Scott
    
    
      28
      Don Carlos Taco Shop
      Carnitas
      4.5
      20.5
      20.50
      0.69
      3.0
      5.0
      4.00
      4.0
      5.0
      4.0
      2.5
      4.5
      4.0
      4.00
      No
      Sage

Brief metadata



In [24]:

    
print 'Number of burritos:', df.shape[0]
print 'Average burrito rating'
print 'Reviewers: '
print np.array(df['Reviewer'])









    



Number of burritos: 29
Reviewers: 
['Scott' 'Scott' 'Emily' 'Ricardo' 'Scott' 'Emily' 'Scott' 'Emily' 'Scott'
 'Scott' 'Emily' 'Marc' 'Scott' 'Nicole' 'Cris' 'Emily' 'Scott' 'Scott'
 'Emily' 'Leo' 'Scott' 'Scott' 'Emily' 'Scott' 'Emily' 'Emily' 'Scott'
 'Scott' 'Sage']

What types of burritos have been rated?



In [10]:

    
def burritotypes(x, types = {'California':'cali', 'Carnitas':'carnita', 'Carne asada':'carne asada',
                             'Soyrizo':'soyrizo', 'Shredded chicken':'chicken'}):
    import re
    T = len(types)
    Nmatches = {}
    for b in x:
        matched = False
        for t in types.keys():
            re4str = re.compile('.*'+types[t]+'.*', re.IGNORECASE)
            if np.logical_and(re4str.match(b) is not None, matched is False):
                try:
                    Nmatches[t] +=1
                except KeyError:
                    Nmatches[t] = 1
                matched = True
        if matched is False:
            try:
                Nmatches['other'] +=1
            except KeyError:
                Nmatches['other'] = 1
    return Nmatches

typecounts = burritotypes(df.Burrito)



In [12]:

    
plt.figure(figsize=(6,6))
ax = plt.axes([0.1, 0.1, 0.65, 0.65])

# The slices will be ordered and plotted counter-clockwise.
labels = typecounts.keys()
fracs = typecounts.values()
explode=[.1]*len(typecounts)

patches, texts, autotexts = plt.pie(fracs, explode=explode, labels=labels,
                autopct=lambda(p): '{:.0f}'.format(p * np.sum(fracs) / 100), shadow=False, startangle=0)
                # The default startangle is 0, which would start
                # the Frogs slice on the x-axis.  With startangle=90,
                # everything is rotated counter-clockwise by 90 degrees,
                # so the plotting starts on the positive y-axis.

plt.title('Types of burritos',size=30)
for t in texts:
    t.set_size(20)
for t in autotexts:
    t.set_size(20)
autotexts[0].set_color('w')

2. Hypothesis tests



In [ ]:

    
#California burritos vs. Carnitas burritos
TODO



In [ ]:

    
# Don Carlos 1 vs. Don Carlos 2
TODO



In [ ]:

    
# Bonferroni correction
TODO

3. Burrito dimension distributions

Distribution of each burrito quality



In [18]:

    
import math
def metrichist(metricname):
    if metricname == 'Volume':
        bins = np.arange(.375,1.225,.05)
        xticks = np.arange(.4,1.2,.1)
        xlim = (.4,1.2)
    else:
        bins = np.arange(-.25,5.5,.5)
        xticks = np.arange(0,5.5,.5)
        xlim = (-.25,5.25)
        
    plt.figure(figsize=(5,5))
    n, _, _ = plt.hist(df[metricname].dropna(),bins,color='k')
    plt.xlabel(metricname + ' rating',size=20)
    plt.xticks(xticks,size=15)
    plt.xlim(xlim)
    plt.ylabel('Count',size=20)
    plt.yticks((0,int(math.ceil(np.max(n) / 5.)) * 5),size=15)
    plt.tight_layout()



In [19]:

    
m_Hist = ['Hunger','Volume','Tortilla','Temp','Meat','Fillings',
          'Meat:filling','Uniformity','Salsa','Synergy','Wrap','overall']
for m in m_Hist:
    metrichist(m)

Test for normal distribution



In [ ]:

    
TODO

	Location	Burrito	Hunger	Length	Circum	Volume	Tortilla	Temp	Meat	Fillings	Meat:filling	Uniformity	Salsa	Synergy	Wrap	overall	Rec	Reviewer
0	Don Carlos Taco Shop	Shredded chicken	3.0	23.5	21.50	0.86	3.0	5.0	3.00	3.5	4.0	4.0	4.0	4.0	4.0	3.80	Yes	Scott
1	Don Carlos Taco Shop	Carne asada	3.5	22.5	22.00	0.87	2.0	3.5	2.50	2.5	2.0	4.0	3.5	2.5	5.0	3.00	Yes	Scott
2	Don Carlos Taco Shop	Soyrizo	1.5	22.5	22.00	0.87	3.0	2.0	2.50	3.0	4.5	4.0	3.0	3.0	5.0	3.00	Yes	Emily
3	Don Carlos Taco Shop	Soyrizo	2.0	23.0	22.50	0.93	3.0	2.0	3.50	3.0	4.0	5.0	4.0	4.0	5.0	3.75	Yes	Ricardo
4	Don Carlos Taco Shop	Soyrizo	4.0	NaN	NaN	NaN	4.0	5.0	4.00	3.5	4.5	5.0	2.5	4.5	4.0	4.20	Yes	Scott
5	Don Carlos Taco Shop	Soyrizo	4.0	21.5	20.00	0.68	3.0	4.0	5.00	3.5	2.5	2.5	2.5	4.0	1.0	3.20	Yes	Emily
6	Don Carlos Taco Shop	Soyrizo	1.5	23.0	23.00	0.97	2.0	3.0	3.00	2.0	2.5	2.5	NaN	2.0	3.0	2.60	Yes	Scott
7	Don Carlos Taco Shop	California	4.0	21.5	20.50	0.72	2.5	3.0	3.00	2.5	3.0	3.5	NaN	2.5	3.0	3.00	Yes	Emily
8	Don Carlos Taco Shop	California	3.5	23.0	21.50	0.85	2.0	4.5	4.50	3.5	1.5	3.0	3.5	4.0	2.0	3.90	Yes	Scott
9	Don Carlos Taco Shop	California	3.5	22.0	20.80	0.76	2.5	1.5	1.50	3.0	4.5	3.0	1.5	2.0	4.5	2.00	Yes	Scott
10	Don Carlos Taco Shop	California	2.0	22.5	21.50	0.83	2.5	2.5	2.75	2.5	2.5	2.0	0.5	3.0	3.5	2.50	Yes	Emily
11	Don Carlos Taco Shop	California	2.0	23.0	21.50	0.85	3.0	4.0	4.00	3.0	4.0	4.0	1.0	2.0	1.0	3.00	Yes	Marc
12	Don Carlos Taco Shop	California	3.5	21.0	22.50	0.85	3.0	3.5	3.50	4.0	2.0	3.5	1.0	4.0	4.0	3.90	Yes	Scott
13	Don Carlos Taco Shop	California	3.0	22.5	21.00	0.79	3.0	1.0	1.50	2.5	4.0	4.0	3.0	4.5	5.0	2.00	Yes	Nicole
14	Don Carlos Taco Shop	California	3.0	NaN	NaN	NaN	4.0	NaN	2.00	2.0	4.0	4.0	NaN	3.0	4.0	2.75	Yes	Cris
15	Don Carlos Taco Shop	California	4.0	22.0	20.00	0.70	3.0	2.5	4.00	4.0	3.5	2.5	3.5	5.0	4.5	4.20	Yes	Emily
16	Don Carlos Taco Shop	California	2.5	20.5	21.75	0.77	4.0	4.0	4.50	4.0	5.0	4.5	3.5	4.0	2.0	4.10	Yes	Scott
17	Don Carlos Taco Shop	California	3.0	20.5	20.00	0.65	4.0	4.0	3.00	3.5	4.0	4.5	4.0	4.0	4.5	4.00	No	Scott
18	Don Carlos Taco Shop	California	2.0	18.5	20.50	0.62	3.5	4.0	3.50	NaN	4.0	NaN	4.0	4.0	1.5	4.00	No	Emily
19	Don Carlos Taco Shop	California	4.0	21.5	20.00	0.68	3.0	4.0	2.75	3.0	4.0	2.0	2.0	NaN	5.0	3.00	No	Leo
20	Don Carlos Taco Shop	California	2.5	20.0	20.00	0.64	3.5	3.0	3.00	3.0	4.0	4.0	1.5	NaN	4.5	3.50	No	Scott
21	Don Carlos Taco Shop	Carnitas	3.5	19.0	21.00	0.67	1.5	2.0	3.00	3.5	4.0	1.0	3.5	4.5	4.0	4.00	No	Scott
22	Don Carlos Taco Shop	Carnitas	2.5	21.5	22.50	0.87	1.5	2.5	3.50	3.0	4.0	1.5	2.5	3.5	4.5	3.50	No	Emily
23	Don Carlos Taco Shop	Carnitas	2.5	18.5	21.00	0.65	4.0	3.0	4.00	4.0	4.0	4.0	4.0	4.0	3.0	4.60	No	Scott
24	Don Carlos Taco Shop	Carnitas	2.5	23.0	19.00	0.66	3.0	2.0	4.50	4.0	4.0	3.5	3.0	4.5	4.5	4.50	No	Emily
25	Don Carlos Taco Shop	Carnitas	3.5	22.5	20.50	0.75	2.5	2.5	3.00	4.0	4.0	4.0	3.0	3.5	1.5	3.80	No	Emily
26	Don Carlos Taco Shop	Carnitas	3.5	18.5	21.50	0.68	2.5	3.0	3.00	4.0	2.0	2.0	3.0	3.5	3.5	3.00	No	Scott
27	Don Carlos Taco Shop	Carnitas	3.5	16.5	23.50	0.73	3.5	5.0	4.00	4.0	3.5	4.5	3.5	3.5	4.0	4.00	No	Scott
28	Don Carlos Taco Shop	Carnitas	4.5	20.5	20.50	0.69	3.0	5.0	4.00	4.0	5.0	4.0	2.5	4.5	4.0	4.00	No	Sage