San Diego Burrito Analytics: Rankings

Scott Cole

21 May 2016

This notebook ranks each taco shop along each dimension

imports


In [1]:
%config InlineBackend.figure_format = 'retina'
%matplotlib inline

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import pandas as pd

import seaborn as sns
sns.set_style("white")

Load data


In [2]:
import util
df = util.load_burritos()
N = df.shape[0]

Average each metric over each Location


In [3]:
m_Location = ['Location','N','Yelp','Google','Hunger','Cost','Volume','Tortilla','Temp','Meat','Fillings','Meat:filling',
               'Uniformity','Salsa','Synergy','Wrap','overall']

# Calculate the mean of each of the metrics above for each taco shop
tacoshops = df.Location.unique()
TS = len(tacoshops)
dfmean = pd.DataFrame(np.nan, index=range(TS), columns=m_Location)
for ts in range(TS):
    dfmean.loc[ts] = df.loc[df.Location==tacoshops[ts]].mean()
    dfmean['N'][ts] = sum(df.Location == tacoshops[ts])
dfmean.Location = tacoshops

In [4]:
Ncutoff = 5
dfToRank = dfmean.loc[dfmean.N>=Ncutoff]

In [5]:
dfToRank


Out[5]:
Location N Yelp Google Hunger Cost Volume Tortilla Temp Meat Fillings Meat:filling Uniformity Salsa Synergy Wrap overall
4 los primos mexican food 8.0 3.000000 3.700000 3.500000 7.200000 0.822500 3.187500 3.625000 3.125000 3.250000 2.000000 2.428571 2.875000 2.562500 3.125000 2.750000
7 taco stand 16.0 4.500000 4.400000 3.125000 7.558750 0.787143 3.593750 2.968750 4.187500 3.875000 3.843750 3.781250 3.718750 4.125000 3.875000 4.018750
9 rigoberto's taco shop 13.0 3.769231 4.338462 3.538462 6.761538 0.890000 3.500000 3.807692 3.557692 3.500000 4.000000 3.653846 3.225000 3.818182 3.923077 3.650000
10 lolita's taco shop 12.0 4.000000 4.400000 3.141667 7.225000 0.747778 2.983333 3.275000 3.363636 3.641667 3.354545 2.991667 2.854167 3.437500 3.916667 3.283333
17 vallarta express 9.0 3.500000 4.000000 3.500000 6.933333 0.880000 2.916667 4.357143 3.277778 3.444444 3.611111 3.166667 3.500000 3.000000 3.611111 3.500000
22 california burritos 12.0 4.500000 4.400000 3.916667 6.187500 0.843750 4.108333 3.608333 4.166667 3.833333 4.125000 3.791667 3.681818 4.166667 4.458333 4.254167

In [6]:
m_Rank = ['Location','Cost','Volume','Tortilla','Temp','Meat','Fillings','Meat:filling', 'Uniformity','Salsa','Synergy','Wrap','overall']
TS = len(dfToRank)
dfRanked = pd.DataFrame(np.nan, index=range(TS), columns=m_Rank)
dfRanked.Location[:] = dfToRank.Location
for m in m_Rank[1:]:
    if m == 'Cost':
        dfRanked[m][:] = dfToRank[m].rank(ascending=1)
    else:
        dfRanked[m][:] = dfToRank[m].rank(ascending=0)


C:\Users\Scott\Anaconda2\lib\site-packages\ipykernel\__main__.py:7: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
C:\Users\Scott\Anaconda2\lib\site-packages\ipykernel\__main__.py:9: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In [7]:
dfRanked


Out[7]:
Location Cost Volume Tortilla Temp Meat Fillings Meat:filling Uniformity Salsa Synergy Wrap overall
0 los primos mexican food 4.0 4.0 4.0 3.0 6.0 6.0 6.0 6.0 5.0 6.0 6.0 6.0
1 taco stand 6.0 5.0 2.0 6.0 1.0 1.0 3.0 2.0 1.0 2.0 4.0 2.0
2 rigoberto's taco shop 2.0 1.0 3.0 2.0 3.0 4.0 2.0 3.0 4.0 3.0 2.0 3.0
3 lolita's taco shop 5.0 6.0 5.0 5.0 4.0 3.0 5.0 5.0 6.0 4.0 3.0 5.0
4 vallarta express 3.0 2.0 6.0 1.0 5.0 5.0 4.0 4.0 3.0 5.0 5.0 4.0
5 california burritos 1.0 3.0 1.0 4.0 2.0 2.0 1.0 1.0 2.0 1.0 1.0 1.0

Best california


In [8]:
#TODO