San Diego Burrito Analytics: Rankings

Scott Cole

21 May 2016

This notebook ranks each taco shop along each dimension

imports


In [11]:
%config InlineBackend.figure_format = 'retina'
%matplotlib inline

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import pandas as pd

import seaborn as sns
sns.set_style("white")

Load data


In [12]:
filename="burrito_current.csv"
df = pd.read_csv(filename)
N = df.shape[0]

Average each metric over each Location


In [14]:
# Avoid case issues; in the future should avoid article issues
df.Location = df.Location.str.lower()
m_Location = ['Location','N','Yelp','Google','Hunger','Cost','Volume','Tortilla','Temp','Meat','Fillings','Meat:filling',
               'Uniformity','Salsa','Synergy','Wrap','overall']

tacoshops = df.Location.unique()
TS = len(tacoshops)
dfmean = pd.DataFrame(np.nan, index=range(TS), columns=m_Location)
for ts in range(TS):
    dfmean.loc[ts] = df.loc[df.Location==tacoshops[ts]].mean()
    dfmean['N'][ts] = sum(df.Location == tacoshops[ts])
dfmean.Location = tacoshops

In [53]:
Ncutoff = 8
dfToRank = dfmean.loc[dfmean.N>=Ncutoff]

In [54]:
dfToRank


Out[54]:
Location N Yelp Google Hunger Cost Volume Tortilla Temp Meat Fillings Meat:filling Uniformity Salsa Synergy Wrap overall
7 taco stand 15 4.5 4.400000 3.100000 7.727333 0.787143 3.566667 2.933333 4.266667 3.966667 4.033333 3.766667 3.933333 4.2 3.966667 4.126667
9 rigoberto's taco shop 15 3.8 4.346667 3.533333 6.700000 0.930000 3.700000 3.866667 3.660714 3.700000 4.133333 3.766667 3.225000 4.0 3.866667 3.796667
18 vallarta express 9 3.5 4.000000 3.500000 6.933333 0.880000 2.916667 4.357143 3.277778 3.444444 3.611111 3.166667 3.500000 3.0 3.611111 3.500000

In [55]:
m_Rank = ['Location','Cost','Volume','Tortilla','Temp','Meat','Fillings','Meat:filling', 'Uniformity','Salsa','Synergy','Wrap','overall']
TS = len(dfToRank)
dfRanked = pd.DataFrame(np.nan, index=range(TS), columns=m_Rank)
dfRanked.Location[:] = dfToRank.Location
for m in m_Rank[1:]:
    if m == 'Cost':
        dfRanked[m][:] = dfToRank[m].rank(ascending=1)
    else:
        dfRanked[m][:] = dfToRank[m].rank(ascending=0)


C:\Users\Scott\Anaconda2\lib\site-packages\ipykernel\__main__.py:7: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
C:\Users\Scott\Anaconda2\lib\site-packages\ipykernel\__main__.py:9: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In [56]:
dfRanked


Out[56]:
Location Cost Volume Tortilla Temp Meat Fillings Meat:filling Uniformity Salsa Synergy Wrap overall
0 taco stand 3 3 2 3 1 1 2 1.5 1 1 1 1
1 rigoberto's taco shop 1 1 1 2 2 2 1 1.5 3 2 2 2
2 vallarta express 2 2 3 1 3 3 3 3.0 2 3 3 3

Best california


In [ ]:
#TODO