Correlation between Starbucks and Chain Density for Census Tracts

I wanted to see if, just based on yelp data, I could find any correlation between starbucks counts in a tract and percentage of businesses that are chains.

Methodology

Look at value counts for business names in the Yelp data set to develop criteria for a business being a chain/big-box based on numer of occourances of the business name.

I settled on minimum name count of 5 to be considered a chain, even though there were clearly some chains that had fewer than 5 instances in the yelp data set.

I plotted chain fraction vs starbucks count. (I excluded Starbucks from businesses when computing the Yelp chain fraction.)

Results

As can be seen from the figure at the end of this notebook, there is no apparent correlation between Starbucks count in a Census tract and the fraction of chain references.

Note that I tried a number of chain criteria and different cutoffs did not effect the lack of correlation.



In [2]:

    
%pylab inline
import pandas as pd
import json
bus = pd.DataFrame(json.loads(l) for l in open('yelp/business.json'))
bus_counts = pd.DataFrame(bus.name.value_counts())
bus_counts.columns = ['counts']
bus_counts[bus_counts.counts < 5]









    



Populating the interactive namespace from numpy and matplotlib






    Out[2]:






  
    
      
      counts
    
  
  
    
      Mojo Yogurt
      4
    
    
      SKECHERS Factory Outlet
      4
    
    
      Fairfield Inn by Marriott
      4
    
    
      Wolfman Pizza
      4
    
    
      Arizona Federal Credit Union
      4
    
    
      Rock Bottom Restaurant & Brewery
      4
    
    
      CenturyLink Store
      4
    
    
      Thrifty Car Rental
      4
    
    
      Tacos Los Toritos
      4
    
    
      Community Tire Pros & Auto Repair
      4
    
    
      Lumber Liquidators
      4
    
    
      Sanrio
      4
    
    
      Ghost Armor
      4
    
    
      The Roomstore
      4
    
    
      Tortas Paquime
      4
    
    
      Culinary Dropout
      4
    
    
      Bahama Buck's
      4
    
    
      Color Me Mine
      4
    
    
      Mimis Cafe
      4
    
    
      Kangaroo Express
      4
    
    
      Cactus Flower Florists
      4
    
    
      Garcia's Mexican Restaurant
      4
    
    
      Chevys Fresh Mex
      4
    
    
      Roy's Restaurant
      4
    
    
      Tory Burch
      4
    
    
      Papa Murphy's Take 'n' Bake Pizza
      4
    
    
      Avis
      4
    
    
      Quick Trip
      4
    
    
      Budget Car Rental
      4
    
    
      Mecklenburg ABC Liquor Store
      4
    
    
      ...
      ...
    
    
      Mardi Gras Costume Shop
      1
    
    
      TK Service Center
      1
    
    
      Fonda Mexicana El Paraiso
      1
    
    
      Airbridge Tours
      1
    
    
      Chuparosa Park
      1
    
    
      Hamra Jewelers
      1
    
    
      Charleston Swapmeet
      1
    
    
      Automall Autobody
      1
    
    
      Spa Uptown
      1
    
    
      San Portella Apartments
      1
    
    
      MGM Grand Race & Sports Book
      1
    
    
      Petite Chateau
      1
    
    
      European Auto Salon
      1
    
    
      Massage Envy Spa Dobson
      1
    
    
      J's Barber Shop
      1
    
    
      Maximum Pilates
      1
    
    
      Cafe Assisi
      1
    
    
      Hawaiian Shave Ice
      1
    
    
      Lane Bryant The Shoppes At Gilbert Commons
      1
    
    
      Kim Alterations
      1
    
    
      Marquee Theatre
      1
    
    
      Children's Museum Of Phoenix
      1
    
    
      Little Dumpling Thai & Chinese Cuisine
      1
    
    
      Friomio
      1
    
    
      Hans Gulyas
      1
    
    
      Le Meridien Versailles
      1
    
    
      Superior School of Real Estate
      1
    
    
      Bike Den
      1
    
    
      Eyecare Center
      1
    
    
      Satara
      1
    
  

44960 rows × 1 columns



In [3]:

    
bus_counts['chain'] = bus_counts['counts'].apply(lambda c: 1 if c >= 5 else 0)
bus_counts









    Out[3]:






  
    
      
      counts
      chain
    
  
  
    
      Starbucks
      413
      1
    
    
      McDonald's
      293
      1
    
    
      Subway
      274
      1
    
    
      Walgreens
      161
      1
    
    
      Taco Bell
      154
      1
    
    
      Wendy's
      123
      1
    
    
      Pizza Hut
      119
      1
    
    
      Burger King
      113
      1
    
    
      Panda Express
      112
      1
    
    
      The UPS Store
      107
      1
    
    
      Dunkin' Donuts
      101
      1
    
    
      Chipotle Mexican Grill
      86
      1
    
    
      Domino's Pizza
      85
      1
    
    
      Great Clips
      84
      1
    
    
      Wells Fargo Bank
      83
      1
    
    
      Bank of America
      74
      1
    
    
      Jack in the Box
      71
      1
    
    
      Jimmy John's
      71
      1
    
    
      Enterprise Rent-A-Car
      70
      1
    
    
      Dairy Queen
      68
      1
    
    
      Walmart Supercenter
      68
      1
    
    
      Papa John's Pizza
      67
      1
    
    
      QuikTrip
      66
      1
    
    
      Jiffy Lube
      66
      1
    
    
      Cvs Pharmacy
      62
      1
    
    
      The Home Depot
      61
      1
    
    
      Sonic Drive-In
      59
      1
    
    
      Supercuts
      57
      1
    
    
      Del Taco
      55
      1
    
    
      Albertsons
      55
      1
    
    
      ...
      ...
      ...
    
    
      Mardi Gras Costume Shop
      1
      0
    
    
      TK Service Center
      1
      0
    
    
      Fonda Mexicana El Paraiso
      1
      0
    
    
      Airbridge Tours
      1
      0
    
    
      Chuparosa Park
      1
      0
    
    
      Hamra Jewelers
      1
      0
    
    
      Charleston Swapmeet
      1
      0
    
    
      Automall Autobody
      1
      0
    
    
      Spa Uptown
      1
      0
    
    
      San Portella Apartments
      1
      0
    
    
      MGM Grand Race & Sports Book
      1
      0
    
    
      Petite Chateau
      1
      0
    
    
      European Auto Salon
      1
      0
    
    
      Massage Envy Spa Dobson
      1
      0
    
    
      J's Barber Shop
      1
      0
    
    
      Maximum Pilates
      1
      0
    
    
      Cafe Assisi
      1
      0
    
    
      Hawaiian Shave Ice
      1
      0
    
    
      Lane Bryant The Shoppes At Gilbert Commons
      1
      0
    
    
      Kim Alterations
      1
      0
    
    
      Marquee Theatre
      1
      0
    
    
      Children's Museum Of Phoenix
      1
      0
    
    
      Little Dumpling Thai & Chinese Cuisine
      1
      0
    
    
      Friomio
      1
      0
    
    
      Hans Gulyas
      1
      0
    
    
      Le Meridien Versailles
      1
      0
    
    
      Superior School of Real Estate
      1
      0
    
    
      Bike Den
      1
      0
    
    
      Eyecare Center
      1
      0
    
    
      Satara
      1
      0
    
  

45694 rows × 2 columns



In [4]:

    
buswc = bus.merge(bus_counts, left_on='name', right_index=True)
import json
join = pd.DataFrame(json.loads(l) for l in open('business_track.json'))
busg = buswc.merge(join)
yelp_starbucks = busg[busg['name'] == 'Starbucks'].GISJOIN.value_counts()
yelp_starbucks = pd.DataFrame(yelp_starbucks, columns=['yelp_starbucks'])
starbucks = pd.DataFrame(json.loads(l) for l in open('sb_track.json')).GISJOIN.value_counts()
starbucks = pd.DataFrame(starbucks, columns=['starbucks'])
yelp_starbucks = yelp_starbucks.merge(starbucks, left_index=True, right_index=True)
from matplotlib import pyplot
pyplot.scatter(yelp_starbucks.starbucks, yelp_starbucks.yelp_starbucks)
pyplot.plot([0,30], [0,30])
ax = pyplot.gca()
ax.set_xlabel("starbucks location count in tract")
ax.set_ylabel("yelp starbucks location count in tract")
ax.set_title("Compare starbucks location counts for starbucks and yelp data sets")









    Out[4]:





<matplotlib.text.Text at 0x12a680650>

Obviously, not all starbucks are in yelp.



In [7]:

    
chain = busg[busg['name'] != 'Starbucks'][['chain']].groupby(busg.GISJOIN).mean()
chain = chain.merge(starbucks, left_index=True, right_index=True)
pyplot.scatter(chain.starbucks, chain['chain'])
ax = pyplot.gca()
ax.set_xlabel("starbucks location count in tract")
ax.set_ylabel("fraction chain business in tract")









    Out[7]:





<matplotlib.text.Text at 0x129677050>



In [ ]:

	counts
Mojo Yogurt	4
SKECHERS Factory Outlet	4
Fairfield Inn by Marriott	4
Wolfman Pizza	4
Arizona Federal Credit Union	4
Rock Bottom Restaurant & Brewery	4
CenturyLink Store	4
Thrifty Car Rental	4
Tacos Los Toritos	4
Community Tire Pros & Auto Repair	4
Lumber Liquidators	4
Sanrio	4
Ghost Armor	4
The Roomstore	4
Tortas Paquime	4
Culinary Dropout	4
Bahama Buck's	4
Color Me Mine	4
Mimis Cafe	4
Kangaroo Express	4
Cactus Flower Florists	4
Garcia's Mexican Restaurant	4
Chevys Fresh Mex	4
Roy's Restaurant	4
Tory Burch	4
Papa Murphy's Take 'n' Bake Pizza	4
Avis	4
Quick Trip	4
Budget Car Rental	4
Mecklenburg ABC Liquor Store	4
...	...
Mardi Gras Costume Shop	1
TK Service Center	1
Fonda Mexicana El Paraiso	1
Airbridge Tours	1
Chuparosa Park	1
Hamra Jewelers	1
Charleston Swapmeet	1
Automall Autobody	1
Spa Uptown	1
San Portella Apartments	1
MGM Grand Race & Sports Book	1
Petite Chateau	1
European Auto Salon	1
Massage Envy Spa Dobson	1
J's Barber Shop	1
Maximum Pilates	1
Cafe Assisi	1
Hawaiian Shave Ice	1
Lane Bryant The Shoppes At Gilbert Commons	1
Kim Alterations	1
Marquee Theatre	1
Children's Museum Of Phoenix	1
Little Dumpling Thai & Chinese Cuisine	1
Friomio	1
Hans Gulyas	1
Le Meridien Versailles	1
Superior School of Real Estate	1
Bike Den	1
Eyecare Center	1
Satara	1

	counts	chain
Starbucks	413	1
McDonald's	293	1
Subway	274	1
Walgreens	161	1
Taco Bell	154	1
Wendy's	123	1
Pizza Hut	119	1
Burger King	113	1
Panda Express	112	1
The UPS Store	107	1
Dunkin' Donuts	101	1
Chipotle Mexican Grill	86	1
Domino's Pizza	85	1
Great Clips	84	1
Wells Fargo Bank	83	1
Bank of America	74	1
Jack in the Box	71	1
Jimmy John's	71	1
Enterprise Rent-A-Car	70	1
Dairy Queen	68	1
Walmart Supercenter	68	1
Papa John's Pizza	67	1
QuikTrip	66	1
Jiffy Lube	66	1
Cvs Pharmacy	62	1
The Home Depot	61	1
Sonic Drive-In	59	1
Supercuts	57	1
Del Taco	55	1
Albertsons	55	1
...	...	...
Mardi Gras Costume Shop	1	0
TK Service Center	1	0
Fonda Mexicana El Paraiso	1	0
Airbridge Tours	1	0
Chuparosa Park	1	0
Hamra Jewelers	1	0
Charleston Swapmeet	1	0
Automall Autobody	1	0
Spa Uptown	1	0
San Portella Apartments	1	0
MGM Grand Race & Sports Book	1	0
Petite Chateau	1	0
European Auto Salon	1	0
Massage Envy Spa Dobson	1	0
J's Barber Shop	1	0
Maximum Pilates	1	0
Cafe Assisi	1	0
Hawaiian Shave Ice	1	0
Lane Bryant The Shoppes At Gilbert Commons	1	0
Kim Alterations	1	0
Marquee Theatre	1	0
Children's Museum Of Phoenix	1	0
Little Dumpling Thai & Chinese Cuisine	1	0
Friomio	1	0
Hans Gulyas	1	0
Le Meridien Versailles	1	0
Superior School of Real Estate	1	0
Bike Den	1	0
Eyecare Center	1	0
Satara	1	0