Yelp Dataset Challenge


Timothy Helton

Yelp is a website that allows patrons to review restaurants they have been to. The company runs a regular challenge to see if anyone can derive additional insights from the raw user reviews. More information about the challenge may be found here.


For excerises 1-4, use the Yelp business json file. For exercises 5-6, use the Yelp review json file.



NOTE:
This notebook uses code found in the k2datascience.yelp module. To execute all the cells do one of the following items:

  • Install the k2datascience package to the active Python interpreter.
  • Add k2datascience/k2datascience to the PYTHON_PATH system variable.
  • Create a link to the yelp.py file in the same directory as this notebook.


Imports


In [1]:
from k2datascience import yelp

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
%matplotlib inline

Load Data

Create Instance of Yelp Class


In [2]:
ydc = yelp.YDC()

In [3]:
ydc.load_data()


05/01/2017 07:13:08      INFO  -> root <- (line: 97) File Loaded: yelp_academic_dataset_business.json

05/01/2017 07:13:59      INFO  -> root <- (line: 97) File Loaded: yelp_academic_dataset_review.json


In [4]:
business = ydc.file_data['business']
business.shape
business.head()
business.tail()


Out[4]:
(85901, 15)
Out[4]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods open review_count stars state type
0 {'Take-out': True, 'Drive-Thru': False, 'Good ... 5UmKMjUEUNdYWqANhGckJw [Fast Food, Restaurants] Dravosburg 4734 Lebanon Church Rd\nDravosburg, PA 15034 {'Friday': {'close': '21:00', 'open': '11:00'}... 40.354327 -79.900706 Mr Hoagie [] True 7 3.5 PA business
1 {'Happy Hour': True, 'Accepts Credit Cards': T... UsFtqoBl7naz8AVUBZMjQQ [Nightlife] Dravosburg 202 McClure St\nDravosburg, PA 15034 {} 40.350553 -79.886814 Clancy's Pub [] True 5 3.0 PA business
2 {'Good for Kids': True} cE27W9VPgO88Qxe4ol6y_g [Active Life, Mini Golf, Golf] Bethel Park 1530 Hamilton Rd\nBethel Park, PA 15234 {} 40.354115 -80.014660 Cool Springs Golf Center [] False 5 2.5 PA business
3 {'Alcohol': 'full_bar', 'Noise Level': 'averag... mVHrayjG3uZ_RLHkLj-AMg [Bars, American (New), Nightlife, Lounges, Res... Braddock 414 Hawkins Ave\nBraddock, PA 15104 {'Tuesday': {'close': '19:00', 'open': '10:00'... 40.408830 -79.866211 Emil's Lounge [] True 26 4.5 PA business
4 {'Parking': {'garage': False, 'street': False,... mYSpR_SLPgUVymYOvTQd_Q [Active Life, Golf] Braddock 1000 Clubhouse Dr\nBraddock, PA 15104 {'Sunday': {'close': '15:00', 'open': '10:00'}... 40.403405 -79.855782 Grand View Golf Club [] True 3 5.0 PA business
Out[4]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods open review_count stars state type
85896 {'Accepts Credit Cards': True} m7-3lyY0CJEhePfJKWtD3w [Bridal, Fashion, Shopping, Formal Wear] Las Vegas 3899 East Sunset Rd\nSte 105\nLas Vegas, NV 89120 {'Tuesday': {'close': '18:00', 'open': '10:00'... 36.070535 -115.089318 Bowties Bridal [] True 61 4.0 NV business
85897 {'Take-out': True, 'Wi-Fi': 'no', 'Good For': ... g0vvhkZWZKlwF8BUeSPaTA [Mexican, Restaurants] Goodyear 525 N Estrella Pkwy\nSte 100\nGoodyear, AZ 85338 {} 33.452205 -112.392009 Senor Taco [] True 89 3.5 AZ business
85898 {} 46L_7y9QXffPpOaXNLX8hg [Car Wash, Automotive] Phoenix 9215 North 7th St\nPhoenix, AZ 85020 {'Monday': {'close': '18:00', 'open': '07:00'}... 33.570417 -112.064854 Cobblestone Auto Spa [] True 7 3.0 AZ business
85899 {'Accepts Credit Cards': True, 'Wi-Fi': 'free'... HuLzZUBkHEcHk6ETDJIVhQ [Home Services, Real Estate, Apartments] Edinburgh 16 Waterloo Place\nOld Town\nEdinburgh EH1 3EG {} 55.953447 -3.186813 Princess Street Suites [Old Town] True 5 4.0 EDH business
85900 {} DH2Ujt_hwcMBIz8VvCb0Lg [Mexican, Restaurants] Charlotte Charlotte Douglas International Airport Termin... {} 35.224223 -80.940290 Salsarita's Express [] True 57 2.5 NC business

Exercise 1: Create a new column that contains only the zipcode.


In [5]:
ydc.get_zip_codes()
business.head()


Out[5]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods open review_count stars state type zip_code
0 {'Take-out': True, 'Drive-Thru': False, 'Good ... 5UmKMjUEUNdYWqANhGckJw [Fast Food, Restaurants] Dravosburg 4734 Lebanon Church Rd\nDravosburg, PA 15034 {'Friday': {'close': '21:00', 'open': '11:00'}... 40.354327 -79.900706 Mr Hoagie [] True 7 3.5 PA business 15034
1 {'Happy Hour': True, 'Accepts Credit Cards': T... UsFtqoBl7naz8AVUBZMjQQ [Nightlife] Dravosburg 202 McClure St\nDravosburg, PA 15034 {} 40.350553 -79.886814 Clancy's Pub [] True 5 3.0 PA business 15034
2 {'Good for Kids': True} cE27W9VPgO88Qxe4ol6y_g [Active Life, Mini Golf, Golf] Bethel Park 1530 Hamilton Rd\nBethel Park, PA 15234 {} 40.354115 -80.014660 Cool Springs Golf Center [] False 5 2.5 PA business 15234
3 {'Alcohol': 'full_bar', 'Noise Level': 'averag... mVHrayjG3uZ_RLHkLj-AMg [Bars, American (New), Nightlife, Lounges, Res... Braddock 414 Hawkins Ave\nBraddock, PA 15104 {'Tuesday': {'close': '19:00', 'open': '10:00'... 40.408830 -79.866211 Emil's Lounge [] True 26 4.5 PA business 15104
4 {'Parking': {'garage': False, 'street': False,... mYSpR_SLPgUVymYOvTQd_Q [Active Life, Golf] Braddock 1000 Clubhouse Dr\nBraddock, PA 15104 {'Sunday': {'close': '15:00', 'open': '10:00'}... 40.403405 -79.855782 Grand View Golf Club [] True 3 5.0 PA business 15104

Exercise 2: The table contains a column called 'categories' and each entry in this column is populated by a list. We are interested in those businesses that are restaurants. Create a new column 'Restaurant_type' that contains a description of the restaurant based on the other elements of 'categories.

That is, if we have '[Sushi Bars, Japanese, Restaurants]' in categories the 'Restaurant_type will be '{'SushiBars': 1, 'Japanese': 1, 'Mexican': 0, ...}'


In [6]:
ydc.get_restaurant_type()
business.head()


Out[6]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods open review_count stars state type zip_code restaurant_type
0 {'Take-out': True, 'Drive-Thru': False, 'Good ... 5UmKMjUEUNdYWqANhGckJw [Fast Food, Restaurants] Dravosburg 4734 Lebanon Church Rd\nDravosburg, PA 15034 {'Friday': {'close': '21:00', 'open': '11:00'}... 40.354327 -79.900706 Mr Hoagie [] True 7 3.5 PA business 15034 {'Fast Food': 1, 'Bars': 0, 'American (New)': ...
1 {'Happy Hour': True, 'Accepts Credit Cards': T... UsFtqoBl7naz8AVUBZMjQQ [Nightlife] Dravosburg 202 McClure St\nDravosburg, PA 15034 {} 40.350553 -79.886814 Clancy's Pub [] True 5 3.0 PA business 15034 remove
2 {'Good for Kids': True} cE27W9VPgO88Qxe4ol6y_g [Active Life, Mini Golf, Golf] Bethel Park 1530 Hamilton Rd\nBethel Park, PA 15234 {} 40.354115 -80.014660 Cool Springs Golf Center [] False 5 2.5 PA business 15234 remove
3 {'Alcohol': 'full_bar', 'Noise Level': 'averag... mVHrayjG3uZ_RLHkLj-AMg [Bars, American (New), Nightlife, Lounges, Res... Braddock 414 Hawkins Ave\nBraddock, PA 15104 {'Tuesday': {'close': '19:00', 'open': '10:00'... 40.408830 -79.866211 Emil's Lounge [] True 26 4.5 PA business 15104 {'Fast Food': 0, 'Bars': 1, 'American (New)': ...
4 {'Parking': {'garage': False, 'street': False,... mYSpR_SLPgUVymYOvTQd_Q [Active Life, Golf] Braddock 1000 Clubhouse Dr\nBraddock, PA 15104 {'Sunday': {'close': '15:00', 'open': '10:00'}... 40.403405 -79.855782 Grand View Golf Club [] True 3 5.0 PA business 15104 remove

In [7]:
business.restaurant_type.ix[0]


Out[7]:
{'Active Life': 0,
 'Adult Entertainment': 0,
 'Afghan': 0,
 'African': 0,
 'Alsatian': 0,
 'Amateur Sports Teams': 0,
 'American (New)': 0,
 'American (Traditional)': 0,
 'Amusement Parks': 0,
 'Antiques': 0,
 'Appliances': 0,
 'Arabian': 0,
 'Arcades': 0,
 'Argentine': 0,
 'Art Galleries': 0,
 'Arts & Crafts': 0,
 'Arts & Entertainment': 0,
 'Asian Fusion': 0,
 'Australian': 0,
 'Austrian': 0,
 'Automotive': 0,
 'Baden': 0,
 'Bagels': 0,
 'Bakeries': 0,
 'Bangladeshi': 0,
 'Banks & Credit Unions': 0,
 'Barbeque': 0,
 'Bars': 0,
 'Bartenders': 0,
 'Basque': 0,
 'Bavarian': 0,
 'Beauty & Spas': 0,
 'Bed & Breakfast': 0,
 'Beer Bar': 0,
 'Beer Garden': 0,
 'Beer Gardens': 0,
 'Beer Hall': 0,
 'Beer, Wine & Spirits': 0,
 'Belgian': 0,
 'Bikes': 0,
 'Bistros': 0,
 'Boating': 0,
 'Books, Mags, Music & Video': 0,
 'Bookstores': 0,
 'Bowling': 0,
 'Brasseries': 0,
 'Brazilian': 0,
 'Breakfast & Brunch': 0,
 'Breweries': 0,
 'British': 0,
 'Bubble Tea': 0,
 'Buffets': 0,
 'Burgers': 0,
 'Burmese': 0,
 'Butcher': 0,
 'Cabaret': 0,
 'Cafes': 0,
 'Cafeteria': 0,
 'Cajun/Creole': 0,
 'Cambodian': 0,
 'Canadian (New)': 0,
 'Candy Stores': 0,
 'Cantonese': 0,
 'Car Wash': 0,
 'Caribbean': 0,
 'Casinos': 0,
 'Caterers': 0,
 'Champagne Bars': 0,
 'Cheese Shops': 0,
 'Cheesesteaks': 0,
 'Chicken Shop': 0,
 'Chicken Wings': 0,
 'Chinese': 0,
 'Chocolatiers & Shops': 0,
 'Cinema': 0,
 'Cocktail Bars': 0,
 'Coffee & Tea': 0,
 'Colleges & Universities': 0,
 'Colombian': 0,
 'Comfort Food': 0,
 'Convenience Stores': 0,
 'Cooking Schools': 0,
 'Cosmetics & Beauty Supply': 0,
 'Country Clubs': 0,
 'Country Dance Halls': 0,
 'Creperies': 0,
 'Cuban': 0,
 'Cupcakes': 0,
 'Curry Sausage': 0,
 'Czech': 0,
 'DJs': 0,
 'Dance Clubs': 0,
 'Day Spas': 0,
 'Delicatessen': 0,
 'Delis': 0,
 'Department Stores': 0,
 'Desserts': 0,
 'Dim Sum': 0,
 'Diners': 0,
 'Dinner Theater': 0,
 'Distilleries': 0,
 'Dive Bars': 0,
 'Do-It-Yourself Food': 0,
 'Dominican': 0,
 'Donairs': 0,
 'Donuts': 0,
 'Drive-Thru Bars': 0,
 'Dry Cleaning & Laundry': 0,
 'Eastern European': 0,
 'Education': 0,
 'Egyptian': 0,
 'Empanadas': 0,
 'Ethiopian': 0,
 'Ethnic Food': 0,
 'Ethnic Grocery': 0,
 'Event Planning & Services': 0,
 'Falafel': 0,
 'Farmers Market': 0,
 'Fashion': 0,
 'Fast Food': 1,
 'Festivals': 0,
 'Filipino': 0,
 'Financial Services': 0,
 'Fish & Chips': 0,
 'Fitness & Instruction': 0,
 'Flea Markets': 0,
 'Florists': 0,
 'Flowers & Gifts': 0,
 'Fondue': 0,
 'Food': 0,
 'Food Court': 0,
 'Food Delivery Services': 0,
 'Food Stands': 0,
 'Food Trucks': 0,
 'Framing': 0,
 'French': 0,
 'Fruits & Veggies': 0,
 'Furniture Stores': 0,
 'Gas & Service Stations': 0,
 'Gastropubs': 0,
 'Gay Bars': 0,
 'Gelato': 0,
 'German': 0,
 'Gluten-Free': 0,
 'Golf': 0,
 'Golf Equipment Shops': 0,
 'Greek': 0,
 'Grocery': 0,
 'Guest Houses': 0,
 'Haitian': 0,
 'Halal': 0,
 'Hardware Stores': 0,
 'Hawaiian': 0,
 'Health & Medical': 0,
 'Health Markets': 0,
 'Herbs & Spices': 0,
 'Hiking': 0,
 'Himalayan/Nepalese': 0,
 'Home & Garden': 0,
 'Home Decor': 0,
 'Hong Kong Style Cafe': 0,
 'Hookah Bars': 0,
 'Horseback Riding': 0,
 'Hot Dogs': 0,
 'Hot Pot': 0,
 'Hotel bar': 0,
 'Hotels': 0,
 'Hotels & Travel': 0,
 'Hungarian': 0,
 'Ice Cream & Frozen Yogurt': 0,
 'Indian': 0,
 'Indonesian': 0,
 'International': 0,
 'Internet Cafes': 0,
 'Irish': 0,
 'Irish Pub': 0,
 'Italian': 0,
 'Japanese': 0,
 'Jazz & Blues': 0,
 'Juice Bars & Smoothies': 0,
 'Karaoke': 0,
 'Kebab': 0,
 'Kids Activities': 0,
 'Kitchen & Bath': 0,
 'Korean': 0,
 'Kosher': 0,
 'Lakes': 0,
 'Landmarks & Historical Buildings': 0,
 'Laotian': 0,
 'Latin American': 0,
 'Lebanese': 0,
 'Live/Raw Food': 0,
 'Local Flavor': 0,
 'Local Services': 0,
 'Lounges': 0,
 'Macarons': 0,
 'Malaysian': 0,
 'Meat Shops': 0,
 'Mediterranean': 0,
 "Men's Clothing": 0,
 'Mexican': 0,
 'Middle Eastern': 0,
 'Modern European': 0,
 'Mongolian': 0,
 'Moroccan': 0,
 'Music & DVDs': 0,
 'Music Venues': 0,
 'Musicians': 0,
 'New Mexican Cuisine': 0,
 'Nightlife': 0,
 'Noodles': 0,
 'Nutritionists': 0,
 'Olive Oil': 0,
 'Organic Stores': 0,
 'Oriental': 0,
 'Pakistani': 0,
 'Palatine': 0,
 'Party & Event Planning': 0,
 'Pasta Shops': 0,
 'Patisserie/Cake Shop': 0,
 'Performing Arts': 0,
 'Persian/Iranian': 0,
 'Personal Chefs': 0,
 'Peruvian': 0,
 'Piano Bars': 0,
 'Pita': 0,
 'Pizza': 0,
 'Poke': 0,
 'Polish': 0,
 'Pool Halls': 0,
 'Portuguese': 0,
 'Poutineries': 0,
 'Pretzels': 0,
 'Public Services & Government': 0,
 'Pubs': 0,
 'Puerto Rican': 0,
 'Ramen': 0,
 'Recreation Centers': 0,
 'Rhinelandian': 0,
 'Russian': 0,
 'Salad': 0,
 'Salvadoran': 0,
 'Sandwiches': 0,
 'Scandinavian': 0,
 'Scottish': 0,
 'Seafood': 0,
 'Seafood Markets': 0,
 'Senegalese': 0,
 'Serbo Croatian': 0,
 'Shanghainese': 0,
 'Shaved Ice': 0,
 'Shopping': 0,
 'Shopping Centers': 0,
 'Singaporean': 0,
 'Slovakian': 0,
 'Soccer': 0,
 'Social Clubs': 0,
 'Soul Food': 0,
 'Soup': 0,
 'Southern': 0,
 'Spanish': 0,
 'Specialty Food': 0,
 'Specialty Schools': 0,
 'Sporting Goods': 0,
 'Sports Bars': 0,
 'Sports Clubs': 0,
 'Sports Wear': 0,
 'Sri Lankan': 0,
 'Steakhouses': 0,
 'Street Vendors': 0,
 'Supper Clubs': 0,
 'Sushi Bars': 0,
 'Swimming Pools': 0,
 'Swiss Food': 0,
 'Szechuan': 0,
 'Taiwanese': 0,
 'Tapas Bars': 0,
 'Tapas/Small Plates': 0,
 'Tea Rooms': 0,
 'Teppanyaki': 0,
 'Tex-Mex': 0,
 'Thai': 0,
 'Tiki Bars': 0,
 'Tours': 0,
 'Toy Stores': 0,
 'Travel Services': 0,
 'Trinidadian': 0,
 'Turkish': 0,
 'Tuscan': 0,
 'Ukrainian': 0,
 'Uzbek': 0,
 'Vegan': 0,
 'Vegetarian': 0,
 'Venezuelan': 0,
 'Venues & Event Spaces': 0,
 'Vietnamese': 0,
 'Vinyl Records': 0,
 'Vitamins & Supplements': 0,
 'Waffles': 0,
 'Whiskey Bars': 0,
 'Wine Bars': 0,
 'Wine Tasting Room': 0,
 'Wineries': 0,
 'Wok': 0,
 'Yoga': 0}

Exercise 3: Lets clean the 'attributes' column. The entries in this column are dictionaries. We need to do two things:

1) Turn all the True or False in the dictionary to 1s and 0s.

2) There are some entries within dictionaries that are dictionaries themselves, lets turn the whole entry into just one dictionary, for example if we have

'{'Accepts Credit Cards': True, 'Alcohol': 'none','Ambience': {'casual': False,'classy': False}}'
then turn it into
'{'Accepts Credit Cards':1, 'Alcohol_none': 1, 'Ambience_casual': 0, 'Ambience_classy': 0}'.
There might be other entries like {'Price Range': 1} where the values are numerical so we might want to change that into {'Price_Range_1': 1}.

The reason we modify categorical variables like this is that machine learning algorithms cannot interpret textual data like "True" and "False". They need numerical inputs such as 1 and 0.


In [8]:
business.attributes = yelp.convert_boolean(business.attributes)
business.head()


Out[8]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods open review_count stars state type zip_code restaurant_type
0 {'Take-out': 1, 'Drive-Thru': 0, 'Good For': {... 5UmKMjUEUNdYWqANhGckJw [Fast Food, Restaurants] Dravosburg 4734 Lebanon Church Rd\nDravosburg, PA 15034 {'Friday': {'close': '21:00', 'open': '11:00'}... 40.354327 -79.900706 Mr Hoagie [] True 7 3.5 PA business 15034 {'Fast Food': 1, 'Bars': 0, 'American (New)': ...
1 {'Happy Hour': 1, 'Accepts Credit Cards': 1, '... UsFtqoBl7naz8AVUBZMjQQ [Nightlife] Dravosburg 202 McClure St\nDravosburg, PA 15034 {} 40.350553 -79.886814 Clancy's Pub [] True 5 3.0 PA business 15034 remove
2 {'Good for Kids': 1} cE27W9VPgO88Qxe4ol6y_g [Active Life, Mini Golf, Golf] Bethel Park 1530 Hamilton Rd\nBethel Park, PA 15234 {} 40.354115 -80.014660 Cool Springs Golf Center [] False 5 2.5 PA business 15234 remove
3 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... mVHrayjG3uZ_RLHkLj-AMg [Bars, American (New), Nightlife, Lounges, Res... Braddock 414 Hawkins Ave\nBraddock, PA 15104 {'Tuesday': {'close': '19:00', 'open': '10:00'... 40.408830 -79.866211 Emil's Lounge [] True 26 4.5 PA business 15104 {'Fast Food': 0, 'Bars': 1, 'American (New)': ...
4 {'Parking': {'garage': 0, 'street': 0, 'valida... mYSpR_SLPgUVymYOvTQd_Q [Active Life, Golf] Braddock 1000 Clubhouse Dr\nBraddock, PA 15104 {'Sunday': {'close': '15:00', 'open': '10:00'}... 40.403405 -79.855782 Grand View Golf Club [] True 3 5.0 PA business 15104 remove

In [9]:
business.attributes.ix[0]


Out[9]:
{'Accepts Credit Cards': 1,
 'Alcohol_none': 1,
 'Ambience': {'casual': 0,
  'classy': 0,
  'divey': 0,
  'hipster': 0,
  'intimate': 0,
  'romantic': 0,
  'touristy': 0,
  'trendy': 0,
  'upscale': 0},
 'Attire_casual': 1,
 'Caters': 0,
 'Delivery': 0,
 'Drive-Thru': 0,
 'Good For': {'breakfast': 0,
  'brunch': 0,
  'dessert': 0,
  'dinner': 0,
  'latenight': 0,
  'lunch': 0},
 'Good For Groups': 1,
 'Good for Kids': 1,
 'Has TV': 0,
 'Noise Level_average': 1,
 'Outdoor Seating': 0,
 'Parking': {'garage': 0, 'lot': 0, 'street': 0, 'valet': 0, 'validated': 0},
 'Price Range_1': 1,
 'Take-out': 1,
 'Takes Reservations': 0,
 'Waiter Service': 0}

Exercise 4: Create a new column for every day of the week and fill it with the amount of hours the business is open that day.

Your approach should handle businesses that stay open late like bars and nightclubs.


In [10]:
ydc.calc_open_hours()
business.head(6)


Out[10]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods ... type zip_code restaurant_type Sunday Monday Tuesday Wednesday Thursday Friday Saturday
0 {'Take-out': 1, 'Drive-Thru': 0, 'Good For': {... 5UmKMjUEUNdYWqANhGckJw [Fast Food, Restaurants] Dravosburg 4734 Lebanon Church Rd\nDravosburg, PA 15034 {'Friday': {'close': '21:00', 'open': '11:00'}... 40.354327 -79.900706 Mr Hoagie [] ... business 15034 {'Fast Food': 1, 'Bars': 0, 'American (New)': ... 0.0 10.0 10.0 10.0 10.0 10.0 0.0
1 {'Happy Hour': 1, 'Accepts Credit Cards': 1, '... UsFtqoBl7naz8AVUBZMjQQ [Nightlife] Dravosburg 202 McClure St\nDravosburg, PA 15034 {} 40.350553 -79.886814 Clancy's Pub [] ... business 15034 remove 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 {'Good for Kids': 1} cE27W9VPgO88Qxe4ol6y_g [Active Life, Mini Golf, Golf] Bethel Park 1530 Hamilton Rd\nBethel Park, PA 15234 {} 40.354115 -80.014660 Cool Springs Golf Center [] ... business 15234 remove 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... mVHrayjG3uZ_RLHkLj-AMg [Bars, American (New), Nightlife, Lounges, Res... Braddock 414 Hawkins Ave\nBraddock, PA 15104 {'Tuesday': {'close': '19:00', 'open': '10:00'... 40.408830 -79.866211 Emil's Lounge [] ... business 15104 {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 0.0 0.0 9.0 9.0 9.0 10.0 6.0
4 {'Parking': {'garage': 0, 'street': 0, 'valida... mYSpR_SLPgUVymYOvTQd_Q [Active Life, Golf] Braddock 1000 Clubhouse Dr\nBraddock, PA 15104 {'Sunday': {'close': '15:00', 'open': '10:00'}... 40.403405 -79.855782 Grand View Golf Club [] ... business 15104 remove 5.0 0.0 0.0 9.0 9.0 9.0 9.0
5 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... KayYbHCt-RkbGcPdGOThNg [Bars, American (Traditional), Nightlife, Rest... Carnegie 141 Hawthorne St\nGreentree\nCarnegie, PA 15106 {'Monday': {'close': '02:00', 'open': '11:00'}... 40.415486 -80.067549 Alexion's Bar & Grill [Greentree] ... business 15106 {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 10.0 15.0 15.0 15.0 15.0 15.0 14.0

6 rows × 24 columns


Exercise 5: Create a table with the average review for a business.

You will need to pull in a new json file and merge DataFrames for the next 2 exercises.


In [11]:
review = ydc.file_data['review']
review.shape
review.head()
review.tail()


Out[11]:
(2685066, 8)
Out[11]:
business_id date review_id stars text type user_id votes
0 5UmKMjUEUNdYWqANhGckJw 2012-08-01 Ya85v4eqdd6k9Od8HbQjyA 4 Mr Hoagie is an institution. Walking in, it do... review PUFPaY9KxDAcGqfsorJp3Q {'funny': 0, 'useful': 0, 'cool': 0}
1 5UmKMjUEUNdYWqANhGckJw 2014-02-13 KPvLNJ21_4wbYNctrOwWdQ 5 Excellent food. Superb customer service. I mis... review Iu6AxdBYGR4A0wspR9BYHA {'funny': 0, 'useful': 0, 'cool': 0}
2 5UmKMjUEUNdYWqANhGckJw 2015-10-31 fFSoGV46Yxuwbr3fHNuZig 5 Yes this place is a little out dated and not o... review auESFwWvW42h6alXgFxAXQ {'funny': 1, 'useful': 1, 'cool': 0}
3 5UmKMjUEUNdYWqANhGckJw 2015-12-26 pVMIt0a_QsKtuDfWVfSk2A 3 PROS: Italian hoagie was delicious. Friendly ... review qiczib2fO_1VBG8IoCGvVg {'funny': 0, 'useful': 0, 'cool': 0}
4 5UmKMjUEUNdYWqANhGckJw 2016-04-08 AEyiQ_Y44isJmNbMTyoMKQ 2 First the only reason this place could possibl... review qEE5EvV-f-s7yHC0Z4ydJQ {'funny': 0, 'useful': 1, 'cool': 0}
Out[11]:
business_id date review_id stars text type user_id votes
2685061 DH2Ujt_hwcMBIz8VvCb0Lg 2015-11-23 5-pv7M86ZdrXjfHPkPsZug 1 Still sick. Do not eat here unless you want to... review kONznNes89LWlc1jcZtD0A {'funny': 0, 'useful': 0, 'cool': 0}
2685062 DH2Ujt_hwcMBIz8VvCb0Lg 2015-11-24 MjGrqy30haStX4Q6SWsdcg 1 This place sucks especially the white manager ... review 6jXm3mrRGAPRENujxhlRpw {'funny': 0, 'useful': 0, 'cool': 0}
2685063 DH2Ujt_hwcMBIz8VvCb0Lg 2016-02-13 7ZfVeWubWTleBJUXXMPl_w 3 Not a bad stop for airport food. I got the chi... review D8AR0UYdlHClqcjARPEr8Q {'funny': 0, 'useful': 0, 'cool': 0}
2685064 DH2Ujt_hwcMBIz8VvCb0Lg 2016-04-30 vwmqHxxmy9rEAwhbkLXmnQ 3 He stood in the face of a 2.5 star biz, and br... review nELVJlkX8T0mUAArSPSJxw {'funny': 5, 'useful': 4, 'cool': 4}
2685065 DH2Ujt_hwcMBIz8VvCb0Lg 2016-07-11 DDmiTM_jMhshjYkXk5Sshg 1 2 pm Monday afternoon. Out of sour cream (ridi... review maAimqEE4G483rtifPKlYg {'funny': 0, 'useful': 0, 'cool': 0}

In [12]:
ydc.get_avg_stars()
ydc.file_data['business'].head()


Out[12]:
attributes business_id categories city full_address hours latitude longitude name neighborhoods ... zip_code restaurant_type Sunday Monday Tuesday Wednesday Thursday Friday Saturday stars_avg
0 {'Take-out': 1, 'Drive-Thru': 0, 'Good For': {... 5UmKMjUEUNdYWqANhGckJw [Fast Food, Restaurants] Dravosburg 4734 Lebanon Church Rd\nDravosburg, PA 15034 {'Friday': {'close': '21:00', 'open': '11:00'}... 40.354327 -79.900706 Mr Hoagie [] ... 15034 {'Fast Food': 1, 'Bars': 0, 'American (New)': ... 0.0 10.0 10.0 10.0 10.0 10.0 0.0 3.428571
1 {'Happy Hour': 1, 'Accepts Credit Cards': 1, '... UsFtqoBl7naz8AVUBZMjQQ [Nightlife] Dravosburg 202 McClure St\nDravosburg, PA 15034 {} 40.350553 -79.886814 Clancy's Pub [] ... 15034 remove 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.000000
2 {'Good for Kids': 1} cE27W9VPgO88Qxe4ol6y_g [Active Life, Mini Golf, Golf] Bethel Park 1530 Hamilton Rd\nBethel Park, PA 15234 {} 40.354115 -80.014660 Cool Springs Golf Center [] ... 15234 remove 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.600000
3 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... mVHrayjG3uZ_RLHkLj-AMg [Bars, American (New), Nightlife, Lounges, Res... Braddock 414 Hawkins Ave\nBraddock, PA 15104 {'Tuesday': {'close': '19:00', 'open': '10:00'... 40.408830 -79.866211 Emil's Lounge [] ... 15104 {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 0.0 0.0 9.0 9.0 9.0 10.0 6.0 4.680000
4 {'Parking': {'garage': 0, 'street': 0, 'valida... mYSpR_SLPgUVymYOvTQd_Q [Active Life, Golf] Braddock 1000 Clubhouse Dr\nBraddock, PA 15104 {'Sunday': {'close': '15:00', 'open': '10:00'}... 40.403405 -79.855782 Grand View Golf Club [] ... 15104 remove 5.0 0.0 0.0 9.0 9.0 9.0 9.0 5.000000

5 rows × 25 columns


Exercise 6: Create a new table that only contains restaurants with the following schema:

Business_Name | Restaurant_type | Friday hours | Saturday hours | Attributes | Zipcode | Average Rating


In [13]:
mask = ['name', 'restaurant_type', 'Friday', 'Saturday',
        'attributes', 'zip_code', 'stars_avg']
ydc.file_data['business'].loc[:, mask]


Out[13]:
name restaurant_type Friday Saturday attributes zip_code stars_avg
0 Mr Hoagie {'Fast Food': 1, 'Bars': 0, 'American (New)': ... 10.0 0.0 {'Take-out': 1, 'Drive-Thru': 0, 'Good For': {... 15034 3.428571
1 Clancy's Pub remove 0.0 0.0 {'Happy Hour': 1, 'Accepts Credit Cards': 1, '... 15034 3.000000
2 Cool Springs Golf Center remove 0.0 0.0 {'Good for Kids': 1} 15234 2.600000
3 Emil's Lounge {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 10.0 6.0 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... 15104 4.680000
4 Grand View Golf Club remove 9.0 9.0 {'Parking': {'garage': 0, 'street': 0, 'valida... 15104 5.000000
5 Alexion's Bar & Grill {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 15.0 14.0 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... 15106 3.894737
6 Flynn's Tire & Auto Service remove 10.5 8.5 {'Accepts Credit Cards': 1} 15106 2.625000
7 Forsythe Miniature Golf & Snacks remove 0.0 0.0 {'Good for Kids': 1} 15106 4.000000
8 Quaker State Construction remove 0.0 0.0 {} 15106 2.333333
9 Greentree Animal Clinic remove 0.0 0.0 {} 15220 3.400000
10 Carnegie Free Library remove 0.0 0.0 {'Wi-Fi_free_1_1_1_1': 1} 15106 4.333333
11 Advance Auto Parts remove 0.0 0.0 {} 15106 3.666667
12 Kings Family Restaurant {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 18.0 18.0 {'Take-out': 1, 'Drive-Thru': 0, 'Good For': {... 15106 3.250000
13 Shop N' Save remove 0.0 0.0 {'Parking': {'garage': 0, 'street': 0, 'valida... 15106 3.500000
14 Knorr's Sunoco Service remove 0.0 0.0 {} 15106 3.000000
15 Rossi Tailoring & Cleaners remove 0.0 0.0 {'Accepts Credit Cards': 1} 15106 3.000000
16 Heidelberg B P remove 0.0 0.0 {} 15106 2.750000
17 Rocky's Lounge {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 12.0 12.0 {'Music': {'dj': 0}, 'Ambience': {'romantic': ... 15106 3.800000
18 Gab & Eat {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 8.5 6.5 {'Has TV': 1, 'Ambience': {'romantic': 0, 'int... 15106 4.250000
19 Barb's Country Junction Cafe {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 7.0 5.0 {'Take-out': 1, 'Good For': {'dessert': 0, 'la... 15106 4.250000
20 Extended Stay America - Pittsburgh - Carnegie remove 0.0 0.0 {'Accepts Credit Cards': 1, 'Price Range_2': 1... 15106 3.500000
21 Paddy's Pour House {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 0.0 0.0 {'Coat Check': 0, 'Take-out': 1, 'Good For': {... 15106 3.142857
22 Porto Fino Pizzaria & Gyro {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 13.0 12.0 {} 15106 2.250000
23 Alteration World remove 10.5 10.0 {'Accepts Credit Cards': 0} 15106 4.444444
24 Long John Silver's {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 0.0 0.0 {'Take-out': 1, 'Caters': 0, 'Takes Reservatio... 15106 3.400000
25 Weinberg Lisa, DMD remove 0.0 0.0 {'By Appointment Only': 1} 15106 2.857143
26 Don Don Chinese Restaurant {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 0.0 0.0 {'Take-out': 1, 'Parking': {'garage': 0, 'stre... 15106 2.444444
27 Chartiers Animal Hospital remove 6.0 7.0 {} 15106 3.000000
28 Denny's {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 0.0 0.0 {'Take-out': 1, 'Good For': {'dessert': 0, 'la... 15220 4.090909
29 Amerifit remove 15.5 9.5 {'By Appointment Only': 0, 'Accepts Credit Car... 15106 3.000000
... ... ... ... ... ... ... ...
85871 Kanoa Strength Gym remove 16.0 4.0 {'By Appointment Only': 0, 'Accepts Credit Car... 89103 5.000000
85872 Marché Adonis remove 13.0 12.0 {'Take-out': 0, 'Parking': {'garage': 0, 'stre... H3C 2G6 2.666667
85873 CM2 Pizzeria & Bakeshop {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 5.0 5.0 {} 11485 5.000000
85874 The Grand Central Coffee Company remove 20.0 20.0 {'Take-out': 1, 'Outdoor Seating': 1, 'Parking... 85004 4.142857
85875 Sexy3D Extensions remove 14.0 14.0 {} 89148 5.000000
85876 Yelp Elites Celebrate The Zombie Apocalypse! remove 0.0 0.0 {} H2L 4.500000
85877 Elites at Overture: Kinky Boots remove 0.0 0.0 {} 53703 4.222222
85878 YEE: "Greed" at The Mob Museum remove 0.0 0.0 {'Good for Kids': 1} 89101 5.000000
85879 YEE: An Evening With Kid Cashew remove 0.0 0.0 {} 28203 4.958333
85880 The Chicken Scoop {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 8.0 8.0 {'Take-out': 1, 'Good For': {'dessert': 0, 'la... 16495 4.333333
85881 Pure Sushi Colony {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 12.0 12.0 {'Take-out': 1, 'Takes Reservations': 1, 'Has ... 85014 3.500000
85882 Skin Awakened remove 0.0 9.0 {'Parking': {'garage': 0, 'street': 0, 'valida... 85251 5.000000
85883 Brad Winston, CPA/Realtor-King of Condos, Inc. remove 15.0 15.0 {} 89104 5.000000
85884 Pita Pit {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 12.0 12.0 {'Take-out': 1, 'Takes Reservations': 0, 'Deli... 85044 5.000000
85885 Tokyo Sushi House II {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 10.5 10.5 {'Take-out': 1, 'Takes Reservations': 0, 'Deli... 89032 2.600000
85886 Kneaders Bakery & Cafe {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 15.0 15.0 {'Take-out': 1, 'Good For': {'dessert': 0, 'la... 89129 3.625000
85887 Frys Marketplace remove 12.0 9.0 {'Parking': {'garage': 0, 'street': 0, 'valida... 85022 2.578947
85888 Photobetty Photography remove 12.0 12.0 {'Accepts Credit Cards': 1} NaN 5.000000
85889 Chestnut {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 7.0 7.0 {'Take-out': 1, 'Good For': {'dessert': 0, 'la... 85018 3.364865
85890 Greathouse Sports Grill {'Fast Food': 0, 'Bars': 1, 'American (New)': ... 7.0 7.0 {'Take-out': 1, 'Takes Reservations': 0, 'Deli... 85339 2.000000
85891 Bar Louie remove 0.0 0.0 {} 85305 2.764706
85892 Lyric Apartments remove 9.0 9.0 {'Accepts Credit Cards': 1} 89183 2.333333
85893 Dunn's Import remove 5.0 0.0 {} 53562 4.375000
85894 Aries Unisex Salon remove 0.0 0.0 {'By Appointment Only': 0, 'Parking': {'garage... 85021 5.000000
85895 Citta Delle Luci remove 13.0 13.0 {} 89109 3.000000
85896 Bowties Bridal remove 8.0 8.0 {'Accepts Credit Cards': 1} 89120 4.157895
85897 Senor Taco {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 0.0 0.0 {'Take-out': 1, 'Good For': {'dessert': 0, 'la... 85338 3.561798
85898 Cobblestone Auto Spa remove 11.0 11.0 {} 85020 3.125000
85899 Princess Street Suites remove 0.0 0.0 {'Accepts Credit Cards': 1, 'Price Range_2': 1... EH1 3EG 4.000000
85900 Salsarita's Express {'Fast Food': 0, 'Bars': 0, 'American (New)': ... 0.0 0.0 {} 28208 2.519231

85901 rows × 7 columns