The data analysis below explores the distribution of restaurants across New York City’s five boroughs, identifying relationships between geographic location, inspection rating, and cuisine type. The report provides insight into trends in consumer food preferences, and cuisine’s relationship to demographic distribution across the city.
This report’s primary data source is the the DOHMH New York City Restaurant Inspection Results from the NYC OpenData project. The data collected is a result of the unannounced inspections of 24,000 restaurants, conducted annually by the New York City Health Department. These inspections report on food handling, food temperature, personal hygiene and vermin control, assigning points to the restaurant based on the number and severity of violations in terms of public health risk. The lower the score, the better the overall restaurant grade, as this means that the restaurant has fewer violations of the health code (NYC Health).
Official data from the NYC OpenData project. This dataset includes restaurant names, borough and exact address , type of cuisine, and several data points regarding restaurant grade and any violations the restaurant may have. We would like to use the location, cuisine type, and grade to explore the distribution of cuisines across the city, and whether or not there is a correlation between cuisine and grade or location and grade.
Source URL: https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/xx67-kt59 Click Export > SODA API > copy/paste API endpoint URL
The top 10 cuisines in New York City from 2013-2015, measured by number of restaurants per cuisine type, are as follows: American Chinese Italian Pizza Latin Mexican Japanese Cafe/Coffee/Tea Bakery Spanish
We saw the greatest increase in the number of Spanish and Chinese restaurants, both increasing by 43% over the two year period. Cafes also saw a significant 40% increase, following by Pizza with a 31% increase from 2013-15. There are almost 3000 American restaurants in New York City, far exceeding all other cuisine types (for example, +1500 more than the number of Chinese restaurants, the #2 cuisine).
These trends align with Google’s recent study on U.S. food industry trends, which compiled data related to food category queries from January 2014 to February 2016. Top sustained risers, which are categories exhibiting steady volume growth over the past year, include: ramen, bibimbap, empanada, and italian pasta dishes. Top sustained decliners include gluten free cupcakes, wheat free bread, and bacon-flavored desserts. Consumers are using food as a cultural experience, and because global cuisines are more difficult to prepare at home, are looking to restaurants to provide this (Google). The National Restaurant Association’s ‘What’s Hot in 2016’ report concludes that “chef-driven fast-casual concepts, hyper-local sourcing, natural ingredients/ minimally processed food, environmental sustainability, and artisan butchery” are the trending culinary themes across the country. There has also been unprecedented growth in interest in “authentic ethnic cuisine, African, Latin American, Middle Eastern, and ethnic fusion” (National Restaurant Association). In New York City specifically, Yelp reports several cuisines with higher popularity than the national average (Huffington Post), including: Kosher (414% higher), Halal (233% higher), Spanish (206% higher), Caribbean (158% higher) and Delis (117% higher). These trends correlate with the increase in the number of restaurants serving global/ethnic cuisine in New York City, as well as the casual, hyper-local Italian and Pizza restaurants New York City has become known for.
In [1]:
import sys # system module
import pandas as pd # data package
import matplotlib.pyplot as plt # graphics module
import datetime as dt # date and time module
import numpy as np # foundation for Pandas
import seaborn.apionly as sns # fancy matplotlib graphics (no styling)
import datetime as dt
from dateutil.relativedelta import relativedelta
# plotly imports (plotly allows us to make things interactive)
from plotly.offline import iplot, iplot_mpl # plotting functions
import plotly.graph_objs as go # ditto
import plotly # just to print version and init notebook
import cufflinks as cf # gives us df.iplot that feels like df.plot - gives us access to things that pop up on pandas
cf.set_config_file(offline=True, offline_show_link=False)
# these lines make our graphics show up in the notebook
%matplotlib inline
plotly.offline.init_notebook_mode()
#URLs of Sorted Data
#2013-2015: https://data.cityofnewyork.us/resource/batu-qkuq.json
#2015: https://data.cityofnewyork.us/resource/vv3s-su4i.json
#2014: https://data.cityofnewyork.us/resource/k8fu-sr5f.json
#2013: https://data.cityofnewyork.us/resource/snte-me4y.json
In [2]:
#Analysis of 2013 Data
# Set the base url which we will add options to
base_url = "https://data.cityofnewyork.us/resource/snte-me4y.json?"
#
# Add options
# Select which columns we want
base_url += "&$select=boro,cuisine_description"
#
# Retrieve data from website using the api
df2013 = pd.read_json(base_url)
# Group data by Cuisine Description and Sort Values
grouped2013 = df2013.groupby('cuisine_description').size().sort_values()
#Plot bar chart
plt.style.use("ggplot")
fig, ax = plt.subplots()
grouped2013.plot(ax=ax,kind="barh",figsize=(8, 13), color="c")
ax.set_title('Number of Restaurants by Cuisine Type 2013', loc='left', fontsize=20)
ax.set_xlabel('Number of Restaurants', fontsize=14)
ax.set_ylabel('Type of Cuisine', fontsize=14)
#
#
#
#Analysis of 2014 Data
base_url = "https://data.cityofnewyork.us/resource/k8fu-sr5f.json?"
base_url += "&$select=boro,cuisine_description"
df2014 = pd.read_json(base_url)
# Group data by Cuisine Description and Sort Values
grouped2014 = df2014.groupby('cuisine_description').size().sort_values()
#Plot bar chart
plt.style.use("ggplot")
fig, ax = plt.subplots()
grouped2014.plot(ax=ax,kind="barh",figsize=(8, 13), color="c")
ax.set_title('Number of Restaurants by Cuisine Type 2014', loc='left', fontsize=20)
ax.set_xlabel('Number of Restaurants', fontsize=14)
ax.set_ylabel('Type of Cuisine', fontsize=14)
#
#
#
#Analysis of 2015 Data
base_url = "https://data.cityofnewyork.us/resource/vv3s-su4i.json?"
base_url += "&$select=boro,cuisine_description"
df2015 = pd.read_json(base_url)
# Group data by Cuisine Description and Sort Values
grouped2015 = df2015.groupby('cuisine_description').size().sort_values()
#Plot bar chart
plt.style.use("ggplot")
fig, ax = plt.subplots()
grouped2015.plot(ax=ax,kind="barh",figsize=(8, 13), color="c")
ax.set_title('Number of Restaurants by Cuisine Type 2015', loc='left', fontsize=20)
ax.set_xlabel('Number of Restaurants', fontsize=14)
ax.set_ylabel('Type of Cuisine', fontsize=14)
Out[2]:
In [ ]:
Using 2015 data, we plotted types of cuisine by borough to determine whether or not location played a part in the type of restaurant that opened and was successful in each area. There were a total of 74 different cuisines reported in 2015. According to our analysis, Manhattan is home to restaurants covering 66 of those cuisine types (89%), followed by Brooklyn with 57 cuisine types (76%), and Queens with 54 cuisine types (72%). The Bronx and Staten Island had the lowest diversity of cuisine types, with 32 types (43%). This makes sense when one considers the number of tourists and native New Yorkers who dine out in Manhattan vs. the outer boroughs. It makes more sense from a foot traffic and profitability perspective to be located centrally - in Manhattan.
According to data from the U.S. Census Bureau, the five boroughs of New York City differ significantly in terms of demographic diversity, which we believe to be a factor determining the types of restaurants that open and thrive in each neighborhood. The Bronx, Brooklyn, and Queens are the most diverse boroughs compared to Manhattan and Staten Island, with more than 30% of their populations identifying as Black, Asian, Multiracial, or Other (Crains).
This ethnnic diversity aligns with our finding that despite Manhattan having the most cuisine types overall, certain cuisine types are more common in different boroughs. American and Chinese restaurants were the most dominant cuisine types across all five boroughs, but we see major differences once we move beyond the top two most common cuisines. Manhattan's top 5 is rounded out by Italian (3), Japanese (4) and Cafes/Coffee Shops (5). Queen's top 5 includes Latin (3), Pizza (4), and Korean (5). Brooklyn and the Bronx also include Pizza in the top 5, along with Carribbean and Mexican (Brooklyn), and Latin and Spanish (Bronx). Staten Island includes Pizza (3), Italian (4), and Mexican (5). Therefore we believe there is a relationship between restaurant cuisine and the ethnic makeup of the borough in which it is located.
In [3]:
#Analysis of Cuisine Types by Borough
import matplotlib.pyplot as plt
base_url = "https://data.cityofnewyork.us/resource/batu-qkuq.json?"
base_url += "&$limit=50000"
base_url += "&$select=camis,boro,cuisine_description"
dfboro = pd.read_json(base_url)
# Group data by Borough and Sort Values
gbboro = dfboro.groupby(["boro", "cuisine_description"]).count()
#Plot bar chart
plt.style.use("ggplot")
fig, ax = plt.subplots()
gbboro.plot(ax=ax,kind="barh",figsize=(20, 40), color="y")
ax.set_title('Cuisine Types by Borough', loc='left', fontsize=20)
ax.set_xlabel("Number of Restaurants",fontsize=14)
ax.set_ylabel('Cuisine Type', fontsize=14)
ax.legend_.remove()
In [ ]:
Finally, we plotted 2015 data to detertmine the relationship between restaurant location (borough) and restaurant grade. Across the entire sample of restaurants, 77% received an A grade, 17% received a B grade, and 4% received a C grade. Our sample removes restaurants with interim scores that have not yet received grades. All boroughs were very close to this overall average, and therefore we conclude that borough location has little effect on grade. To illustrate, all boroughs were in the range of 76-80% A grades, 15-18% B grades, and 1-5% C grades. Staten Isnad does skew slightly towards higher grades, meaning fewer violations.
We believe rental price has a significant effect on the decision making process when opening a restaurant, and one's ability to continue to run a business in a location as prices rise or fall. Despite the lack of correlation between grade and location, we do want to briefly touch on the rental market and how this may affect our results upon further investigation.
Restaurants account for just under 50% of retail unit growth currently being tracked throughout the United States. Research by Cushman and Wakefield on retail property rental rates showcases the continuous spike in rental rates over the past two years. For example, in SoHo, a prime location for Manhattan restaurants, the asking rental rate was $556 as of Q1 2016, a 7.1% increase from Q1 2015. There is also a direct correlation between rental rate and availability rate; the higher the rent, the more spaces available in that area. For example, the YOY SoHo availability rate increased by 8.6% from Q1 2015 to Q1 2016. We believe that location and rental rates have a significant impact on both number and type of restaurant (Cushman & Wakefield). Despite the added cost of being centrally-located, the majority of New York City restaurants reviewed by the Health Department are located in Manhattan (39.6%), followed by Queens and Brooklyn (24% each), the Bronx (8.7%), and Staten Island (3.1%).
Another important trend we see in the data is the influx of restaurants in Brooklyn. This aligns with the ‘urban renewal’ taking place in key Brooklyn neighbourhoods, including DUMBO and Williamsburg. With millennials flocking to the area, development has been ramped up, with rent increasing 17.7% over the past two years. This means that many long-standing mom-and-pop businesses have been pushed out to make way for more upscale restaurants and retail stores that can afford the hike in rental costs. The retail vacancy rate has been dropping for the past three years (Cushman & Wakefield).
Although these findings do not uncover a relationship between grade and location, we can infer the popularity and 'gentrification' of certain boroughs with the influx of certain types of restaurants and the rise in rental prices.
In [4]:
#Analysis of Restaurant Grades by Borough
base_url = "https://data.cityofnewyork.us/resource/batu-qkuq.json?"
#
# Add options
#
#Increase the query limit
base_url += "&$limit=50000"
#
# Select which columns we want
base_url += "&$select=camis,boro,grade"
#
# Retrieve data from website using the api
#
dfboro = pd.read_json(base_url)
# Group data by Borough and Sort Values
gbboro = dfboro.groupby(["boro", "grade"]).count()
#Plot bar chart
plt.style.use("ggplot")
fig, ax = plt.subplots()
gbboro.plot(ax=ax,kind="bar",figsize=(20, 10), color="b")
ax.set_title('Restaurant Grades by Borough', loc='left', fontsize=25)
ax.set_xlabel('Borough and Grade', fontsize=14)
ax.set_ylabel('Number of Restaurants', fontsize=14)
ax.legend_.remove()
This data analysis explored the types of restaurant cuisine types in New York City, and took a deeper dive into location's relationship with cuisine type and grade. We uncovered shifts in cuisine type over the past 3 years that align with national trends in food preferences, specifically the growth in popularity of ethnic/world cuisines. We found a relationship between cuisine type and borough, which we can in part connect with the demographic makeup of that location. Finally, we looked into health grade (A, B, C) and location, finding that all five boroughs have a similar grade distribution amongst their restaurants.