In [1]:
!pip install matplotlib


Requirement already satisfied (use --upgrade to upgrade): matplotlib in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.6 in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): pyparsing!=2.0.0,!=2.0.4,>=1.5.6 in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): pytz in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): cycler in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): six>=1.5 in /Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages (from python-dateutil->matplotlib)

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
import dateutil.parser


/Users/mercybenzaquen/.virtualenvs/Homework12/lib/python3.5/site-packages/matplotlib/__init__.py:1035: UserWarning: Duplicate key in file "/Users/mercybenzaquen/.matplotlib/matplotlibrc", line #2
  (fname, cnt))

First, I made a mistake naming the data set! It's 2015 data, not 2014 data. But yes, still use 311-2014.csv. You can rename it.

Importing and preparing your data

Import your data, but only the first 200,000 rows. You'll also want to change the index to be a datetime based on the Created Date column - you'll want to check if it's already a datetime, and parse it if not.


In [3]:
#df = pd.read_csv("small-311-2015.csv")
df = pd.read_csv("311-2014.csv", nrows=200000)

df.head(2)


/usr/local/lib/python3.5/site-packages/IPython/core/interactiveshell.py:2723: DtypeWarning: Columns (8,17,48) have mixed types. Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)
Out[3]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Name Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location
0 31015465 07/06/2015 10:58:27 AM 07/22/2015 01:07:20 AM DCA Department of Consumer Affairs Consumer Complaint Demand for Cash NaN 11360 27-16 203 STREET ... NaN NaN NaN NaN NaN NaN NaN 40.773540 -73.788237 (40.773539552542, -73.78823697228408)
1 30997660 07/03/2015 01:26:29 PM 07/03/2015 02:08:20 PM NYPD New York City Police Department Vending In Prohibited Area Residential Building/House 10019 200 CENTRAL PARK SOUTH ... NaN NaN NaN NaN NaN NaN NaN 40.767021 -73.979448 (40.76702142171206, -73.97944780718524)

2 rows × 53 columns


In [4]:
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200000 entries, 0 to 199999
Data columns (total 53 columns):
Unique Key                        200000 non-null int64
Created Date                      200000 non-null object
Closed Date                       188913 non-null object
Agency                            200000 non-null object
Agency Name                       200000 non-null object
Complaint Type                    200000 non-null object
Descriptor                        198197 non-null object
Location Type                     179328 non-null object
Incident Zip                      181049 non-null object
Incident Address                  152173 non-null object
Street Name                       152152 non-null object
Cross Street 1                    108035 non-null object
Cross Street 2                    107583 non-null object
Intersection Street 1             24790 non-null object
Intersection Street 2             24530 non-null object
Address Type                      177091 non-null object
City                              181095 non-null object
Landmark                          127 non-null object
Facility Type                     80031 non-null object
Status                            199998 non-null object
Due Date                          152018 non-null object
Resolution Description            198936 non-null object
Resolution Action Updated Date    188529 non-null object
Community Board                   200000 non-null object
Borough                           200000 non-null object
X Coordinate (State Plane)        175825 non-null float64
Y Coordinate (State Plane)        175825 non-null float64
Park Facility Name                200000 non-null object
Park Borough                      200000 non-null object
School Name                       200000 non-null object
School Number                     199907 non-null object
School Region                     197128 non-null object
School Code                       197128 non-null object
School Phone Number               200000 non-null object
School Address                    200000 non-null object
School City                       200000 non-null object
School State                      200000 non-null object
School Zip                        199999 non-null object
School Not Found                  151897 non-null object
School or Citywide Complaint      0 non-null float64
Vehicle Type                      34 non-null object
Taxi Company Borough              434 non-null object
Taxi Pick Up Location             3680 non-null object
Bridge Highway Name               1960 non-null object
Bridge Highway Direction          1959 non-null object
Road Ramp                         1946 non-null object
Bridge Highway Segment            2134 non-null object
Garage Lot Name                   143 non-null object
Ferry Direction                   86 non-null object
Ferry Terminal Name               215 non-null object
Latitude                          175825 non-null float64
Longitude                         175825 non-null float64
Location                          175825 non-null object
dtypes: float64(5), int64(1), object(47)
memory usage: 80.9+ MB

In [5]:
def parse_date (str_date):
    return dateutil.parser.parse(str_date)

df['created_dt']= df['Created Date'].apply(parse_date)

df.head(3)


Out[5]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
0 31015465 07/06/2015 10:58:27 AM 07/22/2015 01:07:20 AM DCA Department of Consumer Affairs Consumer Complaint Demand for Cash NaN 11360 27-16 203 STREET ... NaN NaN NaN NaN NaN NaN 40.773540 -73.788237 (40.773539552542, -73.78823697228408) 2015-07-06 10:58:27
1 30997660 07/03/2015 01:26:29 PM 07/03/2015 02:08:20 PM NYPD New York City Police Department Vending In Prohibited Area Residential Building/House 10019 200 CENTRAL PARK SOUTH ... NaN NaN NaN NaN NaN NaN 40.767021 -73.979448 (40.76702142171206, -73.97944780718524) 2015-07-03 13:26:29
2 31950223 11/09/2015 03:55:09 AM 11/09/2015 08:08:57 AM NYPD New York City Police Department Blocked Driveway No Access Street/Sidewalk 10453 1993 GRAND AVENUE ... NaN NaN NaN NaN NaN NaN 40.852671 -73.910608 (40.85267061877697, -73.91060771362552) 2015-11-09 03:55:09

3 rows × 54 columns


In [6]:
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200000 entries, 0 to 199999
Data columns (total 54 columns):
Unique Key                        200000 non-null int64
Created Date                      200000 non-null object
Closed Date                       188913 non-null object
Agency                            200000 non-null object
Agency Name                       200000 non-null object
Complaint Type                    200000 non-null object
Descriptor                        198197 non-null object
Location Type                     179328 non-null object
Incident Zip                      181049 non-null object
Incident Address                  152173 non-null object
Street Name                       152152 non-null object
Cross Street 1                    108035 non-null object
Cross Street 2                    107583 non-null object
Intersection Street 1             24790 non-null object
Intersection Street 2             24530 non-null object
Address Type                      177091 non-null object
City                              181095 non-null object
Landmark                          127 non-null object
Facility Type                     80031 non-null object
Status                            199998 non-null object
Due Date                          152018 non-null object
Resolution Description            198936 non-null object
Resolution Action Updated Date    188529 non-null object
Community Board                   200000 non-null object
Borough                           200000 non-null object
X Coordinate (State Plane)        175825 non-null float64
Y Coordinate (State Plane)        175825 non-null float64
Park Facility Name                200000 non-null object
Park Borough                      200000 non-null object
School Name                       200000 non-null object
School Number                     199907 non-null object
School Region                     197128 non-null object
School Code                       197128 non-null object
School Phone Number               200000 non-null object
School Address                    200000 non-null object
School City                       200000 non-null object
School State                      200000 non-null object
School Zip                        199999 non-null object
School Not Found                  151897 non-null object
School or Citywide Complaint      0 non-null float64
Vehicle Type                      34 non-null object
Taxi Company Borough              434 non-null object
Taxi Pick Up Location             3680 non-null object
Bridge Highway Name               1960 non-null object
Bridge Highway Direction          1959 non-null object
Road Ramp                         1946 non-null object
Bridge Highway Segment            2134 non-null object
Garage Lot Name                   143 non-null object
Ferry Direction                   86 non-null object
Ferry Terminal Name               215 non-null object
Latitude                          175825 non-null float64
Longitude                         175825 non-null float64
Location                          175825 non-null object
created_dt                        200000 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(5), int64(1), object(47)
memory usage: 82.4+ MB

What was the most popular type of complaint, and how many times was it filed?


In [6]:
df["Complaint Type"].value_counts().head(1)


Out[6]:
Blocked Driveway    21779
Name: Complaint Type, dtype: int64

Make a horizontal bar graph of the top 5 most frequent complaint types.


In [63]:
df["Complaint Type"].value_counts().head(5).sort_values().plot(kind='barh')


Out[63]:
<matplotlib.axes._subplots.AxesSubplot at 0x10f099400>

Which borough has the most complaints per capita? Since it's only 5 boroughs, you can do the math manually.


In [8]:
df["Borough"].value_counts()


Out[8]:
BROOKLYN         57129
QUEENS           46824
MANHATTAN        42050
BRONX            29610
Unspecified      17000
STATEN ISLAND     7387
Name: Borough, dtype: int64

In [9]:
people_bronx= 1438159
people_queens= 2321580
people_manhattan=1636268
people_brooklyn= 2621793
people_staten_island= 473279

In [10]:
complaints_per_capita_bronx= 29610/people_bronx
complaints_per_capita_bronx


Out[10]:
0.020588822237318682

In [11]:
complaints_per_capita_queens=46824/people_queens
complaints_per_capita_queens


Out[11]:
0.020169022820665235

In [12]:
complaints_per_capita_manhattan=42050/people_manhattan
complaints_per_capita_manhattan


Out[12]:
0.025698724169879263

In [13]:
complaints_per_capita_staten_island=473279/people_staten_island
complaints_per_capita_staten_island


Out[13]:
1.0

In [14]:
complaints_per_capita_brooklyn=2621793/people_brooklyn
complaints_per_capita_brooklyn


Out[14]:
1.0

According to your selection of data, how many cases were filed in March? How about May?


In [15]:
df.index = df['created_dt']
#del df['Created Date']

df.head()


Out[15]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
created_dt
2015-07-06 10:58:27 31015465 07/06/2015 10:58:27 AM 07/22/2015 01:07:20 AM DCA Department of Consumer Affairs Consumer Complaint Demand for Cash NaN 11360 27-16 203 STREET ... NaN NaN NaN NaN NaN NaN 40.773540 -73.788237 (40.773539552542, -73.78823697228408) 2015-07-06 10:58:27
2015-07-03 13:26:29 30997660 07/03/2015 01:26:29 PM 07/03/2015 02:08:20 PM NYPD New York City Police Department Vending In Prohibited Area Residential Building/House 10019 200 CENTRAL PARK SOUTH ... NaN NaN NaN NaN NaN NaN 40.767021 -73.979448 (40.76702142171206, -73.97944780718524) 2015-07-03 13:26:29
2015-11-09 03:55:09 31950223 11/09/2015 03:55:09 AM 11/09/2015 08:08:57 AM NYPD New York City Police Department Blocked Driveway No Access Street/Sidewalk 10453 1993 GRAND AVENUE ... NaN NaN NaN NaN NaN NaN 40.852671 -73.910608 (40.85267061877697, -73.91060771362552) 2015-11-09 03:55:09
2015-07-03 02:18:32 31000038 07/03/2015 02:18:32 AM 07/03/2015 07:54:48 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11372 84-16 NORTHERN BOULEVARD ... NaN NaN NaN NaN NaN NaN 40.755774 -73.883262 (40.755773786469966, -73.88326243225418) 2015-07-03 02:18:32
2015-07-04 00:03:27 30995614 07/04/2015 12:03:27 AM 07/04/2015 03:33:09 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11216 1057 BERGEN STREET ... NaN NaN NaN NaN NaN NaN 40.676175 -73.951269 (40.67617516102934, -73.9512690004692) 2015-07-04 00:03:27

5 rows × 54 columns


In [16]:
print("There were", len(df['2015-03']), "cases filed in March")


There were 15025 cases filed in March

In [17]:
print("There were", len(df['2015-05']), "cases filed in May")


There were 49715 cases filed in May

I'd like to see all of the 311 complaints called in on April 1st.

Surprise! We couldn't do this in class, but it was just a limitation of our data set


In [18]:
df['2015-04-01']


Out[18]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
created_dt
2015-04-01 21:37:42 30311691 04/01/2015 09:37:42 PM 04/01/2015 10:49:33 PM NYPD New York City Police Department Illegal Parking Blocked Sidewalk Street/Sidewalk 11234 NaN ... NaN NaN NaN NaN NaN NaN 40.609810 -73.922498 (40.60980966645303, -73.92249759633725) 2015-04-01 21:37:42
2015-04-01 23:12:04 30307701 04/01/2015 11:12:04 PM 04/01/2015 11:32:40 PM NYPD New York City Police Department Noise - Commercial Loud Music/Party Store/Commercial 11205 700 MYRTLE AVENUE ... NaN NaN NaN NaN NaN NaN 40.694644 -73.955504 (40.694643700748486, -73.95550356170298) 2015-04-01 23:12:04
2015-04-01 13:10:35 30313389 04/01/2015 01:10:35 PM 04/07/2015 04:01:08 PM DPR Department of Parks and Recreation Root/Sewer/Sidewalk Condition Trees and Sidewalks Program Street 11422 245-16 149 AVENUE ... NaN NaN NaN NaN NaN NaN 40.653016 -73.738626 (40.653016256598534, -73.73862588133056) 2015-04-01 13:10:35
2015-04-01 17:37:38 30314393 04/01/2015 05:37:38 PM 04/03/2015 11:40:54 AM DPR Department of Parks and Recreation Maintenance or Facility Hours of Operation Park 11211 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 17:37:38
2015-04-01 12:32:40 30309207 04/01/2015 12:32:40 PM 04/17/2015 01:06:49 AM DCA Department of Consumer Affairs Consumer Complaint Installation/Work Quality NaN 11423 90-71 198 STREET ... NaN NaN NaN NaN NaN NaN 40.714299 -73.761158 (40.71429859671565, -73.76115807774032) 2015-04-01 12:32:40
2015-04-01 18:44:50 30311759 04/01/2015 06:44:50 PM 06/24/2015 11:27:00 AM DPR Department of Parks and Recreation Damaged Tree Entire Tree Has Fallen Down Street 10467 862 EAST 213 STREET ... NaN NaN NaN NaN NaN NaN 40.878028 -73.860237 (40.87802828144708, -73.86023734606933) 2015-04-01 18:44:50
2015-04-01 16:30:15 30309690 04/01/2015 04:30:15 PM 04/01/2015 11:27:22 PM NYPD New York City Police Department Animal Abuse Neglected Residential Building/House 11368 107-15 NORTHERN BOULEVARD ... NaN NaN NaN NaN NaN NaN 40.757811 -73.861677 (40.757811195752154, -73.86167714731972) 2015-04-01 16:30:15
2015-04-01 09:04:07 30307990 04/01/2015 09:04:07 AM 04/06/2015 09:17:10 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Miscellaneous Senior Address 10027 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 09:04:07
2015-04-01 07:46:58 30308253 04/01/2015 07:46:58 AM 04/01/2015 09:32:31 AM NYPD New York City Police Department Blocked Driveway No Access Street/Sidewalk 11370 32-51 80 STREET ... NaN NaN NaN NaN NaN NaN 40.756412 -73.887405 (40.75641194675221, -73.88740503059863) 2015-04-01 07:46:58
2015-04-01 17:12:17 30314214 04/01/2015 05:12:17 PM 04/09/2015 02:20:11 PM DOT Department of Transportation Highway Condition Pothole - Highway Highway NaN NaN ... West/Manhattan Bound Roadway Clearview Expwy (I-295) (Exit 27 S-N) - Utopia... NaN NaN NaN NaN NaN NaN 2015-04-01 17:12:17
2015-04-01 21:30:48 30307111 04/01/2015 09:30:48 PM NaN DOHMH Department of Health and Mental Hygiene Food Establishment Food Temperature Restaurant/Bar/Deli/Bakery 11215 709 5 AVENUE ... NaN NaN NaN NaN NaN NaN 40.660699 -73.994082 (40.660699296661825, -73.99408169463258) 2015-04-01 21:30:48
2015-04-01 15:51:04 30311571 04/01/2015 03:51:04 PM 04/14/2015 09:23:30 AM DPR Department of Parks and Recreation Maintenance or Facility Hours of Operation Park 11210 NaN ... NaN NaN NaN NaN NaN NaN 40.621474 -73.950711 (40.62147413119333, -73.95071097029123) 2015-04-01 15:51:04
2015-04-01 10:43:28 30313817 04/01/2015 10:43:28 AM NaN DPR Department of Parks and Recreation Damaged Tree Branch Cracked and Will Fall NaN 10009 620 EAST 12TH STREET ... NaN NaN NaN NaN NaN NaN 40.727725 -73.978204 (40.72772462544187, -73.97820435916094) 2015-04-01 10:43:28
2015-04-01 15:12:46 30308922 04/01/2015 03:12:46 PM 06/01/2015 06:25:48 AM DOHMH Department of Health and Mental Hygiene Food Establishment Letter Grading Restaurant/Bar/Deli/Bakery 11238 663 FRANKLIN AVENUE ... NaN NaN NaN NaN NaN NaN 40.675746 -73.956122 (40.67574618440852, -73.9561218336512) 2015-04-01 15:12:46
2015-04-01 06:15:42 30311132 04/01/2015 06:15:42 AM 04/01/2015 10:28:30 AM DOT Department of Transportation Highway Condition Pothole - Highway Highway 10304 NaN ... East/Brooklyn Bound Roadway Clove Rd/Richmond Rd (Exit 13) - Lily Pond Ave... NaN NaN NaN 40.606875 -74.085408 (40.60687536641399, -74.0854077221027) 2015-04-01 06:15:42
2015-04-01 11:28:02 30308180 04/01/2015 11:28:02 AM 04/01/2015 11:42:53 AM DOT Department of Transportation Highway Condition Pothole - Highway Highway 11432 NaN ... West/Toward Triborough Br Ramp 168th St (Exit 17) NaN NaN NaN 40.719228 -73.791963 (40.71922760413319, -73.791962929951) 2015-04-01 11:28:02
2015-04-01 17:35:18 30313207 04/01/2015 05:35:18 PM 06/01/2015 06:25:54 AM DOHMH Department of Health and Mental Hygiene Food Establishment Rodents/Insects/Garbage Restaurant/Bar/Deli/Bakery 10011 140 WEST 13 STREET ... NaN NaN NaN NaN NaN NaN 40.737182 -73.998585 (40.737182358685516, -73.99858548189518) 2015-04-01 17:35:18
2015-04-01 13:54:54 30310017 04/01/2015 01:54:54 PM 04/06/2015 10:11:11 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Miscellaneous Senior Address 11435 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 13:54:54
2015-04-01 23:49:33 30306774 04/01/2015 11:49:33 PM 04/02/2015 12:20:59 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Store/Commercial 10003 36 SAINT MARKS PLACE ... NaN NaN NaN NaN NaN NaN 40.728733 -73.988011 (40.72873338955463, -73.98801059255561) 2015-04-01 23:49:33
2015-04-01 07:50:49 30313339 04/01/2015 07:50:49 AM 07/08/2015 02:19:25 PM DOT Department of Transportation Street Condition Rough, Pitted or Cracked Roads Street 11385 NaN ... NaN NaN NaN NaN NaN NaN 40.703414 -73.862854 (40.70341423569781, -73.86285397616253) 2015-04-01 07:50:49
2015-04-01 13:50:29 30312146 04/01/2015 01:50:29 PM 06/01/2015 06:25:49 AM DOHMH Department of Health and Mental Hygiene Food Establishment Rodents/Insects/Garbage Restaurant/Bar/Deli/Bakery 10028 1291 LEXINGTON AVENUE ... NaN NaN NaN NaN NaN NaN 40.780069 -73.955158 (40.78006850471446, -73.95515761412761) 2015-04-01 13:50:29
2015-04-01 16:14:19 30313259 04/01/2015 04:14:19 PM 04/01/2015 04:21:53 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Medicaid NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 16:14:19
2015-04-01 19:27:34 30308920 04/01/2015 07:27:34 PM 04/01/2015 08:45:17 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10017 210 EAST 46 STREET ... NaN NaN NaN NaN NaN NaN 40.753104 -73.972096 (40.75310402468627, -73.97209629231209) 2015-04-01 19:27:34
2015-04-01 05:30:02 30314164 04/01/2015 05:30:02 AM 04/01/2015 02:57:31 PM DOT Department of Transportation Highway Condition Pothole - Highway Highway NaN NaN ... East/Queens Bound Roadway Williamsburg Br / Metropolitan Ave (Exit 32) -... NaN NaN NaN NaN NaN NaN 2015-04-01 05:30:02
2015-04-01 10:33:26 30311790 04/01/2015 10:33:26 AM 04/01/2015 11:19:12 AM NYPD New York City Police Department Illegal Parking Blocked Sidewalk Street/Sidewalk 10033 2284 AMSTERDAM AVENUE ... NaN NaN NaN NaN NaN NaN 40.843149 -73.934539 (40.84314882753921, -73.93453937669832) 2015-04-01 10:33:26
2015-04-01 11:47:38 30310940 04/01/2015 11:47:38 AM 04/06/2015 09:23:32 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Miscellaneous Senior Address 11355 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 11:47:38
2015-04-01 11:01:27 30310409 04/01/2015 11:01:27 AM 04/17/2015 01:06:42 AM DCA Department of Consumer Affairs Consumer Complaint Exchange/Refund/Return NaN 10455 2997 3 AVENUE ... NaN NaN NaN NaN NaN NaN 40.819111 -73.913908 (40.819110789789214, -73.91390802507868) 2015-04-01 11:01:27
2015-04-01 08:51:52 30310350 04/01/2015 08:51:52 AM 04/03/2015 04:33:46 PM DCA Department of Consumer Affairs Consumer Complaint Cars Parked on Sidewalk/Street NaN 11223 1701 WEST 8 STREET ... NaN NaN NaN NaN NaN NaN 40.605657 -73.981194 (40.60565667868274, -73.98119372058547) 2015-04-01 08:51:52
2015-04-01 14:58:55 30313106 04/01/2015 02:58:55 PM 04/06/2015 10:06:35 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Rent Discrepancy Senior Address 11201 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 14:58:55
2015-04-01 16:59:19 30309324 04/01/2015 04:59:19 PM 04/01/2015 07:48:33 PM NYPD New York City Police Department Blocked Driveway Partial Access Street/Sidewalk 11210 650 EAST 24 STREET ... NaN NaN NaN NaN NaN NaN 40.634497 -73.954167 (40.63449684441219, -73.95416735372353) 2015-04-01 16:59:19
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2015-04-01 17:12:09 30313532 04/01/2015 05:12:09 PM 04/30/2015 06:02:47 PM DOT Department of Transportation Street Condition Line/Marking - Faded Street 11207 NaN ... NaN NaN NaN NaN NaN NaN 40.679561 -73.898899 (40.67956105192572, -73.89889884573184) 2015-04-01 17:12:09
2015-04-01 17:09:29 30311473 04/01/2015 05:09:29 PM 04/30/2015 05:59:38 PM DOT Department of Transportation Street Condition Line/Marking - Faded Street 11203 NaN ... NaN NaN NaN NaN NaN NaN 40.658529 -73.939568 (40.6585289219231, -73.93956820621213) 2015-04-01 17:09:29
2015-04-01 18:30:22 30307427 04/01/2015 06:30:22 PM 05/06/2015 10:59:47 AM DOT Department of Transportation Street Condition Failed Street Repair Street 11234 J AVENUE ... NaN NaN NaN NaN NaN NaN 40.628542 -73.921838 (40.62854243316789, -73.92183818389044) 2015-04-01 18:30:22
2015-04-01 21:07:21 30314301 04/01/2015 09:07:21 PM 05/08/2015 11:30:22 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 10001 511 WEST 25 STREET ... NaN NaN NaN NaN NaN NaN 40.749380 -74.004169 (40.74937996228322, -74.00416853967121) 2015-04-01 21:07:21
2015-04-01 10:50:12 30312508 04/01/2015 10:50:12 AM 05/08/2015 10:21:38 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 10032 NaN ... NaN NaN NaN NaN NaN NaN 40.842154 -73.942278 (40.84215388602991, -73.94227827092928) 2015-04-01 10:50:12
2015-04-01 09:07:38 30310225 04/01/2015 09:07:38 AM 05/04/2015 10:43:15 AM DPR Department of Parks and Recreation Root/Sewer/Sidewalk Condition Trees and Sidewalks Program Street 10307 647 CRAIG AVENUE ... NaN NaN NaN NaN NaN NaN 40.506708 -74.252182 (40.50670803830861, -74.25218246259357) 2015-04-01 09:07:38
2015-04-01 16:18:25 30313554 04/01/2015 04:18:25 PM 05/08/2015 11:29:12 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 11369 22-19 93 STREET ... NaN NaN NaN NaN NaN NaN 40.769574 -73.877480 (40.769573850244676, -73.8774799367093) 2015-04-01 16:18:25
2015-04-01 10:23:09 30313061 04/01/2015 10:23:09 AM 05/07/2015 02:19:57 PM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint Street 10021 NaN ... NaN NaN NaN NaN NaN NaN 40.765546 -73.954702 (40.765545913197165, -73.95470170187454) 2015-04-01 10:23:09
2015-04-01 14:31:57 30312110 04/01/2015 02:31:57 PM 05/08/2015 06:05:44 PM DPR Department of Parks and Recreation Dead Tree Dead/Dying Tree Street 11229 2056 EAST 29 STREET ... NaN NaN NaN NaN NaN NaN 40.601403 -73.943106 (40.60140342407911, -73.94310580244269) 2015-04-01 14:31:57
2015-04-01 18:50:19 30307758 04/01/2015 06:50:19 PM 05/07/2015 07:46:48 AM DPR Department of Parks and Recreation Damaged Tree Branch Cracked and Will Fall Street NaN NaN ... NaN NaN NaN NaN NaN NaN 40.720103 -73.790376 (40.72010305201917, -73.79037648278602) 2015-04-01 18:50:19
2015-04-01 14:03:43 30313462 04/01/2015 02:03:43 PM 05/06/2015 12:48:52 PM DOT Department of Transportation Street Condition Blocked - Construction Street 11209 NaN ... NaN NaN NaN NaN NaN NaN 40.633428 -74.032876 (40.63342806685948, -74.03287604669814) 2015-04-01 14:03:43
2015-04-01 11:59:20 30310246 04/01/2015 11:59:20 AM 11/09/2015 03:58:34 PM DOT Department of Transportation Street Condition Rough, Pitted or Cracked Roads Street 11217 90 PROSPECT PLACE ... NaN NaN NaN NaN NaN NaN 40.679040 -73.974579 (40.67903998236064, -73.97457889877462) 2015-04-01 11:59:20
2015-04-01 09:17:40 30310085 04/01/2015 09:17:40 AM 05/07/2015 06:53:11 PM DOT Department of Transportation Highway Condition Graffiti - Highway Highway NaN NaN ... West/Staten Island Bound Roadway Crospey Ave Stillwell Ave (Exit 6N) - Crospey ... NaN NaN NaN NaN NaN NaN 2015-04-01 09:17:40
2015-04-01 21:13:08 30314474 04/01/2015 09:13:08 PM 05/08/2015 11:27:01 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 11429 214-16 110 AVENUE ... NaN NaN NaN NaN NaN NaN 40.708131 -73.743041 (40.70813050331176, -73.74304104617282) 2015-04-01 21:13:08
2015-04-01 12:59:08 30308968 04/01/2015 12:59:08 PM 04/01/2015 12:59:23 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Medicaid Address Outside of NYC NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 12:59:08
2015-04-01 13:31:23 30308389 04/01/2015 01:31:23 PM 04/01/2015 01:32:08 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Medicaid NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 13:31:23
2015-04-01 16:42:15 30309377 04/01/2015 04:42:15 PM 04/01/2015 04:43:11 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Food Stamp NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 16:42:15
2015-04-01 13:37:07 30310992 04/01/2015 01:37:07 PM 04/01/2015 01:37:28 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Food Stamp NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 13:37:07
2015-04-01 23:44:04 30310652 04/01/2015 11:44:04 PM 04/02/2015 01:25:52 AM NYPD New York City Police Department Derelict Vehicle With License Plate Street/Sidewalk 11421 85-86 87 STREET ... NaN NaN NaN NaN NaN NaN 40.694888 -73.857927 (40.69488849346232, -73.85792744070989) 2015-04-01 23:44:04
2015-04-01 16:32:12 30309028 04/01/2015 04:32:12 PM 05/20/2015 05:36:29 PM TLC Taxi and Limousine Commission For Hire Vehicle Complaint Car Service Company Complaint Street 10451 215 EAST 161 STREET ... NaN NaN NaN NaN NaN NaN 40.826235 -73.920529 (40.8262353417949, -73.92052920426786) 2015-04-01 16:32:12
2015-04-01 08:26:06 30312622 04/01/2015 08:26:06 AM 06/01/2015 06:25:41 AM DOHMH Department of Health and Mental Hygiene Food Establishment Facility Construction Restaurant/Bar/Deli/Bakery 11234 2301 FLATBUSH AVENUE ... NaN NaN NaN NaN NaN NaN 40.613889 -73.927186 (40.61388875283825, -73.92718600732812) 2015-04-01 08:26:06
2015-04-01 15:08:20 30308371 04/01/2015 03:08:20 PM 06/01/2015 06:18:02 PM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint Street 10128 NaN ... NaN NaN NaN NaN NaN NaN 40.781533 -73.958320 (40.78153263581957, -73.9583197488706) 2015-04-01 15:08:20
2015-04-01 10:19:21 30311001 04/01/2015 10:19:21 AM 06/01/2015 06:25:39 AM DOHMH Department of Health and Mental Hygiene Food Establishment Rodents/Insects/Garbage Restaurant/Bar/Deli/Bakery 11377 59-21 ROOSEVELT AVENUE ... NaN NaN NaN NaN NaN NaN 40.745586 -73.904573 (40.74558568959288, -73.90457292624892) 2015-04-01 10:19:21
2015-04-01 20:20:13 30311341 04/01/2015 08:20:13 PM 04/01/2015 10:49:32 PM NYPD New York City Police Department Blocked Driveway Partial Access Street/Sidewalk 11691 348 BEACH 40 STREET ... NaN NaN NaN NaN NaN NaN 40.595019 -73.772153 (40.5950185756628, -73.77215306630436) 2015-04-01 20:20:13
2015-04-01 02:16:44 30308863 04/01/2015 02:16:44 AM 04/01/2015 02:54:17 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10013 301 CHURCH STREET ... NaN NaN NaN NaN NaN NaN 40.719322 -74.004470 (40.71932215308254, -74.00446968948569) 2015-04-01 02:16:44
2015-04-01 13:12:58 30307673 04/01/2015 01:12:58 PM 04/01/2015 10:01:26 PM NYPD New York City Police Department Illegal Parking Posted Parking Sign Violation Street/Sidewalk 10306 200 ADELAIDE AVENUE ... NaN NaN NaN NaN NaN NaN 40.561690 -74.124622 (40.5616902523158, -74.12462211525013) 2015-04-01 13:12:58
2015-04-01 13:17:23 30307732 04/01/2015 01:17:23 PM 04/01/2015 01:31:22 PM NYPD New York City Police Department Traffic Congestion/Gridlock Street/Sidewalk 10013 NaN ... NaN NaN NaN NaN NaN NaN 40.720557 -74.003510 (40.72055732795014, -74.00351016018516) 2015-04-01 13:17:23
2015-04-01 21:39:04 30311958 04/01/2015 09:39:04 PM 04/01/2015 09:50:48 PM NYPD New York City Police Department Noise - Vehicle Car/Truck Music Street/Sidewalk 11207 184 JEROME STREET ... NaN NaN NaN NaN NaN NaN 40.677739 -73.887888 (40.677739297670584, -73.8878875660618) 2015-04-01 21:39:04
2015-04-01 12:53:45 30309365 04/01/2015 12:53:45 PM 04/02/2015 12:04:38 PM DCA Department of Consumer Affairs Consumer Complaint Overcharge NaN 11418 NaN ... NaN NaN NaN NaN NaN NaN 40.700108 -73.832667 (40.70010803283339, -73.83266746664873) 2015-04-01 12:53:45
2015-04-01 10:46:01 30312487 04/01/2015 10:46:01 AM 04/02/2015 03:34:31 PM DCA Department of Consumer Affairs Consumer Complaint Damaged/Defective Goods NaN 11232 807 42 STREET ... NaN NaN NaN NaN NaN NaN 40.645348 -73.998616 (40.64534787518196, -73.99861625677346) 2015-04-01 10:46:01

573 rows × 54 columns

What was the most popular type of complaint on April 1st?

What were the most popular three types of complaint on April 1st


In [19]:
df['2015-04-01']['Complaint Type'].value_counts().head(3)


Out[19]:
Illegal Parking     67
Street Condition    64
Blocked Driveway    58
Name: Complaint Type, dtype: int64

In [20]:
df.info()


<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 200000 entries, 2015-07-06 10:58:27 to 2015-06-09 12:48:25
Data columns (total 54 columns):
Unique Key                        200000 non-null int64
Created Date                      200000 non-null object
Closed Date                       188913 non-null object
Agency                            200000 non-null object
Agency Name                       200000 non-null object
Complaint Type                    200000 non-null object
Descriptor                        198197 non-null object
Location Type                     179328 non-null object
Incident Zip                      181049 non-null object
Incident Address                  152173 non-null object
Street Name                       152152 non-null object
Cross Street 1                    108035 non-null object
Cross Street 2                    107583 non-null object
Intersection Street 1             24790 non-null object
Intersection Street 2             24530 non-null object
Address Type                      177091 non-null object
City                              181095 non-null object
Landmark                          127 non-null object
Facility Type                     80031 non-null object
Status                            199998 non-null object
Due Date                          152018 non-null object
Resolution Description            198936 non-null object
Resolution Action Updated Date    188529 non-null object
Community Board                   200000 non-null object
Borough                           200000 non-null object
X Coordinate (State Plane)        175825 non-null float64
Y Coordinate (State Plane)        175825 non-null float64
Park Facility Name                200000 non-null object
Park Borough                      200000 non-null object
School Name                       200000 non-null object
School Number                     199907 non-null object
School Region                     197128 non-null object
School Code                       197128 non-null object
School Phone Number               200000 non-null object
School Address                    200000 non-null object
School City                       200000 non-null object
School State                      200000 non-null object
School Zip                        199999 non-null object
School Not Found                  151897 non-null object
School or Citywide Complaint      0 non-null float64
Vehicle Type                      34 non-null object
Taxi Company Borough              434 non-null object
Taxi Pick Up Location             3680 non-null object
Bridge Highway Name               1960 non-null object
Bridge Highway Direction          1959 non-null object
Road Ramp                         1946 non-null object
Bridge Highway Segment            2134 non-null object
Garage Lot Name                   143 non-null object
Ferry Direction                   86 non-null object
Ferry Terminal Name               215 non-null object
Latitude                          175825 non-null float64
Longitude                         175825 non-null float64
Location                          175825 non-null object
created_dt                        200000 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(5), int64(1), object(47)
memory usage: 83.9+ MB

In [ ]:

What month has the most reports filed? How many? Graph it.


In [21]:
df.resample('M').count()


Out[21]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
created_dt
2015-01-31 7091 7091 6583 7091 7091 7091 7051 6547 6418 5308 ... 76 75 75 7 2 8 6181 6181 6181 7091
2015-02-28 8141 8141 7631 8141 8141 8141 8100 7508 7515 6097 ... 121 121 121 18 4 17 7274 7274 7274 8141
2015-03-31 15025 15025 14305 15025 15025 15025 14931 13742 13833 10775 ... 704 702 702 20 10 22 13444 13444 13444 15025
2015-04-30 20087 20087 19131 20087 20087 20087 19921 17250 17292 13809 ... 311 307 346 15 9 18 16692 16692 16692 20087
2015-05-31 49715 49715 47090 49715 49715 49715 49287 42564 42611 36206 ... 303 301 393 33 17 45 41381 41381 41381 49715
2015-06-30 14459 14459 13416 14459 14459 14459 14341 12274 12474 10460 ... 83 81 99 16 5 18 12067 12067 12067 14459
2015-07-31 15047 15047 13908 15047 15047 15047 14789 14121 14395 11430 ... 75 74 74 13 11 26 13864 13864 13864 15047
2015-08-31 12204 12204 11408 12204 12204 12204 12022 11266 11753 9556 ... 53 52 52 12 12 18 11336 11336 11336 12204
2015-09-30 13679 13679 12911 13679 13679 13679 13492 12790 13024 10769 ... 78 78 85 3 4 10 12551 12551 12551 13679
2015-10-31 24700 24700 23658 24700 24700 24700 24551 23061 23361 21244 ... 88 88 103 2 8 20 23007 23007 23007 24700
2015-11-30 16476 16476 15736 16476 16476 16476 16344 15242 15279 13740 ... 60 60 68 3 4 12 14999 14999 14999 16476
2015-12-31 3373 3373 3134 3373 3373 3373 3365 2960 3091 2776 ... 7 7 16 1 0 1 3026 3026 3026 3373
2016-01-31 3 3 2 3 3 3 3 3 3 3 ... 0 0 0 0 0 0 3 3 3 3

13 rows × 54 columns


In [22]:
df.resample('M').index[0]


/usr/local/lib/python3.5/site-packages/ipykernel/__main__.py:1: FutureWarning: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
  if __name__ == '__main__':
Out[22]:
Timestamp('2015-01-31 00:00:00', offset='M')

In [23]:
import numpy as np
np.__version__


Out[23]:
'1.11.1'

In [24]:
df.resample('M').count().plot(y="Unique Key")


Out[24]:
<matplotlib.axes._subplots.AxesSubplot at 0x10a9c3e48>

In [25]:
ax= df.groupby(df.index.month).count().plot(y='Unique Key', legend=False)
ax.set_xticks([1,2,3,4,5,6,7,8,9,10,11, 12])
ax.set_xticklabels(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
ax.set_ylabel("Number of Complaints")
ax.set_title("311 complains in 2015")


Out[25]:
<matplotlib.text.Text at 0x10b14b748>

What week of the year has the most reports filed? How many? Graph the weekly complaints.


In [64]:
#df.resample('W').count().head(5)
df.resample('W').count().plot(y="Unique Key", color= "purple")


Out[64]:
<matplotlib.axes._subplots.AxesSubplot at 0x10f08aa58>

Noise complaints are a big deal. Use .str.contains to select noise complaints, and make an chart of when they show up annually. Then make a chart about when they show up every day (cyclic).


In [27]:
df[df['Complaint Type'].str.contains("Noise")].head()


Out[27]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
created_dt
2015-07-03 02:18:32 31000038 07/03/2015 02:18:32 AM 07/03/2015 07:54:48 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 11372 84-16 NORTHERN BOULEVARD ... NaN NaN NaN NaN NaN NaN 40.755774 -73.883262 (40.755773786469966, -73.88326243225418) 2015-07-03 02:18:32
2015-07-04 00:03:27 30995614 07/04/2015 12:03:27 AM 07/04/2015 03:33:09 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11216 1057 BERGEN STREET ... NaN NaN NaN NaN NaN NaN 40.676175 -73.951269 (40.67617516102934, -73.9512690004692) 2015-07-04 00:03:27
2015-09-09 21:59:03 31492526 09/09/2015 09:59:03 PM 09/09/2015 11:17:39 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11238 238 SAINT JAMES PLACE ... NaN NaN NaN NaN NaN NaN 40.683308 -73.963775 (40.68330795503152, -73.96377504548408) 2015-09-09 21:59:03
2015-04-28 18:26:58 30502370 04/28/2015 06:26:58 PM 04/28/2015 07:29:34 PM NYPD New York City Police Department Noise - Commercial Car/Truck Music Store/Commercial 10035 1911 MADISON AVENUE ... NaN NaN NaN NaN NaN NaN 40.804617 -73.941505 (40.80461674564084, -73.9415053197214) 2015-04-28 18:26:58
2015-05-21 19:01:52 30668699 05/21/2015 07:01:52 PM 05/21/2015 09:56:29 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 10026 8 WEST 111 STREET ... NaN NaN NaN NaN NaN NaN 40.797731 -73.949399 (40.79773121644539, -73.94939942634502) 2015-05-21 19:01:52

5 rows × 54 columns


In [28]:
noise_df= df[df['Complaint Type'].str.contains("Noise")]

In [29]:
noise_graph= noise_df.groupby(noise_df.index.month).count().plot(y='Unique Key', legend=False)
noise_graph.set_xticks([1,2,3,4,5,6,7,8,9,10,11, 12])
noise_graph.set_xticklabels(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
noise_graph.set_ylabel("Number of Noise Complaints")
noise_graph.set_title("311 noise complains in 2015")


Out[29]:
<matplotlib.text.Text at 0x10bab60b8>

In [65]:
noise_df.groupby(by=noise_df.index.hour)['Unique Key'].count().plot()


Out[65]:
<matplotlib.axes._subplots.AxesSubplot at 0x10f3925f8>

In [30]:
noise_graph= noise_df.groupby(noise_df.index.dayofweek).count().plot(y='Unique Key', legend=False)
noise_graph.set_xticks([1,2,3,4,5,6,7])
noise_graph.set_xticklabels(['Mon', 'Tues', 'Wed', 'Thur', 'Fri', 'Sat', 'Sun'])
noise_graph.set_ylabel("Number of Noise Complaints")
noise_graph.set_title("311 noise complains in 2015")


Out[30]:
<matplotlib.text.Text at 0x10c04b6d8>

Which were the top five days of the year for filing complaints? How many on each of those days? Graph it.


In [31]:
daily_count= df['Unique Key'].resample('D').count().sort_values(ascending=False)
top_5_days= daily_count.head(5)
top_5_days


Out[31]:
created_dt
2015-10-28    2697
2015-11-09    2529
2015-05-04    2465
2015-05-11    2293
2015-10-29    2258
Name: Unique Key, dtype: int64

In [32]:
ax = top_5_days.plot(kind='bar') # I dont know how to put names to the labels
ax.set_title("Top 5 days")
ax.set_xlabel("Day")
ax.set_ylabel("Complaints")


Out[32]:
<matplotlib.text.Text at 0x10c0991d0>

What hour of the day are the most complaints? Graph a day of complaints.


In [33]:
hour_graph= df.groupby(df.index.hour).count().plot(y='Unique Key', legend=False)
hour_graph.set_xticks([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23])
hour_graph.set_title("A day of complaints")
hour_graph.set_xlabel("Hours")
hour_graph.set_ylabel("Complaints")


Out[33]:
<matplotlib.text.Text at 0x10b57f1d0>

One of the hours has an odd number of complaints. What are the most common complaints at that hour, and what are the most common complaints the hour before and after?


In [34]:
twelve_am_complaints= df[df.index.hour <1]

In [35]:
twelve_am_complaints.head()


Out[35]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
created_dt
2015-07-04 00:03:27 30995614 07/04/2015 12:03:27 AM 07/04/2015 03:33:09 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Talking Street/Sidewalk 11216 1057 BERGEN STREET ... NaN NaN NaN NaN NaN NaN 40.676175 -73.951269 (40.67617516102934, -73.9512690004692) 2015-07-04 00:03:27
2015-07-09 00:00:00 31042454 07/09/2015 12:00:00 AM 07/20/2015 12:00:00 AM DOHMH Department of Health and Mental Hygiene Standing Water Other - Explain Below Other NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-07-09 00:00:00
2015-07-09 00:00:00 31037751 07/09/2015 12:00:00 AM NaN DOHMH Department of Health and Mental Hygiene Standing Water Puddle in Ground 3+ Family Apartment Building 10016 379 THIRD AVENUE ... NaN NaN NaN NaN NaN NaN 40.741537 -73.981163 (40.741536747969185, -73.98116258383294) 2015-07-09 00:00:00
2015-06-29 00:26:39 30956584 06/29/2015 12:26:39 AM 06/29/2015 04:27:24 AM NYPD New York City Police Department Noise - Park Loud Talking Park/Playground 11377 NaN ... NaN NaN NaN NaN NaN NaN 40.741280 -73.902565 (40.741280237793646, -73.90256544457489) 2015-06-29 00:26:39
2015-07-02 00:09:59 30986795 07/02/2015 12:09:59 AM 07/02/2015 12:37:47 AM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10035 20 PALADINO AVENUE ... NaN NaN NaN NaN NaN NaN 40.800365 -73.931212 (40.80036497064086, -73.9312115560449) 2015-07-02 00:09:59

5 rows × 54 columns


In [36]:
twelve_am_complaints['Complaint Type'].value_counts().head(5)


Out[36]:
HEAT/HOT WATER          4534
Rodent                  2112
PAINT/PLASTER           1946
UNSANITARY CONDITION    1820
PLUMBING                1502
Name: Complaint Type, dtype: int64

In [37]:
one_am_complaints= df[df.index.hour == 1]

In [38]:
one_am_complaints['Complaint Type'].value_counts().head(5)


Out[38]:
Noise - Commercial         1025
Noise - Street/Sidewalk     897
Blocked Driveway            479
Illegal Parking             400
Noise - Vehicle             249
Name: Complaint Type, dtype: int64

In [39]:
eleven_pm_complaints= df[df.index.hour == 23]
eleven_pm_complaints['Complaint Type'].value_counts().head(5)


Out[39]:
Noise - Street/Sidewalk    1599
Noise - Commercial         1503
Blocked Driveway            973
Illegal Parking             882
Noise - Vehicle             478
Name: Complaint Type, dtype: int64

So odd. What's the per-minute breakdown of complaints between 12am and 1am? You don't need to include 1am.


In [40]:
twelve_am_complaints.groupby(twelve_am_complaints.index.minute).count()


Out[40]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_dt
0 17116 17116 16721 17116 17116 17116 17116 17098 17098 16983 ... 0 0 0 0 0 0 17093 17093 17093 17116
1 109 109 108 109 109 109 109 105 103 90 ... 0 0 0 0 0 0 102 102 102 109
2 91 91 88 91 91 91 90 81 88 72 ... 0 0 1 0 0 0 87 87 87 91
3 99 99 97 99 99 99 99 94 96 83 ... 1 1 1 0 0 0 95 95 95 99
4 106 106 103 106 106 106 105 101 103 93 ... 1 1 1 0 0 0 103 103 103 106
5 94 94 91 94 94 94 93 89 90 80 ... 0 0 0 1 0 0 88 88 88 94
6 106 106 103 106 106 106 105 101 101 90 ... 2 2 2 0 0 0 101 101 101 106
7 106 106 103 106 106 106 106 101 102 90 ... 1 1 2 0 0 0 102 102 102 106
8 95 95 94 95 95 95 95 92 92 80 ... 1 1 1 0 0 0 90 90 90 95
9 82 82 80 82 82 82 81 80 78 71 ... 1 1 1 0 0 0 77 77 77 82
10 89 89 87 89 89 89 89 87 87 78 ... 0 0 0 0 0 0 84 84 84 89
11 101 101 99 101 101 101 100 92 99 84 ... 0 0 1 0 0 0 99 99 99 101
12 100 100 99 100 100 100 97 96 96 81 ... 0 0 0 0 0 0 95 95 95 100
13 100 100 98 100 100 100 99 93 94 83 ... 1 1 1 0 0 0 93 93 93 100
14 88 88 86 88 88 88 88 85 84 74 ... 0 0 0 0 0 0 84 84 84 88
15 100 100 100 100 100 100 99 96 97 83 ... 0 0 0 0 1 1 96 96 96 100
16 83 83 82 83 83 83 83 78 77 63 ... 0 0 0 0 0 0 76 76 76 83
17 93 93 91 93 93 93 92 91 89 81 ... 0 0 0 0 0 0 88 88 88 93
18 91 91 90 91 91 91 91 85 89 77 ... 0 0 0 0 0 0 89 89 89 91
19 93 93 91 93 93 93 92 88 88 76 ... 0 0 0 0 0 0 88 88 88 93
20 100 100 98 100 100 100 99 93 99 85 ... 0 0 0 0 0 0 99 99 99 100
21 109 109 108 109 109 109 109 101 106 96 ... 0 0 0 0 0 0 103 103 103 109
22 104 104 99 104 104 104 104 98 103 91 ... 0 0 1 0 0 0 101 101 101 104
23 92 92 87 92 92 92 92 91 86 80 ... 0 0 0 0 0 0 86 86 86 92
24 79 79 76 79 79 79 78 75 75 67 ... 0 0 0 0 0 0 75 75 75 79
25 100 100 96 100 100 100 100 96 96 91 ... 0 0 0 0 0 0 96 96 96 100
26 84 84 79 84 84 84 83 83 81 72 ... 0 0 0 0 0 0 81 81 81 84
27 94 94 91 94 94 94 94 90 88 81 ... 1 1 2 0 0 0 87 87 87 94
28 94 94 90 94 94 94 94 87 93 83 ... 1 1 1 0 0 0 93 93 93 94
29 87 87 86 87 87 87 86 83 82 65 ... 0 0 0 0 1 1 82 82 82 87
30 89 89 87 89 89 89 88 86 87 73 ... 0 0 0 0 0 0 86 86 86 89
31 89 89 87 89 89 89 89 85 86 79 ... 0 0 0 0 0 0 86 86 86 89
32 87 87 84 87 87 87 87 86 84 74 ... 1 1 1 0 0 0 82 82 82 87
33 98 98 94 98 98 98 94 95 91 79 ... 0 0 0 0 0 0 91 91 91 98
34 77 77 75 77 77 77 77 74 76 69 ... 0 0 0 0 0 0 73 73 73 77
35 89 89 85 89 89 89 88 85 84 75 ... 1 1 1 0 0 0 84 84 84 89
36 78 78 73 78 78 78 78 74 72 66 ... 1 1 1 0 0 0 72 72 72 78
37 84 84 82 84 84 84 82 82 81 69 ... 0 0 0 0 0 0 81 81 81 84
38 91 91 91 91 91 91 91 88 88 82 ... 1 1 1 0 0 1 88 88 88 91
39 106 106 100 106 106 106 106 102 99 88 ... 0 0 0 0 0 0 98 98 98 106
40 108 108 105 108 108 108 108 101 102 96 ... 0 0 0 0 0 0 101 101 101 108
41 95 95 92 95 95 95 95 88 88 78 ... 0 0 0 0 0 0 88 88 88 95
42 72 72 68 72 72 72 72 70 65 63 ... 0 0 0 0 0 0 65 65 65 72
43 89 89 89 89 89 89 87 87 89 81 ... 0 0 0 0 0 0 88 88 88 89
44 97 97 93 97 97 97 97 91 94 78 ... 0 0 0 0 0 0 93 93 93 97
45 81 81 78 81 81 81 80 77 78 67 ... 0 0 0 0 0 0 78 78 78 81
46 93 93 88 93 93 93 93 84 85 75 ... 0 0 0 0 0 0 85 85 85 93
47 76 76 74 76 76 76 76 74 74 64 ... 1 1 1 0 0 0 74 74 74 76
48 82 82 79 82 82 82 82 76 76 66 ... 1 1 2 0 0 0 74 74 74 82
49 70 70 68 70 70 70 69 68 70 60 ... 1 1 1 0 0 0 70 70 70 70
50 112 112 108 112 112 112 111 108 106 96 ... 2 2 2 0 0 0 106 106 106 112
51 75 75 71 75 75 75 75 73 71 61 ... 1 1 1 0 0 0 71 71 71 75
52 85 85 83 85 85 85 85 81 83 74 ... 1 1 1 0 0 1 82 82 82 85
53 83 83 79 83 83 83 83 81 81 70 ... 1 1 1 0 0 0 81 81 81 83
54 70 70 66 70 70 70 70 69 66 61 ... 0 0 0 0 0 0 66 66 66 70
55 60 60 59 60 60 60 60 57 56 47 ... 0 0 0 0 0 0 56 56 56 60
56 89 89 87 89 89 89 89 83 85 75 ... 0 0 0 0 0 0 85 85 85 89
57 76 76 75 76 76 76 76 74 76 65 ... 0 0 0 0 0 0 75 75 75 76
58 61 61 56 61 61 61 61 54 55 50 ... 1 1 1 0 0 0 55 55 55 61
59 80 80 77 80 80 80 80 77 76 66 ... 0 0 0 0 0 0 76 76 76 80

60 rows × 54 columns

Looks like midnight is a little bit of an outlier. Why might that be? Take the 5 most common agencies and graph the times they file reports at (all day, not just midnight).


In [41]:
df['Agency'].value_counts().head(5)


Out[41]:
NYPD     80000
HPD      39388
DOT      22308
DPR      15505
DOHMH     8250
Name: Agency, dtype: int64

In [42]:
df_NYPD = df[df['Agency'] == 'NYPD']

In [43]:
df_HPD = df[df['Agency'] == 'HPD']

In [44]:
df_DOT = df[df['Agency'] == 'DOT']

In [45]:
df_DPR= df[df['Agency'] == 'DPR']

In [46]:
df_DOHMH= df[df['Agency'] == 'DOHMH']

In [47]:
all_graph = df_NYPD.groupby(by= df_NYPD.index.hour).count().plot(y='Unique Key', label='NYPD complaints')
all_graph.set_xticks([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23])
all_graph.set_title("A day of complaints by the top 5 agencies")
all_graph.set_xlabel("Hours")
all_graph.set_ylabel("Complaints")


df_HPD.groupby(by= df_HPD.index.hour).count().plot(y='Unique Key', ax=all_graph , label='HPD complaints')

df_DOT.groupby(by= df_DOT.index.hour).count().plot(y='Unique Key', ax=all_graph , label='DOT complaints')

df_DPR.groupby(by= df_DPR.index.hour).count().plot(y='Unique Key', ax=all_graph , label='DPR complaints')

df_DOHMH.groupby(by= df_DOHMH.index.hour).count().plot(y='Unique Key', ax=all_graph , label='DOHMH complaints')


Out[47]:
<matplotlib.axes._subplots.AxesSubplot at 0x10b612198>

Graph those same agencies on an annual basis - make it weekly. When do people like to complain? When does the NYPD have an odd number of complaints?


In [48]:
all_graph = df_NYPD.groupby(by= df_NYPD.index.weekofyear).count().plot(y='Unique Key', label='NYPD complaints')
#all_graph.set_xticks([1,50])
all_graph.set_title("A year of complaints by the top 5 agencies")
all_graph.set_xlabel("Weeks")

ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))

df_HPD.groupby(by= df_HPD.index.week).count().plot(y='Unique Key', ax=all_graph , label='HPD complaints')

df_DOT.groupby(by= df_DOT.index.hour).count().plot(y='Unique Key', ax=all_graph , label='DOT complaints')

df_DPR.groupby(by= df_DPR.index.hour).count().plot(y='Unique Key', ax=all_graph , label='DPR complaints')

df_DOHMH.groupby(by= df_DOHMH.index.hour).count().plot(y='Unique Key', ax=all_graph , label='DOHMH complaints')

plt.legend(bbox_to_anchor=(0, 1), loc='best', ncol=1)

print("""May and June are the months with more complaints, followed by October, November and December. 
In May the NYPD and HPD have an odd number of complaints""")


May and June are the months with more complaints, followed by October, November and December. 
In May the NYPD and HPD have an odd number of complaints

Maybe the NYPD deals with different issues at different times? Check the most popular complaints in July and August vs the month of May. Also check the most common complaints for the Housing Preservation Bureau (HPD) in winter vs. summer.


In [49]:
August_July = df["2015-07":"2015-08"]
August_July_complaints = August_July['Complaint Type'].value_counts().head(5)
August_July_complaints


Out[49]:
Illegal Parking            3444
Blocked Driveway           3258
Noise - Street/Sidewalk    3165
Street Condition           1480
Noise - Commercial         1201
Name: Complaint Type, dtype: int64

In [50]:
May = df['2015-05']
May_complaints= May['Complaint Type'].value_counts().head(5)
May_complaints


Out[50]:
Blocked Driveway           4114
Illegal Parking            3975
HEAT/HOT WATER             3583
Noise - Street/Sidewalk    3385
Noise - Commercial         2263
Name: Complaint Type, dtype: int64

In [51]:
# August_July_vs_May= August_July_complaints.plot(y='Unique Key', label='August - July complaints')
# August_July_vs_May.set_ylabel("Number of Complaints")
# August_July_vs_May.set_title("August-July vs May Complaints")
# May['Complaint Type'].value_counts().head(5).plot(y='Unique Key', ax=August_July_vs_May, label='May complaints')

# August_July_vs_May.set_xticks([1,2,3,4,5])
# August_July_vs_May.set_xticklabels(['Illegal Parking', 'Blocked Driveway', 'Noise - Street/Sidewalk', 'Street Condition', 'Noise - Commercial'])

In [52]:
#Most popular complaints of the HPD
df_HPD['Complaint Type'].value_counts().head(5)


Out[52]:
HEAT/HOT WATER            12408
UNSANITARY CONDITION       4774
PAINT/PLASTER              4306
PLUMBING                   3388
HPD Literature Request     3305
Name: Complaint Type, dtype: int64

In [79]:
summer_complaints= df_HPD["2015-06":"2015-08"]['Complaint Type'].value_counts().head(5)
summer_complaints


Out[79]:
HEAT/HOT WATER            617
UNSANITARY CONDITION      510
HPD Literature Request    462
PAINT/PLASTER             444
PLUMBING                  309
Name: Complaint Type, dtype: int64

In [80]:
winter_complaints= df_HPD["2015-01":"2015-02"]['Complaint Type'].value_counts().head(5)
winter_complaints


Out[80]:
UNSANITARY CONDITION    8
GENERAL                 3
PAINT/PLASTER           3
APPLIANCE               2
WATER LEAK              2
Name: Complaint Type, dtype: int64

In [82]:
winter_complaints_dec= df_HPD["2015-12"]['Complaint Type'].value_counts().head(5)
winter_complaints_dec


Out[82]:
HEAT/HOT WATER          353
UNSANITARY CONDITION    182
PLUMBING                138
PAINT/PLASTER           136
DOOR/WINDOW             103
Name: Complaint Type, dtype: int64

In [83]:
winter_results= df_HPD["2015-12"]['Complaint Type'].value_counts() + df_HPD["2015-01":"2015-02"]['Complaint Type'].value_counts()

In [84]:
winter_results


Out[84]:
APPLIANCE                  32.0
DOOR/WINDOW                 NaN
ELECTRIC                    NaN
FLOORING/STAIRS            57.0
GENERAL                    66.0
HEAT/HOT WATER              NaN
HPD Literature Request      NaN
OUTSIDE BUILDING            NaN
PAINT/PLASTER             139.0
PLUMBING                  139.0
SAFETY                      NaN
UNSANITARY CONDITION      190.0
WATER LEAK                 88.0
Name: Complaint Type, dtype: float64

In [ ]: