In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
import dateutil.parser

First, I made a mistake naming the data set! It's 2015 data, not 2014 data. But yes, still use 311-2014.csv. You can rename it.

Importing and preparing your data

Import your data, but only the first 200,000 rows. You'll also want to change the index to be a datetime based on the Created Date column - you'll want to check if it's already a datetime, and parse it if not.


In [3]:
df=pd.read_csv("311-2014.csv", nrows=200000)


/home/sean/.local/lib/python3.5/site-packages/IPython/core/interactiveshell.py:2723: DtypeWarning: Columns (8,17,48) have mixed types. Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)

In [4]:
dateutil.parser.parse(df['Created Date'][0])


Out[4]:
datetime.datetime(2015, 7, 6, 10, 58, 27)

In [5]:
def parse_date(str_date):
    return dateutil.parser.parse(str_date)

In [6]:
df['created_datetime']=df['Created Date'].apply(parse_date)

In [7]:
df.index=df['created_datetime']

What was the most popular type of complaint, and how many times was it filed?


In [8]:
df['Complaint Type'].describe()


Out[8]:
count               200000
unique                 180
top       Blocked Driveway
freq                 21779
Name: Complaint Type, dtype: object

Make a horizontal bar graph of the top 5 most frequent complaint types.


In [9]:
df.groupby(by='Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5).plot(kind='barh').invert_yaxis()


Which borough has the most complaints per capita? Since it's only 5 boroughs, you can do the math manually.


In [10]:
df.groupby(by='Borough')['Borough'].count()


Out[10]:
Borough
BRONX            29610
BROOKLYN         57129
MANHATTAN        42050
QUEENS           46824
STATEN ISLAND     7387
Unspecified      17000
Name: Borough, dtype: int64

In [11]:
boro_pop={
    'BRONX': 1438159,
    'BROOKLYN': 2621793,
    'MANHATTAN': 1636268,
    'QUEENS': 2321580,
    'STATEN ISLAND': 473279}

In [12]:
boro_df=pd.Series.to_frame(df.groupby(by='Borough')['Borough'].count())
boro_df['Population']=pd.DataFrame.from_dict(boro_pop, orient='index')
boro_df['Complaints']=boro_df['Borough']
boro_df.drop('Borough', axis=1, inplace=True)
boro_df['Per Capita']=boro_df['Complaints']/boro_df['Population']
boro_df['Per Capita'].plot(kind='bar')


Out[12]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf4bc59e8>

According to your selection of data, how many cases were filed in March? How about May?


In [13]:
df['2015-03']['Created Date'].count()


Out[13]:
15025

In [14]:
df['2015-05']['Created Date'].count()


Out[14]:
49715

I'd like to see all of the 311 complaints called in on April 1st.

Surprise! We couldn't do this in class, but it was just a limitation of our data set


In [15]:
df['2015-04-01']


Out[15]:
Unique Key Created Date Closed Date Agency Agency Name Complaint Type Descriptor Location Type Incident Zip Incident Address ... Bridge Highway Direction Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name Latitude Longitude Location created_datetime
created_datetime
2015-04-01 21:37:42 30311691 04/01/2015 09:37:42 PM 04/01/2015 10:49:33 PM NYPD New York City Police Department Illegal Parking Blocked Sidewalk Street/Sidewalk 11234 NaN ... NaN NaN NaN NaN NaN NaN 40.609810 -73.922498 (40.60980966645303, -73.92249759633725) 2015-04-01 21:37:42
2015-04-01 23:12:04 30307701 04/01/2015 11:12:04 PM 04/01/2015 11:32:40 PM NYPD New York City Police Department Noise - Commercial Loud Music/Party Store/Commercial 11205 700 MYRTLE AVENUE ... NaN NaN NaN NaN NaN NaN 40.694644 -73.955504 (40.694643700748486, -73.95550356170298) 2015-04-01 23:12:04
2015-04-01 13:10:35 30313389 04/01/2015 01:10:35 PM 04/07/2015 04:01:08 PM DPR Department of Parks and Recreation Root/Sewer/Sidewalk Condition Trees and Sidewalks Program Street 11422 245-16 149 AVENUE ... NaN NaN NaN NaN NaN NaN 40.653016 -73.738626 (40.653016256598534, -73.73862588133056) 2015-04-01 13:10:35
2015-04-01 17:37:38 30314393 04/01/2015 05:37:38 PM 04/03/2015 11:40:54 AM DPR Department of Parks and Recreation Maintenance or Facility Hours of Operation Park 11211 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 17:37:38
2015-04-01 12:32:40 30309207 04/01/2015 12:32:40 PM 04/17/2015 01:06:49 AM DCA Department of Consumer Affairs Consumer Complaint Installation/Work Quality NaN 11423 90-71 198 STREET ... NaN NaN NaN NaN NaN NaN 40.714299 -73.761158 (40.71429859671565, -73.76115807774032) 2015-04-01 12:32:40
2015-04-01 18:44:50 30311759 04/01/2015 06:44:50 PM 06/24/2015 11:27:00 AM DPR Department of Parks and Recreation Damaged Tree Entire Tree Has Fallen Down Street 10467 862 EAST 213 STREET ... NaN NaN NaN NaN NaN NaN 40.878028 -73.860237 (40.87802828144708, -73.86023734606933) 2015-04-01 18:44:50
2015-04-01 16:30:15 30309690 04/01/2015 04:30:15 PM 04/01/2015 11:27:22 PM NYPD New York City Police Department Animal Abuse Neglected Residential Building/House 11368 107-15 NORTHERN BOULEVARD ... NaN NaN NaN NaN NaN NaN 40.757811 -73.861677 (40.757811195752154, -73.86167714731972) 2015-04-01 16:30:15
2015-04-01 09:04:07 30307990 04/01/2015 09:04:07 AM 04/06/2015 09:17:10 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Miscellaneous Senior Address 10027 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 09:04:07
2015-04-01 07:46:58 30308253 04/01/2015 07:46:58 AM 04/01/2015 09:32:31 AM NYPD New York City Police Department Blocked Driveway No Access Street/Sidewalk 11370 32-51 80 STREET ... NaN NaN NaN NaN NaN NaN 40.756412 -73.887405 (40.75641194675221, -73.88740503059863) 2015-04-01 07:46:58
2015-04-01 17:12:17 30314214 04/01/2015 05:12:17 PM 04/09/2015 02:20:11 PM DOT Department of Transportation Highway Condition Pothole - Highway Highway NaN NaN ... West/Manhattan Bound Roadway Clearview Expwy (I-295) (Exit 27 S-N) - Utopia... NaN NaN NaN NaN NaN NaN 2015-04-01 17:12:17
2015-04-01 21:30:48 30307111 04/01/2015 09:30:48 PM NaN DOHMH Department of Health and Mental Hygiene Food Establishment Food Temperature Restaurant/Bar/Deli/Bakery 11215 709 5 AVENUE ... NaN NaN NaN NaN NaN NaN 40.660699 -73.994082 (40.660699296661825, -73.99408169463258) 2015-04-01 21:30:48
2015-04-01 15:51:04 30311571 04/01/2015 03:51:04 PM 04/14/2015 09:23:30 AM DPR Department of Parks and Recreation Maintenance or Facility Hours of Operation Park 11210 NaN ... NaN NaN NaN NaN NaN NaN 40.621474 -73.950711 (40.62147413119333, -73.95071097029123) 2015-04-01 15:51:04
2015-04-01 10:43:28 30313817 04/01/2015 10:43:28 AM NaN DPR Department of Parks and Recreation Damaged Tree Branch Cracked and Will Fall NaN 10009 620 EAST 12TH STREET ... NaN NaN NaN NaN NaN NaN 40.727725 -73.978204 (40.72772462544187, -73.97820435916094) 2015-04-01 10:43:28
2015-04-01 15:12:46 30308922 04/01/2015 03:12:46 PM 06/01/2015 06:25:48 AM DOHMH Department of Health and Mental Hygiene Food Establishment Letter Grading Restaurant/Bar/Deli/Bakery 11238 663 FRANKLIN AVENUE ... NaN NaN NaN NaN NaN NaN 40.675746 -73.956122 (40.67574618440852, -73.9561218336512) 2015-04-01 15:12:46
2015-04-01 06:15:42 30311132 04/01/2015 06:15:42 AM 04/01/2015 10:28:30 AM DOT Department of Transportation Highway Condition Pothole - Highway Highway 10304 NaN ... East/Brooklyn Bound Roadway Clove Rd/Richmond Rd (Exit 13) - Lily Pond Ave... NaN NaN NaN 40.606875 -74.085408 (40.60687536641399, -74.0854077221027) 2015-04-01 06:15:42
2015-04-01 11:28:02 30308180 04/01/2015 11:28:02 AM 04/01/2015 11:42:53 AM DOT Department of Transportation Highway Condition Pothole - Highway Highway 11432 NaN ... West/Toward Triborough Br Ramp 168th St (Exit 17) NaN NaN NaN 40.719228 -73.791963 (40.71922760413319, -73.791962929951) 2015-04-01 11:28:02
2015-04-01 17:35:18 30313207 04/01/2015 05:35:18 PM 06/01/2015 06:25:54 AM DOHMH Department of Health and Mental Hygiene Food Establishment Rodents/Insects/Garbage Restaurant/Bar/Deli/Bakery 10011 140 WEST 13 STREET ... NaN NaN NaN NaN NaN NaN 40.737182 -73.998585 (40.737182358685516, -73.99858548189518) 2015-04-01 17:35:18
2015-04-01 13:54:54 30310017 04/01/2015 01:54:54 PM 04/06/2015 10:11:11 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Miscellaneous Senior Address 11435 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 13:54:54
2015-04-01 23:49:33 30306774 04/01/2015 11:49:33 PM 04/02/2015 12:20:59 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Store/Commercial 10003 36 SAINT MARKS PLACE ... NaN NaN NaN NaN NaN NaN 40.728733 -73.988011 (40.72873338955463, -73.98801059255561) 2015-04-01 23:49:33
2015-04-01 07:50:49 30313339 04/01/2015 07:50:49 AM 07/08/2015 02:19:25 PM DOT Department of Transportation Street Condition Rough, Pitted or Cracked Roads Street 11385 NaN ... NaN NaN NaN NaN NaN NaN 40.703414 -73.862854 (40.70341423569781, -73.86285397616253) 2015-04-01 07:50:49
2015-04-01 13:50:29 30312146 04/01/2015 01:50:29 PM 06/01/2015 06:25:49 AM DOHMH Department of Health and Mental Hygiene Food Establishment Rodents/Insects/Garbage Restaurant/Bar/Deli/Bakery 10028 1291 LEXINGTON AVENUE ... NaN NaN NaN NaN NaN NaN 40.780069 -73.955158 (40.78006850471446, -73.95515761412761) 2015-04-01 13:50:29
2015-04-01 16:14:19 30313259 04/01/2015 04:14:19 PM 04/01/2015 04:21:53 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Medicaid NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 16:14:19
2015-04-01 19:27:34 30308920 04/01/2015 07:27:34 PM 04/01/2015 08:45:17 PM NYPD New York City Police Department Noise - Street/Sidewalk Loud Music/Party Street/Sidewalk 10017 210 EAST 46 STREET ... NaN NaN NaN NaN NaN NaN 40.753104 -73.972096 (40.75310402468627, -73.97209629231209) 2015-04-01 19:27:34
2015-04-01 05:30:02 30314164 04/01/2015 05:30:02 AM 04/01/2015 02:57:31 PM DOT Department of Transportation Highway Condition Pothole - Highway Highway NaN NaN ... East/Queens Bound Roadway Williamsburg Br / Metropolitan Ave (Exit 32) -... NaN NaN NaN NaN NaN NaN 2015-04-01 05:30:02
2015-04-01 10:33:26 30311790 04/01/2015 10:33:26 AM 04/01/2015 11:19:12 AM NYPD New York City Police Department Illegal Parking Blocked Sidewalk Street/Sidewalk 10033 2284 AMSTERDAM AVENUE ... NaN NaN NaN NaN NaN NaN 40.843149 -73.934539 (40.84314882753921, -73.93453937669832) 2015-04-01 10:33:26
2015-04-01 11:47:38 30310940 04/01/2015 11:47:38 AM 04/06/2015 09:23:32 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Miscellaneous Senior Address 11355 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 11:47:38
2015-04-01 11:01:27 30310409 04/01/2015 11:01:27 AM 04/17/2015 01:06:42 AM DCA Department of Consumer Affairs Consumer Complaint Exchange/Refund/Return NaN 10455 2997 3 AVENUE ... NaN NaN NaN NaN NaN NaN 40.819111 -73.913908 (40.819110789789214, -73.91390802507868) 2015-04-01 11:01:27
2015-04-01 08:51:52 30310350 04/01/2015 08:51:52 AM 04/03/2015 04:33:46 PM DCA Department of Consumer Affairs Consumer Complaint Cars Parked on Sidewalk/Street NaN 11223 1701 WEST 8 STREET ... NaN NaN NaN NaN NaN NaN 40.605657 -73.981194 (40.60565667868274, -73.98119372058547) 2015-04-01 08:51:52
2015-04-01 14:58:55 30313106 04/01/2015 02:58:55 PM 04/06/2015 10:06:35 AM DOF Senior Citizen Rent Increase Exemption Unit SCRIE Rent Discrepancy Senior Address 11201 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 14:58:55
2015-04-01 16:59:19 30309324 04/01/2015 04:59:19 PM 04/01/2015 07:48:33 PM NYPD New York City Police Department Blocked Driveway Partial Access Street/Sidewalk 11210 650 EAST 24 STREET ... NaN NaN NaN NaN NaN NaN 40.634497 -73.954167 (40.63449684441219, -73.95416735372353) 2015-04-01 16:59:19
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2015-04-01 17:12:09 30313532 04/01/2015 05:12:09 PM 04/30/2015 06:02:47 PM DOT Department of Transportation Street Condition Line/Marking - Faded Street 11207 NaN ... NaN NaN NaN NaN NaN NaN 40.679561 -73.898899 (40.67956105192572, -73.89889884573184) 2015-04-01 17:12:09
2015-04-01 17:09:29 30311473 04/01/2015 05:09:29 PM 04/30/2015 05:59:38 PM DOT Department of Transportation Street Condition Line/Marking - Faded Street 11203 NaN ... NaN NaN NaN NaN NaN NaN 40.658529 -73.939568 (40.6585289219231, -73.93956820621213) 2015-04-01 17:09:29
2015-04-01 18:30:22 30307427 04/01/2015 06:30:22 PM 05/06/2015 10:59:47 AM DOT Department of Transportation Street Condition Failed Street Repair Street 11234 J AVENUE ... NaN NaN NaN NaN NaN NaN 40.628542 -73.921838 (40.62854243316789, -73.92183818389044) 2015-04-01 18:30:22
2015-04-01 21:07:21 30314301 04/01/2015 09:07:21 PM 05/08/2015 11:30:22 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 10001 511 WEST 25 STREET ... NaN NaN NaN NaN NaN NaN 40.749380 -74.004169 (40.74937996228322, -74.00416853967121) 2015-04-01 21:07:21
2015-04-01 10:50:12 30312508 04/01/2015 10:50:12 AM 05/08/2015 10:21:38 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 10032 NaN ... NaN NaN NaN NaN NaN NaN 40.842154 -73.942278 (40.84215388602991, -73.94227827092928) 2015-04-01 10:50:12
2015-04-01 09:07:38 30310225 04/01/2015 09:07:38 AM 05/04/2015 10:43:15 AM DPR Department of Parks and Recreation Root/Sewer/Sidewalk Condition Trees and Sidewalks Program Street 10307 647 CRAIG AVENUE ... NaN NaN NaN NaN NaN NaN 40.506708 -74.252182 (40.50670803830861, -74.25218246259357) 2015-04-01 09:07:38
2015-04-01 16:18:25 30313554 04/01/2015 04:18:25 PM 05/08/2015 11:29:12 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 11369 22-19 93 STREET ... NaN NaN NaN NaN NaN NaN 40.769574 -73.877480 (40.769573850244676, -73.8774799367093) 2015-04-01 16:18:25
2015-04-01 10:23:09 30313061 04/01/2015 10:23:09 AM 05/07/2015 02:19:57 PM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint Street 10021 NaN ... NaN NaN NaN NaN NaN NaN 40.765546 -73.954702 (40.765545913197165, -73.95470170187454) 2015-04-01 10:23:09
2015-04-01 14:31:57 30312110 04/01/2015 02:31:57 PM 05/08/2015 06:05:44 PM DPR Department of Parks and Recreation Dead Tree Dead/Dying Tree Street 11229 2056 EAST 29 STREET ... NaN NaN NaN NaN NaN NaN 40.601403 -73.943106 (40.60140342407911, -73.94310580244269) 2015-04-01 14:31:57
2015-04-01 18:50:19 30307758 04/01/2015 06:50:19 PM 05/07/2015 07:46:48 AM DPR Department of Parks and Recreation Damaged Tree Branch Cracked and Will Fall Street NaN NaN ... NaN NaN NaN NaN NaN NaN 40.720103 -73.790376 (40.72010305201917, -73.79037648278602) 2015-04-01 18:50:19
2015-04-01 14:03:43 30313462 04/01/2015 02:03:43 PM 05/06/2015 12:48:52 PM DOT Department of Transportation Street Condition Blocked - Construction Street 11209 NaN ... NaN NaN NaN NaN NaN NaN 40.633428 -74.032876 (40.63342806685948, -74.03287604669814) 2015-04-01 14:03:43
2015-04-01 11:59:20 30310246 04/01/2015 11:59:20 AM 11/09/2015 03:58:34 PM DOT Department of Transportation Street Condition Rough, Pitted or Cracked Roads Street 11217 90 PROSPECT PLACE ... NaN NaN NaN NaN NaN NaN 40.679040 -73.974579 (40.67903998236064, -73.97457889877462) 2015-04-01 11:59:20
2015-04-01 09:17:40 30310085 04/01/2015 09:17:40 AM 05/07/2015 06:53:11 PM DOT Department of Transportation Highway Condition Graffiti - Highway Highway NaN NaN ... West/Staten Island Bound Roadway Crospey Ave Stillwell Ave (Exit 6N) - Crospey ... NaN NaN NaN NaN NaN NaN 2015-04-01 09:17:40
2015-04-01 21:13:08 30314474 04/01/2015 09:13:08 PM 05/08/2015 11:27:01 AM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint NaN 11429 214-16 110 AVENUE ... NaN NaN NaN NaN NaN NaN 40.708131 -73.743041 (40.70813050331176, -73.74304104617282) 2015-04-01 21:13:08
2015-04-01 12:59:08 30308968 04/01/2015 12:59:08 PM 04/01/2015 12:59:23 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Medicaid Address Outside of NYC NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 12:59:08
2015-04-01 13:31:23 30308389 04/01/2015 01:31:23 PM 04/01/2015 01:32:08 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Medicaid NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 13:31:23
2015-04-01 16:42:15 30309377 04/01/2015 04:42:15 PM 04/01/2015 04:43:11 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Food Stamp NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 16:42:15
2015-04-01 13:37:07 30310992 04/01/2015 01:37:07 PM 04/01/2015 01:37:28 PM HRA HRA Benefit Card Replacement Benefit Card Replacement Food Stamp NYC Street Address NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 2015-04-01 13:37:07
2015-04-01 23:44:04 30310652 04/01/2015 11:44:04 PM 04/02/2015 01:25:52 AM NYPD New York City Police Department Derelict Vehicle With License Plate Street/Sidewalk 11421 85-86 87 STREET ... NaN NaN NaN NaN NaN NaN 40.694888 -73.857927 (40.69488849346232, -73.85792744070989) 2015-04-01 23:44:04
2015-04-01 16:32:12 30309028 04/01/2015 04:32:12 PM 05/20/2015 05:36:29 PM TLC Taxi and Limousine Commission For Hire Vehicle Complaint Car Service Company Complaint Street 10451 215 EAST 161 STREET ... NaN NaN NaN NaN NaN NaN 40.826235 -73.920529 (40.8262353417949, -73.92052920426786) 2015-04-01 16:32:12
2015-04-01 08:26:06 30312622 04/01/2015 08:26:06 AM 06/01/2015 06:25:41 AM DOHMH Department of Health and Mental Hygiene Food Establishment Facility Construction Restaurant/Bar/Deli/Bakery 11234 2301 FLATBUSH AVENUE ... NaN NaN NaN NaN NaN NaN 40.613889 -73.927186 (40.61388875283825, -73.92718600732812) 2015-04-01 08:26:06
2015-04-01 15:08:20 30308371 04/01/2015 03:08:20 PM 06/01/2015 06:18:02 PM TLC Taxi and Limousine Commission Taxi Complaint Driver Complaint Street 10128 NaN ... NaN NaN NaN NaN NaN NaN 40.781533 -73.958320 (40.78153263581957, -73.9583197488706) 2015-04-01 15:08:20
2015-04-01 10:19:21 30311001 04/01/2015 10:19:21 AM 06/01/2015 06:25:39 AM DOHMH Department of Health and Mental Hygiene Food Establishment Rodents/Insects/Garbage Restaurant/Bar/Deli/Bakery 11377 59-21 ROOSEVELT AVENUE ... NaN NaN NaN NaN NaN NaN 40.745586 -73.904573 (40.74558568959288, -73.90457292624892) 2015-04-01 10:19:21
2015-04-01 20:20:13 30311341 04/01/2015 08:20:13 PM 04/01/2015 10:49:32 PM NYPD New York City Police Department Blocked Driveway Partial Access Street/Sidewalk 11691 348 BEACH 40 STREET ... NaN NaN NaN NaN NaN NaN 40.595019 -73.772153 (40.5950185756628, -73.77215306630436) 2015-04-01 20:20:13
2015-04-01 02:16:44 30308863 04/01/2015 02:16:44 AM 04/01/2015 02:54:17 AM NYPD New York City Police Department Noise - Commercial Loud Music/Party Club/Bar/Restaurant 10013 301 CHURCH STREET ... NaN NaN NaN NaN NaN NaN 40.719322 -74.004470 (40.71932215308254, -74.00446968948569) 2015-04-01 02:16:44
2015-04-01 13:12:58 30307673 04/01/2015 01:12:58 PM 04/01/2015 10:01:26 PM NYPD New York City Police Department Illegal Parking Posted Parking Sign Violation Street/Sidewalk 10306 200 ADELAIDE AVENUE ... NaN NaN NaN NaN NaN NaN 40.561690 -74.124622 (40.5616902523158, -74.12462211525013) 2015-04-01 13:12:58
2015-04-01 13:17:23 30307732 04/01/2015 01:17:23 PM 04/01/2015 01:31:22 PM NYPD New York City Police Department Traffic Congestion/Gridlock Street/Sidewalk 10013 NaN ... NaN NaN NaN NaN NaN NaN 40.720557 -74.003510 (40.72055732795014, -74.00351016018516) 2015-04-01 13:17:23
2015-04-01 21:39:04 30311958 04/01/2015 09:39:04 PM 04/01/2015 09:50:48 PM NYPD New York City Police Department Noise - Vehicle Car/Truck Music Street/Sidewalk 11207 184 JEROME STREET ... NaN NaN NaN NaN NaN NaN 40.677739 -73.887888 (40.677739297670584, -73.8878875660618) 2015-04-01 21:39:04
2015-04-01 12:53:45 30309365 04/01/2015 12:53:45 PM 04/02/2015 12:04:38 PM DCA Department of Consumer Affairs Consumer Complaint Overcharge NaN 11418 NaN ... NaN NaN NaN NaN NaN NaN 40.700108 -73.832667 (40.70010803283339, -73.83266746664873) 2015-04-01 12:53:45
2015-04-01 10:46:01 30312487 04/01/2015 10:46:01 AM 04/02/2015 03:34:31 PM DCA Department of Consumer Affairs Consumer Complaint Damaged/Defective Goods NaN 11232 807 42 STREET ... NaN NaN NaN NaN NaN NaN 40.645348 -73.998616 (40.64534787518196, -73.99861625677346) 2015-04-01 10:46:01

573 rows × 54 columns

What was the most popular type of complaint on April 1st?


In [16]:
df['2015-04-01'].groupby(by='Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(1)


Out[16]:
Complaint Type
Illegal Parking    67
Name: Complaint Type, dtype: int64

What were the most popular three types of complaint on April 1st


In [17]:
df['2015-04-01'].groupby(by='Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(3)


Out[17]:
Complaint Type
Illegal Parking     67
Street Condition    64
Blocked Driveway    58
Name: Complaint Type, dtype: int64

What month has the most reports filed? How many? Graph it.


In [18]:
df.resample('M')['Unique Key'].count().sort_values(ascending=False)


Out[18]:
created_datetime
2015-05-31    49715
2015-10-31    24700
2015-04-30    20087
2015-11-30    16476
2015-07-31    15047
2015-03-31    15025
2015-06-30    14459
2015-09-30    13679
2015-08-31    12204
2015-02-28     8141
2015-01-31     7091
2015-12-31     3373
2016-01-31        3
Name: Unique Key, dtype: int64

In [19]:
df.resample('M').count().plot(y='Unique Key')


Out[19]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf52e0d30>

What week of the year has the most reports filed? How many? Graph the weekly complaints.


In [20]:
df.resample('W')['Unique Key'].count().sort_values(ascending=False).head(5)


Out[20]:
created_datetime
2015-05-10    13559
2015-05-17    11683
2015-05-24    10351
2015-05-03    10184
2015-05-31     9387
Name: Unique Key, dtype: int64

In [21]:
df.resample('W').count().plot(y='Unique Key')


Out[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf521fbe0>

Noise complaints are a big deal. Use .str.contains to select noise complaints, and make an chart of when they show up annually. Then make a chart about when they show up every day (cyclic).


In [32]:
noise_df=df[df['Complaint Type'].str.contains('Noise')]
noise_df.resample('M').count().plot(y='Unique Key')


Out[32]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf4ff6ba8>

In [39]:
noise_df.groupby(by=noise_df.index.hour).count().plot(y='Unique Key')


Out[39]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf49c18d0>

Which were the top five days of the year for filing complaints? How many on each of those days? Graph it.


In [42]:
df.resample('D')['Unique Key'].count().sort_values(ascending=False).head(5)


Out[42]:
created_datetime
2015-10-28    2697
2015-11-09    2529
2015-05-04    2465
2015-05-11    2293
2015-10-29    2258
Name: Unique Key, dtype: int64

In [47]:
df.resample('D')['Unique Key'].count().sort_values().tail(5).plot(kind='barh')


Out[47]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf47bd278>

What hour of the day are the most complaints? Graph a day of complaints.


In [74]:
df['Unique Key'].groupby(by=df.index.hour).count().sort_values(ascending=False)


Out[74]:
0     22427
11    12729
12    12469
10    12343
13    11745
9     11490
15    11454
14    11205
16    10966
17     9291
18     8965
8      8157
22     8085
21     7658
19     7636
23     7420
20     7322
7      4992
1      3927
6      2687
2      2400
3      1644
5      1528
4      1460
Name: Unique Key, dtype: int64

In [51]:
df['Unique Key'].groupby(df.index.hour).count().plot()


Out[51]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbdf2e9af28>

One of the hours has an odd number of complaints. What are the most common complaints at that hour, and what are the most common complaints the hour before and after?


In [71]:
df[df.index.hour==0].groupby(by='Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)


Out[71]:
Complaint Type
HEAT/HOT WATER          4534
Rodent                  2112
PAINT/PLASTER           1946
UNSANITARY CONDITION    1820
PLUMBING                1502
Name: Complaint Type, dtype: int64

In [72]:
df[df.index.hour==1].groupby(by='Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)


Out[72]:
Complaint Type
Noise - Commercial         1025
Noise - Street/Sidewalk     897
Blocked Driveway            479
Illegal Parking             400
Noise - Vehicle             249
Name: Complaint Type, dtype: int64

In [73]:
df[df.index.hour==11].groupby(by='Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)


Out[73]:
Complaint Type
Illegal Parking      1184
Blocked Driveway     1170
Street Condition      694
Broken Muni Meter     596
Graffiti              566
Name: Complaint Type, dtype: int64

So odd. What's the per-minute breakdown of complaints between 12am and 1am? You don't need to include 1am.


In [112]:
midnight_df = df[df.index.hour==0]

In [113]:
midnight_df.groupby(midnight_df.index.minute)['Unique Key'].count().sort_values(ascending=False)


Out[113]:
0     17116
50      112
1       109
21      109
40      108
6       106
7       106
4       106
39      106
22      104
11      101
12      100
20      100
25      100
15      100
13      100
3        99
33       98
44       97
8        95
41       95
5        94
27       94
28       94
17       93
46       93
19       93
23       92
18       91
2        91
38       91
30       89
31       89
35       89
10       89
56       89
43       89
14       88
29       87
32       87
52       85
26       84
37       84
53       83
16       83
48       82
9        82
45       81
59       80
24       79
36       78
34       77
57       76
47       76
51       75
42       72
49       70
54       70
58       61
55       60
Name: Unique Key, dtype: int64

Looks like midnight is a little bit of an outlier. Why might that be? Take the 5 most common agencies and graph the times they file reports at (all day, not just midnight).


In [120]:
df.groupby('Agency')['Unique Key'].count().sort_values(ascending=False).head(5)


Out[120]:
Agency
NYPD     80000
HPD      39388
DOT      22308
DPR      15505
DOHMH     8250
Name: Unique Key, dtype: int64

In [139]:
ax=df[df['Agency']=='NYPD'].groupby(df[df['Agency']=='NYPD'].index.hour)['Unique Key'].count().plot(legend=True, label='NYPD')
df[df['Agency']=='HPD'].groupby(df[df['Agency']=='HPD'].index.hour)['Unique Key'].count().plot(ax=ax, legend=True, label='HPD')
df[df['Agency']=='DOT'].groupby(df[df['Agency']=='DOT'].index.hour)['Unique Key'].count().plot(ax=ax, legend=True, label='DOT')
df[df['Agency']=='DPR'].groupby(df[df['Agency']=='DPR'].index.hour)['Unique Key'].count().plot(ax=ax, legend=True, label='DPR')
df[df['Agency']=='DOHMH'].groupby(df[df['Agency']=='DOHMH'].index.hour)['Unique Key'].count().plot(ax=ax, legend=True, label='DOHMH')


Out[139]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbde530d4a8>

Graph those same agencies on an annual basis - make it weekly. When do people like to complain? When does the NYPD have an odd number of complaints?


In [141]:
ax=df[df['Agency']=='NYPD'].groupby(df[df['Agency']=='NYPD'].index.week)['Unique Key'].count().plot(legend=True, label='NYPD')
df[df['Agency']=='HPD'].groupby(df[df['Agency']=='HPD'].index.week)['Unique Key'].count().plot(ax=ax, legend=True, label='HPD')
df[df['Agency']=='DOT'].groupby(df[df['Agency']=='DOT'].index.week)['Unique Key'].count().plot(ax=ax, legend=True, label='DOT')
df[df['Agency']=='DPR'].groupby(df[df['Agency']=='DPR'].index.week)['Unique Key'].count().plot(ax=ax, legend=True, label='DPR')
df[df['Agency']=='DOHMH'].groupby(df[df['Agency']=='DOHMH'].index.week)['Unique Key'].count().plot(ax=ax, legend=True, label='DOHMH')


Out[141]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbde5307278>

Maybe the NYPD deals with different issues at different times? Check the most popular complaints in July and August vs the month of May. Also check the most common complaints for the Housing Preservation Bureau (HPD) in winter vs. summer.


In [142]:
nypd=df[df['Agency']=='NYPD']

In [157]:
nypd[(nypd.index.month==7) | (nypd.index.month==8)].groupby('Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)


Out[157]:
Complaint Type
Illegal Parking            3444
Blocked Driveway           3258
Noise - Street/Sidewalk    3165
Noise - Commercial         1201
Noise - Vehicle             942
Name: Complaint Type, dtype: int64

In [158]:
nypd[nypd.index.month==5].groupby('Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)


Out[158]:
Complaint Type
Blocked Driveway           4114
Illegal Parking            3975
Noise - Street/Sidewalk    3385
Noise - Commercial         2263
Noise - Vehicle            1232
Name: Complaint Type, dtype: int64

In [ ]:
# seems like mostly noise complaints and bad parking to me

In [159]:
hpd=df[df['Agency']=='HPD']

In [162]:
hpd[(hpd.index.month>=6) & (hpd.index.month<=8)].groupby('Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)
# i would consider summer to be june to august.


Out[162]:
Complaint Type
HEAT/HOT WATER            617
UNSANITARY CONDITION      510
HPD Literature Request    462
PAINT/PLASTER             444
PLUMBING                  309
Name: Complaint Type, dtype: int64

In [163]:
hpd[(hpd.index.month==12) | (hpd.index.month<=2)].groupby('Complaint Type')['Complaint Type'].count().sort_values(ascending=False).head(5)


Out[163]:
Complaint Type
HEAT/HOT WATER          353
UNSANITARY CONDITION    190
PLUMBING                139
PAINT/PLASTER           139
DOOR/WINDOW             103
Name: Complaint Type, dtype: int64

In [ ]:
# pretty similar list, but people probably notice a draft from their bad window or door in the winter more easily than summer