In [2]:

    
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
import dateutil.parser

First, I made a mistake naming the data set! It's 2015 data, not 2014 data. But yes, still use 311-2014.csv. You can rename it.

Importing and preparing your data

Import your data, but only the first 200,000 rows. You'll also want to change the index to be a datetime based on the Created Date column - you'll want to check if it's already a datetime, and parse it if not.



In [216]:

    
df = pd.read_csv('311-2015.csv', dtype = str)



In [217]:

    
df.head()









    Out[217]:






  
    
      
      Unique Key
      Created Date
      Closed Date
      Agency
      Agency Name
      Complaint Type
      Descriptor
      Location Type
      Incident Zip
      Incident Address
      ...
      Bridge Highway Name
      Bridge Highway Direction
      Road Ramp
      Bridge Highway Segment
      Garage Lot Name
      Ferry Direction
      Ferry Terminal Name
      Latitude
      Longitude
      Location
    
  
  
    
      0
      31015465
      07/06/2015 10:58:27 AM
      07/22/2015 01:07:20 AM
      DCA
      Department of Consumer Affairs
      Consumer Complaint
      Demand for Cash
      NaN
      11360
      27-16 203 STREET
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      40.773539552542
      -73.78823697228408
      (40.773539552542, -73.78823697228408)
    
    
      1
      30997660
      07/03/2015 01:26:29 PM
      07/03/2015 02:08:20 PM
      NYPD
      New York City Police Department
      Vending
      In Prohibited Area
      Residential Building/House
      10019
      200 CENTRAL PARK SOUTH
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      40.76702142171206
      -73.97944780718524
      (40.76702142171206, -73.97944780718524)
    
    
      2
      31950223
      11/09/2015 03:55:09 AM
      11/09/2015 08:08:57 AM
      NYPD
      New York City Police Department
      Blocked Driveway
      No Access
      Street/Sidewalk
      10453
      1993 GRAND AVENUE
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      40.85267061877697
      -73.91060771362552
      (40.85267061877697, -73.91060771362552)
    
    
      3
      31000038
      07/03/2015 02:18:32 AM
      07/03/2015 07:54:48 AM
      NYPD
      New York City Police Department
      Noise - Commercial
      Loud Music/Party
      Club/Bar/Restaurant
      11372
      84-16 NORTHERN BOULEVARD
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      40.755773786469966
      -73.88326243225418
      (40.755773786469966, -73.88326243225418)
    
    
      4
      30995614
      07/04/2015 12:03:27 AM
      07/04/2015 03:33:09 AM
      NYPD
      New York City Police Department
      Noise - Street/Sidewalk
      Loud Talking
      Street/Sidewalk
      11216
      1057 BERGEN STREET
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      40.67617516102934
      -73.9512690004692
      (40.67617516102934, -73.9512690004692)
    
  

5 rows × 53 columns



In [133]:

    
import datetime



In [134]:

    
def created_date_to_datetime(date_str):
    return datetime.datetime.strptime(date_str, '%m/%d/%Y %I:%M:%S %p')



In [218]:

    
df['created_datetime'] = df['Created Date'].apply(created_date_to_datetime)



In [219]:

    
df = df.set_index('created_datetime')

What was the most popular type of complaint, and how many times was it filed?



In [220]:

    
freq_complaints = df[['Unique Key', 'Complaint Type']].groupby('Complaint Type').count().sort_values('Unique Key', ascending=False).head()



In [221]:

    
freq_complaints









    Out[221]:






  
    
      
      Unique Key
    
    
      Complaint Type
      
    
  
  
    
      HEAT/HOT WATER
      225706
    
    
      Noise - Residential
      208026
    
    
      Street Condition
      124379
    
    
      Blocked Driveway
      100881
    
    
      Street Light Condition
      98106

Make a horizontal bar graph of the top 5 most frequent complaint types.



In [222]:

    
ax = freq_complaints.plot(kind = 'barh', legend = False)
ax.set_title('5 Most Frequent 311 Complaints')
ax.set_xlabel('Number of Complaints in 2015')









    Out[222]:





<matplotlib.text.Text at 0x11fe9a940>

Which borough has the most complaints per capita? Since it's only 5 boroughs, you can do the math manually.



In [223]:

    
df[['Unique Key', 'Borough']].groupby('Borough').count().sort_values('Unique Key', ascending = False)









    Out[223]:






  
    
      
      Unique Key
    
    
      Borough
      
    
  
  
    
      BROOKLYN
      684215
    
    
      QUEENS
      515349
    
    
      MANHATTAN
      473782
    
    
      BRONX
      410768
    
    
      STATEN ISLAND
      106796
    
    
      Unspecified
      95841

According to your selection of data, how many cases were filed in March? How about May?



In [224]:

    
cases_in_mar = df[df.index.month == 3]['Unique Key'].count()
print('There were', cases_in_mar, 'cases filed in March.')









    



There were 206907 cases filed in March.



In [225]:

    
cases_in_may = df[df.index.month == 5]['Unique Key'].count()
print('There were', cases_in_may, 'cases filed in May.')









    



There were 187137 cases filed in May.

I'd like to see all of the 311 complaints called in on April 1st.

Surprise! We couldn't do this in class, but it was just a limitation of our data set



In [226]:

    
april_1_complaints = df[(df.index.month == 4) & (df.index.day == 1)][['Unique Key', 'Created Date', 'Complaint Type', 'Descriptor']]
april_1_complaints









    Out[226]:






  
    
      
      Unique Key
      Created Date
      Complaint Type
      Descriptor
    
    
      created_datetime
      
      
      
      
    
  
  
    
      2015-04-01 21:37:42
      30311691
      04/01/2015 09:37:42 PM
      Illegal Parking
      Blocked Sidewalk
    
    
      2015-04-01 23:12:04
      30307701
      04/01/2015 11:12:04 PM
      Noise - Commercial
      Loud Music/Party
    
    
      2015-04-01 13:10:35
      30313389
      04/01/2015 01:10:35 PM
      Root/Sewer/Sidewalk Condition
      Trees and Sidewalks Program
    
    
      2015-04-01 17:37:38
      30314393
      04/01/2015 05:37:38 PM
      Maintenance or Facility
      Hours of Operation
    
    
      2015-04-01 12:32:40
      30309207
      04/01/2015 12:32:40 PM
      Consumer Complaint
      Installation/Work Quality
    
    
      2015-04-01 18:44:50
      30311759
      04/01/2015 06:44:50 PM
      Damaged Tree
      Entire Tree Has Fallen Down
    
    
      2015-04-01 16:30:15
      30309690
      04/01/2015 04:30:15 PM
      Animal Abuse
      Neglected
    
    
      2015-04-01 09:04:07
      30307990
      04/01/2015 09:04:07 AM
      SCRIE
      Miscellaneous
    
    
      2015-04-01 07:46:58
      30308253
      04/01/2015 07:46:58 AM
      Blocked Driveway
      No Access
    
    
      2015-04-01 17:12:17
      30314214
      04/01/2015 05:12:17 PM
      Highway Condition
      Pothole - Highway
    
    
      2015-04-01 21:30:48
      30307111
      04/01/2015 09:30:48 PM
      Food Establishment
      Food Temperature
    
    
      2015-04-01 15:51:04
      30311571
      04/01/2015 03:51:04 PM
      Maintenance or Facility
      Hours of Operation
    
    
      2015-04-01 10:43:28
      30313817
      04/01/2015 10:43:28 AM
      Damaged Tree
      Branch Cracked and Will Fall
    
    
      2015-04-01 15:12:46
      30308922
      04/01/2015 03:12:46 PM
      Food Establishment
      Letter Grading
    
    
      2015-04-01 06:15:42
      30311132
      04/01/2015 06:15:42 AM
      Highway Condition
      Pothole - Highway
    
    
      2015-04-01 11:28:02
      30308180
      04/01/2015 11:28:02 AM
      Highway Condition
      Pothole - Highway
    
    
      2015-04-01 17:35:18
      30313207
      04/01/2015 05:35:18 PM
      Food Establishment
      Rodents/Insects/Garbage
    
    
      2015-04-01 13:54:54
      30310017
      04/01/2015 01:54:54 PM
      SCRIE
      Miscellaneous
    
    
      2015-04-01 23:49:33
      30306774
      04/01/2015 11:49:33 PM
      Noise - Commercial
      Loud Music/Party
    
    
      2015-04-01 07:50:49
      30313339
      04/01/2015 07:50:49 AM
      Street Condition
      Rough, Pitted or Cracked Roads
    
    
      2015-04-01 13:50:29
      30312146
      04/01/2015 01:50:29 PM
      Food Establishment
      Rodents/Insects/Garbage
    
    
      2015-04-01 16:14:19
      30313259
      04/01/2015 04:14:19 PM
      Benefit Card Replacement
      Medicaid
    
    
      2015-04-01 19:27:34
      30308920
      04/01/2015 07:27:34 PM
      Noise - Street/Sidewalk
      Loud Music/Party
    
    
      2015-04-01 05:30:02
      30314164
      04/01/2015 05:30:02 AM
      Highway Condition
      Pothole - Highway
    
    
      2015-04-01 10:33:26
      30311790
      04/01/2015 10:33:26 AM
      Illegal Parking
      Blocked Sidewalk
    
    
      2015-04-01 11:47:38
      30310940
      04/01/2015 11:47:38 AM
      SCRIE
      Miscellaneous
    
    
      2015-04-01 11:01:27
      30310409
      04/01/2015 11:01:27 AM
      Consumer Complaint
      Exchange/Refund/Return
    
    
      2015-04-01 08:51:52
      30310350
      04/01/2015 08:51:52 AM
      Consumer Complaint
      Cars Parked on Sidewalk/Street
    
    
      2015-04-01 14:58:55
      30313106
      04/01/2015 02:58:55 PM
      SCRIE
      Rent Discrepancy
    
    
      2015-04-01 16:59:19
      30309324
      04/01/2015 04:59:19 PM
      Blocked Driveway
      Partial Access
    
    
      ...
      ...
      ...
      ...
      ...
    
    
      2015-04-01 10:10:40
      30313973
      04/01/2015 10:10:40 AM
      DOF Property - Owner Issue
      Billing Address Incorrect
    
    
      2015-04-01 09:45:31
      30314014
      04/01/2015 09:45:31 AM
      DOF Property - Reduction Issue
      Commercial J51 Exemption
    
    
      2015-04-01 10:39:40
      30314031
      04/01/2015 10:39:40 AM
      Noise - Residential
      Banging/Pounding
    
    
      2015-04-01 14:02:44
      30314032
      04/01/2015 02:02:44 PM
      Noise - Residential
      Banging/Pounding
    
    
      2015-04-01 12:57:43
      30314033
      04/01/2015 12:57:43 PM
      Noise - Residential
      Loud Music/Party
    
    
      2015-04-01 21:13:06
      30314044
      04/01/2015 09:13:06 PM
      Noise - Residential
      Loud Music/Party
    
    
      2015-04-01 15:55:36
      30314045
      04/01/2015 03:55:36 PM
      Noise - Residential
      Loud Music/Party
    
    
      2015-04-01 21:19:35
      30314046
      04/01/2015 09:19:35 PM
      Noise - Residential
      Loud Music/Party
    
    
      2015-04-01 18:56:00
      30314047
      04/01/2015 06:56:00 PM
      Noise - Residential
      Loud Talking
    
    
      2015-04-01 03:29:02
      30314048
      04/01/2015 03:29:02 AM
      Noise - Residential
      Loud Talking
    
    
      2015-04-01 03:03:18
      30314071
      04/01/2015 03:03:18 AM
      Noise - Residential
      Loud Music/Party
    
    
      2015-04-01 21:36:56
      30314072
      04/01/2015 09:36:56 PM
      Noise - Residential
      Loud Television
    
    
      2015-04-01 16:59:18
      30314077
      04/01/2015 04:59:18 PM
      Housing Options
      Assisted Living
    
    
      2015-04-01 11:51:47
      30314087
      04/01/2015 11:51:47 AM
      DOF Parking - Request Copy
      List of Outstanding Tickets
    
    
      2015-04-01 10:04:23
      30314088
      04/01/2015 10:04:23 AM
      DOF Property - Owner Issue
      Billing Address Incorrect
    
    
      2015-04-01 14:31:19
      30314090
      04/01/2015 02:31:19 PM
      DOF Property - Request Copy
      Copy of Account Information
    
    
      2015-04-01 13:02:25
      30314091
      04/01/2015 01:02:25 PM
      DOF Property - Request Copy
      Copy of Statement
    
    
      2015-04-01 12:48:02
      30314092
      04/01/2015 12:48:02 PM
      DOF Property - Request Copy
      Copy of Statement
    
    
      2015-04-01 09:42:34
      30314234
      04/01/2015 09:42:34 AM
      Home Delivered Meal Complaint
      NaN
    
    
      2015-04-01 14:54:23
      30314238
      04/01/2015 02:54:23 PM
      DOF Parking - Request Copy
      Decision and Order
    
    
      2015-04-01 17:41:01
      30314261
      04/01/2015 05:41:01 PM
      Housing - Low Income Senior
      NaN
    
    
      2015-04-01 15:53:30
      30314291
      04/01/2015 03:53:30 PM
      DOF Property - Payment Issue
      Interest Dispute
    
    
      2015-04-01 10:31:33
      30314440
      04/01/2015 10:31:33 AM
      New Tree Request
      For One Address
    
    
      2015-04-01 13:55:04
      30314441
      04/01/2015 01:55:04 PM
      New Tree Request
      For One Address
    
    
      2015-04-01 10:16:08
      30314442
      04/01/2015 10:16:08 AM
      New Tree Request
      For One Address
    
    
      2015-04-01 10:24:52
      30314499
      04/01/2015 10:24:52 AM
      New Tree Request
      For One Address
    
    
      2015-04-01 15:01:35
      30314500
      04/01/2015 03:01:35 PM
      New Tree Request
      For One Address
    
    
      2015-04-01 15:26:36
      30314519
      04/01/2015 03:26:36 PM
      New Tree Request
      For One Address
    
    
      2015-04-01 11:28:23
      30314540
      04/01/2015 11:28:23 AM
      DOF Property - Payment Issue
      Other Billing Issue
    
    
      2015-04-01 10:09:13
      30314547
      04/01/2015 10:09:13 AM
      DOF Property - Payment Issue
      Other Billing Issue
    
  

6801 rows × 4 columns

What was the most popular type of complaint on April 1st?



In [227]:

    
april_1_complaints[['Unique Key', 'Complaint Type']].groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head(1)









    Out[227]:






  
    
      
      Unique Key
    
    
      Complaint Type
      
    
  
  
    
      Street Condition
      839

What were the most popular three types of complaint on April 1st



In [228]:

    
april_1_complaints[['Unique Key', 'Complaint Type']].groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head(3)









    Out[228]:






  
    
      
      Unique Key
    
    
      Complaint Type
      
    
  
  
    
      Street Condition
      839
    
    
      HEAT/HOT WATER
      581
    
    
      Noise - Residential
      330

What month has the most reports filed? How many? Graph it.



In [229]:

    
complaints_by_month = df['Unique Key'].groupby(df.index.month).count()
complaints_by_month









    Out[229]:





1     190501
2     202520
3     206907
4     182611
5     187137
6     198527
7     194254
8     185412
9     182099
10    197757
11    183091
12    175935
Name: Unique Key, dtype: int64



In [230]:

    
ax = complaints_by_month.plot()
ax.set_xticks([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
ax.set_xticklabels(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
ax.set_ylabel('Number of Complaints')
ax.set_title('311 Complaints by Month in 2015')

# x_values = df.groupby(df.index.month).median().index
min_values = 0
max_values = complaints_by_month
ax.fill_between(x_values, min_values, max_values, alpha = 0.4)









    Out[230]:





<matplotlib.collections.PolyCollection at 0x12cf887f0>

What week of the year has the most reports filed? How many? Graph the weekly complaints.



In [231]:

    
complaints_by_week = df['Unique Key'].groupby(df.index.week).count()
complaints_by_week









    Out[231]:





1     17522
2     54405
3     40467
4     39814
5     43132
6     49582
7     45936
8     55971
9     50964
10    47544
11    50035
12    46601
13    43726
14    41302
15    42120
16    42385
17    44138
18    42740
19    43769
20    42648
21    41321
22    43502
23    44435
24    45616
25    46575
26    46653
27    42723
28    43225
29    44749
30    44508
31    44164
32    43696
33    41367
34    42344
35    41424
36    43196
37    41233
38    42804
39    42059
40    45012
41    41819
42    44769
43    46320
44    43901
45    41293
46    44877
47    45365
48    39159
49    44761
50    41230
51    41442
52    32015
53    24393
Name: Unique Key, dtype: int64



In [232]:

    
ax = complaints_by_week.plot()
ax.set_xticks(range(1,53))
ax.set_xticklabels(['', '', '', '', '5', 
                    '', '', '', '', '10', 
                    '', '', '', '', '15',
                   '', '', '', '', '20',
                   '', '', '', '', '25',
                   '', '', '', '', '30',
                   '', '', '', '', '35',
                   '', '', '', '', '40',
                   '', '', '', '', '45',
                   '', '', '', '', '50',
                   '', '',])
ax.set_ylabel('Number of Complaints')
ax.set_xlabel('Week of the Year')
ax.set_title('311 Complaints by Week in 2015')









    Out[232]:





<matplotlib.text.Text at 0x112372780>

Noise complaints are a big deal. Use .str.contains to select noise complaints, and make an chart of when they show up annually. Then make a chart about when they show up every day (cyclic).



In [233]:

    
noise_complaints = df[df['Complaint Type'].str.contains('Noise') == True]



In [234]:

    
noise_complaints_hour = noise_complaints['Unique Key'].groupby(noise_complaints.index.hour).count()



In [235]:

    
ax = noise_complaints_hour.plot()
ax.set_xticks(range(0,24))
ax.set_xticklabels(['Midnight', '', '', '', '', '', 
                    '6 am', '', '', '', '', '',
                    'Noon', '', '', '', '', '',
                    '6 pm', '', '', '', '', ''])
ax.set_ylabel('Number of Complaints')
ax.set_xlabel('Time of Day')
ax.set_title('Noise Complaints by Time of Day in 2015')









    Out[235]:





<matplotlib.text.Text at 0x113f87c88>

Which were the top five days of the year for filing complaints? How many on each of those days? Graph it.



In [236]:

    
top_complaining_days = df['Unique Key'].resample('D').count().sort_values(ascending = False).head()
top_complaining_days









    Out[236]:





created_datetime
2015-01-08    11318
2015-02-16    10434
2015-11-13     9818
2015-10-19     9662
2015-02-20     9438
Name: Unique Key, dtype: int64



In [237]:

    
ax = top_complaining_days.plot(kind = 'barh')
ax.set_ylabel('Date')
ax.set_xlabel('Number of Complaints')
ax.set_title('The Top 5 Days for 311 Complaints in 2015')









    Out[237]:





<matplotlib.text.Text at 0x11503fb70>



In [238]:

    
complaining_days = df['Unique Key'].resample('D').count()
ax = complaining_days.plot()
ax.set_ylabel('Number of Complaints')
ax.set_xlabel('Day of Year')
ax.set_title('Noise Complaints by Day in 2015')









    Out[238]:





<matplotlib.text.Text at 0x115866668>

Interesting—it looks cyclical. Let's see what day of the week is most popular:



In [243]:

    
df.index[1]









    Out[243]:





Timestamp('2015-07-03 13:26:29')



In [254]:

    
def get_day_of_wk(timestamp):
    return datetime.datetime.strftime(timestamp, '%a')



In [255]:

    
df['datetime'] = df.index



In [256]:

    
df['day_of_wk'] = df['datetime'].apply(get_day_of_wk)



In [258]:

    
complaining_day_of_wk = df[['Unique Key', 'day_of_wk']].groupby('day_of_wk').count()



In [272]:

    
complaining_day_of_wk['number_of_day'] = [6, 2, 7, 1, 5, 3, 4]



In [275]:

    
complaining_day_of_wk_sorted = complaining_day_of_wk.sort_values('number_of_day')



In [277]:

    
complaining_day_of_wk_sorted









    Out[277]:






  
    
      
      Unique Key
      number_of_day
    
    
      day_of_wk
      
      
    
  
  
    
      Sun
      251452
      1
    
    
      Mon
      367697
      2
    
    
      Tue
      364313
      3
    
    
      Wed
      357175
      4
    
    
      Thu
      345647
      5
    
    
      Fri
      333317
      6
    
    
      Sat
      267150
      7



In [282]:

    
ax = complaining_day_of_wk_sorted.plot(y = 'Unique Key', legend = False)

What hour of the day are the most complaints? Graph a day of complaints.



In [283]:

    
hourly_complaints = df['Unique Key'].groupby(df.index.hour).count()



In [284]:

    
ax = hourly_complaints.plot()
ax.set_xticks(range(0,24))
ax.set_xticklabels(['Midnight', '', '', '', '', '', 
                    '6 am', '', '', '', '', '',
                    'Noon', '', '', '', '', '',
                    '6 pm', '', '', '', '', ''])
ax.set_ylabel('Number of Complaints')
ax.set_xlabel('Time of Day')
ax.set_title('311 Complaints by Time of Day in 2015')









    Out[284]:





<matplotlib.text.Text at 0x119319390>

One of the hours has an odd number of complaints. What are the most common complaints at that hour, and what are the most common complaints the hour before and after?



In [285]:

    
# 11 pm
df[df.index.hour == 23][['Unique Key', 'Complaint Type', 'Descriptor']].groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head()









    Out[285]:






  
    
      
      Unique Key
      Descriptor
    
    
      Complaint Type
      
      
    
  
  
    
      Noise - Residential
      27576
      27576
    
    
      Noise - Street/Sidewalk
      7293
      7293
    
    
      Noise - Commercial
      6955
      6955
    
    
      Blocked Driveway
      4775
      4775
    
    
      Illegal Parking
      4373
      4373



In [286]:

    
# 12 am
df[df.index.hour == 0][['Unique Key', 'Complaint Type', 'Descriptor']].groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head()









    Out[286]:






  
    
      
      Unique Key
      Descriptor
    
    
      Complaint Type
      
      
    
  
  
    
      HEAT/HOT WATER
      142330
      142330
    
    
      Rodent
      28567
      28567
    
    
      PAINT/PLASTER
      28165
      28165
    
    
      UNSANITARY CONDITION
      27962
      27962
    
    
      PLUMBING
      24137
      24137



In [287]:

    
# 1 am
df[df.index.hour == 1][['Unique Key', 'Complaint Type', 'Descriptor']].groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head()









    Out[287]:






  
    
      
      Unique Key
      Descriptor
    
    
      Complaint Type
      
      
    
  
  
    
      Noise - Residential
      16893
      16893
    
    
      Noise - Commercial
      4685
      4685
    
    
      Noise - Street/Sidewalk
      3819
      3819
    
    
      Noise
      2282
      2282
    
    
      Blocked Driveway
      2267
      2267

So odd. What's the per-minute breakdown of complaints between 12am and 1am? You don't need to include 1am.



In [291]:

    
midnight_complaints = df[df.index.hour == 0][['Unique Key', 'Complaint Type']]

for minute in range(0, 60):
    top_complaint = midnight_complaints[midnight_complaints.index.minute == minute].groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head(1)
    if minute < 10:
        minute = '0' + str(minute)
    else:
        minute = str(minute)
    print('12:' + minute + '\'s top complaint was:', top_complaint)
    print('')
# hourly_complaints = df['Unique Key'].groupby(df.index.hour).count()









    



12:00's top complaint was:                 Unique Key
Complaint Type            
HEAT/HOT WATER      140688

12:01's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         366

12:02's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         372

12:03's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         387

12:04's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         390

12:05's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         441

12:06's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         429

12:07's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         398

12:08's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         415

12:09's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         384

12:10's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         405

12:11's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         411

12:12's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         383

12:13's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         386

12:14's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         379

12:15's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         417

12:16's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         382

12:17's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         403

12:18's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         393

12:19's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         365

12:20's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         379

12:21's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         371

12:22's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         398

12:23's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         383

12:24's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         387

12:25's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         351

12:26's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         384

12:27's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         353

12:28's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         354

12:29's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         411

12:30's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         345

12:31's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         380

12:32's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         412

12:33's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         351

12:34's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         362

12:35's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         403

12:36's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         359

12:37's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         378

12:38's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         374

12:39's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         356

12:40's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         354

12:41's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         339

12:42's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         365

12:43's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         382

12:44's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         341

12:45's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         348

12:46's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         369

12:47's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         317

12:48's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         310

12:49's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         361

12:50's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         326

12:51's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         312

12:52's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         351

12:53's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         356

12:54's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         337

12:55's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         355

12:56's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         305

12:57's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         361

12:58's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         367

12:59's top complaint was:                      Unique Key
Complaint Type                 
Noise - Residential         296

Looks like midnight is a little bit of an outlier. Why might that be? Take the 5 most common agencies and graph the times they file reports at (all day, not just midnight).

5 Top Agencies:



In [292]:

    
df[['Unique Key', 'Agency']].groupby('Agency').count().sort_values('Unique Key', ascending = False).head()



In [293]:

    
def agency_hourly_complaints(agency_name_str):
    agency_complaints = df[df['Agency'] == agency_name_str]
    return agency_complaints['Unique Key'].groupby(agency_complaints.index.hour).count()



In [294]:

    
ax = agency_hourly_complaints('HPD').plot(label = 'HPD', legend = True)
for x in ['NYPD', 'DOT', 'DEP', 'DSNY']:
    agency_hourly_complaints(x).plot(ax = ax, label = x, legend = True)

Graph those same agencies on an annual basis - make it weekly. When do people like to complain? When does the NYPD have an odd number of complaints?



In [295]:

    
def agency_weekly_complaints(agency_name_str):
    agency_complaints = df[df['Agency'] == agency_name_str]
    return agency_complaints['Unique Key'].groupby(agency_complaints.index.week).count()



In [296]:

    
ax = agency_weekly_complaints('NYPD').plot(label = 'NYPD', legend = True)
for x in ['DOT', 'HPD', 'DPR', 'DSNY']:
    agency_weekly_complaints(x).plot(ax = ax, label = x, legend = True)



In [299]:

    
NYPD_complaints = df[df['Agency'] == 'NYPD']
NYPD_weekly_complaints = NYPD_complaints['Unique Key'].groupby(NYPD_complaints.index.week).count()
NYPD_weekly_complaints[NYPD_weekly_complaints > 13000]









    Out[299]:





19    13959
22    14404
23    13286
27    13836
35    13048
38    13198
39    13966
Name: Unique Key, dtype: int64



In [300]:

    
# # a way to use the function agency_weekly_complaints that's actually longer than not using it.
# week_number = 0
# for week in agency_weekly_complaints('NYPD'):
#     week_number += 1
#     if week > 1500:
#         print('In week', week_number)
#         print('there were', week, 'complaints.')
#         print('')

It looks like complaints are most popular in May, June, September—generally in the summer.



In [301]:

    
NYPD_weekly_complaints[NYPD_weekly_complaints < 6000]









    Out[301]:





1     5590
53    5505
Name: Unique Key, dtype: int64

It looks like complaints are least popular in around Christmas and New Year's.

Maybe the NYPD deals with different issues at different times? Check the most popular complaints in July and August vs the month of May. Also check the most common complaints for the Housing Preservation Bureau (HPD) in winter vs. summer.

Most popular NYPD complaints in July and August:



In [302]:

    
NYPD_complaints = df[df['Agency'] == 'NYPD']
NYPD_jul_aug_complaints = NYPD_complaints[(NYPD_complaints.index.month == 7) | (NYPD_complaints.index.month == 8)][['Unique Key', 'Complaint Type']]
NYPD_jul_aug_complaints.groupby('Complaint Type').count().sort_values('Unique Key', ascending = False).head()









    Out[302]:






  
    
      
      Unique Key
    
    
      Complaint Type
      
    
  
  
    
      Noise - Residential
      37509
    
    
      Illegal Parking
      16287
    
    
      Noise - Street/Sidewalk
      15463
    
    
      Blocked Driveway
      15428
    
    
      Noise - Commercial
      5973

	Unique Key
Complaint Type
UNSANITARY CONDITION	23687
PAINT/PLASTER	17653
PLUMBING	12617
DOOR/WINDOW	9440
WATER LEAK	8679

	Unique Key	Created Date	Closed Date	Agency	Agency Name	Complaint Type	Descriptor	Location Type	Incident Zip	Incident Address	...	Bridge Highway Name	Bridge Highway Direction	Road Ramp	Bridge Highway Segment	Garage Lot Name	Ferry Direction	Ferry Terminal Name	Latitude	Longitude	Location
0	31015465	07/06/2015 10:58:27 AM	07/22/2015 01:07:20 AM	DCA	Department of Consumer Affairs	Consumer Complaint	Demand for Cash	NaN	11360	27-16 203 STREET	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	40.773539552542	-73.78823697228408	(40.773539552542, -73.78823697228408)
1	30997660	07/03/2015 01:26:29 PM	07/03/2015 02:08:20 PM	NYPD	New York City Police Department	Vending	In Prohibited Area	Residential Building/House	10019	200 CENTRAL PARK SOUTH	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	40.76702142171206	-73.97944780718524	(40.76702142171206, -73.97944780718524)
2	31950223	11/09/2015 03:55:09 AM	11/09/2015 08:08:57 AM	NYPD	New York City Police Department	Blocked Driveway	No Access	Street/Sidewalk	10453	1993 GRAND AVENUE	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	40.85267061877697	-73.91060771362552	(40.85267061877697, -73.91060771362552)
3	31000038	07/03/2015 02:18:32 AM	07/03/2015 07:54:48 AM	NYPD	New York City Police Department	Noise - Commercial	Loud Music/Party	Club/Bar/Restaurant	11372	84-16 NORTHERN BOULEVARD	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	40.755773786469966	-73.88326243225418	(40.755773786469966, -73.88326243225418)
4	30995614	07/04/2015 12:03:27 AM	07/04/2015 03:33:09 AM	NYPD	New York City Police Department	Noise - Street/Sidewalk	Loud Talking	Street/Sidewalk	11216	1057 BERGEN STREET	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	40.67617516102934	-73.9512690004692	(40.67617516102934, -73.9512690004692)

	Unique Key
Complaint Type
HEAT/HOT WATER	225706
Noise - Residential	208026
Street Condition	124379
Blocked Driveway	100881
Street Light Condition	98106

	Unique Key
Borough
BROOKLYN	684215
QUEENS	515349
MANHATTAN	473782
BRONX	410768
STATEN ISLAND	106796
Unspecified	95841

	Unique Key	Created Date	Complaint Type	Descriptor
created_datetime
2015-04-01 21:37:42	30311691	04/01/2015 09:37:42 PM	Illegal Parking	Blocked Sidewalk
2015-04-01 23:12:04	30307701	04/01/2015 11:12:04 PM	Noise - Commercial	Loud Music/Party
2015-04-01 13:10:35	30313389	04/01/2015 01:10:35 PM	Root/Sewer/Sidewalk Condition	Trees and Sidewalks Program
2015-04-01 17:37:38	30314393	04/01/2015 05:37:38 PM	Maintenance or Facility	Hours of Operation
2015-04-01 12:32:40	30309207	04/01/2015 12:32:40 PM	Consumer Complaint	Installation/Work Quality
2015-04-01 18:44:50	30311759	04/01/2015 06:44:50 PM	Damaged Tree	Entire Tree Has Fallen Down
2015-04-01 16:30:15	30309690	04/01/2015 04:30:15 PM	Animal Abuse	Neglected
2015-04-01 09:04:07	30307990	04/01/2015 09:04:07 AM	SCRIE	Miscellaneous
2015-04-01 07:46:58	30308253	04/01/2015 07:46:58 AM	Blocked Driveway	No Access
2015-04-01 17:12:17	30314214	04/01/2015 05:12:17 PM	Highway Condition	Pothole - Highway
2015-04-01 21:30:48	30307111	04/01/2015 09:30:48 PM	Food Establishment	Food Temperature
2015-04-01 15:51:04	30311571	04/01/2015 03:51:04 PM	Maintenance or Facility	Hours of Operation
2015-04-01 10:43:28	30313817	04/01/2015 10:43:28 AM	Damaged Tree	Branch Cracked and Will Fall
2015-04-01 15:12:46	30308922	04/01/2015 03:12:46 PM	Food Establishment	Letter Grading
2015-04-01 06:15:42	30311132	04/01/2015 06:15:42 AM	Highway Condition	Pothole - Highway
2015-04-01 11:28:02	30308180	04/01/2015 11:28:02 AM	Highway Condition	Pothole - Highway
2015-04-01 17:35:18	30313207	04/01/2015 05:35:18 PM	Food Establishment	Rodents/Insects/Garbage
2015-04-01 13:54:54	30310017	04/01/2015 01:54:54 PM	SCRIE	Miscellaneous
2015-04-01 23:49:33	30306774	04/01/2015 11:49:33 PM	Noise - Commercial	Loud Music/Party
2015-04-01 07:50:49	30313339	04/01/2015 07:50:49 AM	Street Condition	Rough, Pitted or Cracked Roads
2015-04-01 13:50:29	30312146	04/01/2015 01:50:29 PM	Food Establishment	Rodents/Insects/Garbage
2015-04-01 16:14:19	30313259	04/01/2015 04:14:19 PM	Benefit Card Replacement	Medicaid
2015-04-01 19:27:34	30308920	04/01/2015 07:27:34 PM	Noise - Street/Sidewalk	Loud Music/Party
2015-04-01 05:30:02	30314164	04/01/2015 05:30:02 AM	Highway Condition	Pothole - Highway
2015-04-01 10:33:26	30311790	04/01/2015 10:33:26 AM	Illegal Parking	Blocked Sidewalk
2015-04-01 11:47:38	30310940	04/01/2015 11:47:38 AM	SCRIE	Miscellaneous
2015-04-01 11:01:27	30310409	04/01/2015 11:01:27 AM	Consumer Complaint	Exchange/Refund/Return
2015-04-01 08:51:52	30310350	04/01/2015 08:51:52 AM	Consumer Complaint	Cars Parked on Sidewalk/Street
2015-04-01 14:58:55	30313106	04/01/2015 02:58:55 PM	SCRIE	Rent Discrepancy
2015-04-01 16:59:19	30309324	04/01/2015 04:59:19 PM	Blocked Driveway	Partial Access
...	...	...	...	...
2015-04-01 10:10:40	30313973	04/01/2015 10:10:40 AM	DOF Property - Owner Issue	Billing Address Incorrect
2015-04-01 09:45:31	30314014	04/01/2015 09:45:31 AM	DOF Property - Reduction Issue	Commercial J51 Exemption
2015-04-01 10:39:40	30314031	04/01/2015 10:39:40 AM	Noise - Residential	Banging/Pounding
2015-04-01 14:02:44	30314032	04/01/2015 02:02:44 PM	Noise - Residential	Banging/Pounding
2015-04-01 12:57:43	30314033	04/01/2015 12:57:43 PM	Noise - Residential	Loud Music/Party
2015-04-01 21:13:06	30314044	04/01/2015 09:13:06 PM	Noise - Residential	Loud Music/Party
2015-04-01 15:55:36	30314045	04/01/2015 03:55:36 PM	Noise - Residential	Loud Music/Party
2015-04-01 21:19:35	30314046	04/01/2015 09:19:35 PM	Noise - Residential	Loud Music/Party
2015-04-01 18:56:00	30314047	04/01/2015 06:56:00 PM	Noise - Residential	Loud Talking
2015-04-01 03:29:02	30314048	04/01/2015 03:29:02 AM	Noise - Residential	Loud Talking
2015-04-01 03:03:18	30314071	04/01/2015 03:03:18 AM	Noise - Residential	Loud Music/Party
2015-04-01 21:36:56	30314072	04/01/2015 09:36:56 PM	Noise - Residential	Loud Television
2015-04-01 16:59:18	30314077	04/01/2015 04:59:18 PM	Housing Options	Assisted Living
2015-04-01 11:51:47	30314087	04/01/2015 11:51:47 AM	DOF Parking - Request Copy	List of Outstanding Tickets
2015-04-01 10:04:23	30314088	04/01/2015 10:04:23 AM	DOF Property - Owner Issue	Billing Address Incorrect
2015-04-01 14:31:19	30314090	04/01/2015 02:31:19 PM	DOF Property - Request Copy	Copy of Account Information
2015-04-01 13:02:25	30314091	04/01/2015 01:02:25 PM	DOF Property - Request Copy	Copy of Statement
2015-04-01 12:48:02	30314092	04/01/2015 12:48:02 PM	DOF Property - Request Copy	Copy of Statement
2015-04-01 09:42:34	30314234	04/01/2015 09:42:34 AM	Home Delivered Meal Complaint	NaN
2015-04-01 14:54:23	30314238	04/01/2015 02:54:23 PM	DOF Parking - Request Copy	Decision and Order
2015-04-01 17:41:01	30314261	04/01/2015 05:41:01 PM	Housing - Low Income Senior	NaN
2015-04-01 15:53:30	30314291	04/01/2015 03:53:30 PM	DOF Property - Payment Issue	Interest Dispute
2015-04-01 10:31:33	30314440	04/01/2015 10:31:33 AM	New Tree Request	For One Address
2015-04-01 13:55:04	30314441	04/01/2015 01:55:04 PM	New Tree Request	For One Address
2015-04-01 10:16:08	30314442	04/01/2015 10:16:08 AM	New Tree Request	For One Address
2015-04-01 10:24:52	30314499	04/01/2015 10:24:52 AM	New Tree Request	For One Address
2015-04-01 15:01:35	30314500	04/01/2015 03:01:35 PM	New Tree Request	For One Address
2015-04-01 15:26:36	30314519	04/01/2015 03:26:36 PM	New Tree Request	For One Address
2015-04-01 11:28:23	30314540	04/01/2015 11:28:23 AM	DOF Property - Payment Issue	Other Billing Issue
2015-04-01 10:09:13	30314547	04/01/2015 10:09:13 AM	DOF Property - Payment Issue	Other Billing Issue

	Unique Key	number_of_day
day_of_wk
Sun	251452	1
Mon	367697	2
Tue	364313	3
Wed	357175	4
Thu	345647	5
Fri	333317	6
Sat	267150	7

	Unique Key	Descriptor
Complaint Type
Noise - Residential	27576	27576
Noise - Street/Sidewalk	7293	7293
Noise - Commercial	6955	6955
Blocked Driveway	4775	4775
Illegal Parking	4373	4373

	Unique Key	Descriptor
Complaint Type
HEAT/HOT WATER	142330	142330
Rodent	28567	28567
PAINT/PLASTER	28165	28165
UNSANITARY CONDITION	27962	27962
PLUMBING	24137	24137

	Unique Key	Descriptor
Complaint Type
Noise - Residential	16893	16893
Noise - Commercial	4685	4685
Noise - Street/Sidewalk	3819	3819
Noise	2282	2282
Blocked Driveway	2267	2267

	Unique Key
Complaint Type
Noise - Residential	37509
Illegal Parking	16287
Noise - Street/Sidewalk	15463
Blocked Driveway	15428
Noise - Commercial	5973

	Unique Key
Complaint Type
Noise - Residential	22923
Illegal Parking	8105
Blocked Driveway	8103
Noise - Street/Sidewalk	7157
Noise - Commercial	4922

	Unique Key
Complaint Type
HEAT/HOT WATER	121055
UNSANITARY CONDITION	18118
PAINT/PLASTER	16632
PLUMBING	15799
DOOR/WINDOW	10801