Atlanta Police Department

The Atlanta Police Department provides Part 1 crime data at http://www.atlantapd.org/i-want-to/crime-data-downloads

A recent copy of the data file is stored in the cluster. Please, do not copy this data file into your home directory!

Introduction

This notebooks leads into an exploration of public crime data provided by the Atlanta Police Department.
The original data set and supplemental information can be found at http://www.atlantapd.org/i-want-to/crime-data-downloads
The data set is available on ARC, please, don't download into your home directory on ARC!



In [2]:

    
import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt



In [3]:

    
# load data set
df = pd.read_csv('/home/data/APD/COBRA-YTD2017.csv.gz')
print "Shape of table: ", df.shape









    



Shape of table:  (17425, 23)

Let's look at the structure of this table. We're actually creating some text output that can be used to create a data dictionary.



In [ ]:

    
dataDict = pd.DataFrame({'DataType': df.dtypes.values, 'Description': '', }, index=df.columns.values)

We need to enter the descriptions for each entry in our dictionary manually. However, why not just create a the Python code automatically...

Run the code below only if you haven't edited the datadict.py file in a different way, since it will overwrite what you have so far. (That's why the code is commented-out.)



In [ ]:

    
# with open("datadict.py", "w") as io:
#     for i in dataDict.index:
#         io.write("dataDict.loc['%s'].Description = '' # type: %s\n" % (i, str(dataDict.loc[i].DataType)))



In [ ]:

    
# %load datadict.py
dataDict.loc['MI_PRINX'].Description = '' # type: int64
dataDict.loc['offense_id'].Description = 'Unique ID in the format YYDDDNNNN with the year YY, the day of the year DDD and a counter NNNN' # type: int64
dataDict.loc['rpt_date'].Description = 'Date the crime was reported' # type: object
dataDict.loc['occur_date'].Description = 'Estimated date when the crime occured' # type: object
dataDict.loc['occur_time'].Description = 'Estimated time when the crime occured' # type: object
dataDict.loc['poss_date'].Description = '' # type: object
dataDict.loc['poss_time'].Description = '' # type: object
dataDict.loc['beat'].Description = '' # type: int64
dataDict.loc['apt_office_prefix'].Description = '' # type: object
dataDict.loc['apt_office_num'].Description = '' # type: object
dataDict.loc['location'].Description = '' # type: object
dataDict.loc['MinOfucr'].Description = '' # type: int64
dataDict.loc['MinOfibr_code'].Description = '' # type: object
dataDict.loc['dispo_code'].Description = '' # type: object
dataDict.loc['MaxOfnum_victims'].Description = '' # type: float64
dataDict.loc['Shift'].Description = 'Zones have 8 or 10 hour shifts' # type: object
dataDict.loc['Avg Day'].Description = '' # type: object
dataDict.loc['loc_type'].Description = '' # type: float64
dataDict.loc['UC2 Literal'].Description = '' # type: object
dataDict.loc['neighborhood'].Description = '' # type: object
dataDict.loc['npu'].Description = '' # type: object
dataDict.loc['x'].Description = '' # type: float64
dataDict.loc['y'].Description = '' # type: float64
dataDict.to_csv("COBRA_Data_Dictionary.csv")

Fixing Data Types



In [ ]:

    
print df.groupby("Shift").count().index

Date and Time

Working with dates can be tricky. Often dates and times are coded as strings and need to be converted to a date and time data format.
Python provides a module datetime to deal with converting parsing and formatting dates and times. See https://docs.python.org/2/library/datetime.html
The pandas package provides functionality to convert text fields into date/time fields...given the values adhere to a given format. See http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.to_datetime.html

Create a proper text field

In order to use the text to date/time converter our text columns need to have the appropriate format.



In [4]:

    
# function currying
def fixdatetime(fld):
    def _fix(s):
        date_col = '%s_date' % fld
        time_col = '%s_time' % fld
        if time_col in s.index:
            return str(s[date_col])+' '+str(s[time_col])
        else:
            return str(s[date_col])+' 00:00:00'
    return _fix

df.apply(fixdatetime('rpt'), axis=1)[:10]









    Out[4]:





0    09/06/2017 00:00:00
1    09/06/2017 00:00:00
2    09/06/2017 00:00:00
3    09/06/2017 00:00:00
4    09/06/2017 00:00:00
5    09/06/2017 00:00:00
6    09/06/2017 00:00:00
7    09/06/2017 00:00:00
8    09/06/2017 00:00:00
9    09/06/2017 00:00:00
dtype: object

Convert Columns



In [5]:

    
for col in ['rpt', 'occur', 'poss']:
    datser = df.apply(fixdatetime(col), axis=1)
    df['%s_dt'%col] = pd.to_datetime(datser, format="%m/%d/%Y %H:%M:%S", errors='coerce')



In [6]:

    
df.head()









    Out[6]:






  
    
      
      MI_PRINX
      offense_id
      rpt_date
      occur_date
      occur_time
      poss_date
      poss_time
      beat
      apt_office_prefix
      apt_office_num
      ...
      Avg Day
      loc_type
      UC2 Literal
      neighborhood
      npu
      x
      y
      rpt_dt
      occur_dt
      poss_dt
    
  
  
    
      0
      7693210
      172490115
      09/06/2017
      09/06/2017
      00:30:00
      09/06/2017
      00:35:00
      607
      NaN
      NaN
      ...
      Wed
      NaN
      AGG ASSAULT
      Custer/McDonough/Guice
      W
      -84.35850
      33.70839
      2017-09-06
      2017-09-06 00:30:00
      2017-09-06 00:35:00
    
    
      1
      7693211
      172490265
      09/06/2017
      09/05/2017
      11:15:00
      09/06/2017
      02:30:00
      512
      NaN
      NaN
      ...
      Tue
      99.0
      LARCENY-FROM VEHICLE
      Downtown
      M
      -84.39736
      33.74958
      2017-09-06
      2017-09-05 11:15:00
      2017-09-06 02:30:00
    
    
      2
      7693212
      172490322
      09/06/2017
      09/06/2017
      03:15:00
      09/06/2017
      03:45:00
      501
      NaN
      NaN
      ...
      Wed
      18.0
      LARCENY-FROM VEHICLE
      Atlantic Station
      E
      -84.39776
      33.79072
      2017-09-06
      2017-09-06 03:15:00
      2017-09-06 03:45:00
    
    
      3
      7693213
      172490390
      09/06/2017
      09/05/2017
      17:45:00
      09/06/2017
      04:57:00
      207
      NaN
      NaN
      ...
      Tue
      18.0
      LARCENY-FROM VEHICLE
      Brookwood Hills
      E
      -84.39361
      33.80774
      2017-09-06
      2017-09-05 17:45:00
      2017-09-06 04:57:00
    
    
      4
      7693214
      172490401
      09/06/2017
      09/05/2017
      17:00:00
      09/06/2017
      05:00:00
      203
      NaN
      NaN
      ...
      Tue
      18.0
      LARCENY-FROM VEHICLE
      Hills Park
      D
      -84.43337
      33.79848
      2017-09-06
      2017-09-05 17:00:00
      2017-09-06 05:00:00
    
  

5 rows × 26 columns

Beats and Zones

The City of Atlanta is divided into 6 zones. Each with 12 to 14 beats.

Let's create a separate column for the zones:



In [41]:

    
df['Zone'] = df['beat']//100



In [ ]:



In [ ]:

    
df['UC2 Literal'].unique()



In [42]:

    
df[df['UC2 Literal']=='LARCENY-FROM VEHICLE']









    Out[42]:






  
    
      
      MI_PRINX
      offense_id
      rpt_date
      occur_date
      occur_time
      poss_date
      poss_time
      beat
      apt_office_prefix
      apt_office_num
      ...
      UC2 Literal
      neighborhood
      npu
      x
      y
      rpt_dt
      occur_dt
      poss_dt
      Year
      Zone
    
  
  
    
      1
      7693211
      172490265
      09/06/2017
      09/05/2017
      11:15:00
      09/06/2017
      02:30:00
      512
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Downtown
      M
      -84.39736
      33.74958
      2017-09-06
      2017-09-05 11:15:00
      2017-09-06 02:30:00
      2017.0
      5
    
    
      2
      7693212
      172490322
      09/06/2017
      09/06/2017
      03:15:00
      09/06/2017
      03:45:00
      501
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Atlantic Station
      E
      -84.39776
      33.79072
      2017-09-06
      2017-09-06 03:15:00
      2017-09-06 03:45:00
      2017.0
      5
    
    
      3
      7693213
      172490390
      09/06/2017
      09/05/2017
      17:45:00
      09/06/2017
      04:57:00
      207
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Brookwood Hills
      E
      -84.39361
      33.80774
      2017-09-06
      2017-09-05 17:45:00
      2017-09-06 04:57:00
      2017.0
      2
    
    
      4
      7693214
      172490401
      09/06/2017
      09/05/2017
      17:00:00
      09/06/2017
      05:00:00
      203
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Hills Park
      D
      -84.43337
      33.79848
      2017-09-06
      2017-09-05 17:00:00
      2017-09-06 05:00:00
      2017.0
      2
    
    
      11
      7693221
      172490557
      09/06/2017
      09/05/2017
      21:00:00
      09/06/2017
      07:20:00
      207
      NaN
      323
      ...
      LARCENY-FROM VEHICLE
      Loring Heights
      E
      -84.40671
      33.79742
      2017-09-06
      2017-09-05 21:00:00
      2017-09-06 07:20:00
      2017.0
      2
    
    
      12
      7693222
      172490741
      09/06/2017
      09/06/2017
      07:30:00
      09/06/2017
      08:20:00
      506
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Midtown
      E
      -84.37102
      33.77686
      2017-09-06
      2017-09-06 07:30:00
      2017-09-06 08:20:00
      2017.0
      5
    
    
      13
      7693223
      172490781
      09/06/2017
      09/05/2017
      22:00:00
      09/06/2017
      05:00:00
      212
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Piedmont Heights
      F
      -84.37157
      33.80697
      2017-09-06
      2017-09-05 22:00:00
      2017-09-06 05:00:00
      2017.0
      2
    
    
      17
      7693227
      172490914
      09/06/2017
      09/05/2017
      19:30:00
      09/06/2017
      10:10:00
      408
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Venetian Hills
      S
      -84.44116
      33.72205
      2017-09-06
      2017-09-05 19:30:00
      2017-09-06 10:10:00
      2017.0
      4
    
    
      20
      7693230
      172491015
      09/06/2017
      09/06/2017
      07:00:00
      09/06/2017
      10:45:00
      208
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Peachtree Park
      B
      -84.36815
      33.84642
      2017-09-06
      2017-09-06 07:00:00
      2017-09-06 10:45:00
      2017.0
      2
    
    
      21
      7693231
      172491018
      09/06/2017
      09/06/2017
      10:30:00
      09/06/2017
      11:00:00
      413
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Ben Hill Pines
      P
      -84.50755
      33.68212
      2017-09-06
      2017-09-06 10:30:00
      2017-09-06 11:00:00
      2017.0
      4
    
    
      30
      7693240
      172491351
      09/06/2017
      09/06/2017
      14:10:00
      09/06/2017
      14:10:00
      401
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      West End
      T
      -84.41680
      33.74057
      2017-09-06
      2017-09-06 14:10:00
      2017-09-06 14:10:00
      2017.0
      4
    
    
      36
      7693246
      172491776
      09/06/2017
      09/06/2017
      17:00:00
      09/06/2017
      17:13:00
      109
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Center Hill
      J
      -84.47168
      33.77587
      2017-09-06
      2017-09-06 17:00:00
      2017-09-06 17:13:00
      2017.0
      1
    
    
      40
      7693250
      172491865
      09/06/2017
      09/06/2017
      14:00:00
      09/06/2017
      17:45:00
      605
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Grant Park
      W
      -84.37925
      33.74322
      2017-09-06
      2017-09-06 14:00:00
      2017-09-06 17:45:00
      2017.0
      6
    
    
      43
      7693253
      172492033
      09/06/2017
      09/03/2017
      08:30:00
      09/03/2017
      22:30:00
      206
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      South Tuxedo Park
      B
      -84.38104
      33.84316
      2017-09-06
      2017-09-03 08:30:00
      2017-09-03 22:30:00
      2017.0
      2
    
    
      50
      7693260
      172492215
      09/06/2017
      09/06/2017
      19:30:00
      09/06/2017
      21:15:00
      206
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Garden Hills
      B
      -84.37759
      33.83533
      2017-09-06
      2017-09-06 19:30:00
      2017-09-06 21:15:00
      2017.0
      2
    
    
      51
      7693261
      172492350
      09/06/2017
      09/06/2017
      19:30:00
      09/06/2017
      22:06:00
      505
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Midtown
      E
      -84.38533
      33.77831
      2017-09-06
      2017-09-06 19:30:00
      2017-09-06 22:06:00
      2017.0
      5
    
    
      55
      7693138
      172480028
      09/05/2017
      09/04/2017
      17:30:00
      09/04/2017
      23:30:00
      511
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Downtown
      M
      -84.39166
      33.75190
      2017-09-05
      2017-09-04 17:30:00
      2017-09-04 23:30:00
      2017.0
      5
    
    
      57
      7693140
      172480065
      09/05/2017
      09/04/2017
      17:00:00
      09/05/2017
      00:20:00
      508
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Downtown
      M
      -84.39178
      33.76002
      2017-09-05
      2017-09-04 17:00:00
      2017-09-05 00:20:00
      2017.0
      5
    
    
      59
      7693142
      172480144
      09/05/2017
      09/05/2017
      00:00:00
      09/05/2017
      01:00:00
      213
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Piedmont Heights
      F
      -84.37025
      33.79592
      2017-09-05
      2017-09-05 00:00:00
      2017-09-05 01:00:00
      2017.0
      2
    
    
      60
      7693143
      172480162
      09/05/2017
      09/05/2017
      00:00:00
      09/05/2017
      01:00:00
      213
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Lindridge/Martin Manor
      F
      -84.35601
      33.81123
      2017-09-05
      2017-09-05 00:00:00
      2017-09-05 01:00:00
      2017.0
      2
    
    
      62
      7693145
      172480270
      09/05/2017
      09/04/2017
      21:00:00
      09/05/2017
      02:26:00
      612
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      East Atlanta
      W
      -84.32549
      33.72743
      2017-09-05
      2017-09-04 21:00:00
      2017-09-05 02:26:00
      2017.0
      6
    
    
      64
      7693147
      172480383
      09/05/2017
      09/04/2017
      18:00:00
      09/05/2017
      03:42:00
      313
      NaN
      6104
      ...
      LARCENY-FROM VEHICLE
      South River Gardens
      Z
      -84.35549
      33.66061
      2017-09-05
      2017-09-04 18:00:00
      2017-09-05 03:42:00
      2017.0
      3
    
    
      67
      7693150
      172480528
      09/05/2017
      09/05/2017
      04:00:00
      09/05/2017
      06:05:00
      310
      NaN
      2601
      ...
      LARCENY-FROM VEHICLE
      Browns Mill Park
      Z
      -84.37006
      33.68838
      2017-09-05
      2017-09-05 04:00:00
      2017-09-05 06:05:00
      2017.0
      3
    
    
      69
      7693152
      172480682
      09/05/2017
      09/04/2017
      17:00:00
      09/05/2017
      08:00:00
      605
      NaN
      210
      ...
      LARCENY-FROM VEHICLE
      Capitol Gateway
      V
      -84.38612
      33.74513
      2017-09-05
      2017-09-04 17:00:00
      2017-09-05 08:00:00
      2017.0
      6
    
    
      70
      7693153
      172480709
      09/05/2017
      09/05/2017
      07:45:00
      09/05/2017
      07:50:00
      201
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      NaN
      C
      -84.45274
      33.82065
      2017-09-05
      2017-09-05 07:45:00
      2017-09-05 07:50:00
      2017.0
      2
    
    
      71
      7693154
      172480772
      09/05/2017
      09/03/2017
      23:00:00
      09/05/2017
      08:35:00
      310
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Browns Mill Park
      Z
      -84.36507
      33.68797
      2017-09-05
      2017-09-03 23:00:00
      2017-09-05 08:35:00
      2017.0
      3
    
    
      72
      7693155
      172480790
      09/05/2017
      09/05/2017
      05:45:00
      09/05/2017
      08:00:00
      301
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Adair Park
      V
      -84.40824
      33.73657
      2017-09-05
      2017-09-05 05:45:00
      2017-09-05 08:00:00
      2017.0
      3
    
    
      74
      7693157
      172480865
      09/05/2017
      09/04/2017
      20:00:00
      09/05/2017
      07:00:00
      612
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      East Atlanta
      W
      -84.34393
      33.73856
      2017-09-05
      2017-09-04 20:00:00
      2017-09-05 07:00:00
      2017.0
      6
    
    
      75
      7693158
      172480868
      09/05/2017
      09/05/2017
      21:13:00
      09/05/2017
      21:13:00
      210
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Lenox
      B
      -84.35947
      33.84800
      2017-09-05
      2017-09-05 21:13:00
      2017-09-05 21:13:00
      2017.0
      2
    
    
      76
      7693159
      172480883
      09/05/2017
      09/04/2017
      18:00:00
      09/05/2017
      09:19:00
      207
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Loring Heights
      E
      -84.39947
      33.79455
      2017-09-05
      2017-09-04 18:00:00
      2017-09-05 09:19:00
      2017.0
      2
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      17336
      7676041
      170021967
      01/02/2017
      01/02/2017
      19:00:00
      01/02/2017
      20:15:00
      602
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Inman Park
      N
      -84.34916
      33.76634
      2017-01-02
      2017-01-02 19:00:00
      2017-01-02 20:15:00
      2017.0
      6
    
    
      17340
      7676045
      170022145
      01/02/2017
      01/02/2017
      22:06:00
      01/02/2017
      22:30:00
      302
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Pittsburgh
      V
      -84.40221
      33.73365
      2017-01-02
      2017-01-02 22:06:00
      2017-01-02 22:30:00
      2017.0
      3
    
    
      17343
      7674777
      163522791
      01/02/2017
      12/18/2016
      01:00:00
      12/18/2016
      01:30:00
      303
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Mechanicsville
      V
      -84.39438
      33.73699
      2017-01-02
      2016-12-18 01:00:00
      2016-12-18 01:30:00
      2016.0
      3
    
    
      17345
      7675895
      170010079
      01/01/2017
      01/01/2017
      00:00:00
      01/01/2017
      00:10:00
      413
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      NaN
      P
      -84.50584
      33.68990
      2017-01-01
      2017-01-01 00:00:00
      2017-01-01 00:10:00
      2017.0
      4
    
    
      17346
      7675896
      170010151
      01/01/2017
      01/01/2017
      00:26:00
      01/31/2017
      12:17:00
      503
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Midtown
      E
      -84.39087
      33.78153
      2017-01-01
      2017-01-01 00:26:00
      2017-01-31 12:17:00
      2017.0
      5
    
    
      17348
      7675898
      170010193
      01/01/2017
      12/31/2016
      22:30:00
      01/01/2017
      00:45:00
      402
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Oakland City
      S
      -84.42916
      33.73214
      2017-01-01
      2016-12-31 22:30:00
      2017-01-01 00:45:00
      2016.0
      4
    
    
      17350
      7675900
      170010232
      01/01/2017
      12/30/2016
      23:30:00
      12/31/2016
      23:50:00
      605
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Oakland
      W
      -84.37875
      33.74803
      2017-01-01
      2016-12-30 23:30:00
      2016-12-31 23:50:00
      2016.0
      6
    
    
      17351
      7675901
      170010294
      01/01/2017
      12/31/2016
      19:00:00
      01/01/2017
      00:30:00
      507
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Downtown
      M
      -84.39475
      33.74875
      2017-01-01
      2016-12-31 19:00:00
      2017-01-01 00:30:00
      2016.0
      5
    
    
      17358
      7675908
      170010593
      01/01/2017
      12/31/2016
      01:00:00
      01/01/2017
      03:03:00
      309
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Perkerson
      X
      -84.40744
      33.68263
      2017-01-01
      2016-12-31 01:00:00
      2017-01-01 03:03:00
      2016.0
      3
    
    
      17361
      7675911
      170010712
      01/01/2017
      01/01/2017
      01:30:00
      01/01/2017
      03:00:00
      212
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Morningside/Lenox Park
      F
      -84.36122
      33.81038
      2017-01-01
      2017-01-01 01:30:00
      2017-01-01 03:00:00
      2017.0
      2
    
    
      17366
      7675916
      170010894
      01/01/2017
      12/15/2016
      08:00:00
      12/15/2016
      12:00:00
      405
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Harland Terrace
      I
      -84.48570
      33.75177
      2017-01-01
      2016-12-15 08:00:00
      2016-12-15 12:00:00
      2016.0
      4
    
    
      17372
      7675922
      170011007
      01/01/2017
      12/31/2016
      16:00:00
      01/01/2017
      09:00:00
      210
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Lenox
      B
      -84.35947
      33.84800
      2017-01-01
      2016-12-31 16:00:00
      2017-01-01 09:00:00
      2016.0
      2
    
    
      17373
      7675923
      170011015
      01/01/2017
      12/31/2016
      23:00:00
      01/01/2017
      09:00:00
      612
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      East Atlanta
      W
      -84.33704
      33.74340
      2017-01-01
      2016-12-31 23:00:00
      2017-01-01 09:00:00
      2016.0
      6
    
    
      17376
      7675926
      170011070
      01/01/2017
      01/01/2017
      10:25:00
      01/01/2017
      10:25:00
      101
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      The Villages at Castleberry H
      T
      -84.40523
      33.74505
      2017-01-01
      2017-01-01 10:25:00
      2017-01-01 10:25:00
      2017.0
      1
    
    
      17379
      7675929
      170011129
      01/01/2017
      12/31/2016
      12:00:00
      01/01/2017
      11:00:00
      206
      NaN
      103
      ...
      LARCENY-FROM VEHICLE
      Buckhead Village
      B
      -84.37533
      33.83888
      2017-01-01
      2016-12-31 12:00:00
      2017-01-01 11:00:00
      2016.0
      2
    
    
      17381
      7675931
      170011171
      01/01/2017
      12/31/2016
      21:00:00
      01/01/2017
      11:00:00
      504
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Marietta Street Artery
      E
      -84.41050
      33.77957
      2017-01-01
      2016-12-31 21:00:00
      2017-01-01 11:00:00
      2016.0
      5
    
    
      17386
      7675936
      170011255
      01/01/2017
      01/01/2017
      12:00:00
      01/01/2017
      13:00:00
      612
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      East Atlanta
      W
      -84.32324
      33.72602
      2017-01-01
      2017-01-01 12:00:00
      2017-01-01 13:00:00
      2017.0
      6
    
    
      17394
      7675958
      170011751
      01/01/2017
      01/01/2017
      15:30:00
      01/01/2017
      17:45:00
      508
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Downtown
      M
      -84.38905
      33.75832
      2017-01-01
      2017-01-01 15:30:00
      2017-01-01 17:45:00
      2017.0
      5
    
    
      17395
      7675959
      170011863
      01/01/2017
      12/31/2016
      10:30:00
      01/01/2017
      12:00:00
      112
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Collier Heights
      I
      -84.49110
      33.78133
      2017-01-01
      2016-12-31 10:30:00
      2017-01-01 12:00:00
      2016.0
      1
    
    
      17399
      7675963
      170012028
      01/01/2017
      01/01/2017
      14:30:00
      01/01/2017
      19:30:00
      102
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Vine City
      L
      -84.40771
      33.75916
      2017-01-01
      2017-01-01 14:30:00
      2017-01-01 19:30:00
      2017.0
      1
    
    
      17402
      7675966
      170012145
      01/01/2017
      01/01/2017
      20:30:00
      01/01/2017
      20:45:00
      209
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      North Buckhead
      B
      -84.38135
      33.85547
      2017-01-01
      2017-01-01 20:30:00
      2017-01-01 20:45:00
      2017.0
      2
    
    
      17403
      7675967
      170012182
      01/01/2017
      01/01/2017
      21:05:00
      01/01/2017
      21:07:00
      203
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Blandtown
      D
      -84.42561
      33.80058
      2017-01-01
      2017-01-01 21:05:00
      2017-01-01 21:07:00
      2017.0
      2
    
    
      17406
      7675970
      170012253
      01/01/2017
      01/01/2017
      21:00:00
      01/01/2017
      22:00:00
      212
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Piedmont Heights
      F
      -84.36831
      33.80965
      2017-01-01
      2017-01-01 21:00:00
      2017-01-01 22:00:00
      2017.0
      2
    
    
      17407
      7675971
      170012261
      01/01/2017
      01/01/2017
      22:00:00
      01/01/2017
      22:21:00
      414
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Princeton Lakes
      P
      -84.50908
      33.65571
      2017-01-01
      2017-01-01 22:00:00
      2017-01-01 22:21:00
      2017.0
      4
    
    
      17414
      7675944
      170011358
      01/01/2017
      12/31/2016
      19:00:00
      01/01/2017
      11:00:00
      505
      NaN
      1105
      ...
      LARCENY-FROM VEHICLE
      Midtown
      E
      -84.38424
      33.77800
      2017-01-01
      2016-12-31 19:00:00
      2017-01-01 11:00:00
      2016.0
      5
    
    
      17415
      7675945
      170011435
      01/01/2017
      12/31/2016
      19:05:00
      12/31/2016
      22:30:00
      504
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Marietta Street Artery
      E
      -84.40612
      33.77425
      2017-01-01
      2016-12-31 19:05:00
      2016-12-31 22:30:00
      2016.0
      5
    
    
      17416
      7675946
      170011504
      01/01/2017
      01/01/2017
      15:00:00
      01/01/2017
      15:30:00
      602
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Inman Park
      N
      -84.35289
      33.76341
      2017-01-01
      2017-01-01 15:00:00
      2017-01-01 15:30:00
      2017.0
      6
    
    
      17418
      7675948
      170011525
      01/01/2017
      01/01/2017
      15:37:00
      01/01/2017
      15:37:00
      303
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Mechanicsville
      V
      -84.39486
      33.73760
      2017-01-01
      2017-01-01 15:37:00
      2017-01-01 15:37:00
      2017.0
      3
    
    
      17419
      7675949
      170011531
      01/01/2017
      01/01/2017
      14:15:00
      01/01/2017
      15:45:00
      610
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Kirkwood
      O
      -84.31645
      33.76408
      2017-01-01
      2017-01-01 14:15:00
      2017-01-01 15:45:00
      2017.0
      6
    
    
      17422
      7675952
      170011645
      01/01/2017
      12/31/2016
      20:00:00
      01/01/2017
      14:00:00
      302
      NaN
      NaN
      ...
      LARCENY-FROM VEHICLE
      Pittsburgh
      V
      -84.40465
      33.73026
      2017-01-01
      2016-12-31 20:00:00
      2017-01-01 14:00:00
      2016.0
      3
    
  

6171 rows × 28 columns



In [ ]:

    
df.occur_dt.map(lambda d: d.year).unique()



In [43]:

    
df['Year'] = df.occur_dt.map(lambda d: d.year)
df2 = df[(df.Year>=2010) & (df.Year<=2017)]
df2.shape, df.shape









    Out[43]:





((17418, 28), (17425, 28))

Descriptive Statistics

https://pandas.pydata.org/pandas-docs/stable/basics.html#descriptive-statistics



In [ ]:



In [ ]:

Time Series



In [44]:

    
df_LarcenyFromVehicle = df2[(df2['UC2 Literal']=='LARCENY-FROM VEHICLE')&(df2.Year==2017)].copy()
agr_LarcenyFromVehicle = df_LarcenyFromVehicle.set_index('occur_dt').resample('W').offense_id.count()
agr_LarcenyFromVehicle









    Out[44]:





occur_dt
2017-01-01     21
2017-01-08    157
2017-01-15    154
2017-01-22    178
2017-01-29    182
2017-02-05    161
2017-02-12    137
2017-02-19    190
2017-02-26    135
2017-03-05    142
2017-03-12    112
2017-03-19    118
2017-03-26    117
2017-04-02    124
2017-04-09    165
2017-04-16    191
2017-04-23    181
2017-04-30    186
2017-05-07    198
2017-05-14    180
2017-05-21    191
2017-05-28    190
2017-06-04    171
2017-06-11    196
2017-06-18    180
2017-06-25    194
2017-07-02    183
2017-07-09    188
2017-07-16    211
2017-07-23    196
2017-07-30    188
2017-08-06    167
2017-08-13    161
2017-08-20    226
2017-08-27    201
2017-09-03    187
2017-09-10     61
Freq: W-SUN, Name: offense_id, dtype: int64



In [30]:

    
df_LarcenyFromVehicle["Hour"] = df_LarcenyFromVehicle.occur_dt.map(lambda d: d.hour)



In [33]:

    
df_LarcenyFromVehicle.groupby("Hour").offense_id.count()









    Out[33]:





Hour
0     290
1     220
2     121
3      94
4      55
5      57
6      85
7     113
8     210
9     198
10    195
11    258
12    317
13    278
14    254
15    310
16    288
17    339
18    431
19    483
20    441
21    363
22    403
23    317
Name: offense_id, dtype: int64



In [53]:

    
hourly = df_LarcenyFromVehicle.resample('H', on='occur_dt').offense_id.count()



In [71]:

    
hourly.reset_index().occur_dt.map(lambda d: d.week)









    Out[71]:





0       52
1       52
2       52
3       52
4       52
5       52
6       52
7       52
8       52
9       52
10      52
11      52
12      52
13      52
14      52
15      52
16      52
17      52
18      52
19      52
20      52
21      52
22      52
23      52
24       1
25       1
26       1
27       1
28       1
29       1
        ..
5942    36
5943    36
5944    36
5945    36
5946    36
5947    36
5948    36
5949    36
5950    36
5951    36
5952    36
5953    36
5954    36
5955    36
5956    36
5957    36
5958    36
5959    36
5960    36
5961    36
5962    36
5963    36
5964    36
5965    36
5966    36
5967    36
5968    36
5969    36
5970    36
5971    36
Name: occur_dt, dtype: int64



In [64]:

    
df3 = pd.DataFrame({"N": hourly})
##df3['Day'] = df3.reset_index().occur_dt ##.map(lambda d: d.day)
df3









    Out[64]:






  
    
      
      N
    
    
      occur_dt
      
    
  
  
    
      2017-01-01 00:00:00
      2
    
    
      2017-01-01 01:00:00
      1
    
    
      2017-01-01 02:00:00
      0
    
    
      2017-01-01 03:00:00
      0
    
    
      2017-01-01 04:00:00
      0
    
    
      2017-01-01 05:00:00
      0
    
    
      2017-01-01 06:00:00
      0
    
    
      2017-01-01 07:00:00
      0
    
    
      2017-01-01 08:00:00
      0
    
    
      2017-01-01 09:00:00
      0
    
    
      2017-01-01 10:00:00
      1
    
    
      2017-01-01 11:00:00
      0
    
    
      2017-01-01 12:00:00
      2
    
    
      2017-01-01 13:00:00
      0
    
    
      2017-01-01 14:00:00
      3
    
    
      2017-01-01 15:00:00
      3
    
    
      2017-01-01 16:00:00
      0
    
    
      2017-01-01 17:00:00
      0
    
    
      2017-01-01 18:00:00
      1
    
    
      2017-01-01 19:00:00
      0
    
    
      2017-01-01 20:00:00
      2
    
    
      2017-01-01 21:00:00
      3
    
    
      2017-01-01 22:00:00
      2
    
    
      2017-01-01 23:00:00
      1
    
    
      2017-01-02 00:00:00
      1
    
    
      2017-01-02 01:00:00
      0
    
    
      2017-01-02 02:00:00
      0
    
    
      2017-01-02 03:00:00
      1
    
    
      2017-01-02 04:00:00
      1
    
    
      2017-01-02 05:00:00
      0
    
    
      ...
      ...
    
    
      2017-09-05 14:00:00
      0
    
    
      2017-09-05 15:00:00
      2
    
    
      2017-09-05 16:00:00
      0
    
    
      2017-09-05 17:00:00
      2
    
    
      2017-09-05 18:00:00
      3
    
    
      2017-09-05 19:00:00
      3
    
    
      2017-09-05 20:00:00
      2
    
    
      2017-09-05 21:00:00
      2
    
    
      2017-09-05 22:00:00
      1
    
    
      2017-09-05 23:00:00
      0
    
    
      2017-09-06 00:00:00
      0
    
    
      2017-09-06 01:00:00
      0
    
    
      2017-09-06 02:00:00
      0
    
    
      2017-09-06 03:00:00
      1
    
    
      2017-09-06 04:00:00
      0
    
    
      2017-09-06 05:00:00
      0
    
    
      2017-09-06 06:00:00
      0
    
    
      2017-09-06 07:00:00
      2
    
    
      2017-09-06 08:00:00
      0
    
    
      2017-09-06 09:00:00
      0
    
    
      2017-09-06 10:00:00
      1
    
    
      2017-09-06 11:00:00
      0
    
    
      2017-09-06 12:00:00
      0
    
    
      2017-09-06 13:00:00
      0
    
    
      2017-09-06 14:00:00
      2
    
    
      2017-09-06 15:00:00
      0
    
    
      2017-09-06 16:00:00
      0
    
    
      2017-09-06 17:00:00
      1
    
    
      2017-09-06 18:00:00
      0
    
    
      2017-09-06 19:00:00
      2
    
  

5972 rows × 1 columns



In [73]:

    
ls









    



COBRA_Data_Dictionary.csv  CrimeData_orig.ipynb  datadict.py  HW06/  README.md

Plotting

The Pandas package provides a number of plotting features. Let's try them out.



In [ ]:

    
fig = plt.figure(figsize=(10,6)) # 10inx10in
#plt.plot(resdf['BURGLARY-RESIDENCE'].index, resdf['BURGLARY-RESIDENCE'])
plt.scatter(resdf['BURGLARY-RESIDENCE'].index, resdf['BURGLARY-RESIDENCE'], marker='x')
plt.scatter(resdf['BURGLARY-NONRES'].index, resdf['BURGLARY-NONRES'], marker='o')

plt.ylim(0, 500)
plt.title('BURGLARY-RESIDENCE')
plt.xticks(range(13), ['', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
fig.savefig('BurglaryResidence_over_month.svg')
x = 1



In [ ]:

    
def getTheMonth(x):
    return x.month

df['occur_month'] = df['occur_ts'].map(getTheMonth)



In [ ]:

    
resdf = df.groupby(['UC2 Literal', 'occur_month']).offense_id.count()
fig = plt.figure(figsize=(10,6))
plt.scatter(resdf['BURGLARY-RESIDENCE'].index, resdf['BURGLARY-RESIDENCE'], marker='x')
plt.ylim(0, 500)
plt.title('BURGLARY-RESIDENCE')
plt.xticks(range(13), ['', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
plt.savefig('quiz3-burglary-residence.png')

Seasonal Model



In [ ]:

    
fig = plt.figure(figsize=(40,30))
crime_types = crime_year.index.levels[0]
years = crime_year.index.levels[1]
for c in range(len(crime_types)):
    y_max = max(crime_year.loc[crime_types[c]])
    
    plt.subplot(4,3,c+1)
    plt.hlines(crime_year.loc[crime_types[c]].iloc[-1]*100/y_max, years[0], years[-1], linestyles="dashed", color="r")
    plt.bar(crime_year.loc[crime_types[c]].index, crime_year.loc[crime_types[c]]*100/y_max, label=crime_types[c], alpha=0.5)
    ##plt.legend()
    plt.ylim(0, 100)
    plt.xticks(years+0.4, [str(int(y)) for y in years], rotation=0, fontsize=24)
    plt.yticks([0,20,40,60,80,100], ['0%','20%','40%','60%','80%','100%'], fontsize=24)
    plt.title(crime_types[c], fontsize=30)
    None



In [ ]:

    
c = 3 ## 'BURGLARY-RESIDENCE'
resburglaries = crime_year_month.loc[crime_types[c]]
fig = plt.figure(figsize=(20,10))
for y in years:
    plt.plot(resburglaries.loc[y].index, resburglaries.loc[y], label=("%4.0f"%y))
plt.legend()
plt.title("Seasonal Trends - %s"%crime_types[c], fontsize=20)
plt.xticks(range(13), ['', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
plt.xlim(0,13)
None



In [ ]:

    
c = 3 ## 'BURGLARY-RESIDENCE'
fig = plt.figure(figsize=(20,10))
for y in years:
    avg = resburglaries.loc[y].mean()
    std = resburglaries.loc[y].std()
    ##plt.hlines(avg, 1, 13, linestyle='dashed')
    plt.plot(resburglaries.loc[y].index, (resburglaries.loc[y]-avg)/std, label=("%4.0f"%y))
plt.legend()
plt.title("Seasonal Trends - %s (normalized)"%crime_types[c], fontsize=20)
plt.xticks(list(range(1,13)), ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
plt.xlim(0,13)
plt.ylabel("Standard deviations $\sigma_y$")
None



In [ ]:



In [ ]:



In [ ]:

	MI_PRINX	offense_id	rpt_date	occur_date	occur_time	poss_date	poss_time	beat	apt_office_prefix	apt_office_num	...	Avg Day	loc_type	UC2 Literal	neighborhood	npu	x	y	rpt_dt	occur_dt	poss_dt
0	7693210	172490115	09/06/2017	09/06/2017	00:30:00	09/06/2017	00:35:00	607	NaN	NaN	...	Wed	NaN	AGG ASSAULT	Custer/McDonough/Guice	W	-84.35850	33.70839	2017-09-06	2017-09-06 00:30:00	2017-09-06 00:35:00
1	7693211	172490265	09/06/2017	09/05/2017	11:15:00	09/06/2017	02:30:00	512	NaN	NaN	...	Tue	99.0	LARCENY-FROM VEHICLE	Downtown	M	-84.39736	33.74958	2017-09-06	2017-09-05 11:15:00	2017-09-06 02:30:00
2	7693212	172490322	09/06/2017	09/06/2017	03:15:00	09/06/2017	03:45:00	501	NaN	NaN	...	Wed	18.0	LARCENY-FROM VEHICLE	Atlantic Station	E	-84.39776	33.79072	2017-09-06	2017-09-06 03:15:00	2017-09-06 03:45:00
3	7693213	172490390	09/06/2017	09/05/2017	17:45:00	09/06/2017	04:57:00	207	NaN	NaN	...	Tue	18.0	LARCENY-FROM VEHICLE	Brookwood Hills	E	-84.39361	33.80774	2017-09-06	2017-09-05 17:45:00	2017-09-06 04:57:00
4	7693214	172490401	09/06/2017	09/05/2017	17:00:00	09/06/2017	05:00:00	203	NaN	NaN	...	Tue	18.0	LARCENY-FROM VEHICLE	Hills Park	D	-84.43337	33.79848	2017-09-06	2017-09-05 17:00:00	2017-09-06 05:00:00

	N
occur_dt
2017-01-01 00:00:00	2
2017-01-01 01:00:00	1
2017-01-01 02:00:00	0
2017-01-01 03:00:00	0
2017-01-01 04:00:00	0
2017-01-01 05:00:00	0
2017-01-01 06:00:00	0
2017-01-01 07:00:00	0
2017-01-01 08:00:00	0
2017-01-01 09:00:00	0
2017-01-01 10:00:00	1
2017-01-01 11:00:00	0
2017-01-01 12:00:00	2
2017-01-01 13:00:00	0
2017-01-01 14:00:00	3
2017-01-01 15:00:00	3
2017-01-01 16:00:00	0
2017-01-01 17:00:00	0
2017-01-01 18:00:00	1
2017-01-01 19:00:00	0
2017-01-01 20:00:00	2
2017-01-01 21:00:00	3
2017-01-01 22:00:00	2
2017-01-01 23:00:00	1
2017-01-02 00:00:00	1
2017-01-02 01:00:00	0
2017-01-02 02:00:00	0
2017-01-02 03:00:00	1
2017-01-02 04:00:00	1
2017-01-02 05:00:00	0
...	...
2017-09-05 14:00:00	0
2017-09-05 15:00:00	2
2017-09-05 16:00:00	0
2017-09-05 17:00:00	2
2017-09-05 18:00:00	3
2017-09-05 19:00:00	3
2017-09-05 20:00:00	2
2017-09-05 21:00:00	2
2017-09-05 22:00:00	1
2017-09-05 23:00:00	0
2017-09-06 00:00:00	0
2017-09-06 01:00:00	0
2017-09-06 02:00:00	0
2017-09-06 03:00:00	1
2017-09-06 04:00:00	0
2017-09-06 05:00:00	0
2017-09-06 06:00:00	0
2017-09-06 07:00:00	2
2017-09-06 08:00:00	0
2017-09-06 09:00:00	0
2017-09-06 10:00:00	1
2017-09-06 11:00:00	0
2017-09-06 12:00:00	0
2017-09-06 13:00:00	0
2017-09-06 14:00:00	2
2017-09-06 15:00:00	0
2017-09-06 16:00:00	0
2017-09-06 17:00:00	1
2017-09-06 18:00:00	0
2017-09-06 19:00:00	2