This project examines crime data in Detroit, Michigan from January 1, 2009 to present using Detroit Open Data. There are several datasets of interest contained on the Detroit Open Data website. The main crime file has over 1 million records and is updated frequently. Other datasets used contain locational data for schools, police stations, and fire stations within Detroit city limits.
This version: Dave Backus messing around to see if we can overcome the row limits on downloading data.
In [4]:
    
import pandas as pd             # data package
import matplotlib.pyplot as plt # graphics 
import sys                      # system module, used to get Python version 
import datetime as dt           # date tools, used to note current date
#import geopy as geo             # geographical package
%matplotlib inline 
print('\nPython version: ', sys.version) 
print('Pandas version: ', pd.__version__)
print("Today's date:", dt.date.today())
    
    
The datasets are accessed live from Detroit Open Data in the json format. Each data file contains a link which explains how to access the data (each dataset provides a link, shown as url, url1, url2, url3 below). For one the datasets, a link was not available, though I was able to create a link by finding the resource number for the data.
Follow these instructions to access the data:
Follow these instructions to access the Detroit School data.
In [ ]:
    
%%time
url = 'https://data.detroitmi.gov/resource/i9ph-uyrp.json'
crime = pd.read_json(url)
crime = crime.rename(columns={'caseid':'Case ID',
                              'address':'Address',
                              'hour':'Hour',
                              'incidentdate':'Incident Date',
                              'lat':'Latitude',
                              'lon':'Longitude',
                              'neighborhood':'Neighborhood',
                              'category':'Category',
                              'offensedescription':'Offense Description'})
print('Dimensions:', crime.shape)
    
In [6]:
    
csv = pd.read_csv('DPD__All_Crime_Incidents__Provisional_.csv')
print('Dimensions:', csv.shape)
    
    
    
In [2]:
    
crime = crime[['Case ID','Longitude','Latitude','Address','Incident Date','Hour','Neighborhood','Category','Offense Description']].set_index('Case ID')
crime.head(2)
    
    Out[2]:
In [3]:
    
url1 = 'https://data.detroitmi.gov/resource/3n6r-g9kp.json'
police = pd.read_json(url1)
police = police.rename(columns={'address_1':'Address',
                                'zip_code':'Zip Code',
                                'id':'ID'})
police.insert(1, 'Longitude', 0.0)
police.insert(2, 'Latitude', 0.0)
for (i, ps) in police.iterrows():
    # Pull out dictionary
    curr_dict = ps['location']
    # Pull out coordinates
    coord = curr_dict['coordinates']
    # Set value just sets 
    police.set_value(i, 'Longitude', coord[0])
    police.set_value(i, 'Latitude', coord[1])
    
police = police[['ID','Longitude','Latitude','Address','Zip Code']].set_index('ID')
police.head(2)
    
    Out[3]:
In [4]:
    
url2 = 'https://data.detroitmi.gov/resource/hz79-58xh.json'
fire = pd.read_json(url2)
fire = fire.rename(columns={'station':'Station',
                            'full_address_address':'Address',
                            'full_address_zip':'Zip Code'})
fire.insert(1, 'Longitude', 0.0)
fire.insert(2, 'Latitude', 0.0)
for (i, fs) in fire.iterrows():
    # Pull out dictionary
    curr_dict = fs['full_address']
    # Pull out coordinates
    coord = curr_dict['coordinates']
    # Set value just sets 
    fire.set_value(i, 'Longitude', coord[0])
    fire.set_value(i, 'Latitude', coord[1])
fire = fire[['Station','Longitude','Latitude','Address','Zip Code']].set_index('Station')
fire.head(2)
    
    Out[4]:
In [5]:
    
url3 = 'https://data.detroitmi.gov/resource/8xpr-6ij9.json'
school = pd.read_json(url3)
school = school.rename(columns={'entityoffi':'School',
                                'the_geom':'Location',
                                'entityphys':'Address',
                                'entityph_4':'Zip Code'})
school.insert(1, 'Longitude', 0.0)
school.insert(2, 'Latitude', 0.0)
for (i, s) in school.iterrows():
    # Pull out dictionary
    curr_dict = s['Location']
    # Pull out coordinates
    coord = curr_dict['coordinates']
    # Set value just sets 
    school.set_value(i, 'Longitude', coord[0])
    school.set_value(i, 'Latitude', coord[1])
school = school[['School', 'Longitude', 'Latitude', 'Address', 'Zip Code']].set_index('School')
school.head(2)
    
    Out[5]:
In [6]:
    
crime.count()
    
    Out[6]:
In [ ]: