Mapping San Francisco crime in R

The data is from the Kaggle Competition, San Francisco Crime, with Dates, cataegory, and location of crime in San Francisco over ten years. For this notebook, I focus on location, using longitude/latitude and geospatial packages in R.


In [2]:
library(ggplot2)
library(ggmap)
library(sp)
library(maptools)
library(rgdal)
library(rgeos)
library(RColorBrewer)
library(dplyr)
options(jupyter.plot_mimetypes = 'image/png')

In [3]:
crime = read.csv('train.csv')
str(crime)


'data.frame':	878049 obs. of  9 variables:
 $ Dates     : Factor w/ 389257 levels "2003-01-06 00:01:00",..: 389257 389257 389256 389255 389255 389255 389255 389255 389254 389254 ...
 $ Category  : Factor w/ 39 levels "ARSON","ASSAULT",..: 38 22 22 17 17 17 37 37 17 17 ...
 $ Descript  : Factor w/ 879 levels "ABANDONMENT OF CHILD",..: 867 811 811 405 405 407 740 740 405 405 ...
 $ DayOfWeek : Factor w/ 7 levels "Friday","Monday",..: 7 7 7 7 7 7 7 7 7 7 ...
 $ PdDistrict: Factor w/ 10 levels "BAYVIEW","CENTRAL",..: 5 5 5 5 6 3 3 1 7 2 ...
 $ Resolution: Factor w/ 17 levels "ARREST, BOOKED",..: 1 1 1 12 12 12 12 12 12 12 ...
 $ Address   : Factor w/ 23228 levels "0 Block of  HARRISON ST",..: 19791 19791 22698 4267 1844 1506 13323 18055 11385 17659 ...
 $ X         : num  -122 -122 -122 -122 -122 ...
 $ Y         : num  37.8 37.8 37.8 37.8 37.8 ...

In [4]:
#remove those with Y=90.0
crime = crime[crime$Y!=90.0,]

Use ggmap to plot crime locations

The package ggmap in R makes mapping much easier. The function get_map() can get map data from google map, openstreetmap, at specificied locations and zoom level, and style. Then use ggplot() to add layers of data on top of the map.


In [5]:
locations = c(left = -122.5222, 
                bottom = 37.7073, 
                right = -122.3481,
                top = 37.8381)
map_data = get_map(location=locations, zoom=12, source='osm',color='bw')

Density Plot of Major Crime Categories

The largest category, larceny/theft has two major clsuters, tenderloin and SOMA
Assault, burglary, warrants, and suspicious OCC are concentrated around tenderloin, but not to the same degree as drug/narcotics,
Vehicle theft, vandalism are more spread out
Prostitution is concentrated in a few blocks


In [6]:
cat = 'LARCENY/THEFT'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [7]:
cat = 'ASSAULT'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [8]:
cat = 'DRUG/NARCOTIC'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [9]:
cat = 'VEHICLE THEFT'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [10]:
cat = 'VANDALISM'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [11]:
cat = 'WARRANTS'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [12]:
cat = 'BURGLARY'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [13]:
cat = 'SUSPICIOUS OCC'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [14]:
cat = 'ROBBERY'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [15]:
cat = 'FRAUD'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [16]:
cat = 'WEAPON LAWS'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [17]:
cat = 'PROSTITUTION'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [18]:
cat = 'SEX OFFENSES FORCIBLE'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [19]:
cat = 'DISORDERLY CONDUCT'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [20]:
cat = 'DRUNKENNESS'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [21]:
cat = 'DRIVING UNDER THE INFLUENCE'
crime_subset = filter(crime, Category == cat)
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")

ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle(cat)



In [ ]: