In [2]:
library(ggplot2)
library(ggmap)
library(sp)
library(maptools)
library(rgdal)
library(rgeos)
library(RColorBrewer)
library(dplyr)
options(jupyter.plot_mimetypes = 'image/png')
In [3]:
crime = read.csv('train.csv')
str(crime)
In [4]:
#remove those with Y=90.0
crime = crime[crime$Y!=90.0,]
The package ggmap in R makes mapping much easier. The function get_map() can get map data from google map, openstreetmap, at specificied locations and zoom level, and style. Then use ggplot() to add layers of data on top of the map.
In [5]:
locations = c(left = -122.5222,
bottom = 37.7073,
right = -122.3481,
top = 37.8381)
map_data = get_map(location=locations, zoom=12, source='osm',color='bw')
In [6]:
ggmap(map_data,extent='device') +
geom_point(aes(x=X,y=Y),data=crime,alpha=0.1,color='red',size=0.1)
The aggregate plot of all crimes is not very informative. The function 'map_crime' can plot selected category or categories of crime, to make it easier to visualize the locations of a particular type of crime.
In [8]:
map_crime = function(df, categories){
filtered = filter(df, Category %in% categories)
plot = ggmap(map_data, extent='device') + geom_point(data=filtered, aes(x=X,y=Y,color=Category),alpha=0.1,size=0.3)
return(plot)
}
In [9]:
map_crime(crime, 'ASSAULT')
In [10]:
map_crime(crime, c('ASSAULT','DRUG/NARCOTIC','BURGLARY'))
With density plot, it is clear that tenderloin is the hotspot for crime.
Among three major categories, assult, burglary, and drug/narcotics, drug/narcotics is very concentrated in the tenderloin district, whereas assault and burglary are more spread out.
In [44]:
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime, n=200,geom = "polygon")
ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
ggtitle('All crimes in SF')
In [13]:
crime_subset = filter(crime, Category %in% c('ASSAULT','DRUG/NARCOTIC','BURGLARY'))
contours <- stat_density2d(
aes(x = X, y = Y, fill = ..level.., alpha=..level..),
size = 0.1, data = crime_subset, n=200,geom = "polygon")
ggmap(map_data, extent='device') + contours +
scale_alpha_continuous(range=c(0.1,0.5), guide='none') +
scale_fill_gradient2('Crime\nDensity',low = "white", mid = "orange", high = "red") +
facet_wrap(~Category)