We will first load the SQL extension:
In [12]:
%load_ext sql
The sql extension is already loaded. To reload it, use:
%reload_ext sql
We will connect to a database which lives on AWS. The format for connecting to a database is:
database_type://username:password@host/database_name
Also, each sql command requires to begin with the magic %sql.
In [13]:
%sql postgresql://dssg_student:password@seds-sql.csya4zsfb6y4.us-east-1.rds.amazonaws.com/dssg2016
Out[13]:
'Connected: dssg_student@dssg2016'
The tables in the database which we are interested in are:
* seattlecrimenincidents
* census_data
Let's look at the data! We can view the first 10 columns of a table:
In [14]:
%sql select * from seattlecrimeincidents limit 10
10 rows affected.
Out[14]:
Offense Code
Offense Code Extension
Offense Type
Summary Offense Code
Summarized Offense Description
Date Reported
Occurred Date or Date Range Start
Occurred Date Range End
Hundred Block Location
District/Sector
Zone/Beat
census tract 2000
longitude
latitude
month
year
gid
geom
geom_utm
None
18
DISPUTE-OTH
None
DISPUTE
2015-01-26 13:25:00
2015-01-25 20:00:00
2015-01-25 20:00:00
MAYNARD AV S / S LUCILE ST
O
O2
10900.2042
-122.3253949
47.55337429
1
2015
56
0101000020E610000041722145D3945EC0A5FFFEF7D4C64740
0101000020850E0000979B9D69C6CE20419081F48370175441
None
21
DISTURBANCE-OTH
None
DISTURBANCE
2015-01-29 14:32:00
2015-01-29 14:29:00
None
20XX BLOCK OF 5 AV
M
M2
7200.1058
-122.3398236
47.61411132
1
2015
1067
0101000020E61000002D1A7CABBF955EC06FC021339BCE4740
0101000020850E0000D749456FD8C52041068E86B7051E5441
None
21
DISTURBANCE-OTH
None
DISTURBANCE
2015-01-22 04:35:00
2015-01-22 04:35:00
None
105XX BLOCK OF AURORA AV N
N
N3
1300.2002
-122.3447064
47.70579857
1
2015
468
0101000020E610000034AE6EAB0F965EC056DB879B57DA4740
0101000020850E00009ABDE7414EC220416114B96FF8275441
None
21
DISTURBANCE-OTH
None
DISTURBANCE
2015-01-17 01:21:00
2015-01-17 01:21:00
None
NW 100 ST / 12 AV NW
N
N1
1400.4013
-122.3714156
47.70152724
1
2015
976
0101000020E61000005ACEEF45C5975EC0CD8604A5CBD94740
0101000020850E00003D8C439CAEB22041891D459B7D275441
1313
0
ASSLT-NONAGG
1300
ASSAULT
2015-02-02 06:48:00
2015-02-02 06:48:00
None
3XX BLOCK OF 9 AV
G
G1
8500.3001
-122.3234439
47.60412808
2
2015
1268
0101000020E6100000CE250B4EB3945EC0FF18A51154CD4740
0101000020850E0000754B04A589CF2041895883FDF21C5441
None
21
DISTURBANCE-OTH
None
DISTURBANCE
2015-01-17 01:49:00
2015-01-17 01:45:00
None
22XX BLOCK OF 2 AV
D
D1
8001.2009
-122.3450035
47.6136555
1
2015
1398
0101000020E61000005B608F8914965EC059C16F438CCE4740
0101000020850E0000619840C1CEC22041213B293AF81D5441
None
47
PROPERTY FOUND
None
STOLEN PROPERTY
2015-01-21 05:28:00
2015-01-04 00:00:00
None
8XX BLOCK OF VIRGINIA ST
D
D2
7300.3022
-122.3361655
47.6160533
1
2015
1507
0101000020E6100000C9224DBC83955EC0E30BA4D5DACE4740
0101000020850E00000449DC8CFAC7204114C89E423C1E5441
1313
0
ASSLT-NONAGG
1300
ASSAULT
2015-02-05 20:59:00
2015-02-05 19:31:00
None
15 AV NW / NW MARKET ST
B
B1
4700.302
-122.3762156
47.6686649
2
2015
1582
0101000020E6100000AE8F98EA13985EC0D7BDBACF96D54740
0101000020850E0000C5414D2F19B0204194E04FCEEB235441
None
47
PROPERTY FOUND
None
STOLEN PROPERTY
2015-02-10 17:05:00
2015-02-10 17:05:00
None
1XX BLOCK OF BELMONT AV E
E
E1
7402.1007
-122.324008
47.61907482
2
2015
2538
0101000020E61000001DE90C8CBC945EC0ABD6FCD73DCF4740
0101000020850E000089F479E417CF2041694D1F2F921E5441
1313
0
ASSLT-NONAGG
1300
ASSAULT
2015-02-11 06:59:00
2015-02-11 06:59:00
None
5XX BLOCK OF 3 AV
K
K2
8100.2043
-122.3310822
47.60241242
2
2015
2630
0101000020E61000006A52657330955EC0574DA5D91BCD4740
0101000020850E00004F88F1AE10CB204190E9BE13C21C5441
In [15]:
%sql select * from census_data limit 10
10 rows affected.
Out[15]:
index
County Name
Census Tract
Total Population, 2010
Persons per Square Mile, 2010
Total Housing Units, 2010
Occupied Housing Units, 2010
Vacant Housing Units, 2010
Occupancy Rate, percent, 2010
Vacancy Rate, percent, 2010
Land Area, Square Miles, 2010
Total Area, Square Miles, 2010
Water Area, percent, 2010
gid
0
King County
100
6255
8479.27
3443
3146
297
91.37
8.63
0.74
1.31
43.76
399
1
King County
200
7646
6025.91
3698
3499
199
94.62
5.38
1.27
1.27
0.0
400
2
King County
300
2603
5569.31
1161
1090
71
93.88
6.12
0.47
0.47
0.0
401
3
King County
401
5551
11215.41
3714
3134
580
84.38
15.62
0.49
0.53
6.0
402
4
King County
402
4841
6969.75
2453
2317
136
94.46
5.54
0.69
0.69
0.0
403
5
King County
500
3165
2843.42
1347
1289
58
95.69
4.31
1.11
1.47
24.44
404
6
King County
600
7626
5220.03
3538
3345
193
94.54
5.46
1.46
1.48
1.41
405
7
King County
700
4438
8878.19
2293
2114
179
92.19
7.81
0.5
0.5
0.0
406
8
King County
800
2583
5860.21
1086
1033
53
95.12
4.88
0.44
0.44
0.0
407
9
King County
900
2019
5247.77
906
836
70
92.27
7.73
0.38
1.36
71.75
408
In [16]:
%%sql
select "Offense Type" from seattlecrimeincidents
limit 10;
10 rows affected.
Out[16]:
Offense Type
DISPUTE-OTH
DISTURBANCE-OTH
DISTURBANCE-OTH
DISTURBANCE-OTH
ASSLT-NONAGG
DISTURBANCE-OTH
PROPERTY FOUND
ASSLT-NONAGG
PROPERTY FOUND
ASSLT-NONAGG
Count the number of rows:
In [17]:
%%sql
select distinct "Offense Type" from seattlecrimeincidents
150 rows affected.
Out[17]:
Offense Type
THEFT OF SERVICES
PROSTITUTION
FRAUD-IDENTITY THEFT
WARRARR-MISDEMEANOR
PROSTITUTION LOITERING
FRAUD-CHECK
THEFT-CARPROWL
PORNOGRAPHY-OBSCENE MATERIAL
ENDANGERMENT
DUI-LIQUOR
THEFT-OTH
THEFT-BOAT
THEFT-SHOPLIFT
WEAPON-UNLAWFUL USE
NARC-POSSESS-OTHER
BURGLARY-FORCE-RES
DUI-DRUGS
FRAUD-CREDIT CARD
THEFT-COINOP
METRO TRANSIT - ON BUS, TUNNEL
THREATS-OTHER
NARC-SELL-MARIJU
ROBBERY-BUSINESS-GUN
BURGLARY-SECURE PARKING-RES
THEFT-BICYCLE
NARC-FOUND-PILL/TABLET
THEFT-PKPOCKET
ROBBERY-RESIDENCE-WEAPON
DISTURBANCE-OTH
VEH-THEFT-MTRCYCLE
GAMBLE-BETTING
ROBBERY-STREET-GUN
ROBBERY-BANK-GUN
NARC-POSSESS-HEROIN
ESCAPE
HOMICIDE-PREMEDITATED-GUN
NARC-FOUND-METH
WARRANT-FUGITIVE
WEAPON-DISCHARGE
RECKLESS BURNING
TRESPASS
NARC-FOUND-MARIJU
ASSLT-AGG-GUN
ILLEGAL DUMPING
BURGLARY-NOFORCE-RES
PROPERTY DAMAGE-RESIDENTIAL
NARC-POSSESS-COCAINE
DISTURBANCE-NOISE
DRIVE-BY
DISPUTE-OTH
VEH-RCVD-FOR OTHER AGENCY
ROBBERY-BUSINESS-BODYFORCE
WEAPON-POSSESSION
NARC-SELL-COCAINE
ASSLT-AGG-POLICE-BODYFORCE
WARRARR-FELONY
NARC-DRUG TRAFFIC LOITERING
NARC-MANUFACTURE-OTHER
EMBEZZLE
FRAUD-OTHER
FRAUD-WELFARE
PROPERTY STOLEN-POSSESS
LOITERING
PROP RECOVERED-OTHER AGENCY
INJURY - ACCIDENTAL
COUNTERFEIT
NARC-FOUND-HALLUCINOGEN
DISPUTE-CIVIL PROPERTY (AUTO)
NARC-PRODUCE-MARIJU
NARC-POSSESS-AMPHETAMINE
NARC-POSSESS-PRESCRIPTION
PROPERTY LOST
VEH-THEFT-TRAILER
NARC-FOUND-OTHER
THEFT-PRSNATCH
NARC-FRAUD-PRESCRIPTION
HOMICIDE-PREMEDITATED-BODYFORC
THREATS-KILL
FALSE REPORT
ASSLT-NONAGG-POLICE
NARC-POSSESS-PILL/TABLET
FORGERY-CHECK
THEFT-AUTO PARTS
BIAS INCIDENT
HARASSMENT
NARC-FORGERY-PRESCRIPTION
VEH-THEFT-TRUCK
PROPERTY STOLEN-TRAFFICKING
NARC-POSSESS-MARIJU
TRAFFIC
NARC-FOUND-AMPHETAMINE
NARC-SELL-METH
[INC - CASE DC USE ONLY]
MALICIOUS HARASSMENT
ROBBERY-STREET-WEAPON
VEH-THEFT-AUTO
NARC-FOUND-HEROIN
PROSTITUTION PATRONIZING
PROPERTY STOLEN-SELL
ROBBERY-BANK-BODYFORCE
FORGERY-OTH
FORGERY-CREDIT CARD
WEAPON-SURRENDER-EXCLUDING FIR
EXTORTION
ANIMAL-OTH
NARC-SELL-AMPHETAMINE
ANIMAL-CRUELTY
ROBBERY-BUSINESS-WEAPON
PROPERTY DAMAGE-NON RESIDENTIA
THEFT-BUILDING
VIOL-COURT ORDER
URINATING/DEFECATING-IN PUBLIC
PROSTITUTION-ASSIST-PROMOTE
FIREWORK-USE
ROBBERY-RESIDENCE-GUN
THEFT-MAIL
ASSLT-NONAGG
PROPERTY FOUND
BURGLARY-SECURE PARKING-NONRES
ROBBERY-STREET-BODYFORCE
NARC-EQUIPMENT/PARAPHENALIA
THREATS-WEAPON
OBSTRUCT
BURGLARY-FORCE-NONRES
ASSLT-AGG-WEAPON
BURGLARY-NOFORCE-NONRES
NARC-FOUND-SYNTHETIC
DISORDERLY CONDUCT
THEFT-AUTOACC
ANIMAL-BITE
PROPERTY DAMAGE - GRAFFITI
WEAPON-CONCEALED
NARC-POSSESS-METH
LIQUOR LAW VIOLATION
HOMICIDE-NEG-MANS-BODYFORCE
ROBBERY-RESIDENCE-BODYFORCE
INJURY - OTHER
HOMICIDE-PREMEDITATED-WEAPON
THEFT-LICENSE PLATE
NARC-SELL-HEROIN
ASSLT-AGG-BODYFORCE
PROPERTY LOST - POLICE EQUIPME
FRAUD-COMPUTER
ASSLT-AGG-POLICE-WEAPON
NARC-FOUND-COCAINE
PROPERTY RECOVERED - POLICE EQ
NARC-SELL-OTHER
FRAUD-WIRE-ELECTRONIC
DISPUTE-CIVIL PROPERTY (NON AU
ELUDING-FELONY FLIGHT
In [18]:
%%sql
SELECT count(*) FROM seattlecrimeincidents;
1 rows affected.
Out[18]:
count
30814
Find the range of latitude and longitude coordinates:
In [19]:
%%sql
SELECT min(longitude), max(longitude),min(latitude),max(latitude) FROM seattlecrimeincidents;
1 rows affected.
Out[19]:
min
max
min_1
max_1
-122.4193685
-122.2282241
47.46221396
47.75651395
Calculate number of all TRESPASS crimes:
In [20]:
%%sql
SELECT count(*) FROM seattlecrimeincidents WHERE "Offense Type" = 'TRESPASS';
1 rows affected.
Out[20]:
count
710
Count of the bike thefts in the month of january
In [21]:
%%sql
SELECT * FROM seattlecrimeincidents
WHERE "Offense Type" = 'THEFT-BICYCLE' and month = 1
LIMIT 10;
10 rows affected.
Out[21]:
Offense Code
Offense Code Extension
Offense Type
Summary Offense Code
Summarized Offense Description
Date Reported
Occurred Date or Date Range Start
Occurred Date Range End
Hundred Block Location
District/Sector
Zone/Beat
census tract 2000
longitude
latitude
month
year
gid
geom
geom_utm
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-26 22:01:00
2015-01-26 21:00:00
None
9XX BLOCK OF NW 45 ST
B
B2
4700.2075
-122.3696925
47.66146818
1
2015
93
0101000020E6100000E544BB0AA9975EC0EA3844FDAAD44740
0101000020850E00009D027297F9B320410CB67ED624235441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-26 17:38:00
2015-01-26 14:00:00
2015-01-26 15:00:00
E PIKE ST / BELLEVUE AV
E
E3
8400.3004
-122.3268236
47.6140855
1
2015
94
0101000020E6100000E76388ADEA945EC052B9895A9ACE4740
0101000020850E00000004FB5F7ACD204113974119071E5441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-25 10:26:00
2015-01-25 09:55:00
None
67XX BLOCK OF MARY AV NW
J
J2
3300.101
-122.3761688
47.67867058
1
2015
159
0101000020E6100000A3714D2613985EC0BEED74ADDED64740
0101000020850E00006809D74E0EB02041221A6AD101255441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-25 08:55:00
2015-01-11 12:01:00
2015-01-12 08:45:00
11XX BLOCK OF FAIRVIEW AV N
D
D3
6600.1031
-122.3288806
47.63030081
1
2015
198
0101000020E61000007D5237610C955EC0D1CB6AB2ADD04740
0101000020850E0000482F590526CC2041BA061B4CC91F5441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-24 15:57:00
2015-01-24 10:00:00
None
1XX BLOCK OF S WASHINGTON ST
K
K2
9200.2023
-122.3335291
47.60088239
1
2015
270
0101000020E6100000EB30708A58955EC06BE5D2B6E9CC4740
0101000020850E0000F5A56CC1A3C92041E651A62B971C5441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-23 19:26:00
2015-01-21 01:00:00
2015-01-21 08:00:00
2129 1 / 2 1 AV W
Q
Q2
6800.2007
-122.358572
47.63804375
1
2015
354
0101000020E6100000B950F9D7F2965EC067D5E76AABD14740
0101000020850E0000214ECE90AABA2041C0E23DB69B205441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-23 13:33:00
2015-01-16 17:00:00
2015-01-17 06:00:00
37XX BLOCK OF S HUDSON ST
R
R3
10300.5014
-122.2853849
47.55704307
1
2015
361
0101000020E61000006C1107BF43925EC0850EF42F4DC74740
0101000020850E0000A28D8E5643E62041F24A882EDD175441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-22 12:04:00
2015-01-21 00:01:00
2015-01-21 14:30:00
19XX BLOCK OF N 44 ST
B
B3
5100.1021
-122.3343702
47.66030252
1
2015
437
0101000020E61000007270445266955EC0E66E00CB84D44740
0101000020850E0000F0764BF0B3C8204148A73EFE09235441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-22 14:59:00
2015-01-22 10:00:00
2015-01-22 14:00:00
TERRY AV N / MERCER ST
D
D3
7300.3002
-122.337134
47.62454759
1
2015
467
0101000020E6100000A9177C9A93955EC03DECE82CF1CF4740
0101000020850E0000C4DBB9DA58C72041D8E8201D281F5441
2399
1
THEFT-BICYCLE
2300
BIKE THEFT
2015-01-22 00:39:00
2015-01-21 21:00:00
2015-01-21 23:30:00
5 AV NE / NE NORTHGATE WY
L
L2
1200.4011
-122.323277
47.70859386
1
2015
499
0101000020E610000023A30392B0945EC0576C1F34B3DA4740
0101000020850E000053635904D8CE20418043178F49285441
In [22]:
%%sql
SELECT count(*) FROM seattlecrimeincidents
WHERE "Offense Type" = 'THEFT-BICYCLE' and month = 1;
1 rows affected.
Out[22]:
count
94
In [ ]:
Count how many offenses are for each Offense Type
In [23]:
%%sql
select "Offense Type",count(*) from SeattleCrimeIncidents
group by "Offense Type" order by count DESC;
150 rows affected.
Out[23]:
Offense Type
count
THEFT-CARPROWL
5334
THEFT-OTH
1828
VEH-THEFT-AUTO
1684
PROPERTY DAMAGE-NON RESIDENTIA
1412
BURGLARY-FORCE-RES
1361
DISTURBANCE-OTH
1356
ASSLT-NONAGG
1210
THEFT-SHOPLIFT
1209
PROPERTY FOUND
1063
FRAUD-IDENTITY THEFT
856
BURGLARY-NOFORCE-RES
844
THEFT-BUILDING
822
THEFT-BICYCLE
741
TRESPASS
710
BURGLARY-FORCE-NONRES
628
BURGLARY-SECURE PARKING-RES
555
WARRARR-FELONY
540
FRAUD-CREDIT CARD
516
HARASSMENT
420
PROPERTY DAMAGE-RESIDENTIAL
398
WARRARR-MISDEMEANOR
397
PROPERTY DAMAGE - GRAFFITI
375
ASSLT-AGG-WEAPON
366
BURGLARY-NOFORCE-NONRES
326
THEFT-LICENSE PLATE
305
THREATS-OTHER
303
ROBBERY-STREET-BODYFORCE
301
TRAFFIC
274
FRAUD-OTHER
226
THEFT-MAIL
204
FRAUD-CHECK
193
THREATS-KILL
180
PROPERTY LOST
175
THEFT-AUTOACC
166
ROBBERY-BUSINESS-BODYFORCE
148
ASSLT-AGG-BODYFORCE
126
VEH-THEFT-TRUCK
118
THEFT OF SERVICES
116
COUNTERFEIT
112
VEH-THEFT-MTRCYCLE
103
VEH-RCVD-FOR OTHER AGENCY
94
DISPUTE-OTH
93
NARC-POSSESS-HEROIN
93
THEFT-PKPOCKET
93
PROPERTY STOLEN-POSSESS
92
PROSTITUTION
90
ASSLT-AGG-GUN
83
VIOL-COURT ORDER
80
PROSTITUTION PATRONIZING
79
NARC-POSSESS-COCAINE
78
WEAPON-POSSESSION
78
NARC-SELL-COCAINE
77
ROBBERY-STREET-GUN
75
THREATS-WEAPON
75
EMBEZZLE
72
ROBBERY-STREET-WEAPON
68
DISTURBANCE-NOISE
64
DISPUTE-CIVIL PROPERTY (NON AU
59
FRAUD-WIRE-ELECTRONIC
50
WEAPON-DISCHARGE
50
BURGLARY-SECURE PARKING-NONRES
50
ASSLT-NONAGG-POLICE
48
ROBBERY-BUSINESS-WEAPON
46
NARC-POSSESS-METH
45
OBSTRUCT
41
PROP RECOVERED-OTHER AGENCY
39
FORGERY-CHECK
39
WARRANT-FUGITIVE
38
ROBBERY-BUSINESS-GUN
35
THEFT-AUTO PARTS
33
ANIMAL-OTH
31
ANIMAL-BITE
30
NARC-FOUND-OTHER
30
NARC-EQUIPMENT/PARAPHENALIA
29
LIQUOR LAW VIOLATION
29
INJURY - OTHER
28
DUI-LIQUOR
25
NARC-FOUND-MARIJU
23
FALSE REPORT
23
WEAPON-UNLAWFUL USE
21
NARC-SELL-HEROIN
21
FRAUD-COMPUTER
19
NARC-FOUND-HEROIN
19
MALICIOUS HARASSMENT
18
THEFT-PRSNATCH
17
VEH-THEFT-TRAILER
17
RECKLESS BURNING
17
ILLEGAL DUMPING
17
NARC-SELL-METH
17
DISPUTE-CIVIL PROPERTY (AUTO)
16
NARC-POSSESS-AMPHETAMINE
15
NARC-DRUG TRAFFIC LOITERING
15
ASSLT-AGG-POLICE-WEAPON
15
ROBBERY-RESIDENCE-WEAPON
14
PROPERTY LOST - POLICE EQUIPME
14
NARC-FORGERY-PRESCRIPTION
13
ROBBERY-RESIDENCE-BODYFORCE
13
DISORDERLY CONDUCT
13
DRIVE-BY
12
NARC-FOUND-METH
12
INJURY - ACCIDENTAL
11
ROBBERY-RESIDENCE-GUN
11
NARC-SELL-MARIJU
11
NARC-POSSESS-MARIJU
11
ELUDING-FELONY FLIGHT
11
ENDANGERMENT
10
FORGERY-OTH
10
WEAPON-SURRENDER-EXCLUDING FIR
10
THEFT-COINOP
10
FORGERY-CREDIT CARD
10
PROPERTY STOLEN-TRAFFICKING
9
NARC-SELL-AMPHETAMINE
9
PROSTITUTION LOITERING
9
[INC - CASE DC USE ONLY]
8
EXTORTION
8
BIAS INCIDENT
8
NARC-FRAUD-PRESCRIPTION
7
THEFT-BOAT
7
ASSLT-AGG-POLICE-BODYFORCE
7
URINATING/DEFECATING-IN PUBLIC
7
PROSTITUTION-ASSIST-PROMOTE
7
NARC-FOUND-COCAINE
7
NARC-POSSESS-OTHER
6
ANIMAL-CRUELTY
6
NARC-FOUND-PILL/TABLET
5
NARC-POSSESS-PILL/TABLET
4
ROBBERY-BANK-BODYFORCE
4
WEAPON-CONCEALED
4
PORNOGRAPHY-OBSCENE MATERIAL
4
HOMICIDE-PREMEDITATED-GUN
4
NARC-FOUND-AMPHETAMINE
4
FIREWORK-USE
3
PROPERTY STOLEN-SELL
3
NARC-POSSESS-PRESCRIPTION
3
LOITERING
2
NARC-PRODUCE-MARIJU
2
ESCAPE
2
NARC-SELL-OTHER
2
DUI-DRUGS
2
HOMICIDE-PREMEDITATED-BODYFORC
2
ROBBERY-BANK-GUN
2
NARC-MANUFACTURE-OTHER
2
HOMICIDE-PREMEDITATED-WEAPON
2
NARC-FOUND-SYNTHETIC
1
FRAUD-WELFARE
1
GAMBLE-BETTING
1
NARC-FOUND-HALLUCINOGEN
1
METRO TRANSIT - ON BUS, TUNNEL
1
PROPERTY RECOVERED - POLICE EQ
1
HOMICIDE-NEG-MANS-BODYFORCE
1
Note: for homicide we see there are a lot of types of homicides -> use summarized offense description
In [24]:
%%sql
select distinct "Summarized Offense Description" from seattlecrimeincidents
52 rows affected.
Out[24]:
Summarized Offense Description
THEFT OF SERVICES
PROSTITUTION
BIKE THEFT
ELUDING
BIAS INCIDENT
BURGLARY
OTHER PROPERTY
FIREWORK
RECKLESS BURNING
DISPUTE
STOLEN PROPERTY
TRAFFIC
SHOPLIFTING
TRESPASS
THREATS
OBSTRUCT
[INC - CASE DC USE ONLY]
ILLEGAL DUMPING
PURSE SNATCH
CAR PROWL
DISORDERLY CONDUCT
PORNOGRAPHY
VIOLATION OF COURT ORDER
INJURY
PUBLIC NUISANCE
METRO
EXTORTION
GAMBLE
HOMICIDE
PICKPOCKET
MAIL THEFT
ANIMAL COMPLAINT
DUI
EMBEZZLE
NARCOTICS
RECOVERED PROPERTY
VEHICLE THEFT
BURGLARY-SECURE PARKING-RES
LOITERING
ASSAULT
FRAUD
DISTURBANCE
ROBBERY
COUNTERFEIT
WARRANT ARREST
WEAPON
LOST PROPERTY
FORGERY
ESCAPE
PROPERTY DAMAGE
FALSE REPORT
LIQUOR VIOLATION
Count how many offenses are for each Summarized Offense Description:
In [25]:
%%sql
select "Summarized Offense Description", count(*) from SeattleCrimeIncidents
group by "Summarized Offense Description"
ORDER BY count DESC
limit 10;
10 rows affected.
Out[25]:
Summarized Offense Description
count
CAR PROWL
5334
BURGLARY
3209
OTHER PROPERTY
3164
PROPERTY DAMAGE
2185
VEHICLE THEFT
2023
ASSAULT
1877
FRAUD
1861
DISTURBANCE
1420
SHOPLIFTING
1209
STOLEN PROPERTY
1167
How many crimes per year?
In [26]:
%%sql
select year, count(*) from SeattleCrimeIncidents
group by year;
17 rows affected.
Out[26]:
year
count
2010
10
2006
1
2000
1
1999
1
2005
3
2013
35
2009
2
2004
1
2007
5
1990
1
2012
17
2011
8
2002
1
2001
2
2014
691
2008
4
2015
30031
In [ ]:
In [ ]:
How many crimes for each month?
In [27]:
%%sql
SELECT month,count(*) FROM seattlecrimeincidents
GROUP BY month
ORDER BY month ASC;
12 rows affected.
Out[27]:
month
count
1
6339
2
4393
3
4993
4
4931
5
5206
6
4344
7
20
8
19
9
32
10
46
11
104
12
387
Which month is with highest number of bike thefts?
In [28]:
%%sql
SELECT month,count(*) FROM seattlecrimeincidents
WHERE "Offense Type" = 'THEFT-BICYCLE'
GROUP BY month
ORDER BY count DESC;
8 rows affected.
Out[28]:
month
count
5
180
6
128
4
118
3
103
2
101
1
94
12
15
11
2
How many crimes per census tract?
In [29]:
%%sql
SELECT "census tract 2000",count(*) FROM seattlecrimeincidents
group by "census tract 2000"
ORDER BY count DESC
LIMIT 10;
10 rows affected.
Out[29]:
census tract 2000
count
1200.401
184
8100.2007
149
7500.4006
142
8100.3003
128
8100.2043
126
8100.2009
125
8100.2002
116
8100.2004
111
7500.4016
105
7300.3022
99
Let's add better column names:
In [30]:
%%sql
SELECT "census tract 2000" as "CensusTract",count(*) as "crime_count" FROM seattlecrimeincidents
group by "census tract 2000"
ORDER BY "census tract 2000" DESC
LIMIT 10;
10 rows affected.
Out[30]:
CensusTract
crime_count
None
76
26500.1002
6
26500.1001
3
26500.1
3
26001.1028
1
26001.1027
1
12100.3005
1
12100.2021
2
12100.2019
3
12100.2017
2
Extracting the max:
In [31]:
%%sql
SELECT "census tract 2000" as "CensusTract",count(*) as "crime_count" FROM seattlecrimeincidents
group by "census tract 2000"
limit 10;
10 rows affected.
Out[31]:
CensusTract
crime_count
None
76
7800.1017
1
1400.3
1
300.1013
1
10001.1013
1
9000.1003
9
4700.5004
1
600.301
1
4200.4018
2
4301.2009
3
In [32]:
%%sql
SELECT max(crimeTable.crime_count) FROM
(SELECT "census tract 2000" as "CensusTract",count(*) as "crime_count" FROM seattlecrimeincidents
group by "census tract 2000") as crimeTable;
1 rows affected.
Out[32]:
max
184
Create Table 1 (crimeTable) which contains the crime count for each tract:
``` |tract | crime_count | |---|--- | ```
In [33]:
%%sql
SELECT round("census tract 2000"),count(*) FROM seattlecrimeincidents
group by "census tract 2000"
ORDER BY "census tract 2000" ASC
LIMIT 10;
10 rows affected.
Out[33]:
round
count
100.0
8
100.0
1
100.0
1
100.0
1
100.0
1
100.0
1
100.0
3
100.0
1
100.0
2
100.0
1
Create Table 2 (censusTable) which contains the population for each tract:
``` |tract | population | |---|--- | ```
In [34]:
%%sql
SELECT "Census Tract","Total Population, 2010" as population from census_data
ORDER BY "Census Tract" ASC;
398 rows affected.
Out[34]:
Census Tract
population
100
6255
200
7646
300
2603
401
5551
402
4841
500
3165
600
7626
700
4438
800
2583
900
2019
1000
1941
1100
2503
1200
6494
1300
4340
1400
4848
1500
2454
1600
4131
1701
3818
1702
4234
1800
4535
1900
4157
2000
3406
2100
3807
2200
5286
2400
2972
2500
2802
2600
4660
2700
5289
2800
4481
2900
4177
3000
5637
3100
6219
3200
7683
3300
6224
3400
3302
3500
3936
3600
5672
3800
2058
3900
2772
4000
2690
4100
7789
4200
7618
4301
3531
4302
3518
4400
6260
4500
2676
4600
3251
4700
6739
4800
4539
4900
6139
5000
3600
5100
3662
5200
6076
5301
7085
5302
5706
5400
4948
5600
6382
5700
6001
5801
4743
5802
5059
5900
6836
6000
5150
6100
4796
6200
3848
6300
4962
6400
3207
6500
4157
6600
3540
6700
6544
6800
2834
6900
4080
7000
6959
7100
3055
7200
5318
7300
4668
7401
4115
7402
4921
7500
6282
7600
3498
7700
4476
7800
5145
7900
5147
8001
5588
8002
3013
8100
4070
8200
3280
8300
2505
8400
3760
8500
4341
8600
5055
8700
3843
8800
3503
8900
4829
9000
2925
9100
2508
9200
2825
9300
2354
9400
5700
9500
5906
9600
5252
9701
5290
9702
5173
9800
6198
9900
4787
10001
3769
10002
4313
10100
6553
10200
4835
10300
5940
10401
4432
10402
4669
10500
6342
10600
7310
10701
3385
10702
3679
10800
4596
10900
1287
11001
4534
11002
4479
11101
4390
11102
4419
11200
3906
11300
5984
11401
4027
11402
4174
11500
4191
11600
6413
11700
5538
11800
7383
11900
7108
12000
3349
12100
2681
20100
3156
20200
5566
20300
6434
20401
3426
20402
5368
20500
6327
20600
3661
20700
3569
20800
4309
20900
3240
21000
5594
21100
4125
21300
3759
21400
3690
21500
4294
21600
4513
21700
7681
21802
5401
21803
4348
21804
4374
21903
5695
21904
5043
21905
4950
21906
3911
22001
4738
22003
5031
22005
4725
22006
3686
22101
4734
22102
5871
22201
4268
22202
7073
22203
4865
22300
2681
22400
7935
22500
7143
22603
5324
22604
4233
22605
5410
22606
5430
22701
2546
22702
3729
22703
2373
22801
8314
22802
4791
22803
4987
22901
2924
22902
6299
23000
5190
23100
3792
23201
6223
23202
4345
23300
6217
23401
3546
23403
3888
23404
3321
23500
3667
23601
3812
23603
5963
23604
5838
23700
3971
23801
2536
23803
3501
23804
3646
23900
6924
24000
7612
24100
4379
24200
2953
24300
7255
24400
2885
24500
4595
24601
4302
24602
3662
24701
3598
24702
7058
24800
5668
24901
5017
24902
3795
24903
6864
25001
4635
25003
6817
25005
4645
25006
5088
25101
5797
25102
7456
25200
7929
25301
5004
25302
4366
25400
6925
25500
4841
25601
4941
25602
5474
25701
5545
25702
3495
25803
4676
25804
3714
25805
4551
25806
4954
26001
5274
26002
7911
26100
6712
26200
4998
26300
1626
26400
5745
26500
3459
26600
2306
26700
5472
26801
5385
26802
4906
27000
3489
27100
3431
27200
2558
27300
6327
27400
5421
27500
4718
27600
4648
27701
5679
27702
4945
27800
3290
27900
6514
28000
3366
28100
2477
28200
4718
28300
4580
28402
5097
28403
4986
28500
3847
28600
6214
28700
5055
28801
3095
28802
6128
28901
3368
28902
6495
29001
4230
29003
5466
29004
3751
29101
2990
29102
4329
29203
3537
29204
6666
29205
4602
29206
3853
29303
7517
29304
4836
29305
3914
29306
3283
29307
3883
29403
5535
29405
5861
29406
5004
29407
4885
29408
4376
29502
6780
29503
7896
29504
5292
29601
7153
29602
5462
29700
7056
29801
7914
29802
8969
29901
4823
29902
4427
30003
5877
30004
8055
30005
5530
30006
3736
30100
7323
30201
5327
30202
6484
30304
2987
30305
4906
30306
5681
30308
6266
30309
5730
30310
5551
30311
5034
30312
4736
30313
3612
30314
3795
30401
8134
30403
4270
30404
4208
30501
1686
30503
4181
30504
4484
30600
5331
30700
3796
30801
6575
30802
3683
30901
3606
30902
5605
31000
3347
31100
6718
31202
6401
31204
6038
31205
6673
31206
4877
31301
2487
31302
4371
31400
5780
31501
3616
31502
3876
31601
5134
31603
5728
31604
4797
31605
3589
31703
6649
31704
5539
31705
3744
31706
6310
31800
4777
31903
6411
31904
3329
31906
3594
31907
7399
31908
4266
31909
3037
32002
2906
32003
4754
32005
6457
32006
4275
32007
3457
32008
4090
32010
5175
32011
4117
32102
4546
32103
4280
32104
6033
32203
6638
32207
3254
32208
7936
32210
10572
32211
4803
32212
7787
32213
4376
32214
6264
32215
4629
32307
6368
32309
5212
32311
5837
32313
5576
32315
5256
32316
5495
32317
6038
32318
5646
32319
4703
32320
4205
32321
4959
32322
3344
32323
5483
32324
5053
32325
5791
32326
3762
32327
3834
32328
3245
32329
6599
32401
5859
32402
6010
32500
5237
32601
2785
32602
11041
32702
6545
32703
2020
32704
6014
32800
2851
990100
0
Observations:
Taking all these observations into account, the query which joins the two tables along the census tract key is:
In [35]:
%%sql
SELECT crimeTable.CT,cast(crimeTable.count as float)/censusTable.population as crime_rate from
(select floor("census tract 2000") as CT, count(*) as count from SeattleCrimeIncidents group by "census tract 2000") as crimeTable,
(select "Total Population, 2010" as population,"Census Tract" as CT from census_data) as censusTable
WHERE crimeTable.CT = censusTable.CT order by "crime_rate" DESC
LIMIT 10;
10 rows affected.
Out[35]:
ct
crime_rate
8100.0
0.0366093366093366
8100.0
0.0314496314496314
8100.0
0.030958230958231
7100.0
0.0307692307692308
8100.0
0.0307125307125307
8200.0
0.0292682926829268
8100.0
0.0285012285012285
1200.0
0.0283338466276563
8100.0
0.0272727272727273
8200.0
0.0268292682926829
Joining using the JOIN command:
In [36]:
%%sql
select crimeTable.CT,cast(crimeTable.count as float)/censusTable.population as crime_rate from
(select round("census tract 2000") as CT, count(*) as count from SeattleCrimeIncidents group by "census tract 2000") crimeTable
join
(select "Total Population, 2010" as population,"Census Tract" as CT from census_data) censusTable
on crimeTable.CT = censusTable.CT order by "crime_rate" DESC
LIMIT 10;
10 rows affected.
Out[36]:
ct
crime_rate
8100.0
0.0366093366093366
8100.0
0.0314496314496314
8100.0
0.030958230958231
7100.0
0.0307692307692308
8100.0
0.0307125307125307
8200.0
0.0292682926829268
8100.0
0.0285012285012285
1200.0
0.0283338466276563
8100.0
0.0272727272727273
8200.0
0.0268292682926829
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Content source: UWSEDS/LectureNotes
Similar notebooks: