Seattle Crime Data


Connect to a database

We will first load the SQL extension:


In [12]:
%load_ext sql


The sql extension is already loaded. To reload it, use:
  %reload_ext sql

We will connect to a database which lives on AWS. The format for connecting to a database is:

database_type://username:password@host/database_name

Also, each sql command requires to begin with the magic %sql.


In [13]:
%sql postgresql://dssg_student:password@seds-sql.csya4zsfb6y4.us-east-1.rds.amazonaws.com/dssg2016


Out[13]:
'Connected: dssg_student@dssg2016'

The tables in the database which we are interested in are:

* seattlecrimenincidents
* census_data

Let's look at the data! We can view the first 10 columns of a table:


In [14]:
%sql select * from seattlecrimeincidents limit 10


10 rows affected.
Out[14]:
Offense Code Offense Code Extension Offense Type Summary Offense Code Summarized Offense Description Date Reported Occurred Date or Date Range Start Occurred Date Range End Hundred Block Location District/Sector Zone/Beat census tract 2000 longitude latitude month year gid geom geom_utm
None 18 DISPUTE-OTH None DISPUTE 2015-01-26 13:25:00 2015-01-25 20:00:00 2015-01-25 20:00:00 MAYNARD AV S / S LUCILE ST O O2 10900.2042 -122.3253949 47.55337429 1 2015 56 0101000020E610000041722145D3945EC0A5FFFEF7D4C64740 0101000020850E0000979B9D69C6CE20419081F48370175441
None 21 DISTURBANCE-OTH None DISTURBANCE 2015-01-29 14:32:00 2015-01-29 14:29:00 None 20XX BLOCK OF 5 AV M M2 7200.1058 -122.3398236 47.61411132 1 2015 1067 0101000020E61000002D1A7CABBF955EC06FC021339BCE4740 0101000020850E0000D749456FD8C52041068E86B7051E5441
None 21 DISTURBANCE-OTH None DISTURBANCE 2015-01-22 04:35:00 2015-01-22 04:35:00 None 105XX BLOCK OF AURORA AV N N N3 1300.2002 -122.3447064 47.70579857 1 2015 468 0101000020E610000034AE6EAB0F965EC056DB879B57DA4740 0101000020850E00009ABDE7414EC220416114B96FF8275441
None 21 DISTURBANCE-OTH None DISTURBANCE 2015-01-17 01:21:00 2015-01-17 01:21:00 None NW 100 ST / 12 AV NW N N1 1400.4013 -122.3714156 47.70152724 1 2015 976 0101000020E61000005ACEEF45C5975EC0CD8604A5CBD94740 0101000020850E00003D8C439CAEB22041891D459B7D275441
1313 0 ASSLT-NONAGG 1300 ASSAULT 2015-02-02 06:48:00 2015-02-02 06:48:00 None 3XX BLOCK OF 9 AV G G1 8500.3001 -122.3234439 47.60412808 2 2015 1268 0101000020E6100000CE250B4EB3945EC0FF18A51154CD4740 0101000020850E0000754B04A589CF2041895883FDF21C5441
None 21 DISTURBANCE-OTH None DISTURBANCE 2015-01-17 01:49:00 2015-01-17 01:45:00 None 22XX BLOCK OF 2 AV D D1 8001.2009 -122.3450035 47.6136555 1 2015 1398 0101000020E61000005B608F8914965EC059C16F438CCE4740 0101000020850E0000619840C1CEC22041213B293AF81D5441
None 47 PROPERTY FOUND None STOLEN PROPERTY 2015-01-21 05:28:00 2015-01-04 00:00:00 None 8XX BLOCK OF VIRGINIA ST D D2 7300.3022 -122.3361655 47.6160533 1 2015 1507 0101000020E6100000C9224DBC83955EC0E30BA4D5DACE4740 0101000020850E00000449DC8CFAC7204114C89E423C1E5441
1313 0 ASSLT-NONAGG 1300 ASSAULT 2015-02-05 20:59:00 2015-02-05 19:31:00 None 15 AV NW / NW MARKET ST B B1 4700.302 -122.3762156 47.6686649 2 2015 1582 0101000020E6100000AE8F98EA13985EC0D7BDBACF96D54740 0101000020850E0000C5414D2F19B0204194E04FCEEB235441
None 47 PROPERTY FOUND None STOLEN PROPERTY 2015-02-10 17:05:00 2015-02-10 17:05:00 None 1XX BLOCK OF BELMONT AV E E E1 7402.1007 -122.324008 47.61907482 2 2015 2538 0101000020E61000001DE90C8CBC945EC0ABD6FCD73DCF4740 0101000020850E000089F479E417CF2041694D1F2F921E5441
1313 0 ASSLT-NONAGG 1300 ASSAULT 2015-02-11 06:59:00 2015-02-11 06:59:00 None 5XX BLOCK OF 3 AV K K2 8100.2043 -122.3310822 47.60241242 2 2015 2630 0101000020E61000006A52657330955EC0574DA5D91BCD4740 0101000020850E00004F88F1AE10CB204190E9BE13C21C5441

In [15]:
%sql select * from census_data limit 10


10 rows affected.
Out[15]:
index County Name Census Tract Total Population, 2010 Persons per Square Mile, 2010 Total Housing Units, 2010 Occupied Housing Units, 2010 Vacant Housing Units, 2010 Occupancy Rate, percent, 2010 Vacancy Rate, percent, 2010 Land Area, Square Miles, 2010 Total Area, Square Miles, 2010 Water Area, percent, 2010 gid
0 King County 100 6255 8479.27 3443 3146 297 91.37 8.63 0.74 1.31 43.76 399
1 King County 200 7646 6025.91 3698 3499 199 94.62 5.38 1.27 1.27 0.0 400
2 King County 300 2603 5569.31 1161 1090 71 93.88 6.12 0.47 0.47 0.0 401
3 King County 401 5551 11215.41 3714 3134 580 84.38 15.62 0.49 0.53 6.0 402
4 King County 402 4841 6969.75 2453 2317 136 94.46 5.54 0.69 0.69 0.0 403
5 King County 500 3165 2843.42 1347 1289 58 95.69 4.31 1.11 1.47 24.44 404
6 King County 600 7626 5220.03 3538 3345 193 94.54 5.46 1.46 1.48 1.41 405
7 King County 700 4438 8878.19 2293 2114 179 92.19 7.81 0.5 0.5 0.0 406
8 King County 800 2583 5860.21 1086 1033 53 95.12 4.88 0.44 0.44 0.0 407
9 King County 900 2019 5247.77 906 836 70 92.27 7.73 0.38 1.36 71.75 408

In [16]:
%%sql
select "Offense Type" from seattlecrimeincidents
limit 10;


10 rows affected.
Out[16]:
Offense Type
DISPUTE-OTH
DISTURBANCE-OTH
DISTURBANCE-OTH
DISTURBANCE-OTH
ASSLT-NONAGG
DISTURBANCE-OTH
PROPERTY FOUND
ASSLT-NONAGG
PROPERTY FOUND
ASSLT-NONAGG

Count the number of rows:


In [17]:
%%sql

select distinct "Offense Type" from seattlecrimeincidents


150 rows affected.
Out[17]:
Offense Type
THEFT OF SERVICES
PROSTITUTION
FRAUD-IDENTITY THEFT
WARRARR-MISDEMEANOR
PROSTITUTION LOITERING
FRAUD-CHECK
THEFT-CARPROWL
PORNOGRAPHY-OBSCENE MATERIAL
ENDANGERMENT
DUI-LIQUOR
THEFT-OTH
THEFT-BOAT
THEFT-SHOPLIFT
WEAPON-UNLAWFUL USE
NARC-POSSESS-OTHER
BURGLARY-FORCE-RES
DUI-DRUGS
FRAUD-CREDIT CARD
THEFT-COINOP
METRO TRANSIT - ON BUS, TUNNEL
THREATS-OTHER
NARC-SELL-MARIJU
ROBBERY-BUSINESS-GUN
BURGLARY-SECURE PARKING-RES
THEFT-BICYCLE
NARC-FOUND-PILL/TABLET
THEFT-PKPOCKET
ROBBERY-RESIDENCE-WEAPON
DISTURBANCE-OTH
VEH-THEFT-MTRCYCLE
GAMBLE-BETTING
ROBBERY-STREET-GUN
ROBBERY-BANK-GUN
NARC-POSSESS-HEROIN
ESCAPE
HOMICIDE-PREMEDITATED-GUN
NARC-FOUND-METH
WARRANT-FUGITIVE
WEAPON-DISCHARGE
RECKLESS BURNING
TRESPASS
NARC-FOUND-MARIJU
ASSLT-AGG-GUN
ILLEGAL DUMPING
BURGLARY-NOFORCE-RES
PROPERTY DAMAGE-RESIDENTIAL
NARC-POSSESS-COCAINE
DISTURBANCE-NOISE
DRIVE-BY
DISPUTE-OTH
VEH-RCVD-FOR OTHER AGENCY
ROBBERY-BUSINESS-BODYFORCE
WEAPON-POSSESSION
NARC-SELL-COCAINE
ASSLT-AGG-POLICE-BODYFORCE
WARRARR-FELONY
NARC-DRUG TRAFFIC LOITERING
NARC-MANUFACTURE-OTHER
EMBEZZLE
FRAUD-OTHER
FRAUD-WELFARE
PROPERTY STOLEN-POSSESS
LOITERING
PROP RECOVERED-OTHER AGENCY
INJURY - ACCIDENTAL
COUNTERFEIT
NARC-FOUND-HALLUCINOGEN
DISPUTE-CIVIL PROPERTY (AUTO)
NARC-PRODUCE-MARIJU
NARC-POSSESS-AMPHETAMINE
NARC-POSSESS-PRESCRIPTION
PROPERTY LOST
VEH-THEFT-TRAILER
NARC-FOUND-OTHER
THEFT-PRSNATCH
NARC-FRAUD-PRESCRIPTION
HOMICIDE-PREMEDITATED-BODYFORC
THREATS-KILL
FALSE REPORT
ASSLT-NONAGG-POLICE
NARC-POSSESS-PILL/TABLET
FORGERY-CHECK
THEFT-AUTO PARTS
BIAS INCIDENT
HARASSMENT
NARC-FORGERY-PRESCRIPTION
VEH-THEFT-TRUCK
PROPERTY STOLEN-TRAFFICKING
NARC-POSSESS-MARIJU
TRAFFIC
NARC-FOUND-AMPHETAMINE
NARC-SELL-METH
[INC - CASE DC USE ONLY]
MALICIOUS HARASSMENT
ROBBERY-STREET-WEAPON
VEH-THEFT-AUTO
NARC-FOUND-HEROIN
PROSTITUTION PATRONIZING
PROPERTY STOLEN-SELL
ROBBERY-BANK-BODYFORCE
FORGERY-OTH
FORGERY-CREDIT CARD
WEAPON-SURRENDER-EXCLUDING FIR
EXTORTION
ANIMAL-OTH
NARC-SELL-AMPHETAMINE
ANIMAL-CRUELTY
ROBBERY-BUSINESS-WEAPON
PROPERTY DAMAGE-NON RESIDENTIA
THEFT-BUILDING
VIOL-COURT ORDER
URINATING/DEFECATING-IN PUBLIC
PROSTITUTION-ASSIST-PROMOTE
FIREWORK-USE
ROBBERY-RESIDENCE-GUN
THEFT-MAIL
ASSLT-NONAGG
PROPERTY FOUND
BURGLARY-SECURE PARKING-NONRES
ROBBERY-STREET-BODYFORCE
NARC-EQUIPMENT/PARAPHENALIA
THREATS-WEAPON
OBSTRUCT
BURGLARY-FORCE-NONRES
ASSLT-AGG-WEAPON
BURGLARY-NOFORCE-NONRES
NARC-FOUND-SYNTHETIC
DISORDERLY CONDUCT
THEFT-AUTOACC
ANIMAL-BITE
PROPERTY DAMAGE - GRAFFITI
WEAPON-CONCEALED
NARC-POSSESS-METH
LIQUOR LAW VIOLATION
HOMICIDE-NEG-MANS-BODYFORCE
ROBBERY-RESIDENCE-BODYFORCE
INJURY - OTHER
HOMICIDE-PREMEDITATED-WEAPON
THEFT-LICENSE PLATE
NARC-SELL-HEROIN
ASSLT-AGG-BODYFORCE
PROPERTY LOST - POLICE EQUIPME
FRAUD-COMPUTER
ASSLT-AGG-POLICE-WEAPON
NARC-FOUND-COCAINE
PROPERTY RECOVERED - POLICE EQ
NARC-SELL-OTHER
FRAUD-WIRE-ELECTRONIC
DISPUTE-CIVIL PROPERTY (NON AU
ELUDING-FELONY FLIGHT

In [18]:
%%sql
SELECT count(*) FROM seattlecrimeincidents;


1 rows affected.
Out[18]:
count
30814

Applying Functions along Columns

Find the range of latitude and longitude coordinates:


In [19]:
%%sql
SELECT min(longitude), max(longitude),min(latitude),max(latitude) FROM seattlecrimeincidents;


1 rows affected.
Out[19]:
min max min_1 max_1
-122.4193685 -122.2282241 47.46221396 47.75651395

Calculate number of all TRESPASS crimes:


In [20]:
%%sql

SELECT count(*) FROM seattlecrimeincidents WHERE "Offense Type" = 'TRESPASS';


1 rows affected.
Out[20]:
count
710

Selecting Rows

Count of the bike thefts in the month of january


In [21]:
%%sql
SELECT * FROM seattlecrimeincidents
	WHERE "Offense Type" = 'THEFT-BICYCLE' and month = 1
    LIMIT 10;


10 rows affected.
Out[21]:
Offense Code Offense Code Extension Offense Type Summary Offense Code Summarized Offense Description Date Reported Occurred Date or Date Range Start Occurred Date Range End Hundred Block Location District/Sector Zone/Beat census tract 2000 longitude latitude month year gid geom geom_utm
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-26 22:01:00 2015-01-26 21:00:00 None 9XX BLOCK OF NW 45 ST B B2 4700.2075 -122.3696925 47.66146818 1 2015 93 0101000020E6100000E544BB0AA9975EC0EA3844FDAAD44740 0101000020850E00009D027297F9B320410CB67ED624235441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-26 17:38:00 2015-01-26 14:00:00 2015-01-26 15:00:00 E PIKE ST / BELLEVUE AV E E3 8400.3004 -122.3268236 47.6140855 1 2015 94 0101000020E6100000E76388ADEA945EC052B9895A9ACE4740 0101000020850E00000004FB5F7ACD204113974119071E5441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-25 10:26:00 2015-01-25 09:55:00 None 67XX BLOCK OF MARY AV NW J J2 3300.101 -122.3761688 47.67867058 1 2015 159 0101000020E6100000A3714D2613985EC0BEED74ADDED64740 0101000020850E00006809D74E0EB02041221A6AD101255441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-25 08:55:00 2015-01-11 12:01:00 2015-01-12 08:45:00 11XX BLOCK OF FAIRVIEW AV N D D3 6600.1031 -122.3288806 47.63030081 1 2015 198 0101000020E61000007D5237610C955EC0D1CB6AB2ADD04740 0101000020850E0000482F590526CC2041BA061B4CC91F5441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-24 15:57:00 2015-01-24 10:00:00 None 1XX BLOCK OF S WASHINGTON ST K K2 9200.2023 -122.3335291 47.60088239 1 2015 270 0101000020E6100000EB30708A58955EC06BE5D2B6E9CC4740 0101000020850E0000F5A56CC1A3C92041E651A62B971C5441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-23 19:26:00 2015-01-21 01:00:00 2015-01-21 08:00:00 2129 1 / 2 1 AV W Q Q2 6800.2007 -122.358572 47.63804375 1 2015 354 0101000020E6100000B950F9D7F2965EC067D5E76AABD14740 0101000020850E0000214ECE90AABA2041C0E23DB69B205441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-23 13:33:00 2015-01-16 17:00:00 2015-01-17 06:00:00 37XX BLOCK OF S HUDSON ST R R3 10300.5014 -122.2853849 47.55704307 1 2015 361 0101000020E61000006C1107BF43925EC0850EF42F4DC74740 0101000020850E0000A28D8E5643E62041F24A882EDD175441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-22 12:04:00 2015-01-21 00:01:00 2015-01-21 14:30:00 19XX BLOCK OF N 44 ST B B3 5100.1021 -122.3343702 47.66030252 1 2015 437 0101000020E61000007270445266955EC0E66E00CB84D44740 0101000020850E0000F0764BF0B3C8204148A73EFE09235441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-22 14:59:00 2015-01-22 10:00:00 2015-01-22 14:00:00 TERRY AV N / MERCER ST D D3 7300.3002 -122.337134 47.62454759 1 2015 467 0101000020E6100000A9177C9A93955EC03DECE82CF1CF4740 0101000020850E0000C4DBB9DA58C72041D8E8201D281F5441
2399 1 THEFT-BICYCLE 2300 BIKE THEFT 2015-01-22 00:39:00 2015-01-21 21:00:00 2015-01-21 23:30:00 5 AV NE / NE NORTHGATE WY L L2 1200.4011 -122.323277 47.70859386 1 2015 499 0101000020E610000023A30392B0945EC0576C1F34B3DA4740 0101000020850E000053635904D8CE20418043178F49285441

In [22]:
%%sql
SELECT count(*) FROM seattlecrimeincidents
	WHERE "Offense Type" = 'THEFT-BICYCLE' and month = 1;


1 rows affected.
Out[22]:
count
94

In [ ]:


Grouping

Count how many offenses are for each Offense Type


In [23]:
%%sql
select "Offense Type",count(*) from SeattleCrimeIncidents
	group by "Offense Type" order by count DESC;


150 rows affected.
Out[23]:
Offense Type count
THEFT-CARPROWL 5334
THEFT-OTH 1828
VEH-THEFT-AUTO 1684
PROPERTY DAMAGE-NON RESIDENTIA 1412
BURGLARY-FORCE-RES 1361
DISTURBANCE-OTH 1356
ASSLT-NONAGG 1210
THEFT-SHOPLIFT 1209
PROPERTY FOUND 1063
FRAUD-IDENTITY THEFT 856
BURGLARY-NOFORCE-RES 844
THEFT-BUILDING 822
THEFT-BICYCLE 741
TRESPASS 710
BURGLARY-FORCE-NONRES 628
BURGLARY-SECURE PARKING-RES 555
WARRARR-FELONY 540
FRAUD-CREDIT CARD 516
HARASSMENT 420
PROPERTY DAMAGE-RESIDENTIAL 398
WARRARR-MISDEMEANOR 397
PROPERTY DAMAGE - GRAFFITI 375
ASSLT-AGG-WEAPON 366
BURGLARY-NOFORCE-NONRES 326
THEFT-LICENSE PLATE 305
THREATS-OTHER 303
ROBBERY-STREET-BODYFORCE 301
TRAFFIC 274
FRAUD-OTHER 226
THEFT-MAIL 204
FRAUD-CHECK 193
THREATS-KILL 180
PROPERTY LOST 175
THEFT-AUTOACC 166
ROBBERY-BUSINESS-BODYFORCE 148
ASSLT-AGG-BODYFORCE 126
VEH-THEFT-TRUCK 118
THEFT OF SERVICES 116
COUNTERFEIT 112
VEH-THEFT-MTRCYCLE 103
VEH-RCVD-FOR OTHER AGENCY 94
DISPUTE-OTH 93
NARC-POSSESS-HEROIN 93
THEFT-PKPOCKET 93
PROPERTY STOLEN-POSSESS 92
PROSTITUTION 90
ASSLT-AGG-GUN 83
VIOL-COURT ORDER 80
PROSTITUTION PATRONIZING 79
NARC-POSSESS-COCAINE 78
WEAPON-POSSESSION 78
NARC-SELL-COCAINE 77
ROBBERY-STREET-GUN 75
THREATS-WEAPON 75
EMBEZZLE 72
ROBBERY-STREET-WEAPON 68
DISTURBANCE-NOISE 64
DISPUTE-CIVIL PROPERTY (NON AU 59
FRAUD-WIRE-ELECTRONIC 50
WEAPON-DISCHARGE 50
BURGLARY-SECURE PARKING-NONRES 50
ASSLT-NONAGG-POLICE 48
ROBBERY-BUSINESS-WEAPON 46
NARC-POSSESS-METH 45
OBSTRUCT 41
PROP RECOVERED-OTHER AGENCY 39
FORGERY-CHECK 39
WARRANT-FUGITIVE 38
ROBBERY-BUSINESS-GUN 35
THEFT-AUTO PARTS 33
ANIMAL-OTH 31
ANIMAL-BITE 30
NARC-FOUND-OTHER 30
NARC-EQUIPMENT/PARAPHENALIA 29
LIQUOR LAW VIOLATION 29
INJURY - OTHER 28
DUI-LIQUOR 25
NARC-FOUND-MARIJU 23
FALSE REPORT 23
WEAPON-UNLAWFUL USE 21
NARC-SELL-HEROIN 21
FRAUD-COMPUTER 19
NARC-FOUND-HEROIN 19
MALICIOUS HARASSMENT 18
THEFT-PRSNATCH 17
VEH-THEFT-TRAILER 17
RECKLESS BURNING 17
ILLEGAL DUMPING 17
NARC-SELL-METH 17
DISPUTE-CIVIL PROPERTY (AUTO) 16
NARC-POSSESS-AMPHETAMINE 15
NARC-DRUG TRAFFIC LOITERING 15
ASSLT-AGG-POLICE-WEAPON 15
ROBBERY-RESIDENCE-WEAPON 14
PROPERTY LOST - POLICE EQUIPME 14
NARC-FORGERY-PRESCRIPTION 13
ROBBERY-RESIDENCE-BODYFORCE 13
DISORDERLY CONDUCT 13
DRIVE-BY 12
NARC-FOUND-METH 12
INJURY - ACCIDENTAL 11
ROBBERY-RESIDENCE-GUN 11
NARC-SELL-MARIJU 11
NARC-POSSESS-MARIJU 11
ELUDING-FELONY FLIGHT 11
ENDANGERMENT 10
FORGERY-OTH 10
WEAPON-SURRENDER-EXCLUDING FIR 10
THEFT-COINOP 10
FORGERY-CREDIT CARD 10
PROPERTY STOLEN-TRAFFICKING 9
NARC-SELL-AMPHETAMINE 9
PROSTITUTION LOITERING 9
[INC - CASE DC USE ONLY] 8
EXTORTION 8
BIAS INCIDENT 8
NARC-FRAUD-PRESCRIPTION 7
THEFT-BOAT 7
ASSLT-AGG-POLICE-BODYFORCE 7
URINATING/DEFECATING-IN PUBLIC 7
PROSTITUTION-ASSIST-PROMOTE 7
NARC-FOUND-COCAINE 7
NARC-POSSESS-OTHER 6
ANIMAL-CRUELTY 6
NARC-FOUND-PILL/TABLET 5
NARC-POSSESS-PILL/TABLET 4
ROBBERY-BANK-BODYFORCE 4
WEAPON-CONCEALED 4
PORNOGRAPHY-OBSCENE MATERIAL 4
HOMICIDE-PREMEDITATED-GUN 4
NARC-FOUND-AMPHETAMINE 4
FIREWORK-USE 3
PROPERTY STOLEN-SELL 3
NARC-POSSESS-PRESCRIPTION 3
LOITERING 2
NARC-PRODUCE-MARIJU 2
ESCAPE 2
NARC-SELL-OTHER 2
DUI-DRUGS 2
HOMICIDE-PREMEDITATED-BODYFORC 2
ROBBERY-BANK-GUN 2
NARC-MANUFACTURE-OTHER 2
HOMICIDE-PREMEDITATED-WEAPON 2
NARC-FOUND-SYNTHETIC 1
FRAUD-WELFARE 1
GAMBLE-BETTING 1
NARC-FOUND-HALLUCINOGEN 1
METRO TRANSIT - ON BUS, TUNNEL 1
PROPERTY RECOVERED - POLICE EQ 1
HOMICIDE-NEG-MANS-BODYFORCE 1

Note: for homicide we see there are a lot of types of homicides -> use summarized offense description


In [24]:
%%sql

select distinct "Summarized Offense Description" from seattlecrimeincidents


52 rows affected.
Out[24]:
Summarized Offense Description
THEFT OF SERVICES
PROSTITUTION
BIKE THEFT
ELUDING
BIAS INCIDENT
BURGLARY
OTHER PROPERTY
FIREWORK
RECKLESS BURNING
DISPUTE
STOLEN PROPERTY
TRAFFIC
SHOPLIFTING
TRESPASS
THREATS
OBSTRUCT
[INC - CASE DC USE ONLY]
ILLEGAL DUMPING
PURSE SNATCH
CAR PROWL
DISORDERLY CONDUCT
PORNOGRAPHY
VIOLATION OF COURT ORDER
INJURY
PUBLIC NUISANCE
METRO
EXTORTION
GAMBLE
HOMICIDE
PICKPOCKET
MAIL THEFT
ANIMAL COMPLAINT
DUI
EMBEZZLE
NARCOTICS
RECOVERED PROPERTY
VEHICLE THEFT
BURGLARY-SECURE PARKING-RES
LOITERING
ASSAULT
FRAUD
DISTURBANCE
ROBBERY
COUNTERFEIT
WARRANT ARREST
WEAPON
LOST PROPERTY
FORGERY
ESCAPE
PROPERTY DAMAGE
FALSE REPORT
LIQUOR VIOLATION

Count how many offenses are for each Summarized Offense Description:


In [25]:
%%sql
select "Summarized Offense Description", count(*) from SeattleCrimeIncidents
	group by "Summarized Offense Description"
    ORDER BY count DESC
    limit 10;


10 rows affected.
Out[25]:
Summarized Offense Description count
CAR PROWL 5334
BURGLARY 3209
OTHER PROPERTY 3164
PROPERTY DAMAGE 2185
VEHICLE THEFT 2023
ASSAULT 1877
FRAUD 1861
DISTURBANCE 1420
SHOPLIFTING 1209
STOLEN PROPERTY 1167

How many crimes per year?


In [26]:
%%sql
select year, count(*) from SeattleCrimeIncidents
group by year;


17 rows affected.
Out[26]:
year count
2010 10
2006 1
2000 1
1999 1
2005 3
2013 35
2009 2
2004 1
2007 5
1990 1
2012 17
2011 8
2002 1
2001 2
2014 691
2008 4
2015 30031

In [ ]:


In [ ]:

How many crimes for each month?


In [27]:
%%sql
SELECT month,count(*) FROM seattlecrimeincidents 
    GROUP BY month 
    ORDER BY month ASC;


12 rows affected.
Out[27]:
month count
1 6339
2 4393
3 4993
4 4931
5 5206
6 4344
7 20
8 19
9 32
10 46
11 104
12 387

Which month is with highest number of bike thefts?


In [28]:
%%sql
SELECT month,count(*) FROM seattlecrimeincidents
	WHERE "Offense Type" = 'THEFT-BICYCLE'
	GROUP BY month
	ORDER BY count DESC;


8 rows affected.
Out[28]:
month count
5 180
6 128
4 118
3 103
2 101
1 94
12 15
11 2

How many crimes per census tract?


In [29]:
%%sql
SELECT "census tract 2000",count(*) FROM seattlecrimeincidents
	group by "census tract 2000"
	ORDER BY count DESC
    LIMIT 10;


10 rows affected.
Out[29]:
census tract 2000 count
1200.401 184
8100.2007 149
7500.4006 142
8100.3003 128
8100.2043 126
8100.2009 125
8100.2002 116
8100.2004 111
7500.4016 105
7300.3022 99

Aliasing

Let's add better column names:


In [30]:
%%sql
SELECT "census tract 2000" as "CensusTract",count(*) as "crime_count" FROM seattlecrimeincidents
	group by "census tract 2000"
	ORDER BY "census tract 2000" DESC
    LIMIT 10;


10 rows affected.
Out[30]:
CensusTract crime_count
None 76
26500.1002 6
26500.1001 3
26500.1 3
26001.1028 1
26001.1027 1
12100.3005 1
12100.2021 2
12100.2019 3
12100.2017 2

Nesting

We cannot simply store the table as a variable, but we can apply several transformations on it by nesting queries.

Extracting the max:


In [31]:
%%sql
SELECT "census tract 2000" as "CensusTract",count(*) as "crime_count" FROM seattlecrimeincidents
	group by "census tract 2000"
    limit 10;


10 rows affected.
Out[31]:
CensusTract crime_count
None 76
7800.1017 1
1400.3 1
300.1013 1
10001.1013 1
9000.1003 9
4700.5004 1
600.301 1
4200.4018 2
4301.2009 3

In [32]:
%%sql
SELECT max(crimeTable.crime_count) FROM 
    (SELECT "census tract 2000" as "CensusTract",count(*) as "crime_count" FROM seattlecrimeincidents
    group by "census tract 2000") as crimeTable;


1 rows affected.
Out[32]:
max
184

Combining Tables

Create Table 1 (crimeTable) which contains the crime count for each tract:

```
|tract | crime_count |
|---|--- |
```

In [33]:
%%sql 

SELECT round("census tract 2000"),count(*) FROM seattlecrimeincidents
	group by "census tract 2000"
	ORDER BY "census tract 2000" ASC
    LIMIT 10;


10 rows affected.
Out[33]:
round count
100.0 8
100.0 1
100.0 1
100.0 1
100.0 1
100.0 1
100.0 3
100.0 1
100.0 2
100.0 1

Create Table 2 (censusTable) which contains the population for each tract:

```
|tract | population |
|---|--- |
```

In [34]:
%%sql
SELECT "Census Tract","Total Population, 2010" as population from census_data
	ORDER BY "Census Tract" ASC;


398 rows affected.
Out[34]:
Census Tract population
100 6255
200 7646
300 2603
401 5551
402 4841
500 3165
600 7626
700 4438
800 2583
900 2019
1000 1941
1100 2503
1200 6494
1300 4340
1400 4848
1500 2454
1600 4131
1701 3818
1702 4234
1800 4535
1900 4157
2000 3406
2100 3807
2200 5286
2400 2972
2500 2802
2600 4660
2700 5289
2800 4481
2900 4177
3000 5637
3100 6219
3200 7683
3300 6224
3400 3302
3500 3936
3600 5672
3800 2058
3900 2772
4000 2690
4100 7789
4200 7618
4301 3531
4302 3518
4400 6260
4500 2676
4600 3251
4700 6739
4800 4539
4900 6139
5000 3600
5100 3662
5200 6076
5301 7085
5302 5706
5400 4948
5600 6382
5700 6001
5801 4743
5802 5059
5900 6836
6000 5150
6100 4796
6200 3848
6300 4962
6400 3207
6500 4157
6600 3540
6700 6544
6800 2834
6900 4080
7000 6959
7100 3055
7200 5318
7300 4668
7401 4115
7402 4921
7500 6282
7600 3498
7700 4476
7800 5145
7900 5147
8001 5588
8002 3013
8100 4070
8200 3280
8300 2505
8400 3760
8500 4341
8600 5055
8700 3843
8800 3503
8900 4829
9000 2925
9100 2508
9200 2825
9300 2354
9400 5700
9500 5906
9600 5252
9701 5290
9702 5173
9800 6198
9900 4787
10001 3769
10002 4313
10100 6553
10200 4835
10300 5940
10401 4432
10402 4669
10500 6342
10600 7310
10701 3385
10702 3679
10800 4596
10900 1287
11001 4534
11002 4479
11101 4390
11102 4419
11200 3906
11300 5984
11401 4027
11402 4174
11500 4191
11600 6413
11700 5538
11800 7383
11900 7108
12000 3349
12100 2681
20100 3156
20200 5566
20300 6434
20401 3426
20402 5368
20500 6327
20600 3661
20700 3569
20800 4309
20900 3240
21000 5594
21100 4125
21300 3759
21400 3690
21500 4294
21600 4513
21700 7681
21802 5401
21803 4348
21804 4374
21903 5695
21904 5043
21905 4950
21906 3911
22001 4738
22003 5031
22005 4725
22006 3686
22101 4734
22102 5871
22201 4268
22202 7073
22203 4865
22300 2681
22400 7935
22500 7143
22603 5324
22604 4233
22605 5410
22606 5430
22701 2546
22702 3729
22703 2373
22801 8314
22802 4791
22803 4987
22901 2924
22902 6299
23000 5190
23100 3792
23201 6223
23202 4345
23300 6217
23401 3546
23403 3888
23404 3321
23500 3667
23601 3812
23603 5963
23604 5838
23700 3971
23801 2536
23803 3501
23804 3646
23900 6924
24000 7612
24100 4379
24200 2953
24300 7255
24400 2885
24500 4595
24601 4302
24602 3662
24701 3598
24702 7058
24800 5668
24901 5017
24902 3795
24903 6864
25001 4635
25003 6817
25005 4645
25006 5088
25101 5797
25102 7456
25200 7929
25301 5004
25302 4366
25400 6925
25500 4841
25601 4941
25602 5474
25701 5545
25702 3495
25803 4676
25804 3714
25805 4551
25806 4954
26001 5274
26002 7911
26100 6712
26200 4998
26300 1626
26400 5745
26500 3459
26600 2306
26700 5472
26801 5385
26802 4906
27000 3489
27100 3431
27200 2558
27300 6327
27400 5421
27500 4718
27600 4648
27701 5679
27702 4945
27800 3290
27900 6514
28000 3366
28100 2477
28200 4718
28300 4580
28402 5097
28403 4986
28500 3847
28600 6214
28700 5055
28801 3095
28802 6128
28901 3368
28902 6495
29001 4230
29003 5466
29004 3751
29101 2990
29102 4329
29203 3537
29204 6666
29205 4602
29206 3853
29303 7517
29304 4836
29305 3914
29306 3283
29307 3883
29403 5535
29405 5861
29406 5004
29407 4885
29408 4376
29502 6780
29503 7896
29504 5292
29601 7153
29602 5462
29700 7056
29801 7914
29802 8969
29901 4823
29902 4427
30003 5877
30004 8055
30005 5530
30006 3736
30100 7323
30201 5327
30202 6484
30304 2987
30305 4906
30306 5681
30308 6266
30309 5730
30310 5551
30311 5034
30312 4736
30313 3612
30314 3795
30401 8134
30403 4270
30404 4208
30501 1686
30503 4181
30504 4484
30600 5331
30700 3796
30801 6575
30802 3683
30901 3606
30902 5605
31000 3347
31100 6718
31202 6401
31204 6038
31205 6673
31206 4877
31301 2487
31302 4371
31400 5780
31501 3616
31502 3876
31601 5134
31603 5728
31604 4797
31605 3589
31703 6649
31704 5539
31705 3744
31706 6310
31800 4777
31903 6411
31904 3329
31906 3594
31907 7399
31908 4266
31909 3037
32002 2906
32003 4754
32005 6457
32006 4275
32007 3457
32008 4090
32010 5175
32011 4117
32102 4546
32103 4280
32104 6033
32203 6638
32207 3254
32208 7936
32210 10572
32211 4803
32212 7787
32213 4376
32214 6264
32215 4629
32307 6368
32309 5212
32311 5837
32313 5576
32315 5256
32316 5495
32317 6038
32318 5646
32319 4703
32320 4205
32321 4959
32322 3344
32323 5483
32324 5053
32325 5791
32326 3762
32327 3834
32328 3245
32329 6599
32401 5859
32402 6010
32500 5237
32601 2785
32602 11041
32702 6545
32703 2020
32704 6014
32800 2851
990100 0

Observations:

  • the common column between the tables contains census tract
  • the tract columns have different names: "census tract 2000" vs "Census Tract"
  • the tract from the crime table has non-integer values (it contains subtracts) -> convert to floor
  • crime rate = crime count/population (SQL returns an integer for division of integers: need to convert to float for float division)

Taking all these observations into account, the query which joins the two tables along the census tract key is:


In [35]:
%%sql
SELECT crimeTable.CT,cast(crimeTable.count as float)/censusTable.population as crime_rate from
	(select floor("census tract 2000") as CT, count(*) as count from SeattleCrimeIncidents group by "census tract 2000") as crimeTable,
    (select "Total Population, 2010" as population,"Census Tract" as CT from census_data) as censusTable
    WHERE crimeTable.CT = censusTable.CT order by "crime_rate" DESC
    LIMIT 10;


10 rows affected.
Out[35]:
ct crime_rate
8100.0 0.0366093366093366
8100.0 0.0314496314496314
8100.0 0.030958230958231
7100.0 0.0307692307692308
8100.0 0.0307125307125307
8200.0 0.0292682926829268
8100.0 0.0285012285012285
1200.0 0.0283338466276563
8100.0 0.0272727272727273
8200.0 0.0268292682926829

Joining using the JOIN command:


In [36]:
%%sql
select crimeTable.CT,cast(crimeTable.count as float)/censusTable.population as crime_rate from 
	(select round("census tract 2000") as CT, count(*) as count from SeattleCrimeIncidents group by "census tract 2000") crimeTable
    join 
    (select "Total Population, 2010" as population,"Census Tract" as CT from census_data) censusTable
    on crimeTable.CT = censusTable.CT order by "crime_rate" DESC
    LIMIT 10;


10 rows affected.
Out[36]:
ct crime_rate
8100.0 0.0366093366093366
8100.0 0.0314496314496314
8100.0 0.030958230958231
7100.0 0.0307692307692308
8100.0 0.0307125307125307
8200.0 0.0292682926829268
8100.0 0.0285012285012285
1200.0 0.0283338466276563
8100.0 0.0272727272727273
8200.0 0.0268292682926829

In [ ]:


In [ ]:


In [ ]:


In [ ]: