Using two datasets you find from the internet, do the following for each


In [1136]:
import pandas as pd
import numpy as np
import random
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline

1) Open your dataset up using pandas in a Jupyter notebook


In [1137]:
df = pd.read_csv('Mother Jones US Mass Shootings 1982-2016 - US mass shootings.csv')

2) Do a .head() to get a feel for your data


In [1138]:
df.head()


Out[1138]:
Case Location Date Year Summary Fatalities Wounded Total victims Venue Prior signs of possible mental illness ... Where obtained Type of weapons Weapon details Race Gender Sources Mental Health Sources latitude longitude Type
0 Dallas police shooting Dallas, Texas 7/7/2016 2016 Micah Xavier Johnson, a 25-year-old Army veter... 5 11 16 Other (pending) ... online and or gun show Semiautomatic rifle, semiautomatic handguns Izhmash-Saiga 5.45mm (AK-style) semiautomatic ... Black M http://www.nytimes.com/2016/07/11/us/dallas-sh... NaN NaN NaN Mass
1 Orlando nightclub massacre Orlando, Florida 6/12/2016 2016 Omar Mateen, 29, attacked the Pulse nighclub i... 49 53 102 Other (pending) ... Shooting center in Port St. Lucie, Florida Semiautomatic rifle, semiautomatic handgun Sig Sauer MCX rifle, Glock 17 9mm; high-capaci... Other M http://www.motherjones.com/politics/2016/06/as... NaN NaN NaN Mass
2 Excel Industries mass shooting Hesston, Kansas 2/25/2016 2016 Cedric L. Ford, who worked as a painter at a m... 3 14 17 Workplace Unclear ... NaN Semiautomatic rifle, semiautomatic handgun Zastava Serbia AK-47-style rifle, Glock Model ... Black M http://www.nytimes.com/2016/02/26/us/shooting-... NaN NaN NaN Spree
3 Kalamazoo shooting spree Kalamazoo County, Michigan 2/20/2016 2016 Jason B. Dalton, a driver for Uber, apparently... 6 2 8 Other Unclear ... NaN Semiautomatic handgun 9 mm handgun (ammo used unclear) White M http://www.nytimes.com/2016/02/22/us/kalamazoo... NaN NaN NaN Spree
4 San Bernardino mass shooting San Bernardino, California 12/2/2015 2015 Syed Rizwan Farook left a Christmas party held... 14 21 35 \nWorkplace Unclear ... The suspects purchased their handguns in the U... Two assault rifles and two semi-automatic pist... Two semiautomatic AR-15-style rifles—one a DPM... Other Male & Female http://www.motherjones.com/mojo/2015/12/san-be... NaN NaN NaN Mass

5 rows × 22 columns


In [1139]:
df.columns


Out[1139]:
Index(['Case', 'Location', 'Date', 'Year', 'Summary', 'Fatalities', 'Wounded',
       'Total victims', 'Venue', 'Prior signs of possible mental illness',
       'Mental Health', 'Weapons obtained legally', 'Where obtained',
       'Type of weapons', 'Weapon details', 'Race', 'Gender', 'Sources',
       'Mental Health Sources', 'latitude', 'longitude', 'Type'],
      dtype='object')

3) Write down 12 questions to ask your data, or 12 things to hunt for in the data

  1. Are more weapons obtained legally or illegally?
  2. Where are most weapons obtained?
  3. What state has the most mass shootings?
  4. Which mass shooting had the most wounded? / How many wounded?
  5. Which mass shootings had the most victims?
  6. What mass shooting was most fatal?
  7. What kind of weapons were used in the most fatal shootings? Are glocks used more often than not?
  8. What kind of weapons were used in the shootings that had the most victims (wounded and fatal)?
  9. How many shooters showed prior signs of mental illness?
  10. In which kind of venue do most mass shootings occur?
  11. What kind of weapons are most common?
  12. What is the gender of most people who carry out mass shootings?

4) Attempt to answer those ten questions using the magic of pandas:

1) Are more weapons obtained legally or illegally?


In [1140]:
df['Weapons obtained legally'] = df['Weapons obtained legally'].str.replace('\nYes', 'Yes')
df['Weapons obtained legally'] = df['Weapons obtained legally'].str.replace('Yes\s.+','Yes')
df['Weapons obtained legally'] = df['Weapons obtained legally'].str.replace('Yes ','Yes')
df['Weapons obtained legally'].str.strip()

ax = df['Weapons obtained legally'].value_counts().plot(kind='bar', title='Was the weapon obtained legally?')


2) Where are most weapons obtained?


In [1141]:
df['Gun Show'] = df['Where obtained'].str.contains('[Ss]how', na=False)
df['Online'] = df['Where obtained'].str.contains('[Oo]nline') | df['Where obtained'].str.contains('[Ii]nternet')
df['Family/Friends'] = df['Where obtained'].str.contains('[Gg]randfather')| df['Where obtained'].str.contains('[Mm]other') | df['Where obtained'].str.contains('[Ff]ather') | df['Where obtained'].str.contains('[Ff]riend') | df['Where obtained'].str.contains('[Ii]ndividual')
df['Store/Retailer'] = df['Where obtained'].str.contains('Trading')| df['Where obtained'].str.contains('[Ss]ports') | df['Where obtained'].str.contains('Big')| df['Where obtained'].str.contains('[Ss]portsman') | df['Where obtained'].str.contains("[Ss]portsman's") | df['Where obtained'].str.contains('[Ff]irearms') | df['Where obtained'].str.contains('Gander') | df['Where obtained'].str.contains('Galore')| df['Where obtained'].str.contains('[Dd]ealer') | df['Where obtained'].str.contains('[Ss]upply') | df['Where obtained'].str.contains('Fin') | df['Where obtained'].str.contains('[Ss]tore') | df['Where obtained'].str.contains('[Ss]tores') | df['Where obtained'].str.contains('[Cc]enter') |df['Where obtained'].str.contains('[Pp]awn') | df['Where obtained'].str.contains('[Rr]etailer') | df['Where obtained'].str.contains('[Rr]etailers') | df['Where obtained'].str.contains('[Ff]lea') | df['Where obtained'].str.contains('[Ss]uppliers') | df['Where obtained'].str.contains('[Rr]ange') | df['Where obtained'].str.contains("Frank's") | df['Where obtained'].str.contains("Frank's") | df['Where obtained'].str.contains("[Ss]upplies") | df['Where obtained'].str.contains("Frank's") | df['Where obtained'].str.contains('[Ss]ales') | df['Where obtained'].str.contains('[Ww]arehouse') | df['Where obtained'].str.contains('Bullseye' )| df['Where obtained'].str.contains('Outdoorsman')
df['Stolen'] = df['Where obtained'].str.contains('[Ss]tolen') | df['Where obtained'].str.contains('[Bb]urglary') 
df['Unknown'] = df['Where obtained'].str.contains('Unknown') | df['Where obtained'].str.contains('Unclear') | df['Where obtained'].isnull()
df['Issued'] = df['Where obtained'].str.contains('[Ii]ssued', na=False)
df['Other'] = df['Where obtained'].str.contains('[Th]ird party', na=False) | df['Where obtained'].str.contains('[Aa]ssembled', na=False)
df


Out[1141]:
Case Location Date Year Summary Fatalities Wounded Total victims Venue Prior signs of possible mental illness ... longitude Type Gun Show Online Family/Friends Store/Retailer Stolen Unknown Issued Other
0 Dallas police shooting Dallas, Texas 7/7/2016 2016 Micah Xavier Johnson, a 25-year-old Army veter... 5 11 16 Other (pending) ... NaN Mass True True False False False False False False
1 Orlando nightclub massacre Orlando, Florida 6/12/2016 2016 Omar Mateen, 29, attacked the Pulse nighclub i... 49 53 102 Other (pending) ... NaN Mass False False False True False False False False
2 Excel Industries mass shooting Hesston, Kansas 2/25/2016 2016 Cedric L. Ford, who worked as a painter at a m... 3 14 17 Workplace Unclear ... NaN Spree False False False False False True False False
3 Kalamazoo shooting spree Kalamazoo County, Michigan 2/20/2016 2016 Jason B. Dalton, a driver for Uber, apparently... 6 2 8 Other Unclear ... NaN Spree False False False False False True False False
4 San Bernardino mass shooting San Bernardino, California 12/2/2015 2015 Syed Rizwan Farook left a Christmas party held... 14 21 35 \nWorkplace Unclear ... NaN Mass False False False False False False False True
5 Planned Parenthood clinic Colorado Springs, Colorado 11/27/2015 2015 Robert Lewis Dear, 57, shot and killed a polic... 3 9 12 Workplace Unclear ... NaN Mass False False False False False True False False
6 Colorado Springs shooting rampage Colorado Springs, Colorado 10/31/2015 2015 Noah Harpham, 33, shot three people before dea... 3 0 3 Other Unclear ... NaN Spree False False False False False True False False
7 Umpqua Community College shooting Roseburg, Oregon 10/1/2015 2015 26-year-old Chris Harper Mercer opened fire a... 9 9 18 School Unclear ... NaN Mass False False True False False False False False
8 Chattanooga military recruitment center Chattanooga, Tennessee 7/16/2015 2015 Kuwaiti-born Mohammod Youssuf Abdulazeez, 24, ... 5 2 7 Military Unclear ... -85.311819 Mass False True False False False False False False
9 Charleston Church Shooting Charleston, South Carolina 6/17/2015 2015 Dylann Storm Roof, 21, shot and killed 9 peopl... 9 1 10 Religious unknown ... -79.933143 Mass False False False True False False False False
10 Trestle Trail bridge shooting Menasha, Wisconsin 6/11/2015 2015 Sergio Valencia del Toro, 27, in what official... 3 1 4 Other Yes ... NaN Mass False False False False False True False False
11 Marysville-Pilchuck High School shooting Marysville, Washington 10/24/2014 2014 Jaylen Fryberg, 15, using a .40-caliber Berret... 5 1 6 School Unclear ... -122.176918 Mass False False True False False False False False
12 Isla Vista mass murder Santa Barbara, California 5/23/2014 2014 Elliot Rodger, 22, shot three people to death ... 6 13 19 School Yes ... NaN Spree False False False False False True False False
13 Fort Hood shooting 2 Fort Hood, Texas 4/3/2014 2014 Army Specialist Ivan Lopez, 34, opened fire at... 3 12 15 Military Unclear ... NaN Mass False False False True False False False False
14 Alturas tribal shooting Alturas, California 2/20/2014 2014 Cherie Lash Rhoades, 44, opened fire at the Ce... 4 2 6 Other Unknown ... -120.542237 Mass False False False False False True False False
15 Washington Navy Yard shooting Washington, D.C. 9/16/2013 2013 Aaron Alexis, 34, a military veteran and contr... 12 8 20 Military Yes ... -76.994530 Mass False False False True False False False False
16 Hialeah apartment shooting Hialeah, Florida 7/26/2013 2013 Pedro Vargas, 42, set fire to his apartment, k... 7 0 7 Other\n Unclear ... -80.291463 Mass False False False True False False False False
17 Santa Monica rampage Santa Monica, California 6/7/2013 2013 John Zawahri, 23, armed with a homemade assaul... 6 3 9 Other\n Yes ... -118.494754 Spree False False False False False False False True
18 Pinewood Village Apartment shooting Federal Way, Washington 4/21/2013 2013 Dennis Clark III, 27, shot and killed his girl... 5 0 5 Other\n No ... -122.339366 Mass False False False False False True False False
19 Mohawk Valley shootings Herkimer County, New York 3/13/2013 2013 Kurt Myers, 64, shot six people in neighboring... 5 2 7 Other No ... -74.984891 Spree False False False True False False False False
20 Newtown school shooting Newtown, Connecticut 12/14/2012 2012 Adam Lanza, 20, shot his mother dead at their ... 28 2 30 School Yes ... -73.311424 Mass False False True False True False False False
21 Accent Signage Systems shooting Minneapolis, Minnesota 9/27/2012 2012 Andrew Engeldinger, 36, upon learning he was b... 7 1 8 Workplace Yes ... -93.265469 Mass False False False False False True False False
22 Sikh temple shooting Oak Creek, Wisconsin 8/5/2012 2012 U.S. Army veteran Wade Michael Page, 40, opene... 7 3 10 Religious Yes ... -87.863136 Mass False False False False False True False False
23 Aurora theater shooting Aurora, Colorado 7/20/2012 2012 James Holmes, 24, opened fire in a movie theat... 12 58 70 Other Yes ... -104.823488 Mass False False False True False False False False
24 Seattle cafe shooting Seattle, Washington 5/20/2012 2012 Ian Stawicki, 40, gunned down four patrons at ... 6 1 7 Other Yes ... -122.330062 Spree False False False True False False False False
25 Oikos University killings Oakland, California 4/2/2012 2012 One L. Goh, 43, a former student, opened fire ... 7 3 10 School Yes ... -122.270817 Mass False False False True False False False False
26 Su Jung Health Sauna shooting Norcross, Georgia 2/22/2012 2012 Jeong Soo Paek, 59, returned to a Korean spa f... 5 0 5 Other Yes ... -84.213531 Mass False False False False False True False False
27 Seal Beach shooting Seal Beach, California 10/14/2011 2011 Scott Evans Dekraai, 42, opened fire inside a ... 8 1 9 Other Yes ... -118.104636 Mass False False False False False True False False
28 IHOP shooting Carson City, Nevada 9/6/2011 2011 Eduardo Sencion, 32, opened fire at an Interna... 5 7 12 Other Yes ... -119.767403 Mass False False True False False False False False
29 Tucson shooting Tucson, Arizona 1/8/2011 2011 Jared Loughner, 22, opened fire outside a Safe... 6 13 19 Other Yes ... -110.926479 Mass False False False True False False False False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
52 Xerox killings Honolulu, Hawaii 11/2/1999 1999 Byran Koji Uyesugi, 40, a Xerox service techni... 7 0 7 Workplace Yes ... -157.847306 Mass False False False True False False False False
53 Wedgwood Baptist Church shooting Fort Worth, Texas 9/15/1999 1999 Larry Gene Ashbrook, 47, opened fire inside th... 8 7 15 Religious Yes ... -97.470671 Mass False False False True False False False False
54 Atlanta day trading spree killings Atlanta, Georgia 7/29/1999 1999 Day trader Mark O. Barton, 44, who had recentl... 9 13 22 Workplace Yes ... -84.390185 Spree False False False True False False False False
55 Columbine High School massacre Littleton, Colorado 4/20/1999 1999 Eric Harris, 18, and Dylan Klebold, 17, opened... 13 24 37 School Yes ... -104.987727 Mass True False True False False False False False
56 Thurston High School shooting Springfield, Oregon 5/21/1998 1998 After he was expelled for having a gun in his ... 4 25 29 School Yes ... -123.022029 Spree False False True False True False False False
57 Westside Middle School killings Jonesboro, Arkansas 3/24/1998 1998 Mitchell Scott Johnson, 13, and Andrew Douglas... 5 10 15 School No ... -90.668261 Mass False False True False True False False False
58 Connecticut Lottery shooting Newington, Connecticut 3/6/1998 1998 Lottery worker Matthew Beck, 35, gunned down f... 5 1 6 Workplace Yes ... -72.729838 Mass False False False False False True False False
59 Caltrans maintenance yard shooting Orange, California 12/18/1997 1997 Former Caltrans employee Arturo Reyes Torres, ... 5 2 7 Workplace No ... -117.853112 Mass False False False True False False False False
60 R.E. Phelon Company shooting Aiken, South Carolina 9/15/1997 1997 Ex-con Hastings Arthur Wise, 43, opened fire a... 4 3 7 Workplace No ... -81.721952 Mass False False False False False True False False
61 Fort Lauderdale revenge shooting Fort Lauderdale, Florida 2/9/1996 1996 Fired city park employee Clifton McCree, 41, o... 6 1 7 Workplace Yes ... -80.143379 Mass False False False False False True False False
62 Walter Rossler Company massacre Corpus Christi, Texas 4/3/1995 1995 Disgruntled former metallurgist James Daniel S... 6 0 6 Workplace No ... -97.417398 Mass False False False False False True False False
63 Air Force base shooting Fairchild Air Force Base, Washington 6/20/1994 1994 Former airman Dean Allen Mellberg, 20, opened ... 5 23 28 Military Yes ... -117.648359 Mass False False False True False False False False
64 Chuck E. Cheese's killings Aurora, Colorado 12/14/1993 1993 Nathan Dunlap, 19, a recently fired Chuck E. C... 4 1 5 Workplace Unclear ... -104.835869 Mass False False False False False True False False
65 Long Island Rail Road massacre Garden City, New York 12/7/1993 1993 Colin Ferguson, 35, opened fire on an eastboun... 6 19 25 Other Yes ... -73.634295 Mass False False False True False False False False
66 Luigi's shooting Fayetteville, North Carolina 8/6/1993 1993 Army Sgt. Kenneth Junior French, 22, opened fi... 4 8 12 Other No ... -78.878706 Mass False False False False False True False False
67 101 California Street shootings San Francisco, California 7/1/1993 1993 Failed businessman Gian Luigi Ferri, 55, opene... 9 6 15 Other No ... -122.419199 Mass False False False True False False False False
68 Watkins Glen killings Watkins Glen, New York 10/15/1992 1992 John T. Miller, 50, killed four child-support ... 5 0 5 Other Yes ... -76.870578 Mass False False False True False False False False
69 Lindhurst High School shooting Olivehurst, California 5/1/1992 1992 Former Lindhurst High School student Eric Hous... 4 10 14 School No ... -121.547576 Mass False False False True False False False False
70 Royal Oak postal shootings Royal Oak, Michigan 11/14/1991 1991 Laid-off postal worker Thomas McIlvane, 31, op... 5 5 10 Workplace Yes ... -83.144649 Mass False False False True False False False False
71 University of Iowa shooting Iowa City, Iowa 11/1/1991 1991 Former graduate student Gang Lu, 28, went on a... 6 1 7 School Unclear ... -91.530221 Mass False False False True False False False False
72 Luby's massacre Killeen, Texas 10/16/1991 1991 George Hennard, 35, drove his pickup truck int... 24 20 44 Other No ... -97.727796 Mass False False False False False False False False
73 GMAC massacre Jacksonville, Florida 6/18/1990 1990 James Edward Pough, 42, opened fire at a Gener... 10 4 14 Other No ... -81.655651 Mass False False False False False True False False
74 Standard Gravure shooting Louisville, Kentucky 9/14/1989 1989 Joseph T. Wesbecker, 47, gunned down eight peo... 9 12 21 Workplace Yes ... -85.759407 Mass False False False True False False False False
75 Stockton schoolyard shooting Stockton, California 1/17/1989 1989 Patrick Purdy, 26, an alcoholic with a police ... 6 29 35 School Yes ... -121.290780 Mass False False False True False False False False
76 ESL shooting Sunnyvale, California 2/16/1988 1988 Former ESL Incorporated employee Richard Farle... 7 4 11 Workplace Yes ... -122.036350 Mass False False False True False False False False
77 Shopping centers spree killings Palm Bay, Florida 4/23/1987 1987 Retired librarian William Cruse, 59, was paran... 6 14 20 Other Yes ... -80.642969 Spree False False False True False False False False
78 United States Postal Service shooting Edmond, Oklahoma 8/20/1986 1986 Postal worker Patrick Sherrill, 44, opened fir... 15 6 21 Workplace Unclear ... -97.429370 Mass False False False False False False True False
79 San Ysidro McDonald's massacre San Ysidro, California 7/18/1984 1984 James Oliver Huberty, 41, opened fire in a McD... 22 19 41 Other Yes ... -117.043081 Mass False False False False False True False False
80 Dallas nightclub shooting Dallas, Texas 6/29/1984 1984 Abdelkrim Belachheb, 39, opened fire at an ups... 6 1 7 Other Yes ... -96.800008 Mass False False False True False False False False
81 Welding shop shooting Miami, Florida 8/20/1982 1982 Junior high school teacher Carl Robert Brown, ... 8 3 11 Other Yes ... -80.226683 Mass False False False True False False False False

82 rows × 30 columns


In [1142]:
df['Gun Show'].value_counts()


Out[1142]:
False    80
True      2
Name: Gun Show, dtype: int64

In [1143]:
df['Online'].value_counts()


Out[1143]:
False    78
True      4
Name: Online, dtype: int64

In [1144]:
df['Family/Friends'].value_counts()


Out[1144]:
False    68
True     14
Name: Family/Friends, dtype: int64

In [1145]:
df['Store/Retailer'].value_counts()


Out[1145]:
False    42
True     40
Name: Store/Retailer, dtype: int64

In [1146]:
df['Stolen'].value_counts()


Out[1146]:
False    75
True      7
Name: Stolen, dtype: int64

In [1147]:
df['Issued'].value_counts()


Out[1147]:
False    80
True      2
Name: Issued, dtype: int64

In [1148]:
df['Unknown'].value_counts()


Out[1148]:
False    60
True     22
Name: Unknown, dtype: int64

3) Where state has the most mass shootings?


In [1149]:
# create empty list
states = []

# split the string values in location such that city, state become tuples 
# iterate over list of tuples, appending only the state to the above list 
for item in df['Location'].str.split(','): 
    states.append(item[1])

# create series by setting series equal to the list 
df['State'] = pd.Series(states)

# value counts on the series 
df['State'].value_counts().head(5)


Out[1149]:
 California    13
 Florida        7
 Texas          7
 Washington     6
 Colorado       5
Name: State, dtype: int64

4) Which mass shooting had the most wounded? / How many were wounded?


In [1150]:
df['Wounded'].idxmax() # returns index of maximum of values in a series
df['Case'].iloc[23]


Out[1150]:
'Aurora theater shooting'

In [1151]:
df['Wounded'].iloc[23]


Out[1151]:
58

5) Which mass shootings had the most victims?


In [1152]:
df['Total victims'].idxmax()
df['Case'].iloc[23]


Out[1152]:
'Aurora theater shooting'

6) What mass shooting was most fatal?


In [1153]:
df['Fatalities'].idxmax() # returns index of maximum of values in a series


Out[1153]:
1

In [1154]:
df['Case'].iloc[1]


Out[1154]:
'Orlando nightclub massacre'

7) What kind of weapons were used in the most fatal shootings? / Do shooters use a glock more often than not?


In [1155]:
df['Weapon details'].iloc[1] #case = Orlando nightclub


Out[1155]:
'Sig Sauer MCX rifle, Glock 17 9mm; high-capacity magazines (30 rounds)'

In [1156]:
df['Glock'] = df['Weapon details'].str.contains('[Gg]locks') | df['Weapon details'].str.contains('[Gg]lock')
df['Glock'].value_counts().plot(kind='bar', title = 'Shooter used a Glock?')


Out[1156]:
<matplotlib.axes._subplots.AxesSubplot at 0x10e66fa20>

8) What kind of weapons were used in the shootings that had the most victims (wounded and fatal)?


In [1159]:
df['Weapon details'].iloc[23]


Out[1159]:
'Two .40-caliber Glock semiautomatic handguns; .223-caliber Smith & Wesson M&P15 semiautomatic rifle; 12-gauge Remington 870 pump-action shotgun'

9) How many shooters showed prior signs of mental illness?


In [1157]:
df['Prior signs of possible mental illness'].value_counts()
df['Prior signs of possible mental illness'] = df['Prior signs of possible mental illness'].str.replace('Unclear', 'Unknown')
df['Prior signs of possible mental illness']= df['Prior signs of possible mental illness'].str.replace('unknown', 'Unknown')
df['Prior signs of possible mental illness']= df['Prior signs of possible mental illness'].str.replace('(pending)', 'Unknown')
df['Prior signs of possible mental illness']= df['Prior signs of possible mental illness'].str.replace('(Unknown) ', 'Unknown')
df['Prior signs of possible mental illness']= df['Prior signs of possible mental illness'].str.replace('(Unknown)', 'Unknown')
df['Prior signs of possible mental illness']= df['Prior signs of possible mental illness'].str.strip()
df['Prior signs of possible mental illness'].value_counts().plot(kind='bar', title = 'Prior Signs of Mental Illness?')


Out[1157]:
<matplotlib.axes._subplots.AxesSubplot at 0x10ebab630>

10) In which kind of venue do most mass shootings occur?


In [1158]:
df['Venue'] = df['Venue'].str.replace('Other\n', 'Other')
df['Venue'] = df['Venue'].str.replace('\nWorkplace', 'Workplace')
df['Venue'].value_counts()


Out[1158]:
Other        35
Workplace    23
School       15
Military      5
Religious     4
Name: Venue, dtype: int64