In [34]:
#Page 1
'''
Read the dataset in as a list using the csv module.
Import the csv module.
Open the file using the open() function.
Use the csv.reader() function to load the opened file.
Call list() on the result to get a list of all the data in the file.
Assign the result to the variable data.
Display the first 5 rows of data to verify everything.
'''
import csv
f = open("guns.csv", 'r')
data = list(csv.reader(f))
print(data[:5])


[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]

In [35]:
#Page 2
'''
Extract the first row of data, and assign it to the variable headers.
Remove the first row from data.
Display headers.
Display the first 5 rows of data to verify that you removed the header row properly.
'''
headers = data[1]
data = data[1:len(data)]
print(headers)
print(data[:5])


['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4']
[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]

In [36]:
#Page 3
'''
Use a list comprehension to extract the year column from data.
Because the year column is the second column in the data, you'll need to get the element at index 1 in each row.
Assign the result to the variable years.
Create an empty dictionary called year_counts.
Loop through each element in years.
If the element isn't a key in year_counts, create it, and set the value to 1.
If the element is a key in year_counts, increment the value by one.
Display year_counts to see how many gun deaths occur in each year.
'''
years = [row[1] for row in data]
year_counts ={}
for y in years:
    if y in year_counts:
        year_counts[y] += 1
    else:
        year_counts[y] = 1
year_counts


Out[36]:
{'2012': 33563, '2013': 33636, '2014': 33599}

In [37]:
#Page 4
'''
Use a list comprehension to create a datetime.datetime object for each row. Assign the result to dates.
The year column is in the second element in each row.
The month column is the third element in each row.
Make sure to convert year and month to integers using int().
Pass year, month, and day=1 into the datetime.datetime() function.
Display the first 5 rows in dates to verify everything worked.
Count up how many times each unique date occurs in dates. Assign the result to date_counts.
This follows a similar procedure to what we did in the last screen with year_counts.
Display date_counts.
'''
import datetime as dt
current_datetime = dt.datetime.utcnow()
current_datetime

for row in data:
    #print(row[2])
    row_dateTime = dt.datetime(year = int(row[1]), month = int(row[2]), day  = 22)
    ##print(row_dateTime)
    row.append(row_dateTime)
date_count = {}
for row in data:
    if row[11] in date_count:
        date_count[row[11]] = date_count[row[11]] + 1
    else:
        date_count[row[11]] = 1
date_count


Out[37]:
{datetime.datetime(2012, 1, 22, 0, 0): 2758,
 datetime.datetime(2012, 2, 22, 0, 0): 2357,
 datetime.datetime(2012, 3, 22, 0, 0): 2743,
 datetime.datetime(2012, 4, 22, 0, 0): 2795,
 datetime.datetime(2012, 5, 22, 0, 0): 2999,
 datetime.datetime(2012, 6, 22, 0, 0): 2826,
 datetime.datetime(2012, 7, 22, 0, 0): 3026,
 datetime.datetime(2012, 8, 22, 0, 0): 2954,
 datetime.datetime(2012, 9, 22, 0, 0): 2852,
 datetime.datetime(2012, 10, 22, 0, 0): 2733,
 datetime.datetime(2012, 11, 22, 0, 0): 2729,
 datetime.datetime(2012, 12, 22, 0, 0): 2791,
 datetime.datetime(2013, 1, 22, 0, 0): 2864,
 datetime.datetime(2013, 2, 22, 0, 0): 2375,
 datetime.datetime(2013, 3, 22, 0, 0): 2862,
 datetime.datetime(2013, 4, 22, 0, 0): 2798,
 datetime.datetime(2013, 5, 22, 0, 0): 2806,
 datetime.datetime(2013, 6, 22, 0, 0): 2920,
 datetime.datetime(2013, 7, 22, 0, 0): 3079,
 datetime.datetime(2013, 8, 22, 0, 0): 2859,
 datetime.datetime(2013, 9, 22, 0, 0): 2742,
 datetime.datetime(2013, 10, 22, 0, 0): 2808,
 datetime.datetime(2013, 11, 22, 0, 0): 2758,
 datetime.datetime(2013, 12, 22, 0, 0): 2765,
 datetime.datetime(2014, 1, 22, 0, 0): 2651,
 datetime.datetime(2014, 2, 22, 0, 0): 2361,
 datetime.datetime(2014, 3, 22, 0, 0): 2684,
 datetime.datetime(2014, 4, 22, 0, 0): 2862,
 datetime.datetime(2014, 5, 22, 0, 0): 2864,
 datetime.datetime(2014, 6, 22, 0, 0): 2931,
 datetime.datetime(2014, 7, 22, 0, 0): 2884,
 datetime.datetime(2014, 8, 22, 0, 0): 2970,
 datetime.datetime(2014, 9, 22, 0, 0): 2914,
 datetime.datetime(2014, 10, 22, 0, 0): 2865,
 datetime.datetime(2014, 11, 22, 0, 0): 2756,
 datetime.datetime(2014, 12, 22, 0, 0): 2857}

In [38]:
#Page 5
'''
Count up how many times each item in the sex column occurs.
Assign the result to sex_counts.
Count up how many times each item in the race column occurs.
Assign the result to race_counts.
Display race_counts and sex_counts to verify your work, and see if you can spot any patterns.
Write a markdown cell detailing what you've learned so far, and what you think might need further examination
'''
sex_counts = {}
race_counts = {}
for row in data:
    if row[5] in sex_counts:
        sex_counts[row[5]] += 1
    else:
        sex_counts[row[5]] = 1
    if row[7] in race_counts:
        race_counts[row[7]] += 1
    else:
        race_counts[row[7]] = 1
print(sex_counts)
print(race_counts)


{'M': 86349, 'F': 14449}
{'Hispanic': 9022, 'White': 66237, 'Asian/Pacific Islander': 1326, 'Native American/Native Alaskan': 917, 'Black': 23296}

In [60]:
#Page 6
'''
Read in census.csv, and convert to a list of lists. Assign the result to the census variable.
Display census to verify your work.
'''
census = list(csv.reader(open("census.csv", 'r')))
census


Out[60]:
[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In [80]:
#Page 7
'''
Manually create a dictionary, mapping that maps each key from race_counts to the population count of the race from census.
The keys in the dictionary should be Asian/Pacific Islander, Black, Native American/Native Alaskan, Hispanic, and White.
In the case of Asian/Pacific Islander, you'll need to add the counts from census for Race Alone - Asian, and Race Alone - Native Hawaiian and Other Pacific Islander.
Create an empty dictionary, race_per_hundredk.
Loop through each key in race_counts.
Divide the value associated with the key in race_counts by the value associated with the key in mapping.
Multiply by 100000.
Assign the result to the same key in race_per_hundredk.
When you're done, race_per_hundredk should contain the rate of gun deaths per 100000 people for each racial category.
Print race_per_hundredk to verify your work.
'''
mapping = {}
newCensus = list(census[1])
mapping["White"] = int(newCensus[10])
mapping["Hispanic"] = int(newCensus[11])
mapping["Black"] = int(newCensus[12])
mapping["Native American/Native Alaskan"] = int(newCensus[13])
mapping["Asian/Pacific Islander"] = int(newCensus[14]) + int(newCensus[15])

race_per_hundredk = {}
for row in race_counts:
    race_per_hundredk[row] = (race_counts[row] / mapping[row]) * 100000
print(race_per_hundredk)


{'Hispanic': 20.220491210910907, 'Asian/Pacific Islander': 8.374309664161762, 'White': 33.56849303419181, 'Black': 57.8773477735196, 'Native American/Native Alaskan': 24.521955573811088}

In [85]:
#Page 8:
'''
Extract the intent column using a list comprehension. The intent column is the fourth column in data.
Assign the result to intents.
Extract the race column using a list comprehension. The race column is the eighth column in data.
Assign the result to races.
Create an empty dictionary called homicide_race_counts
Use the enumerate() function to loop through each item in races. The position should be assigned to the loop variable i, and the value to the loop variable race.
Check the value at position i in intents.
If the value at position i in intents is Homicide:
If the key race doesn't exist in homicide_race_counts, create it.
Add 1 to the value associated with race in homicide_race_counts.
When you're done, homicide_race_counts should have one key for each of the racial categories in data. The associated value should be the number of gun deaths by homicide for that race.
Perform the same procedure we did in the last screen using mapping on homicide_race_counts to get from raw numbers to rates per 100000.
Display homicide_race_counts to verify your work.
Write up your findings in a markdown cell.
Write up any next steps you want to pursue with the data in a markdown cell.
'''
intents = [row[3] for row in data]
races = [row[7] for row in data]
homicide_race_counts = {}
for i, race in enumerate(races):
    if intents[i] =="Homicide":
        if race in homicide_race_counts:
            homicide_race_counts[race] += 1
        else:
            homicide_race_counts[race] = 1

race_per_hundredk2 = {}
for row in homicide_race_counts:
    race_per_hundredk2[row] = (homicide_race_counts[row] / mapping[row]) * 100000
print(race_per_hundredk2)

## Look at the number of blacks who are killed by a gun in a homicide


{'Hispanic': 12.627161104219914, 'Asian/Pacific Islander': 3.530346230970155, 'White': 4.6356417981453335, 'Black': 48.471284987180944, 'Native American/Native Alaskan': 8.717729026240365}

In [ ]: