In [61]:
import pandas as pd
import re
import pprint as pp

The Excel file being read in has two sheets: the first is the data.

When I don't specify a sheetname, the default is the first (zero-indexed).


In [62]:
df = pd.read_excel('PPB_gang_records_UPDATED_100516.xlsx')

The second sheet, sheetname=1, is of metadata.


In [63]:
pd.read_excel('PPB_gang_records_UPDATED_100516.xlsx',sheetname=1).head()


Out[63]:
Database with the current 359 designated criminal gang organization members in Portland Police records. Unnamed: 1
0 NaN NaN
1 Variable Explanation
2 gang_code The system code for gang name
3 gang_name The street name of the Criminal Gang Organization
4 gang_type The predominate membership type in the crimina...

What looks like the header row here is just some exposition, then a blank row, and finally the real header: Variable and Explanation. So below, I'll set that row to the header.


In [64]:
metadata = pd.read_excel('PPB_gang_records_UPDATED_100516.xlsx',header=3,sheetname=1)

In [65]:
metadata_dict = dict(zip(metadata['Variable'], metadata['Explanation']))

In [66]:
pp.pprint(metadata_dict)


{nan: nan,
 'Data source: RegJIN; Created on 10/5/2016': nan,
 'Factors considered when documenting the membership to the criminal gang organization': nan,
 'crt10_checked': 'The person wears clothes or jewelry unique to a gang in '
                  'context which clearly indicates affiliation with a gang',
 'crt11_checked': 'This person uses a hand sign or language which, due to '
                  'content or context, Cleary indicates affiliation with the '
                  'gang',
 'crt12_checked': "The person's name appears on a criminal street gang "
                  'document',
 'crt13_checked': 'The person is in photograph with other people who '
                  'collectively display criminal gang signs or apparel to '
                  'exhibit solidarity ',
 'crt14_checked': 'The person possesses a gang tattoo',
 'crt1_checked': 'The person admits or asserts affiliation with a criminal '
                 'gang to police',
 'crt2_checked': 'The person participates in a criminal gang initiation ritual '
                 'or ceremony',
 'crt3_checked': 'The person conspires to commit, or commits, a crime which is '
                 'part of a pattern of street crimes facilitated by the '
                 'efforts of other gang affiliates which advance the interests '
                 'of the person',
 'crt4_checked': 'The person conspires to commit, or commits, a crime to '
                 'attract the attention of the criminal gang or enhance the '
                 'standing of the person in the criminal gang',
 'crt5_checked': 'The person conspires to commit, or commits, a crime for the '
                 'benefit of the gang',
 'crt6_checked': 'The person conspires to commit, or commits, a crime to '
                 'announce the existence of the gang; its membership or its '
                 'territorial claims.',
 'crt7_checked': 'The person conspires to commit, or commits, a crime in '
                 'response to the race, color, religion, sexual preference, '
                 'national origin or gang association of the victim',
 'crt8_checked': "The person displays knowledge of the gang's history, "
                 'leadership, activities or rituals in a context which clearly '
                 'indicates affiliation with the gang',
 'crt9_checked': 'The person announces to the police that the person is '
                 'willing to commit assaults, crimes, or make other sacrifices '
                 'for the gang',
 'diary_date': 'The date the record is required to be updated in the system '
               '(member end date)',
 'entry_date': 'The date that the criminal gang designation document was '
               'entered into the records system',
 'flag_entry_date': 'The date that the criminal gang was entered into the '
                    'records system',
 'flag_last_update': 'The date additional information associated with the '
                     'criminal gang record was entered, if changed from the '
                     'entry date',
 'gang_code': 'The system code for gang name',
 'gang_name': 'The street name of the Criminal Gang Organization',
 'gang_type': 'The predominate membership type in the criminal gang '
              'organization',
 'last_activity_date': 'The last criminal gang activity that would constitute '
                       'a criminal gang organization designation',
 'last_update': 'The date of any additional gang designation information being '
                'entered, if changed from the entry date',
 'member_end_date': 'The date the criminal gang organization membership will '
                    'be removed from their name index, as long as they have '
                    'not been involved with additional criminal gang activity',
 'membership_date': 'The date the person first began being designated as a '
                    'member of the criminal gang organization',
 'race': 'Race of criminal gang member',
 'rin': 'System created record index number (not unique across records in '
        'system)',
 'sex': 'Sex of criminal gang member',
 'state': 'State of residency when designated',
 'yob': 'Year of birth of criminal gang member',
 'zip': 'Zip code of home address when designated'}

In [67]:
df = pd.read_excel('PPB_gang_records_UPDATED_100516.xlsx')

In [68]:
df.head()


Out[68]:
gang_code gang_name gang_type flag_entry_date flag_last_update rin membership_date member_end_date entry_date last_update ... crt5_checked crt6_checked crt7_checked crt8_checked crt9_checked crt10_checked crt11_checked crt12_checked crt13_checked crt14_checked
0 111N 111TH NEIGHBORHOOD CRIP ... CRIPS ... 2015-05-06 2015-05-06 357 2014-06-29 2018-06-29 2015-05-06 2015-05-06 ... N N N N N N N N N Y
1 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 338 2010-09-25 2019-07-03 2015-05-04 2015-07-27 ... N N N N N N N N N N
2 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 339 2015-01-24 2019-01-24 2015-05-04 2015-05-04 ... N N N N N N N N N Y
3 13ST 13TH STREET ... HISPANIC ... 2015-04-29 2015-04-29 273 2014-11-06 2018-11-06 2015-04-29 2015-04-29 ... N N N N N N N N N N
4 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 234 2013-05-02 2017-05-02 2015-04-28 2015-04-28 ... N N N N Y Y N N N N

5 rows × 31 columns


In [69]:
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 359 entries, 0 to 358
Data columns (total 31 columns):
gang_code             359 non-null object
gang_name             359 non-null object
gang_type             359 non-null object
flag_entry_date       359 non-null datetime64[ns]
flag_last_update      359 non-null datetime64[ns]
rin                   359 non-null int64
membership_date       359 non-null datetime64[ns]
member_end_date       359 non-null datetime64[ns]
entry_date            359 non-null datetime64[ns]
last_update           359 non-null datetime64[ns]
last_activity_date    28 non-null datetime64[ns]
diary_date            359 non-null datetime64[ns]
race                  359 non-null object
sex                   359 non-null object
yob                   359 non-null int64
state                 329 non-null object
zip                   331 non-null object
crt1_checked          359 non-null object
crt2_checked          359 non-null object
crt3_checked          359 non-null object
crt4_checked          359 non-null object
crt5_checked          359 non-null object
crt6_checked          359 non-null object
crt7_checked          359 non-null object
crt8_checked          359 non-null object
crt9_checked          359 non-null object
crt10_checked         359 non-null object
crt11_checked         359 non-null object
crt12_checked         359 non-null object
crt13_checked         359 non-null object
crt14_checked         359 non-null object
dtypes: datetime64[ns](8), int64(2), object(21)
memory usage: 87.0+ KB

In [70]:
df['zip'].replace(regex=True,inplace=True,to_replace=r'\D',value=r'')

In [71]:
df.describe()


Out[71]:
rin yob
count 359.000000 359.000000
mean 261.518106 1987.289694
std 154.927931 8.220745
min 2.000000 1959.000000
25% 122.500000 1982.000000
50% 261.000000 1989.000000
75% 391.500000 1993.000000
max 545.000000 2008.000000

By default df.describe() returns summary statistics for all numerical columns. This is how I discovered that the most recent date of birth of a member of the gang list is 2008, making that person no more than eight years old at designation.

When Brosseau followed up with her source, she was told that this was a typo and that the year should be listed as 1998, which I will correct in a subsequent cell.


In [72]:
# Find the index of this maximum 'yob' (year of birth) row
df['yob'].idxmax(axis=1)


Out[72]:
306

In [73]:
df.ix[306]['yob']


Out[73]:
2008

In [74]:
# Replace year of birth according to correction we received from the Portland Police Bureau.
df.set_value(306, 'yob',1998)


Out[74]:
gang_code gang_name gang_type flag_entry_date flag_last_update rin membership_date member_end_date entry_date last_update ... crt5_checked crt6_checked crt7_checked crt8_checked crt9_checked crt10_checked crt11_checked crt12_checked crt13_checked crt14_checked
0 111N 111TH NEIGHBORHOOD CRIP ... CRIPS ... 2015-05-06 2015-05-06 357 2014-06-29 2018-06-29 2015-05-06 2015-05-06 ... N N N N N N N N N Y
1 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 338 2010-09-25 2019-07-03 2015-05-04 2015-07-27 ... N N N N N N N N N N
2 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 339 2015-01-24 2019-01-24 2015-05-04 2015-05-04 ... N N N N N N N N N Y
3 13ST 13TH STREET ... HISPANIC ... 2015-04-29 2015-04-29 273 2014-11-06 2018-11-06 2015-04-29 2015-04-29 ... N N N N N N N N N N
4 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 234 2013-05-02 2017-05-02 2015-04-28 2015-04-28 ... N N N N Y Y N N N N
5 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 235 2013-03-08 2017-03-08 2015-04-28 2015-04-28 ... N N N N N N N N N N
6 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 236 2013-05-04 2017-05-04 2015-04-28 2015-04-28 ... N N N N N N N N N Y
7 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 238 2014-11-06 2018-11-06 2015-04-28 2015-04-28 ... N N N N N N N N N Y
8 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 240 2012-03-24 2019-10-17 2015-04-28 2015-11-13 ... N N N N N N N N N N
9 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 341 2013-02-23 2017-02-23 2015-05-04 2015-05-04 ... N N N N N N N N N Y
10 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 401 2015-03-19 2019-09-22 2015-05-19 2015-11-30 ... N N N N N Y N N N Y
11 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 534 2016-01-30 2020-01-30 2016-03-28 2016-03-28 ... N N N N N N N N N N
12 TLOC 18TH STREET - TINY LOCOS ... HISPANIC ... 2015-05-04 2015-05-04 328 2013-10-31 2017-10-31 2015-05-04 2015-05-04 ... N N N N N N N N N N
13 21SS 21ST ST SOUTH SIDE LOCOS (HISPANIC) ... HISPANIC ... 2015-04-22 2015-04-22 168 2014-09-27 2018-09-27 2015-04-22 2015-04-22 ... N N N N N N N N Y Y
14 62DI 62 DIAMOND CRIP ... CRIPS ... 2015-05-06 2015-05-06 356 2014-10-25 2018-10-25 2015-05-06 2015-05-06 ... N N N Y N N N N N N
15 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 292 2014-01-17 2018-01-17 2015-04-30 2015-04-30 ... N N N N N Y N N N Y
16 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 293 2014-03-07 2018-03-07 2015-04-30 2015-04-30 ... N N N N N N N N N N
17 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 294 2014-02-01 2018-02-01 2015-04-30 2015-04-30 ... N N N N N N N N N N
18 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 295 2013-03-30 2017-03-30 2015-04-30 2015-04-30 ... N N N N N N N N N N
19 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 296 2013-06-26 2017-06-26 2015-04-30 2015-04-30 ... N Y N Y N Y Y N N N
20 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 298 2013-06-09 2017-06-09 2015-04-30 2015-04-30 ... N N N N N N N N N N
21 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 300 2011-08-19 2019-05-14 2015-04-30 2015-06-10 ... N N N N Y Y N N N Y
22 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 450 2015-06-13 2019-06-13 2015-07-27 2015-07-27 ... N N N N N N N N N Y
23 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 467 2015-08-01 2019-08-01 2015-09-10 2015-09-10 ... N N N N N Y N N N Y
24 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 5 2014-04-18 2018-04-18 2015-04-15 2015-04-15 ... N N N Y N N N N N Y
25 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 6 2014-09-29 2018-09-29 2015-04-15 2015-04-15 ... N N N N N N Y N N Y
26 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 7 2013-08-23 2017-08-23 2015-04-15 2015-04-15 ... N N N N N N N N N N
27 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 8 2014-08-14 2018-08-14 2015-04-15 2015-04-15 ... N N N N N N N N N Y
28 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 391 2015-01-13 2019-01-13 2015-05-12 2015-05-12 ... N N N N N N N N N Y
29 AR ARYAN ... WHITE ... 2015-05-06 2015-05-06 376 2014-08-06 2018-08-06 2015-05-06 2015-05-06 ... N N N N N N N N N N
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
329 UPH UNTHANK PARK HUSTLERS ... BLACK ... 2015-04-17 2015-04-17 115 2014-05-30 2019-10-08 2015-04-17 2015-11-13 ... N N N N N N N N N Y
330 UPH UNTHANK PARK HUSTLERS ... BLACK ... 2015-04-17 2015-04-17 116 2014-02-26 2018-02-26 2015-04-17 2015-04-17 ... Y N N Y Y N N N N N
331 UPH UNTHANK PARK HUSTLERS ... BLACK ... 2015-04-17 2015-04-17 472 2015-07-30 2019-07-30 2015-09-10 2015-09-10 ... N N N N N N N N N N
332 VC VARRIO CATORCE ... HISPANIC ... 2015-05-06 2015-05-06 372 2013-01-26 2017-01-26 2015-05-06 2015-05-06 ... N N N N N N N N N Y
333 VL VICE LORDS (FOLK) ... BLACK ... 2015-04-30 2015-04-30 319 2013-06-21 2017-06-21 2015-04-30 2015-04-30 ... N N N N N N N N N Y
334 VL VICE LORDS (FOLK) ... BLACK ... 2015-04-30 2015-04-30 320 2012-11-09 2016-11-09 2015-04-30 2015-04-30 ... N N N N N N N N N N
335 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 231 2013-08-23 2017-08-23 2015-04-28 2015-04-28 ... N N N N N N N N N N
336 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 232 2014-05-02 2018-05-02 2015-04-28 2015-04-28 ... N N N N N N N N N N
337 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 233 2013-12-21 2017-12-21 2015-04-28 2015-04-28 ... N N N N N N N N N N
338 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 281 2012-12-22 2016-12-22 2015-04-29 2015-04-29 ... N N N N N Y Y N N N
339 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 283 2014-11-28 2018-11-28 2015-04-29 2015-04-29 ... N N N N N N N N N N
340 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 453 2015-06-05 2019-06-05 2015-08-18 2015-08-18 ... N N N Y N Y Y N N Y
341 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 507 2015-08-31 2019-08-31 2015-11-10 2015-11-10 ... N N N Y N N N N N Y
342 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 119 2013-03-30 2017-03-30 2015-04-17 2015-04-17 ... N N N Y N N N N N Y
343 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 120 2014-04-18 2018-04-18 2015-04-17 2015-04-17 ... N N N N N N Y N N N
344 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 121 2014-11-22 2018-11-22 2015-04-17 2015-04-17 ... N N N Y N Y Y N N N
345 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 122 2014-10-12 2018-10-12 2015-04-17 2015-04-17 ... N N N Y N Y N Y N N
346 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 123 2014-08-09 2018-08-09 2015-04-17 2015-04-17 ... N N N N N Y N N N Y
347 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 125 2014-04-18 2018-04-18 2015-04-17 2015-04-17 ... N N N N N N N N N N
348 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 406 2015-02-14 2019-02-14 2015-05-19 2015-05-19 ... N N N Y N N N N N N
349 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 411 2015-02-12 2019-02-12 2015-05-19 2015-05-19 ... N N N N N N N N N N
350 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 436 2011-07-11 2019-06-08 2015-06-22 2015-06-22 ... N N N N N N N N N Y
351 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 445 2015-05-09 2019-05-09 2015-07-13 2015-07-13 ... N N N N N Y N N N Y
352 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 446 2015-05-06 2019-05-06 2015-07-13 2015-07-13 ... N N N N N N N N N N
353 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 451 2015-06-12 2019-06-12 2015-07-27 2015-07-27 ... N N N N N Y N N N Y
354 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 457 2015-06-11 2019-06-11 2015-08-18 2015-08-18 ... N N N N N Y N N N Y
355 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 464 2012-08-04 2019-08-30 2015-09-01 2015-09-01 ... N N N N N Y N N N Y
356 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 469 2015-07-09 2019-07-09 2015-09-10 2015-09-10 ... N N N N N N N N N N
357 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 509 2015-09-11 2019-09-11 2015-11-10 2015-11-10 ... N N N N N N N N N N
358 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 529 2015-10-06 2019-10-06 2016-03-07 2016-03-07 ... N N N N N N N N N N

359 rows × 31 columns


In [75]:
# Establish a new column for person's age at designation
df['Approximate Age at Designation'] = df['membership_date'].apply(lambda x: x.year)-df['yob']

In [76]:
df.head()


Out[76]:
gang_code gang_name gang_type flag_entry_date flag_last_update rin membership_date member_end_date entry_date last_update ... crt6_checked crt7_checked crt8_checked crt9_checked crt10_checked crt11_checked crt12_checked crt13_checked crt14_checked Approximate Age at Designation
0 111N 111TH NEIGHBORHOOD CRIP ... CRIPS ... 2015-05-06 2015-05-06 357 2014-06-29 2018-06-29 2015-05-06 2015-05-06 ... N N N N N N N N Y 27
1 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 338 2010-09-25 2019-07-03 2015-05-04 2015-07-27 ... N N N N N N N N N 34
2 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 339 2015-01-24 2019-01-24 2015-05-04 2015-05-04 ... N N N N N N N N Y 41
3 13ST 13TH STREET ... HISPANIC ... 2015-04-29 2015-04-29 273 2014-11-06 2018-11-06 2015-04-29 2015-04-29 ... N N N N N N N N N 20
4 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 234 2013-05-02 2017-05-02 2015-04-28 2015-04-28 ... N N N Y Y N N N N 18

5 rows × 32 columns


In [77]:
# Rather than count individual criteria in the "crime or conspiracy" subset,
# return `True` if any criteria therein are met

df['Any Crime'] = (df['crt3_checked'] == "Y") | (df['crt4_checked'] == "Y") | (df['crt5_checked'] == "Y") | (df['crt6_checked'] == "Y")|(df['crt7_checked'] == "Y")

In [78]:
# This is the subset of criteria not related to "crime or conspiracy to
# commit a crime," e.g. clothing, tattoos, etc.

df_criteria = df[['crt1_checked','crt2_checked','crt8_checked','crt9_checked','crt10_checked','crt12_checked','crt13_checked','crt14_checked']]

In [79]:
# sum non-crime criteria met
def y_count(dataframe):
    counter = 0
    for x in dataframe:
        if x == "Y":
            counter+=1
    return counter

In [80]:
# perform count on dataframe composed only of non-crime criteria
# but append that information to original dataset
df['Non-Crime Criteria Met'] = df_criteria.apply(lambda x: y_count(x),axis=1)

In [81]:
df


Out[81]:
gang_code gang_name gang_type flag_entry_date flag_last_update rin membership_date member_end_date entry_date last_update ... crt8_checked crt9_checked crt10_checked crt11_checked crt12_checked crt13_checked crt14_checked Approximate Age at Designation Any Crime Non-Crime Criteria Met
0 111N 111TH NEIGHBORHOOD CRIP ... CRIPS ... 2015-05-06 2015-05-06 357 2014-06-29 2018-06-29 2015-05-06 2015-05-06 ... N N N N N N Y 27 False 2
1 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 338 2010-09-25 2019-07-03 2015-05-04 2015-07-27 ... N N N N N N N 34 False 1
2 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 339 2015-01-24 2019-01-24 2015-05-04 2015-05-04 ... N N N N N N Y 41 False 2
3 13ST 13TH STREET ... HISPANIC ... 2015-04-29 2015-04-29 273 2014-11-06 2018-11-06 2015-04-29 2015-04-29 ... N N N N N N N 20 False 1
4 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 234 2013-05-02 2017-05-02 2015-04-28 2015-04-28 ... N Y Y N N N N 18 False 3
5 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 235 2013-03-08 2017-03-08 2015-04-28 2015-04-28 ... N N N N N N N 39 False 1
6 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 236 2013-05-04 2017-05-04 2015-04-28 2015-04-28 ... N N N N N N Y 25 False 2
7 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 238 2014-11-06 2018-11-06 2015-04-28 2015-04-28 ... N N N N N N Y 36 False 2
8 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 240 2012-03-24 2019-10-17 2015-04-28 2015-11-13 ... N N N N N N N 17 False 1
9 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 341 2013-02-23 2017-02-23 2015-05-04 2015-05-04 ... N N N N N N Y 21 False 2
10 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 401 2015-03-19 2019-09-22 2015-05-19 2015-11-30 ... N N Y N N N Y 28 False 3
11 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 534 2016-01-30 2020-01-30 2016-03-28 2016-03-28 ... N N N N N N N 36 False 1
12 TLOC 18TH STREET - TINY LOCOS ... HISPANIC ... 2015-05-04 2015-05-04 328 2013-10-31 2017-10-31 2015-05-04 2015-05-04 ... N N N N N N N 16 False 1
13 21SS 21ST ST SOUTH SIDE LOCOS (HISPANIC) ... HISPANIC ... 2015-04-22 2015-04-22 168 2014-09-27 2018-09-27 2015-04-22 2015-04-22 ... N N N N N Y Y 19 False 4
14 62DI 62 DIAMOND CRIP ... CRIPS ... 2015-05-06 2015-05-06 356 2014-10-25 2018-10-25 2015-05-06 2015-05-06 ... Y N N N N N N 32 False 2
15 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 292 2014-01-17 2018-01-17 2015-04-30 2015-04-30 ... N N Y N N N Y 34 False 3
16 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 293 2014-03-07 2018-03-07 2015-04-30 2015-04-30 ... N N N N N N N 36 False 1
17 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 294 2014-02-01 2018-02-01 2015-04-30 2015-04-30 ... N N N N N N N 27 False 1
18 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 295 2013-03-30 2017-03-30 2015-04-30 2015-04-30 ... N N N N N N N 16 False 1
19 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 296 2013-06-26 2017-06-26 2015-04-30 2015-04-30 ... Y N Y Y N N N 35 True 3
20 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 298 2013-06-09 2017-06-09 2015-04-30 2015-04-30 ... N N N N N N N 23 False 1
21 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 300 2011-08-19 2019-05-14 2015-04-30 2015-06-10 ... N Y Y N N N Y 20 False 5
22 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 450 2015-06-13 2019-06-13 2015-07-27 2015-07-27 ... N N N N N N Y 23 False 2
23 62EC 62 EAST COAST CRIP ... CRIPS ... 2015-04-30 2015-04-30 467 2015-08-01 2019-08-01 2015-09-10 2015-09-10 ... N N Y N N N Y 18 False 3
24 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 5 2014-04-18 2018-04-18 2015-04-15 2015-04-15 ... Y N N N N N Y 29 False 2
25 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 6 2014-09-29 2018-09-29 2015-04-15 2015-04-15 ... N N N Y N N Y 24 False 2
26 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 7 2013-08-23 2017-08-23 2015-04-15 2015-04-15 ... N N N N N N N 31 False 1
27 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 8 2014-08-14 2018-08-14 2015-04-15 2015-04-15 ... N N N N N N Y 27 False 2
28 AOB ALL ON A BITCH ... WHITE ... 2015-04-15 2015-04-15 391 2015-01-13 2019-01-13 2015-05-12 2015-05-12 ... N N N N N N Y 26 False 2
29 AR ARYAN ... WHITE ... 2015-05-06 2015-05-06 376 2014-08-06 2018-08-06 2015-05-06 2015-05-06 ... N N N N N N N 29 False 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
329 UPH UNTHANK PARK HUSTLERS ... BLACK ... 2015-04-17 2015-04-17 115 2014-05-30 2019-10-08 2015-04-17 2015-11-13 ... N N N N N N Y 36 False 2
330 UPH UNTHANK PARK HUSTLERS ... BLACK ... 2015-04-17 2015-04-17 116 2014-02-26 2018-02-26 2015-04-17 2015-04-17 ... Y Y N N N N N 21 True 4
331 UPH UNTHANK PARK HUSTLERS ... BLACK ... 2015-04-17 2015-04-17 472 2015-07-30 2019-07-30 2015-09-10 2015-09-10 ... N N N N N N N 27 False 1
332 VC VARRIO CATORCE ... HISPANIC ... 2015-05-06 2015-05-06 372 2013-01-26 2017-01-26 2015-05-06 2015-05-06 ... N N N N N N Y 16 False 3
333 VL VICE LORDS (FOLK) ... BLACK ... 2015-04-30 2015-04-30 319 2013-06-21 2017-06-21 2015-04-30 2015-04-30 ... N N N N N N Y 41 False 2
334 VL VICE LORDS (FOLK) ... BLACK ... 2015-04-30 2015-04-30 320 2012-11-09 2016-11-09 2015-04-30 2015-04-30 ... N N N N N N N 21 False 1
335 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 231 2013-08-23 2017-08-23 2015-04-28 2015-04-28 ... N N N N N N N 35 False 1
336 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 232 2014-05-02 2018-05-02 2015-04-28 2015-04-28 ... N N N N N N N 34 False 1
337 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 233 2013-12-21 2017-12-21 2015-04-28 2015-04-28 ... N N N N N N N 38 False 1
338 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 281 2012-12-22 2016-12-22 2015-04-29 2015-04-29 ... N N Y Y N N N 22 False 2
339 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 283 2014-11-28 2018-11-28 2015-04-29 2015-04-29 ... N N N N N N N 39 False 1
340 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 453 2015-06-05 2019-06-05 2015-08-18 2015-08-18 ... Y N Y Y N N Y 24 False 4
341 WSP WESTSIDE PIRU ... BLOODS ... 2015-04-28 2015-04-28 507 2015-08-31 2019-08-31 2015-11-10 2015-11-10 ... Y N N N N N Y 26 False 3
342 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 119 2013-03-30 2017-03-30 2015-04-17 2015-04-17 ... Y N N N N N Y 35 False 3
343 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 120 2014-04-18 2018-04-18 2015-04-17 2015-04-17 ... N N N Y N N N 21 False 1
344 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 121 2014-11-22 2018-11-22 2015-04-17 2015-04-17 ... Y N Y Y N N N 18 False 2
345 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 122 2014-10-12 2018-10-12 2015-04-17 2015-04-17 ... Y N Y N Y N N 30 False 3
346 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 123 2014-08-09 2018-08-09 2015-04-17 2015-04-17 ... N N Y N N N Y 26 False 3
347 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 125 2014-04-18 2018-04-18 2015-04-17 2015-04-17 ... N N N N N N N 21 False 1
348 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 406 2015-02-14 2019-02-14 2015-05-19 2015-05-19 ... Y N N N N N N 39 False 2
349 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 411 2015-02-12 2019-02-12 2015-05-19 2015-05-19 ... N N N N N N N 15 False 1
350 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 436 2011-07-11 2019-06-08 2015-06-22 2015-06-22 ... N N N N N N Y 17 False 3
351 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 445 2015-05-09 2019-05-09 2015-07-13 2015-07-13 ... N N Y N N N Y 19 False 3
352 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 446 2015-05-06 2019-05-06 2015-07-13 2015-07-13 ... N N N N N N N 18 False 2
353 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 451 2015-06-12 2019-06-12 2015-07-27 2015-07-27 ... N N Y N N N Y 23 False 3
354 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 457 2015-06-11 2019-06-11 2015-08-18 2015-08-18 ... N N Y N N N Y 24 False 2
355 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 464 2012-08-04 2019-08-30 2015-09-01 2015-09-01 ... N N Y N N N Y 21 False 3
356 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 469 2015-07-09 2019-07-09 2015-09-10 2015-09-10 ... N N N N N N N 18 False 1
357 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 509 2015-09-11 2019-09-11 2015-11-10 2015-11-10 ... N N N N N N N 38 False 1
358 WOOD WOODLAWN PARK BLOOD ... BLOODS ... 2015-04-17 2015-04-17 529 2015-10-06 2019-10-06 2016-03-07 2016-03-07 ... N N N N N N N 23 False 1

359 rows × 34 columns


In [82]:
# sum non-crime criteria and crime criteria, but counting
# the latter as one point, as `True` evaluates to 1.

df['Total Criteria Met'] = df['Any Crime'] + df['Non-Crime Criteria Met']

In [83]:
# Make human-readable labels for data, based on metadata sheet

rename_dict = {
"gang_name":"Gang Name",
"crt1_checked":"Claims Affiliation",
"crt2_checked":"Participated in Initiation",
"crt3_checked":"Crime: Gang-Assisted Self-Benefiting",
"crt4_checked":"Crime: Gang-Oriented Self-Benefiting",
"crt5_checked":"Crime: Gang-Benefiting",
"crt6_checked":"Crime: Gang-Promoting",
"crt7_checked":"Crime: Victim-Oriented",
"crt8_checked":"Knowledge of Gang Culture",
"crt9_checked":"Announces Allegiance",
"crt10_checked":"Gang Clothing or Jewelry",
"crt11_checked":"Gang Language",
"crt12_checked":"Named in a Gang Document",
"crt13_checked":"Appears in a Gang-Related Photograph",
"crt14_checked":"Gang Tattoo"}

In [84]:
df.rename(index=str, columns=rename_dict,inplace=True)

In [85]:
df.head()


Out[85]:
gang_code Gang Name gang_type flag_entry_date flag_last_update rin membership_date member_end_date entry_date last_update ... Announces Allegiance Gang Clothing or Jewelry Gang Language Named in a Gang Document Appears in a Gang-Related Photograph Gang Tattoo Approximate Age at Designation Any Crime Non-Crime Criteria Met Total Criteria Met
0 111N 111TH NEIGHBORHOOD CRIP ... CRIPS ... 2015-05-06 2015-05-06 357 2014-06-29 2018-06-29 2015-05-06 2015-05-06 ... N N N N N Y 27 False 2 2
1 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 338 2010-09-25 2019-07-03 2015-05-04 2015-07-27 ... N N N N N N 34 False 1 1
2 12ST 12TH STREET (HISPANIC) ... HISPANIC ... 2015-05-04 2015-05-04 339 2015-01-24 2019-01-24 2015-05-04 2015-05-04 ... N N N N N Y 41 False 2 2
3 13ST 13TH STREET ... HISPANIC ... 2015-04-29 2015-04-29 273 2014-11-06 2018-11-06 2015-04-29 2015-04-29 ... N N N N N N 20 False 1 1
4 18TH 18TH STREET ... HISPANIC ... 2015-04-28 2015-04-28 234 2013-05-02 2017-05-02 2015-04-28 2015-04-28 ... Y Y N N N N 18 False 3 3

5 rows × 35 columns


In [86]:
df.to_csv('gang_list_for_visualization.csv')