US Babe Names 1880-2015

This demo is based on an example in the book: https://www.amazon.com/dp/1449319793/

The data can be download from: https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-level-data

Setup

Install Anaconda from https://www.continuum.io/downloads
Set the account name, user name and password to the environment variables to SF_ACCOUNT, SF_USER and SF_PASSWORD
Set the S3 Bucket, AWS access key id and AWS secret access key to SF_S3_BUCKET, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to load the daata from the given location.
Start Jupyter notebook jupyter notebook

The given data files are named in the form of yobi<year>.txt. For example, the baby names in 1880 is included in yobi1880.txt. The data row format is:

name
gender
births

Since no year is included, I did run a bash script to preprocess the data:

#!/bin/bash
gzip *.txt
for f in $(ls *.txt.gz); do
    year=${f:3:4}
    new_name=${f:0:7}"u.tar.gz"
    zcat $f | awk -v year="$year" '{ gsub(/\r$/,""); print $0 "," year}' | gzip -c > ${f:0:7}"u.tar.gz"
    aws s3 cp $new_name s3://{s3_bucket}/baby_names/data/
done



In [1]:

    
import os
account=os.getenv('SF_ACCOUNT')
user=os.getenv('SF_USER')
password=os.getenv('SF_PASSWORD')
aws_key=os.getenv("AWS_ACCESS_KEY_ID")
aws_secret = os.getenv("AWS_SECRET_ACCESS_KEY")
sf_s3_bucket=os.getenv("SF_S3_BUCKET")

Create Tables and Load Data



In [2]:

    
from snowflake.connector import connect
con = connect(
    account=account,
    user=user,
    password=password,
    database='testdb',
    schema='public',
)
cur = con.cursor()



In [3]:

    
cur.execute("""
create or replace table baby_names(
    name   string,
    gender varchar(1),
    births integer,
    year   integer)""")









    Out[3]:





<snowflake.connector.cursor.SnowflakeCursor at 0x7f91400c9860>



In [4]:

    
cur.execute("""
copy into baby_names from s3://{s3_bucket}/baby_names/data/
    credentials=(
        aws_key_id='{aws_key_id}' 
        aws_secret_key='{aws_secret_key}'
    )
    file_format= (
        field_delimiter=','
        type='csv'
    )
""".format(
        s3_bucket=sf_s3_bucket,
        aws_key_id=aws_key,
        aws_secret_key=aws_secret))









    Out[4]:





<snowflake.connector.cursor.SnowflakeCursor at 0x7f91400c9860>

Data Analysis



In [5]:

    
import pandas as pd



In [6]:

    
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
engine = create_engine(URL(
    account=account,
    user=user,
    password=password,
    database='testdb',
    schema='public',
    numpy=True,
    ))



In [7]:

    
names1880 = pd.read_sql("""
SELECT name, gender, births
  FROM baby_names
 WHERE year=1880""", engine)



In [8]:

    
names1880.groupby('gender')['births'].sum()









    Out[8]:





gender
F     90992
M    110490
Name: births, dtype: int64



In [9]:

    
names=pd.read_sql("""
SELECT *
  FROM baby_names""", engine)



In [10]:

    
total_births=pd.pivot_table(
    names,
    values='births',
    index='year',
    columns='gender',
    aggfunc='sum')



In [11]:

    
total_births.tail()



In [12]:

    
%matplotlib notebook



In [13]:

    
total_births.plot(title='Total births by gender and year')









    














    











    Out[13]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f90d0dd2be0>



In [14]:

    
def add_prop(group):
    # Integer division floor
    births = group.births.astype(float)
    group['prop'] = births / births.sum()
    return group



In [15]:

    
names=names.groupby(['year','gender']).apply(add_prop)



In [16]:

    
names.head()



In [17]:

    
import numpy as np



In [18]:

    
np.allclose(names.groupby(['year','gender']).prop.sum(),1)









    Out[18]:





True



In [19]:

    
def get_top1000(group):
    return group.sort_values(by='births', ascending=False)[:1000]
grouped=names.groupby(['year','gender'])
top1000=grouped.apply(get_top1000)
top1000.index = np.arange(len(top1000))



In [20]:

    
pieces=[]
for year, group in names.groupby(['year','gender']):
    pieces.append(group.sort_values(by='births', ascending=False)[:1000])
top1000=pd.concat(pieces, ignore_index=True)



In [21]:

    
top1000









    Out[21]:






  
    
      
      name
      gender
      births
      year
      prop
    
  
  
    
      0
      Mary
      F
      7065
      1880
      0.077644
    
    
      1
      Anna
      F
      2604
      1880
      0.028618
    
    
      2
      Emma
      F
      2003
      1880
      0.022013
    
    
      3
      Elizabeth
      F
      1939
      1880
      0.021310
    
    
      4
      Minnie
      F
      1746
      1880
      0.019189
    
    
      5
      Margaret
      F
      1578
      1880
      0.017342
    
    
      6
      Ida
      F
      1472
      1880
      0.016177
    
    
      7
      Alice
      F
      1414
      1880
      0.015540
    
    
      8
      Bertha
      F
      1320
      1880
      0.014507
    
    
      9
      Sarah
      F
      1288
      1880
      0.014155
    
    
      10
      Annie
      F
      1258
      1880
      0.013825
    
    
      11
      Clara
      F
      1226
      1880
      0.013474
    
    
      12
      Ella
      F
      1156
      1880
      0.012704
    
    
      13
      Florence
      F
      1063
      1880
      0.011682
    
    
      14
      Cora
      F
      1045
      1880
      0.011485
    
    
      15
      Martha
      F
      1040
      1880
      0.011430
    
    
      16
      Laura
      F
      1012
      1880
      0.011122
    
    
      17
      Nellie
      F
      995
      1880
      0.010935
    
    
      18
      Grace
      F
      982
      1880
      0.010792
    
    
      19
      Carrie
      F
      949
      1880
      0.010429
    
    
      20
      Maude
      F
      858
      1880
      0.009429
    
    
      21
      Mabel
      F
      808
      1880
      0.008880
    
    
      22
      Bessie
      F
      796
      1880
      0.008748
    
    
      23
      Jennie
      F
      793
      1880
      0.008715
    
    
      24
      Gertrude
      F
      787
      1880
      0.008649
    
    
      25
      Julia
      F
      783
      1880
      0.008605
    
    
      26
      Hattie
      F
      769
      1880
      0.008451
    
    
      27
      Edith
      F
      768
      1880
      0.008440
    
    
      28
      Mattie
      F
      704
      1880
      0.007737
    
    
      29
      Rose
      F
      700
      1880
      0.007693
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      271847
      Yadiel
      M
      211
      2015
      0.000111
    
    
      271848
      Yahya
      M
      211
      2015
      0.000111
    
    
      271849
      Aarush
      M
      211
      2015
      0.000111
    
    
      271850
      Robin
      M
      211
      2015
      0.000111
    
    
      271851
      Deangelo
      M
      211
      2015
      0.000111
    
    
      271852
      Boden
      M
      209
      2015
      0.000110
    
    
      271853
      Kye
      M
      209
      2015
      0.000110
    
    
      271854
      Kylen
      M
      209
      2015
      0.000110
    
    
      271855
      Todd
      M
      209
      2015
      0.000110
    
    
      271856
      Truman
      M
      209
      2015
      0.000110
    
    
      271857
      Ean
      M
      209
      2015
      0.000110
    
    
      271858
      Gilbert
      M
      208
      2015
      0.000110
    
    
      271859
      Haiden
      M
      208
      2015
      0.000110
    
    
      271860
      Chevy
      M
      208
      2015
      0.000110
    
    
      271861
      Dangelo
      M
      207
      2015
      0.000109
    
    
      271862
      Juelz
      M
      207
      2015
      0.000109
    
    
      271863
      Osvaldo
      M
      207
      2015
      0.000109
    
    
      271864
      Brixton
      M
      207
      2015
      0.000109
    
    
      271865
      Bishop
      M
      206
      2015
      0.000108
    
    
      271866
      Freddy
      M
      206
      2015
      0.000108
    
    
      271867
      Reagan
      M
      206
      2015
      0.000108
    
    
      271868
      Frankie
      M
      205
      2015
      0.000108
    
    
      271869
      Malaki
      M
      205
      2015
      0.000108
    
    
      271870
      Jayvion
      M
      204
      2015
      0.000107
    
    
      271871
      Leroy
      M
      204
      2015
      0.000107
    
    
      271872
      Deshawn
      M
      204
      2015
      0.000107
    
    
      271873
      Camren
      M
      204
      2015
      0.000107
    
    
      271874
      Jaydon
      M
      203
      2015
      0.000107
    
    
      271875
      Briar
      M
      203
      2015
      0.000107
    
    
      271876
      Ayan
      M
      202
      2015
      0.000106
    
  

271877 rows × 5 columns

Analyzing Naming Trends



In [22]:

    
boys=top1000[top1000.gender=='M']



In [23]:

    
girls=top1000[top1000.gender=='F']



In [24]:

    
total_births=pd.pivot_table(
    names,
    values='births',
    index='year',
    columns='name',
    aggfunc='sum')



In [25]:

    
total_births.info()









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 136 entries, 1880 to 2015
Columns: 95025 entries, Aaban to Zzyzx
dtypes: float64(95025)
memory usage: 98.6 MB



In [26]:

    
subset=total_births[['John', 'Harry', 'Mary', 'Marilyn']]



In [27]:

    
subset.plot(subplots=True, figsize=(12,10), grid=False, title="Number of births per year")









    














    











    Out[27]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f90d1a28f60>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f90f10a8da0>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f90e7d2cb00>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f90d1c36908>], dtype=object)

Measuring the increase in naming diversity



In [28]:

    
table=pd.pivot_table(
    top1000,
    values='prop',
    index='year',
    columns='gender',
    aggfunc='sum'
)



In [29]:

    
table.plot(title='Sum of table1000.prop by year and gender',
          yticks=np.linspace(0, 1.2, 13),
          xticks=range(1880, 2020, 10))









    














    











    Out[29]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f90e7bc7e48>



In [31]:

    
df=boys[boys.year==2010]



In [32]:

    
df









    Out[32]:






  
    
      
      name
      gender
      births
      year
      prop
    
  
  
    
      260877
      Jacob
      M
      22095
      2010
      0.011540
    
    
      260878
      Ethan
      M
      17993
      2010
      0.009398
    
    
      260879
      Michael
      M
      17324
      2010
      0.009048
    
    
      260880
      Jayden
      M
      17158
      2010
      0.008962
    
    
      260881
      William
      M
      17036
      2010
      0.008898
    
    
      260882
      Alexander
      M
      16746
      2010
      0.008746
    
    
      260883
      Noah
      M
      16438
      2010
      0.008585
    
    
      260884
      Daniel
      M
      15819
      2010
      0.008262
    
    
      260885
      Aiden
      M
      15520
      2010
      0.008106
    
    
      260886
      Anthony
      M
      15475
      2010
      0.008083
    
    
      260887
      Joshua
      M
      15425
      2010
      0.008056
    
    
      260888
      Mason
      M
      14836
      2010
      0.007749
    
    
      260889
      Christopher
      M
      14259
      2010
      0.007447
    
    
      260890
      Andrew
      M
      14227
      2010
      0.007431
    
    
      260891
      David
      M
      14176
      2010
      0.007404
    
    
      260892
      Matthew
      M
      14114
      2010
      0.007372
    
    
      260893
      Logan
      M
      14014
      2010
      0.007319
    
    
      260894
      Elijah
      M
      13875
      2010
      0.007247
    
    
      260895
      James
      M
      13861
      2010
      0.007240
    
    
      260896
      Joseph
      M
      13806
      2010
      0.007211
    
    
      260897
      Gabriel
      M
      12859
      2010
      0.006716
    
    
      260898
      Benjamin
      M
      12421
      2010
      0.006487
    
    
      260899
      Ryan
      M
      11964
      2010
      0.006249
    
    
      260900
      Samuel
      M
      11948
      2010
      0.006240
    
    
      260901
      Jackson
      M
      11812
      2010
      0.006169
    
    
      260902
      John
      M
      11543
      2010
      0.006029
    
    
      260903
      Nathan
      M
      11367
      2010
      0.005937
    
    
      260904
      Jonathan
      M
      11113
      2010
      0.005804
    
    
      260905
      Christian
      M
      11084
      2010
      0.005789
    
    
      260906
      Liam
      M
      10924
      2010
      0.005706
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      261847
      Ronaldo
      M
      203
      2010
      0.000106
    
    
      261848
      Yair
      M
      203
      2010
      0.000106
    
    
      261849
      Kael
      M
      203
      2010
      0.000106
    
    
      261850
      Gibson
      M
      202
      2010
      0.000106
    
    
      261851
      Keyon
      M
      202
      2010
      0.000106
    
    
      261852
      Reagan
      M
      202
      2010
      0.000106
    
    
      261853
      Kingsley
      M
      201
      2010
      0.000105
    
    
      261854
      Talan
      M
      201
      2010
      0.000105
    
    
      261855
      Yehuda
      M
      201
      2010
      0.000105
    
    
      261856
      Jordon
      M
      200
      2010
      0.000104
    
    
      261857
      Slade
      M
      200
      2010
      0.000104
    
    
      261858
      Sheldon
      M
      200
      2010
      0.000104
    
    
      261859
      Daylen
      M
      200
      2010
      0.000104
    
    
      261860
      Dashawn
      M
      200
      2010
      0.000104
    
    
      261861
      Cristofer
      M
      200
      2010
      0.000104
    
    
      261862
      Clarence
      M
      199
      2010
      0.000104
    
    
      261863
      Kadin
      M
      199
      2010
      0.000104
    
    
      261864
      Masen
      M
      199
      2010
      0.000104
    
    
      261865
      Rowen
      M
      199
      2010
      0.000104
    
    
      261866
      Clinton
      M
      198
      2010
      0.000103
    
    
      261867
      Dillan
      M
      198
      2010
      0.000103
    
    
      261868
      Thaddeus
      M
      198
      2010
      0.000103
    
    
      261869
      Yousef
      M
      198
      2010
      0.000103
    
    
      261870
      Truman
      M
      197
      2010
      0.000103
    
    
      261871
      Joziah
      M
      196
      2010
      0.000102
    
    
      261872
      Simeon
      M
      196
      2010
      0.000102
    
    
      261873
      Reuben
      M
      196
      2010
      0.000102
    
    
      261874
      Keshawn
      M
      196
      2010
      0.000102
    
    
      261875
      Destin
      M
      196
      2010
      0.000102
    
    
      261876
      Enoch
      M
      196
      2010
      0.000102
    
  

1000 rows × 5 columns



In [33]:

    
prop_cumsum=df.sort_values(by='prop', ascending=False).prop.cumsum()



In [34]:

    
prop_cumsum[:10]









    Out[34]:





260877    0.011540
260878    0.020938
260879    0.029986
260880    0.038947
260881    0.047845
260882    0.056592
260883    0.065177
260884    0.073439
260885    0.081545
260886    0.089628
Name: prop, dtype: float64



In [35]:

    
prop_cumsum.values.searchsorted(0.5)









    Out[35]:





116



In [36]:

    
df=boys[boys.year==1900]



In [37]:

    
in1900=df.sort_values(by='prop', ascending=False).prop.cumsum()



In [38]:

    
in1900.values.searchsorted(0.5)+1









    Out[38]:





25



In [39]:

    
def get_quantile_count(group, q=0.5):
    group=group.sort_values(by='prop', ascending=False)
    return group.prop.cumsum().values.searchsorted(q)+1
diversity=top1000.groupby(['year','gender']).apply(get_quantile_count)
diversity=diversity.unstack('gender')



In [40]:

    
diversity.head()



In [41]:

    
diversity.plot(title='Number of popular names in top 50%')









    














    











    Out[41]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f90e5ef5588>

The "Last letter" revolution



In [42]:

    
get_last_letter = lambda x: x[-1]
last_letters=names.name.map(get_last_letter)
last_letters.name = 'last_letter'
table = pd.pivot_table(
    names,
    values='births',
    index=last_letters,
    columns=['gender', 'year'],
    aggfunc='sum'
)



In [43]:

    
subtable=table.reindex(columns=[1910, 1960, 2010], level='year')



In [44]:

    
subtable.head()









    Out[44]:






  
    
      gender
      F
      M
    
    
      year
      1910
      1960
      2010
      1910
      1960
      2010
    
    
      last_letter
      
      
      
      
      
      
    
  
  
    
      a
      108398.0
      691271.0
      676151.0
      977.0
      5212.0
      28837.0
    
    
      b
      NaN
      694.0
      454.0
      411.0
      3912.0
      39229.0
    
    
      c
      5.0
      49.0
      954.0
      482.0
      15463.0
      23325.0
    
    
      d
      6751.0
      3728.0
      2635.0
      22113.0
      262168.0
      44780.0
    
    
      e
      133601.0
      435056.0
      316425.0
      28665.0
      178817.0
      130130.0



In [45]:

    
subtable.sum()









    Out[45]:





gender  year
F       1910     396503.0
        1960    2022121.0
        2010    1773440.0
M       1910     194216.0
        1960    2132756.0
        2010    1914629.0
dtype: float64



In [46]:

    
letter_prop=subtable/subtable.sum().astype(float)



In [47]:

    
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 1, figsize=(10,8))
letter_prop['M'].plot(kind='bar', rot=0, ax=axes[0], title='Male')
letter_prop['F'].plot(kind='bar', rot=0, ax=axes[1], title='Female', legend=False)









    














    











    Out[47]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f90e4cdca58>



In [50]:

    
letter_prop=table/table.sum().astype(float)



In [51]:

    
dny_ts=letter_prop.ix[['d','n','y'], 'M'].T
dny_ts.head()









    Out[51]:






  
    
      last_letter
      d
      n
      y
    
    
      year
      
      
      
    
  
  
    
      1880
      0.083057
      0.153217
      0.075763
    
    
      1881
      0.083242
      0.153212
      0.077455
    
    
      1882
      0.085332
      0.149561
      0.077538
    
    
      1883
      0.084051
      0.151653
      0.079148
    
    
      1884
      0.086121
      0.149926
      0.080407



In [52]:

    
dny_ts.plot()









    














    











    Out[52]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f90e44bc160>

Boy names that became girl names (and vice versa)



In [53]:

    
all_names=top1000.name.unique()



In [54]:

    
mask=np.array(['lesl' in x.lower() for x in all_names])



In [55]:

    
lesley_like=all_names[mask]



In [56]:

    
lesley_like









    Out[56]:





array(['Leslie', 'Lesley', 'Leslee', 'Lesli', 'Lesly'], dtype=object)



In [57]:

    
filtered=top1000[top1000.name.isin(lesley_like)]



In [58]:

    
filtered.groupby('name').births.sum()









    Out[58]:





name
Leslee       993
Lesley     35032
Lesli        929
Leslie    376034
Lesly      11431
Name: births, dtype: int64



In [59]:

    
table=pd.pivot_table(
    filtered,
    values='births',
    index='year',
    columns=['gender'],
    aggfunc='sum'
)



In [60]:

    
table=table.div(table.sum(1), axis=0)



In [61]:

    
table.tail()



In [62]:

    
table.plot(style={'M': 'k-', 'F': 'k--'})









    














    











    Out[62]:





<matplotlib.axes._subplots.AxesSubplot at 0x7f90e4342ba8>

gender	F	M
year
2011	1754424	1894262
2012	1755254	1890889
2013	1747544	1883945
2014	1777242	1910876
2015	1769325	1898858

	name	gender	births	year	prop
0	Mary	F	8148	1882	0.075551
1	Anna	F	3143	1882	0.029143
2	Emma	F	2303	1882	0.021354
3	Elizabeth	F	2186	1882	0.020269
4	Minnie	F	2004	1882	0.018582

	name	gender	births	year	prop
0	Mary	F	7065	1880	0.077644
1	Anna	F	2604	1880	0.028618
2	Emma	F	2003	1880	0.022013
3	Elizabeth	F	1939	1880	0.021310
4	Minnie	F	1746	1880	0.019189
5	Margaret	F	1578	1880	0.017342
6	Ida	F	1472	1880	0.016177
7	Alice	F	1414	1880	0.015540
8	Bertha	F	1320	1880	0.014507
9	Sarah	F	1288	1880	0.014155
10	Annie	F	1258	1880	0.013825
11	Clara	F	1226	1880	0.013474
12	Ella	F	1156	1880	0.012704
13	Florence	F	1063	1880	0.011682
14	Cora	F	1045	1880	0.011485
15	Martha	F	1040	1880	0.011430
16	Laura	F	1012	1880	0.011122
17	Nellie	F	995	1880	0.010935
18	Grace	F	982	1880	0.010792
19	Carrie	F	949	1880	0.010429
20	Maude	F	858	1880	0.009429
21	Mabel	F	808	1880	0.008880
22	Bessie	F	796	1880	0.008748
23	Jennie	F	793	1880	0.008715
24	Gertrude	F	787	1880	0.008649
25	Julia	F	783	1880	0.008605
26	Hattie	F	769	1880	0.008451
27	Edith	F	768	1880	0.008440
28	Mattie	F	704	1880	0.007737
29	Rose	F	700	1880	0.007693
...	...	...	...	...	...
271847	Yadiel	M	211	2015	0.000111
271848	Yahya	M	211	2015	0.000111
271849	Aarush	M	211	2015	0.000111
271850	Robin	M	211	2015	0.000111
271851	Deangelo	M	211	2015	0.000111
271852	Boden	M	209	2015	0.000110
271853	Kye	M	209	2015	0.000110
271854	Kylen	M	209	2015	0.000110
271855	Todd	M	209	2015	0.000110
271856	Truman	M	209	2015	0.000110
271857	Ean	M	209	2015	0.000110
271858	Gilbert	M	208	2015	0.000110
271859	Haiden	M	208	2015	0.000110
271860	Chevy	M	208	2015	0.000110
271861	Dangelo	M	207	2015	0.000109
271862	Juelz	M	207	2015	0.000109
271863	Osvaldo	M	207	2015	0.000109
271864	Brixton	M	207	2015	0.000109
271865	Bishop	M	206	2015	0.000108
271866	Freddy	M	206	2015	0.000108
271867	Reagan	M	206	2015	0.000108
271868	Frankie	M	205	2015	0.000108
271869	Malaki	M	205	2015	0.000108
271870	Jayvion	M	204	2015	0.000107
271871	Leroy	M	204	2015	0.000107
271872	Deshawn	M	204	2015	0.000107
271873	Camren	M	204	2015	0.000107
271874	Jaydon	M	203	2015	0.000107
271875	Briar	M	203	2015	0.000107
271876	Ayan	M	202	2015	0.000106

	name	gender	births	year	prop
260877	Jacob	M	22095	2010	0.011540
260878	Ethan	M	17993	2010	0.009398
260879	Michael	M	17324	2010	0.009048
260880	Jayden	M	17158	2010	0.008962
260881	William	M	17036	2010	0.008898
260882	Alexander	M	16746	2010	0.008746
260883	Noah	M	16438	2010	0.008585
260884	Daniel	M	15819	2010	0.008262
260885	Aiden	M	15520	2010	0.008106
260886	Anthony	M	15475	2010	0.008083
260887	Joshua	M	15425	2010	0.008056
260888	Mason	M	14836	2010	0.007749
260889	Christopher	M	14259	2010	0.007447
260890	Andrew	M	14227	2010	0.007431
260891	David	M	14176	2010	0.007404
260892	Matthew	M	14114	2010	0.007372
260893	Logan	M	14014	2010	0.007319
260894	Elijah	M	13875	2010	0.007247
260895	James	M	13861	2010	0.007240
260896	Joseph	M	13806	2010	0.007211
260897	Gabriel	M	12859	2010	0.006716
260898	Benjamin	M	12421	2010	0.006487
260899	Ryan	M	11964	2010	0.006249
260900	Samuel	M	11948	2010	0.006240
260901	Jackson	M	11812	2010	0.006169
260902	John	M	11543	2010	0.006029
260903	Nathan	M	11367	2010	0.005937
260904	Jonathan	M	11113	2010	0.005804
260905	Christian	M	11084	2010	0.005789
260906	Liam	M	10924	2010	0.005706
...	...	...	...	...	...
261847	Ronaldo	M	203	2010	0.000106
261848	Yair	M	203	2010	0.000106
261849	Kael	M	203	2010	0.000106
261850	Gibson	M	202	2010	0.000106
261851	Keyon	M	202	2010	0.000106
261852	Reagan	M	202	2010	0.000106
261853	Kingsley	M	201	2010	0.000105
261854	Talan	M	201	2010	0.000105
261855	Yehuda	M	201	2010	0.000105
261856	Jordon	M	200	2010	0.000104
261857	Slade	M	200	2010	0.000104
261858	Sheldon	M	200	2010	0.000104
261859	Daylen	M	200	2010	0.000104
261860	Dashawn	M	200	2010	0.000104
261861	Cristofer	M	200	2010	0.000104
261862	Clarence	M	199	2010	0.000104
261863	Kadin	M	199	2010	0.000104
261864	Masen	M	199	2010	0.000104
261865	Rowen	M	199	2010	0.000104
261866	Clinton	M	198	2010	0.000103
261867	Dillan	M	198	2010	0.000103
261868	Thaddeus	M	198	2010	0.000103
261869	Yousef	M	198	2010	0.000103
261870	Truman	M	197	2010	0.000103
261871	Joziah	M	196	2010	0.000102
261872	Simeon	M	196	2010	0.000102
261873	Reuben	M	196	2010	0.000102
261874	Keshawn	M	196	2010	0.000102
261875	Destin	M	196	2010	0.000102
261876	Enoch	M	196	2010	0.000102

gender	F			M
year	1910	1960	2010	1910	1960	2010
last_letter
a	108398.0	691271.0	676151.0	977.0	5212.0	28837.0
b	NaN	694.0	454.0	411.0	3912.0	39229.0
c	5.0	49.0	954.0	482.0	15463.0	23325.0
d	6751.0	3728.0	2635.0	22113.0	262168.0	44780.0
e	133601.0	435056.0	316425.0	28665.0	178817.0	130130.0

last_letter	d	n	y
year
1880	0.083057	0.153217	0.075763
1881	0.083242	0.153212	0.077455
1882	0.085332	0.149561	0.077538
1883	0.084051	0.151653	0.079148
1884	0.086121	0.149926	0.080407