In [2]:
import nsfg
df = nsfg.ReadFemPreg()
df
Out[2]:
Print value counts for birthord and compare to results published in the codebook
In [3]:
df.birthord.value_counts().sort_index()
Out[3]:
We can also use isnull
to count the number of nans.
In [10]:
df.birthord.isnull().sum()
Out[10]:
Print value counts for prglngth and compare to results published in the codebook
In [3]:
df.prglngth.value_counts().sort_index()
Out[3]:
Print value counts for agepreg and compare to results published in the codebook.
Looking at this data, please remember my comments in the book about the obligation to approach data with consideration for the context and respect for the respondents.
In [4]:
df.agepreg.value_counts().sort_index()
Out[4]:
Compute the mean birthweight.
In [5]:
df.totalwgt_lb.mean()
Out[5]:
Create a new column named totalwgt_kg that contains birth weight in kilograms. Compute its mean. Remember that when you create a new column, you have to use dictionary syntax, not dot notation.
In [25]:
df['totalwgt_kg'] = df.totalwgt_lb / 2.2
df.totalwgt_kg.mean()
Out[25]:
Look through the codebook and find a variable, other than the ones mentioned in the book, that you find interesting. Compute values counts, means, or other statistics.
In [26]:
df.nbrnaliv.value_counts().sort_index()
Out[26]:
Create a boolean Series.
In [27]:
df.outcome == 1
Out[27]:
Use a boolean Series to select the records for the pregnancies that ended in live birth.
In [28]:
live = df[df.outcome == 1]
len(live)
Out[28]:
Count the number of live births with birthwgt_lb between 0 and 5 pounds (including both). The result should be 1125.
In [32]:
len(live[(live.birthwgt_lb >= 0) & (live.birthwgt_lb <= 5)])
Out[32]:
Count the number of live births with birthwgt_lb between 9 and 95 pounds (including both). The result should be 798
In [33]:
len(live[(live.birthwgt_lb >= 9) & (live.birthwgt_lb <= 95)])
Out[33]:
Use birthord to select the records for first babies and others. How many are there of each?
In [30]:
firsts = df[df.birthord==1]
others = df[df.birthord>1]
len(firsts), len(others)
Out[30]:
Compute the mean weight for first babies and others.
In [12]:
firsts.totalwgt_lb.mean()
Out[12]:
In [13]:
others.totalwgt_lb.mean()
Out[13]:
Compute the mean prglngth for first babies and others. Compute the difference in means, expressed in hours.
In [19]:
firsts.prglngth.mean()
Out[19]:
In [20]:
others.prglngth.mean()
Out[20]:
In [22]:
diff = firsts.prglngth.mean() - others.prglngth.mean()
diff
Out[22]:
In [24]:
diff * 7 * 24
Out[24]:
In [ ]: