Read the pregnancy file.


In [1]:
import nsfg
preg = nsfg.ReadFemPreg()

Select live births, then make a CDF of totalwgt_lb.


In [2]:
import thinkstats2
live = preg[preg.outcome == 1]
firsts = live[live.birthord == 1]
others = live[live.birthord != 1]
cdf = thinkstats2.Cdf(live.totalwgt_lb)

Display the CDF.


In [43]:
import thinkplot
thinkplot.Cdf(cdf, label='totalwgt_lb')
thinkplot.Show(loc='lower right')


Find out how much you weighed at birth, if you can, and compute CDF(x).


In [44]:
cdf.Prob(8.4)


Out[44]:
0.81422881168400085

If you are a first child, look up your birthweight in the CDF of first children; otherwise use the CDF of other children.


In [59]:
other_cdf = thinkstats2.Cdf(others.totalwgt_lb)
other_cdf.Prob(8.4)


Out[59]:
0.79657754010695192

Compute the percentile rank of your birthweight


In [46]:
cdf.PercentileRank(8.4)


Out[46]:
81.422881168400082

Compute the median birth weight by looking up the value associated with p=0.5.


In [45]:
cdf.Value(0.5)


Out[45]:
7.375

Compute the interquartile range (IQR) by computing percentiles corresponding to 25 and 75.


In [47]:
cdf.Percentile(25), cdf.Percentile(75)


Out[47]:
(6.5, 8.125)

Make a random selection from cdf.


In [48]:
cdf.Random()


Out[48]:
7.0

Draw a random sample from cdf.


In [49]:
cdf.Sample(10)


Out[49]:
[6.25, 5.1875, 8.1875, 6.5, 7.9375, 6.6875, 5.75, 6.5625, 7.8125, 5.25]

Draw a random sample from cdf, then compute the percentile rank for each value, and plot the distribution of the percentile ranks.


In [50]:
t = [cdf.PercentileRank(x) for x in cdf.Sample(1000)]
cdf2 = thinkstats2.Cdf(t)
thinkplot.Cdf(cdf2)
thinkplot.Show(legend=False)


Generate 1000 random values using random.random() and plot their PMF.


In [55]:
import random
t = [random.random() for _ in range(1000)]
pmf = thinkstats2.Pmf(t)
thinkplot.Pmf(pmf, linewidth=0.1)
thinkplot.Show()


Assuming that the PMF doesn't work very well, try plotting the CDF instead.


In [56]:
cdf = thinkstats2.Cdf(t)
thinkplot.Cdf(cdf)
thinkplot.Show()



In [60]:
import scipy.stats

In [64]:
scipy.stats.norm.cdf(0)


Out[64]:
0.5

In [ ]: