T-tests in Python

Author: Paul M. Magwene


In [1]:
%matplotlib inline
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
import pandas as pd

t-tests in scipy.stats

The scipy.stats module has three functions for carrying out t-tests:

  1. ttest_1samp(a, popmean) -- carries out a one sample t-test, comparing the mean in a to the given popmean.

  2. ttest_ind(a,b) -- carries out a t-test for the mean of two independent samples a and b

  3. ttest_rel(a,b) -- carries out a paired t-test for related samples a and b

We will illustrate the use of the functions using sample data sets from the OpenIntro Statistics textbook.

One sample t-tests

We'll start with a one-sample t-test. To illustrate this we'll use a data set involving bushtail possums that we used previously (see previous notebook).

Possum tail length

Previous studies of brushtail possums in Australia have established that the mean tail length of adult possums is 37.86cm. I am studying an isolated population of possums in the state of Victoria, and I am interested in whether mean tail length of Victorian possums is shorter than that of possums in the rest of Australia.

Null and Alternative Hypotheses

  • $H_0$: The average tail length of Victoria possums is the same as those in the rest of Australia (μ = 37.86)
  • $H_A$: The average tail length of Victoria possums is less than the rest of Australia (μ < 37.86)

Since I have an a priori reason to believe the difference in tail length is shorter, this is a "one-tailed" hypothesis test.

Read the possum data


In [4]:
possums = pd.read_table("http://roybatty.org/possum.txt")
# rename the pop column because thats a pandas method name
possums.rename(columns={'pop':'popn'}, inplace=True)

# get the victoria possums
vic = possums[possums.popn == 'Vic']

Carry out the one-sample t-test


In [12]:
stats.ttest_1samp(vic.tailL, 37.86)


Out[12]:
Ttest_1sampResult(statistic=-7.6007911986606871, pvalue=1.3217904910585221e-09)

Here's the same thing as above, but showing how you can work with the named fields of the tuple return from ttest_1samp


In [15]:
vicT = stats.ttest_1samp(vic.tailL, 37.86)
print("The z-score (t-score) for our test is: {:0.2f}".format(vicT.statistic))
print("The p-value for our test is: {:0.10f}".format(vicT.pvalue))


The z-score (t-score) for our test is: -7.60
The p-value for our test is: 0.0000000013

If you compare the p-value to the previous notebook where we carried out a one-sample hypothesis test of the mean, you'll see the value we calculate here is slightly larger, reflecting the difference between the t- and normal distribution. Using either distribution we have strong evidence for rejecting the null-hypothesis.

Paired t-tests

We'll use the book price example from your textbook (see section 5.2) to illustrate a paired t-test. The data is book prices from the UCLA bookstore and Amazon.com, for 73 text books used in classes at UCLA.

Our null and alternative hypotheses are:

  • $H_0$: the mean book price of textbooks at the UCLA bookstore and Amazon.com are the same

  • $H_A$: the mean books price of textbooks at the UCLA bookstore and Amazon.com are different


In [16]:
books = pd.read_table("https://github.com/Bio204-class/bio204-datasets/raw/master/textbooks.txt")

In [17]:
books.columns


Out[17]:
Index(['deptAbbr', 'course', 'ibsn', 'uclaNew', 'amazNew', 'more', 'diff'], dtype='object')

In [20]:
books.head()


Out[20]:
deptAbbr course ibsn uclaNew amazNew more diff
0 Am Ind C170 978-0803272620 27.67 27.95 Y -0.28
1 Anthro 9 978-0030119194 40.59 31.14 Y 9.45
2 Anthro 135T 978-0300080643 31.68 32.00 Y -0.32
3 Anthro 191HB 978-0226206813 16.00 11.52 Y 4.48
4 Art His M102K 978-0892365999 18.95 14.21 Y 4.74

In [21]:
books.shape


Out[21]:
(73, 7)

In [22]:
booksT = stats.ttest_rel(books.uclaNew, books.amazNew)

In [23]:
booksT


Out[23]:
Ttest_relResult(statistic=7.6487711124797526, pvalue=6.927581126065491e-11)

Our calculated t-score is 7.65, with a corresponding p-value of ~$7 \times 10^{-11}$. Compare this to the calculation of the t-score for these data on page 230 of your textbook. We therefore have strong evidence to reject the null-hypothesis.

t-test for independent samples

To illustrate t-tests for independent samples we'll use the smoking and birthweight example from section 5.3 of your text book.

This data set includes 150 cases of mothers and their newborns in North Carolina. As per the textbook ((Diez et al. 2015), the null and alternative hypotheses we want to test are:

  • $H_0$: There is no difference in average birth weight for newborns from mothers who did and did not smoke. In statistical notation: $μ_n − μ_s = 0$, where $μ_n$ represents non-smoking mothers and $μ_s$ represents mothers who smoked.

  • $H_A$: There is some difference in average newborn weights from mothers who did and did not smoke ($μ_n$ − $μ_s$ $\neq$ 0).


In [24]:
births = pd.read_table("https://github.com/Bio204-class/bio204-datasets/raw/master/births.txt")

In [25]:
births.head()


Out[25]:
fAge mAge weeks premature visits gained weight sexBaby smoke
0 31 30 39 full term 13 1 6.88 male smoker
1 34 36 39 full term 5 35 7.69 male nonsmoker
2 36 35 40 full term 12 29 8.88 male nonsmoker
3 41 40 40 full term 13 30 9.00 female nonsmoker
4 42 37 40 full term NaN 10 7.94 male nonsmoker

In [26]:
births.shape


Out[26]:
(150, 9)

Let's use the groupby and describe methods to generate some useful summary statistics on baby weight, grouped-by whether the mother smoked or not.


In [35]:
births.groupby('smoke').weight.describe()


Out[35]:
smoke           
nonsmoker  count    100.000000
           mean       7.179500
           std        1.434152
           min        1.630000
           25%        6.702500
           50%        7.440000
           75%        8.060000
           max       10.130000
smoker     count     50.000000
           mean       6.779000
           std        1.597415
           min        1.690000
           25%        6.220000
           50%        6.970000
           75%        7.810000
           max        9.130000
dtype: float64

We see that the mean birthweight for babies of non-smoking mothers is greater than that for smoking mothers (7.18 vs 6.78 lbs). However, there is significant variation for both classes.


In [27]:
# subset data based on smoke column
nonsmokers = births[births.smoke == 'nonsmoker']
smokers = births[births.smoke == "smoker"]

In [28]:
birthT = stats.ttest_ind(nonsmokers.weight, smokers.weight)

In [29]:
birthT


Out[29]:
Ttest_indResult(statistic=1.5516757058253177, pvalue=0.12287562175328536)

For this example, we fail to reject the null hypothesis of no-difference in means, at the significance level $\alpha=0.5$.


In [ ]: