Title: T-Tests
Slug: t-tests
Summary: T-tests in Python.
Date: 2016-02-08 12:00
Category: Statistics
Tags: Basics
Authors: Chris Albon
In [1]:
from scipy import stats
import numpy as np
In [2]:
# Create a list of 20 observations drawn from a random distribution
# with mean 1 and a standard deviation of 1.5
x = np.random.normal(1, 1.5, 20)
# Create a list of 20 observations drawn from a random distribution
# with mean 0 and a standard deviation of 1.5
y = np.random.normal(0, 1.5, 20)
Imagine the one sample T-test and drawing a (normally shaped) hill centered at 1 and "spread" out with a standard deviation of 1.5, then placing a flag at 0 and looking at where on the hill the flag is location. Is it near the top? Far away from the hill? If the flag is near the very bottom of the hill or farther, then the t-test p-score will be below 0.05.
In [3]:
# Run a t-test to test if the mean of x is statistically significantly different than 0
pvalue = stats.ttest_1samp(x, 0)[1]
# View the p-value
pvalue
Out[3]:
Imagine the one sample T-test and drawing two (normally shaped) hills centered at their means and their 'flattness' (individual spread) based on the standard deviation. The T-test looks at how much the two hills are overlapping. Are they basically on top of each other? Do just the bottoms of the hill just barely touch? If the tails of the hills are just barely overlapping or are not overlapping at all, the t-test p-score will be below 0.05.
In [4]:
stats.ttest_ind(x, y)[1]
Out[4]:
In [5]:
stats.ttest_ind(x, y, equal_var=False)[1]
Out[5]:
In [6]:
stats.ttest_rel(x, y)[1]
Out[6]: