A statistical t-test of parachute drop times using Python and numpy

In this post, we'll review how to complete of t-test using Python and numpy. A statistical t-test is one type of test used by to determine whether two different samples are statistically different or two different samples are statistically the same.

The data for two samples comes from the measured drop times of parachutes built as part of a group project in an introductory Engineering class. Each group needed to build a parachute and then there was a competition to determine which parachute slowed the fall of a bean bag the most. The group with the slowest parachute fall is given extra credit.

Two of the parachutes in the class had pretty close drop times. We are going to use Python and numpy to determine if the parachute drop times are statistically different (only one group gets extra credit) or statistically the same (both groups get extra credit).

Each parachute was timed by at least 4 people. The recorded drop times for parachute a and parachute b are below. The drop times were measured in seconds. The longer the time (highest number of seconds), the slower the parachute descended.

parachute a	parachute b
4.4 s	3.78 s
4.11 s	4.1 s
4.7 s	3.93 s
4.64 s	4.72 s
	3.56 s
	3.92 s

Let's use Python to quickly compute the mean (average) drop time of each parachute. Remember the mean is the sum of the measurements divided by the number of measurements. We can create a list of drop times, then use Python's sum() function and len() function to compute the mean. sum() sums the values in a list, len() outputs the number of values in a list.



In [6]:

    
a = [4.4, 4.11, 4.7, 4.64]
b = [3.78, 4.1, 3.93, 4.72, 3.56, 3.92]
mean_a = sum(a)/len(a)
mean_b = sum(b)/len(b)
print(f' mean a = {round(mean_a,1)}, mean b = {round(mean_b,1)}')









    



 mean a = 4.5, mean b = 4.0

If we just use the means, it seems like parachute a and parachute b are different and have different fall times.

The mean of a is 4.5 and the mean of b is 4.0

But note that b has the value 4.71 in it, which is higher than the mean of a (4.0). Also note that a has the value 4.11 in it, which is very close to the mean of b.

So are the two samples different or the same? Time to build our t-test.

As part of a t-test we need to construct a hypothesis and a null hypothesis.

Hypothesis: The two parachutes have statistically different drop times

Null Hypothesis: The two parachutes do not have different drop times

Next we need to set a confidence interval. How sure do we want to be that the two sets of drop times are different? For me, I'd take a 90% confidence interval.

We also should specify what probability is acceptable that the two sets of drop times were just random. I am comfortable with a 10% chance the two samples are just random.



In [8]:

    
import numpy as np
from scipy import stats

Next we input the data as numpy arrays. Then we calculate the average number of data points per sample. Since parachute a has 4 measurements and parachute b has 6 measurements, the mean number of measurements is 5.



In [9]:

    
# Data from parachute drops
a = np.array([4.4,4.11,4.7,4.64])
b = np.array([3.78,4.1,3.93,4.72,3.56,3.92])
# average number of data points per sample
N = (len(a)+len(b))/2

We'll use numpy's np.mean() function to calculate the mean of each sample, and we'll use numpy's np.stdev() function to calculate the standard deviation of each sample. Printing them out as an f-string rounded to a small number of decimal places makes the means and standard deviations easier to read.



In [11]:

    
a_mean = np.mean(a)
b_mean = np.mean(b)
print(f' mean of a = {round(a_mean,1)}')
print(f' mean of b = {round(b_mean,1)}')
a_std = np.std(a)
b_std = np.std(b)
print(f' stdev of a = {round(a_std,2)}')
print(f' stdev of b = {round(b_std,2)}')









    



 mean of a = 4.5
 mean of b = 4.0
 stdev of a = 0.23
 stdev of b = 0.36



In [12]:

    
# Compute the variance
var_a = a.var(ddof=1)
var_b = b.var(ddof=1)



In [13]:

    
# Compute the standard deviation
s = np.sqrt((var_a + var_b)/2)
print(f'standard deviation = {round(s,2)} s')









    



standard deviation = 0.34 s



In [14]:

    
## Calculate the t-statistics
t = (a.mean() - b.mean())/(s*np.sqrt(2/N))
print(f't-value = {round(t,2)} ')









    



t-value = 2.16



In [15]:

    
## Compare with the critical t-value
#Degrees of freedom
df = len(a)+len(b)-2
print(f'degrees of freedom = {round(df,0)} ')









    



degrees of freedom = 8



In [16]:

    
#p-value after comparison with the t 
p = 1 - stats.t.cdf(t,df=df)
print(f'after comparision with t, p-value = {round(p,3)} ')









    



after comparision with t, p-value = 0.032



In [17]:

    
print("t = " + str(t))
print("p = " + str(2*p))
#Note that we multiply the p value by 2 because its a twp tail t-test
### You can see that after comparing the t statistic with the critical t value (computed internally)
### we get a good p value of 0.032 and thus we reject the null hypothesis
### and thus it proves that the mean of the two distributions are different and statistically significant.









    



t = 2.156340229982052
p = 0.0631484553773829



In [18]:

    
#based on chart https://towardsdatascience.com/inferential-statistics-series-t-test-using-numpy-2718f8f9bf2f
# t @ 90% confidence and 8 degrees of freedom = 1.397

# if calculated t-value (from the data) is greater than critical t-value (from the table), 
# the two samples are statistically different at a 90% confidence interval

t_crit = 1.397
if t>t_crit:
    print(f'calculated t = {round(t,2)} is greater than critical t = {round(t_crit,3)} ')
    print('The two samples are statistically different from eachother within a 90% confidence interval')
    print(f'p-value of p = {round(2*p,2)} indicates there is a {100*round(2*p,2)} % chance the experiment happend by chance')
else:
    print(f'calculated t={round(t,2)} is smaller than critical t= {round(t_crit)} ')
    print('The two samples are not statistically different from eachother within a 90% confidence interval')









    



calculated t = 2.16 is greater than critical t = 1.397 
The two samples are statistically different from eachother within a 90% confidence interval
p-value of p = 0.06 indicates there is a 6.0 % chance the experiment happend by chance



In [25]:

    
# use scipy's built in function to claculate t-value and p-value. We get a very similar result.

t2, p2 = stats.ttest_ind(a,b, equal_var=False)
print(round(t2,3))
print(round(p2,3))



In [ ]: