```
In [1]:
```%matplotlib inline
import numpy as np
import scipy.stats as stats
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

```
In [2]:
```# download from http://roybatty.org/statplots.py
# and upload to your Jupyter instance if you haven't done so already
import statplots

See lecture slides for intro discussion on hypothesis testing.

I've posted a data set at the following URL -- http://roybatty.org/possum.txt --
that contains morphological measurements on 104 brushtail possums (*Trichosurus
vulpecula*). This data set contains information on variables such as sex, age,
head length, skull width, etc.

By Bryce McQuillan - originally
posted to Flickr as Brushtail Opossum
Trichosurus vulpecula, CC BY-SA 2.0,
https://commons.wikimedia.org/w/index.php?curid=4195284

Previous studies of brushtail possums in Australia have established that the mean tail length of adult possums is 37.86cm. I am studying an isolated population of possums in the state of Victoria, and I am interested in whether mean tail length of Victorian possums is shorter than that of possums in the rest of Australia.

- $H_0$: The average tail length of Victoria possums is the same as those in the rest of Australia ($\mu = 37.86$)
- $H_A$: The average tail length of Victoria possums is less than the rest of Australia ($\mu < 37.86$)

Since I have an a priori reason to believe the difference in tail length is shorter, this is a "one-tailed" hypothesis test (pun intended!).

```
In [3]:
```possums = pd.read_table("http://roybatty.org/possum.txt")
possums.rename(columns={'pop':'popn'}, inplace=True)
possums.columns

```
Out[3]:
```

```
In [4]:
```vic = vic = possums[possums.popn == 'Vic']
print(vic.tailL.describe())

```
```

```
In [5]:
```fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2)
fig.set_size_inches(10,4)
ax1.hist(vic.tailL, bins=9, color='steelblue', normed=True)
ax1.set_xlabel("tail length (cm)")
stats.probplot(vic.tailL, plot=ax2)
pass

```
```

```
In [6]:
```mean_tl = vic.tailL.mean()
std_tl = vic.tailL.std()
SE_tl = std_tl/np.sqrt(vic.tailL.count())
tl_ci95 = mean_tl - 1.96 * SE_tl, mean_tl + 1.96 * SE_tl
print("sample mean tail length = {:.2f}".format(mean_tl))
print("SE of mean tail length = {:.2f}".format(SE_tl))
print("95% CI for mean = ({:.2f}, {:.2f})".format(tl_ci95[0], tl_ci95[1]))

```
```

```
In [7]:
```fig = plt.figure()
fig.set_size_inches(10,4)
mean_null = 37.86
plt.plot([],[]) # draw an empty plot to setup axes
# draw line at x-value corresponding to H0
plt.vlines(mean_null, 0, 1, linestyle='dashed', color='k')
plt.text(mean_null+0.05, 0.2, "$H_0: \mu = {:.2f}$".format(mean_null),
fontsize=18)
plt.errorbar(mean_tl, 0.2, xerr=1.96 * SE_tl, fmt='ro',
label='95% CI of Victoria\nmean tail length')
plt.legend(numpoints=1, loc='upper left')
plt.yticks([]) # get ride of y-ticks
plt.xlabel("Tail length (cm)", fontsize=14)
plt.xlim(35, 40)
pass

```
```

The CI approach above let us assess, in a general way, whether there's evidence to reject the null hypothesis. In the above example, we might state our finding something along the lines of "We are 95% confident that the mean tail length of Victoria possums is between 35.44 and 36.43mm. We therefore reject the null hypothesis that Victoria possums have the same mean tail length as other Australian possums." We would make the same statement for any other estimated 95% CI that did not overlap 37.86. Thus the CI approach doesn't really tell us anything about the strength of evidence against the null hypothesis, it just gives us a basis by which to reject it (or fail to reject it).

To assess the strength of the evidence against the null hypothesis, we can ask "what is the probability of observing data (or a statistic of interest) **at least as favorable** to the alternative hypothesis, **if the null hypothesis were true**?"

When considering a statistic of interest the usual way to phrase this question is with respect to the expected *sampling distribution of the statistic of interest* under the null hypothesis. For example, when testing a hypothesis involving means, we would ask "What is the probability of observing my sample mean, with respect to the expected distribution of sample means, if the null hypothesis was true."

Let's apply this to our possum tail example. Our null hypothesis, $H_0$, is that the mean tail length in Victoria possums should be the same as the rest of Australia ($\mu=37.86$). Let's assume the standard deviation of tail length under the null and alternative hypothesis is the same, so that we can estimate the sampling distribution of the mean under the null hypothesis by using our sample standard deviation.

First we calculate the expected distribution of the sample means under the null hypothesis and plot this estimate.

```
In [8]:
```mean_null = 37.86
se_null = SE_tl # because we assumed H0 and HA have same SE
fig = plt.figure()
fig.set_size_inches(10,4)
null_distn, ax = statplots.norm_plot(mean_null, se_null, nstds=4, color='k',
label='Distn of Sample Means\nUnder $H_0$')
plt.xlabel("Mean tail length (cm)")
# make a fancy legend box
plt.legend(loc='upper right', fancybox=True, shadow=True, borderpad=1)
pass

```
```

Now let's modify this figure to draw the observed sample mean relative to the expected distribution of sample means under $H_0$.

Most of the following code is involved in generating a nice plot, not carrying out any statistical calculations. One thing to note is that I've written the `statplots.norm_plot`

function so that it returns the "frozen" `scipy.stats.norm`

distribution that corresponds to the plot you asked it to generate. This is convenient because you can then carry out subsequent calculations with this distribution.

```
In [9]:
```fig = plt.figure()
fig.set_size_inches(10,4)
distn, ax = statplots.norm_plot(mean_null, SE_tl, nstds=8, color='k',
label='Distn of Sample Means\nUnder $H_0$')
ax.annotate("Observed $\overline{x}$", xy=(mean_tl,0),
xytext=(0,40),
textcoords='offset points',
horizontalalignment="center",
verticalalignment="bottom",
arrowprops=dict(arrowstyle="->",color='black'))
plt.xlabel("Mean tail length (cm)")
plt.legend(fancybox=True, shadow=True, borderpad=1)
pass

```
```

```
In [10]:
```z = (mean_tl - mean_null)/se_null
z

```
Out[10]:
```

Our calculation confirms our graphical intuition. Our observed sample mean is 7.6 standard errors to the left of the the mean under the null hypothesis!

Finally, let's calculate the probability of observing a value this extreme under the null hypothesis. Since this is a one-sided hypothesis test, with our alternative hypothesis to the left of the null hypotehsis, we only have to the sum probability to the left of our observed value of the mean.

```
In [11]:
```# probability to the left of observed mean(x) under H0
pval = stats.norm.cdf(z) # since Z score we can use the std normal to compute
print("p-value =", pval)

```
```

Let's test another hypothesis, this time a two-side one. We wish to test the hypothesis that the mean total body length of Victoria possums is not equal to the mean total length of other Australian possums, which previous studies have shown to be 86.8 cm$

- $H_0$: The average body length of Victoria possums is the same as those in the rest of Australia ($\mu = 86.8$)
- $H_A$: The average body length of Victoria possums is differnt than the rest of Australia ($\mu \neq 86.8$)

```
In [12]:
```fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2)
fig.set_size_inches(10,4)
ax1.hist(vic.totalL, bins=10, color='steelblue', normed=True)
ax1.set_xlabel("head length (mm)")
stats.probplot(vic.totalL, plot=ax2)
pass

```
```

```
In [13]:
```vic.totalL.describe()

```
Out[13]:
```

```
In [14]:
```mean_totL = vic.totalL.mean()
std_totL = vic.totalL.std()
se_totL = std_totL/np.sqrt(vic.totalL.count())
totL_ci95 = mean_totL - 1.96 * se_totL, mean_totL + 1.96 * se_totL
print("sample mean total length = {:.3f}".format(mean_totL))
print("SE of mean total length = {:.2f}".format(se_totL))
print("95% CI for mean = ({:.2f}, {:.2f})".format(totL_ci95[0], totL_ci95[1]))

```
```

```
In [15]:
```fig = plt.figure()
fig.set_size_inches(10,4)
mean_null = 86.8
plt.plot([],[]) # draw an empty plot to setup axes
# draw line at x-value corresponding to H0
plt.vlines(mean_null, 0, 1, linestyle='dashed', color='k')
plt.text(mean_null+0.05, 0.5, "$H_0: \mu = {:.2f}$".format(mean_null),
fontsize=18)
plt.errorbar(mean_totL, 0.2, xerr=1.96 * se_totL, fmt='ro',
label='95% CI of Victoria\nmean total length')
plt.legend(numpoints=1, loc='upper left')
plt.yticks([]) # get ride of y-ticks
plt.xlabel("Total length (cm)", fontsize=14)
plt.xlim(85, 90)
pass

```
```

```
In [16]:
```se_null = se_totL # because we assumed H0 and HA have same SE
fig = plt.figure()
fig.set_size_inches(10,4)
null_distn, ax = statplots.norm_plot(mean_null, se_null, nstds=4, color='k',
label='Distn of Sample Means\nUnder $H_0$')
ax.annotate("Observed $\overline{x}$", xy=(mean_totL,0),
xytext=(0,20),
textcoords='offset points',
horizontalalignment="center",
verticalalignment="bottom",
arrowprops=dict(arrowstyle="->",color='black'))
plt.xlabel("Mean tail length (cm)")
# make a fancy legend box
plt.legend(loc='upper right', fancybox=True, shadow=True, borderpad=1)
plt.xlim(84,90)
pass

```
```

```
In [17]:
```z = (mean_totL - mean_null)/se_null
z

```
Out[17]:
```

```
In [18]:
```fig = plt.figure()
fig.set_size_inches(10,4)
distn,ax = statplots.norm_plot(mean_null, se_null, nstds=4, color='k',
label='Distn of Sample Means\nUnder $H_0$')
# left tail
statplots.area_under_distn(distn, mean_null - 4*se_null, mean_null - z*se_null, color='gray')
# right tail
statplots.area_under_distn(distn, mean_null + 4*se_null, mean_null + z*se_null, color='gray')
plt.xlim(84,90)
pass

```
```

```
In [20]:
``````
### p-value of two-tail hypothesis test
```

```
In [19]:
```pleft = stats.norm.cdf(-z) # left tail = cdf
pright = stats.norm.sf(z) # right tail = sf
pval = pleft + pright
print("p-value =", pval)

```
```