# Homework 10

#### CHE 116: Numerical Methods and Statistics

5/5/2018

Homework Requirements:

1. Write all equations in $\LaTeX$
2. Simplify all expressions
4. Explain or show your work

# 1. Conceptual Questions

1. In the following picture, which color area corresponds to the p-value?
2. If a significance level goes up, is it easier or harder to reject a null hypothesis?
3. If you only have one data point, which hypothesis test or tests can be used?
4. Is it meaningful to perform a Wilcoxon Signed Rank Test if the two paired data are in different unit systems?
5. We haven't learned about a "binomial hypothesis test", but what would the null hypothesis of such a test be and provide a situation where you would use it.

Blue

easier

zM test

yes

### 1.5

a count out of N came from a population binomial distribution. You count how many questions out of 6 you get correct on a homework when you normally have a probability of 0.3 of getting a question correct.

# 2. Hypothesis Tests

For the following questions, state the following in Markdown and show your numerical work in Python:

• The null hypothesis
• The choice of test
• The $p$-value and if you are considering both tails (extreme values above and below) or only one side
• If the null hypothesis is rejected

Each hypothesis test occurs once in the following, so make sure you do not repeat any of them!

1. On average, 3 people fall asleep in class. Today 11 fall asleep in class. Is this significant?
2. Your average running pace over the last few years has been an 8:00 minute mile. You've tried changing running shoes and recorded the following paces on your most recent runs: 7:56, 7:45, 7:34, 8:05, 7:35. Is your running pace significantly different?
3. You are comparing two batches of a compound prepared by different technicians. The following purities have been recorded for technician A: 0.87, 0.86, 0.88, 0.93, 0.85, 0.67 and the following by technician B: 0.86, 0.96, 0.90, 0.76, 0.87, 0.83, 0.84, 0.80. Are they achiving similar purity?
4. You are assessing the efficacy of a drug that helps people lose weight. 13 people who enrolled had the following weights at admission and after 8 weeks of the drug:
Person Weight at Start Weight at 8 Weeks
1 150 163
2 212 194
3 320 280
4 250 265
5 215 132
6 186 172
7 195 185
8 203 187
9 145 135
10 168 140
11 172 178
12 240 211
13 272 268

is there a significant effect from the drug?

5. A chemical refinery has input crude with a concentration of sulfor of 0.7% on average with a variance of 0.015%. A sample from the crude reveals a concentration of 1.2%. Is this significant enough that you should investigate?

6. You are assessing if a correlation exists between literacy rate and birthrate. You've found the following data from countries:

Country Literacy Rate Birthrate per 1000
Afghanistan 38.2% 37.90
Belize 82.7% 24.00
Laos 79.9% 23.60
Lebanon 93.9% 14.30
India 72.1% 19.00
Russia 99.7% 11.00
Argentina 98.1% 16.70
South Africa 94.3% 20.20
Venezuela 95.4% 18.80
Cameroon 75% 35.40

Is there a relationship between these two?

### 2.1

• This is a sample from the population Poisson
• Poisson test
• 0.0003
• reject


In [4]:

#2.1
import numpy as np
import scipy.stats as ss
print(1 - ss.poisson.cdf(11 - 1, 3))




0.0002923369506473428



### 2.2

• These times come from our population normal distribution
• t-test
• 0.0965
• do not reject


In [17]:

#2.2
#must convert to sceonds!
times = [ 7 * 60 + 56, 7 * 60 + 45, 7 * 60 + 34, 8 * 60 + 5, 7 * 60 + 35]
T = (8 * 60 - np.mean(times)) / (np.std(times, ddof=1) / np.sqrt(len(times)))
# we look at both sides
p = 2 * ss.t.cdf(-T, len(times) - 1)
#print stat and p value and new mean
print(T, p, np.mean(times) / 60)




2.16366366222047 0.09649223504829538 7.783333333333333



### 2.3

• These two numbers are from the same distribution
• Wilcoxon sum of ranks
• 0.70
• do not reject


In [22]:

A = [0.87, 0.86, 0.88, 0.93, 0.85, 0.67]
B = [0.86, 0.96, 0.90, 0.76, 0.87, 0.83, 0.84, 0.80]

print(ss.ranksums(A, B).pvalue)




0.6985353583033387



### 2.4

• The two sets of numbers are from the same distribution
• Wilcoxon Signed Rank Test
• 0.028
• Reject


In [31]:

#2.4
# use python list to array syntax
data = np.array([
[ 1, 150, 163],
[ 2, 212, 194],
[ 3, 320, 280],
[ 4, 250, 265],
[ 5, 215, 132],
[ 6, 186, 172],
[ 7, 195, 185],
[ 8, 203, 187],
[ 9, 145, 135],
[10, 168, 140],
[11, 172, 178],
[12, 240, 211],
[13, 272, 268]
])
ss.wilcoxon(data[:,1], data[:,2])




Out[31]:

WilcoxonResult(statistic=14.0, pvalue=0.027660332975047608)



### 2.5

• The sample is from the normal population
• zM test
• ~0
• reject


In [30]:

# 2.5
#quick syntax without making z score
# CDF here is from -\infty up to high value
# 1 - includes top interval
# 2 * to get bottom interval
print(2 * (1 - ss.norm.cdf(1.2, loc=0.7, scale=np.sqrt(0.015))))




4.455709060402491e-05



### 2.6

• There is no correlation between literacy rate and birthrate
• Spearman Correlation Test
• 0.001
• reject


In [32]:

#2.6
data = np.array([
[38.2,37.90],
[82.7,24.00],
[79.9,23.60],
[93.9,14.30],
[72.1,19.00],
[99.7,11.00],
[98.1,16.70],
[94.3,20.20],
[95.4,18.80],
[75,35.40],
[40.2,35.60]
])

ss.spearmanr(data[:,0], data[:,1])




Out[32]:

SpearmanrResult(correlation=-0.8363636363636365, pvalue=0.0013331850799508562)