Homework 8 Key

CHE 116: Numerical Methods and Statistics

3/27/2020


1. Central Limit Theorem (10 Points)

State whether the CLT applies to the following cases. 2 points each

  1. You average crops harvested across 24 fields
  2. You sum the crops harvested from 24 fields
  3. You compute the maximum height among 25 people
  4. You compute the average tire pressure of the four tires on a car
  5. You measure the weight of 25 pumpkins

1.1

Yes

1.2

Yes

1.3

No, did not sum or avg

1.4

No, not enough samples

1.5

No, no avg or sum

2. Confidence Intervals (30 Points)

Report the given confidence interval for error in the mean using the data given for each problem what the confidence interval is for each example. 5 points each

2.1

95% Double.

data_21 = [65.58, -28.15, 21.17, -0.57, 6.04, -10.21, 36.46, 10.67, 77.98, 15.97]

2.2

80% Upper (lower bound, a value such that the mean lies above that value 80% of the time)

data_22 = [-8.78, -6.06, -6.03, -6.9, -13.57, -18.76, 1.5, -8.21, -3.21, -11.85, -2.72, -10.38, -11.03, -10.85, -7.6, -7.76, -5.99, -10.02, -6.32, -8.35, -19.28, -11.53, -6.04, -0.81, -12.01, -3.22, -9.25, -4.13, -7.22, -11.0, -14.42, 1.07]

2.3

80% Double

data_23 = [14.62, 10.34, 7.68, 15.81, 14.48]

2.4

Redo part 3 with a known standard deviation of 3.5

2.5

90% Lower (upper bound)

data_25 = [2.47, 2.03, 1.82, 6.98, 2.41, 2.32, 7.11, 5.89, 5.77, 3.34, 2.75, 6.51]


In [1]:
# 2.1
import scipy.stats as ss
import numpy as np

data_21 = [65.58, -28.15, 21.17, -0.57, 6.04, -10.21, 36.46, 10.67, 77.98, 15.97]
T = ss.t.ppf(0.975, len(data_21) - 1)
y = np.std(data_21, ddof=1) / np.sqrt(len(data_21))
print('{:.1f} +/- {:.1f}'.format(np.mean(data_21), y * T))


19.5 +/- 23.4

In [2]:
# 2.2
data_22 = [-8.78, -6.06, -6.03, -6.9, -13.57, -18.76, 1.5, -8.21, -3.21, -11.85, -2.72, -10.38, -11.03, -10.85, -7.6, -7.76, -5.99, -10.02, -6.32, -8.35, -19.28, -11.53, -6.04, -0.81, -12.01, -3.22, -9.25, -4.13, -7.22, -11.0, -14.42, 1.07]
T = ss.t.ppf(0.2, len(data_22) - 1)
y = np.std(data_22, ddof=1) / np.sqrt(len(data_22))
print('mu > {:.2f} with 80%'.format(np.mean(data_22) + y * T))


mu > -8.88 with 80%

In [3]:
# 2.3
data_23 = [14.62, 10.34, 7.68, 15.81, 14.48]
T = ss.t.ppf(0.9, len(data_23) - 1)
y = np.std(data_23, ddof=1) / np.sqrt(len(data_23))
print('{:.1f} +/- {:.1f}'.format(np.mean(data_23), T * y))


12.6 +/- 2.4

In [4]:
# 2.4
data_23 = [14.62, 10.34, 7.68, 15.81, 14.48]
Z = ss.norm.ppf(0.9)
y = 3.5 / np.sqrt(len(data_23))
print('{:.1f} +/- {:.1f}'.format(np.mean(data_23), Z * y))


12.6 +/- 2.0

In [5]:
#2.5
data_25 = [2.47, 2.03, 1.82, 6.98, 2.41, 2.32, 7.11, 5.89, 5.77, 3.34, 2.75, 6.51]
T = ss.t.ppf(0.9, len(data_25) - 1)
y = np.std(data_25, ddof=1) / np.sqrt(len(data_25))
print('mu < {:.2f} with 90% confidence'.format(np.mean(data_25) +  y * T))


mu < 4.95 with 90% confidence

3. Identifiying Distributions (12 Points)

For each problem state if it is a t or normal distribution and reports the distribution's parameters. Report your answer like: $T(0, 4.3, 4)$ to indicate a $t$-distribution with $\mu = 0$, $\sigma = 4.3$ and degrees of freedom of 4. Note that $\mu, \sigma$s listed below are the population sigmas, not the parameters of a $t$-distribution. 2 Points each

  1. $P(\mu - \bar{x})$, $\sigma = 1$, $N = 5$
  2. $P(\mu - \bar{x})$, $\sigma_x = 1$, $N = 5$
  3. $P(\bar{x})$, $\mu = -1$, $\sigma = 2$, $N = 30$
  4. $P(\bar{x})$, $\mu = 1$, $\sigma_x = 2$, $N = 5$
  5. $P(\mu)$, $\bar{x} = 3$, $\sigma = 1.7$, $N = 4$
  6. $P(\mu)$, $\bar{x} = 2$, $\sigma_x = 2.1$, $N = 9$

3.1

$ \mathcal{N}(0, \frac{1}{\sqrt{5}}) $

3.2

$ \mathcal{T}(0, \frac{1}{\sqrt{5}}, 4) $

3.3

$ \mathcal{N}(-1, \frac{2}{\sqrt{30}}) $

3.4

$ \mathcal{T}(1, \frac{2}{\sqrt{5}}, 4) $

3.5

$ \mathcal{N}(3, \frac{1.6}{\sqrt{4}}) $

3.6

$ \mathcal{T}(2, \frac{2.1}{\sqrt{9}}, 8) $