[2 points] If you sum together 20 numbers sampled from a binomial distribution and 10 from a Poisson distribution, how is your sum distribted?
[2 points] If you sample 25 numbers from different beta distributions, how will each of the numbers be distributed?
[4 points] Assume a HW grade is determined as the sample mean of 3 HW problems. How is the HW grade distributed if we do not know the population standard deviation? Why?
[4 points] For part 3, how could not knowing the population standard deviation change how it's distributed? How does knowledge of that number change the behavior of a random variable?
Normal
We are not summing, no NLT. Beta distributed
t-distribution, since we do not know population standard deviation and N < 25
We have to estimate the standard error using sample standard deviation, which itself is a random variable. If we have the exact number, then we no longer have two sources of randomness.
Report the given confidence interval for error in the mean using the data given for each problem and describe in words what the confidence interval is for each example. 6 points each
80% Double.
data_21 = [65.58, -28.15, 21.17, -0.57, 6.04, -10.21, 36.46, 10.67, 77.98, 15.97]
99% Upper (lower bound, a value such that the mean lies above that value 99% of the time)
data_22 = [-8.78, -6.06, -6.03, -6.9, -13.57, -18.76, 1.5, -8.21, -3.21, -11.85, -2.72, -10.38, -11.03, -10.85, -7.6, -7.76, -5.99, -10.02, -6.32, -8.35, -19.28, -11.53, -6.04, -0.81, -12.01, -3.22, -9.25, -4.13, -7.22, -11.0, -14.42, 1.07]
95% Double
data_23 = [14.62, 10.34, 7.68, 15.81, 14.48]
Redo part 3 with a known standard deviation of 2
95% Lower (upper bound)
data_25 = [2.47, 2.03, 1.82, 6.98, 2.41, 2.32, 7.11, 5.89, 5.77, 3.34, 2.75, 6.51]
In [19]:
import scipy.stats as ss
data_21 = [65.58, -28.15, 21.17, -0.57, 6.04, -10.21, 36.46, 10.67, 77.98, 15.97]
se = np.std(data_21, ddof=1) / np.sqrt(len(data_21))
T = ss.t.ppf(0.9, df=len(data_21) - 1)
print(np.mean(data_21), T * se)
In [20]:
data_22 = [-8.78, -6.06, -6.03, -6.9, -13.57, -18.76, 1.5, -8.21, -3.21, -11.85, -2.72, -10.38, -11.03, -10.85, -7.6, -7.76, -5.99, -10.02, -6.32, -8.35, -19.28, -11.53, -6.04, -0.81, -12.01, -3.22, -9.25, -4.13, -7.22, -11.0, -14.42, 1.07]
se = np.std(data_22, ddof=1) / np.sqrt(len(data_22))
Z = ss.norm.ppf(1 - 0.99)
print(Z * se + np.mean(data_22))
In [21]:
data_23 = [14.62, 10.34, 7.68, 15.81, 14.48]
se = np.std(data_23, ddof=1) / np.sqrt(len(data_23))
T = ss.t.ppf(0.975, df=len(data_23) - 1)
print(np.mean(data_23), T * se)
In [23]:
data_23 = [14.62, 10.34, 7.68, 15.81, 14.48]
se = 2 / np.sqrt(len(data_23))
Z = ss.norm.ppf(0.975)
print(np.mean(data_23), T * se)
In [28]:
data_25 = [2.47, 2.03, 1.82, 6.98, 2.41, 2.32, 7.11, 5.89, 5.77, 3.34, 2.75, 6.51]
se = np.std(data_25, ddof=1) / np.sqrt(len(data_25))
T = ss.t.ppf(0.95, df=len(data_25) - 1)
print(np.mean(data_25) + T * se)
For each problem state if it is a t or normal distribution and reports the distribution's $\mu$ and $\sigma$. Note that $\mu, \sigma$s listed below are the population sigmas. Report your answer like: $T(0, 4.3, 4)$ to indicate a $t$-distribution with $\mu = 0$, $\sigma = 4.3$ and degrees of freedom of 3. 2 Points each