Homework 10

CHE 116: Numerical Methods and Statistics

Prof. Andrew White

Version 2 (3/30/2016)


0. Revise a Problem (15 Bonus Points on HW 7)

Revist a problem you got wrong on homework 7. If you got a perfect score in homework 7, state that fact. Go through each part you missed and state what your answer was and what your mistake was. If you completed this already on homework 8, state that you completed this on homework 8

For example:

Problem 1.1

My answer used the scipy comb function instead of factorial.

1. Short Answer Problems ( 16 Points)

  1. A $t$-test and $zM$ test rely on the assumption of normality. How could you test that assumption?
  2. What is $\hat{\alpha}$ in OLS? Use words.
  3. What is $S_{\epsilon}$ in OLS? Use words.
  4. What is the difference between SSR and TSS? Use words
  5. We learned three ways to do regression. One way was with algebraic equations (OLS-1D). What were the other two ways?
  6. What are the steps to complete for a good regression analysis?
  7. Is a goodness of fit applicable to a non-linear regression?
  8. If you residuals are not normal, is a regression still valid?

2. Exercises (24 Points)

  1. Are these numbers normally distributed? [-26.6,-24.0, -20.9, -25.8, -24.3, -22.6, -23.0, -26.8, -26.5, -23.6, -20.0, -23.1, -22.4, -22.5]
  2. Given $\hat{\alpha} = 1.2$, $\hat{\beta} = -5.3$, $N = 14$, $S^2_\alpha = 0.8$, $S^2_\epsilon = 0.2$, $S^2_\beta = 12$, conduct a hypothesis test on the existence of the intercept.
  3. Conduct a hpyothesis test for the slope being negative using the above data. This is a one-sided hypothesis test. Hint: a good null hypothesis would be that the slope is positive
  4. Write a function which computes the SSR for $\hat{y} = \beta_0 + \beta_1 \cos \beta_2 x $. Your function should take in one argument. You may assume $x$ and $y$ are defined.
  5. In OLS-ND, if my ${\mathbf X}$ has dimensions of $53 \times 5$, how many degrees of freedom do I have?
  6. If my model equation is $\hat{z} = \beta_0 x y^{\,\beta_1}$, what would ${\mathbf F_{21}}$ be if $\hat{\beta_0} = 1.5$, $\hat{\beta_1} = 2.0$, $x_1 = 1.0$, $x_2 = 1.5$, $y_1 = 0.5$, $y_2 = 1.2$.

3. Regression in Excel (30 Points)

Regress the data in the next cell to a slope/intercept equation. Use the np.savetxt to create a CSV file. Provide the following labeled/bolded quantities at the top of your Excel file:

  1. The slope with confidence interval
  2. The intercept with confidence interval
  3. A $p$-value for existence of slope. Use Excel to generate your T value.

You do not need to do all the steps for a good regression, but do make a plot of your fit and the data. Use the linest command in Excel to compute the slope/intercept and standard errors


In [1]:
x = [0.5,1.3, 2.1, 1.0, 2.1, 1.7, 1.2, 3.9, 3.9, 1.5, 3.5, 3.9, 5.7, 4.7, 5.8, 4.6, 5.1, 5.9, 5.5, 6.4, 6.7, 7.8, 7.4, 6.7, 8.4, 6.9, 10.2, 9.7, 10.0, 9.9]
y = [-1.6,0.5, 3.0, 3.1, 1.5, -1.8, -3.6, 7.0, 8.6, 2.2, 9.3, 3.6, 14.1, 9.5, 14.0, 7.4, 6.4, 17.2, 11.8, 12.2, 18.9, 21.9, 20.6, 15.7, 23.7, 13.6, 26.8, 22.0, 27.5, 23.3]

4. Regression in Matlab (30 Points)

Regress the following non-linear equation in Matlab:

$$y =\beta_0 + \beta_1 x + \beta_2 x^2 $$

Perform the regression with and without $\beta_2$. Should there be a $\beta_2$ term? Justify your answer. You do not need to do all the steps for a good regression. Do plot your two regressions and original data.

Hints:

  1. Try doing this in a MATLAB notebook so that you have syntax highlighting and autocomplete
  2. We do not have the stats module installed for Matlab, so if you have a T-statistic you need to evaluate use a quick python cell or look it up in a table.
  3. If you find yourself doing very complex optimization, stop and think.

In [4]:
x = [-5.8,-4.6, -3.9, -3.4, -1.8, -2.1, -3.0, -0.8, 0.4, -0.2, -0.4, -0.0, 2.0, 1.1, 1.4, 1.2, 3.3, 4.3, 4.3, 3.0]
y = [-6.4,-7.7, -9.3, -9.2, -8.9, -7.3, -9.5, -5.0, -3.7, -6.9, -4.0, -3.8, 2.6, -0.6, -0.7, -0.1, 5.0, 4.8, 8.5, 2.5]

5. Python Regression (40 Points)

Regress the following data to this equation:

$$ \hat{y} = \beta_0 \ln \frac{x}{\beta_1} $$

Follow regression best practices, including writing out all necessary equations in Markdown


In [3]:
x = [1.4,2.3, 3.7, 5.3, 6.6, 8.2, 10.2, 11.8, 12.7, 13.3, 14.6, 17.3, 18.6, 19.5, 21.6, 22.7, 23.6, 24.1]
y = [1.0,0.3, -0.1, -0.1, -0.3, -0.4, -0.4, -0.5, -0.4, -0.5, -0.4, -0.6, -0.8, -0.8, -0.6, -0.9, -0.7, -1.1]