Homework 12

CHE 116: Numerical Methods and Statistics

4/19/2018


Homework Requirements:

  1. Write all equations in $\LaTeX$
  2. Simplify all expressions
  3. Put comments in your Python code
  4. Explain or show your work
  5. Follow the academic honesty guidelines in the syllabus

1. Conceptual Questions (20 Points)

Answer in Markdown 2 Points each

  1. What assumption do we make on the noise terms when doing linear regression? How can we check it?

  2. Your friend tells you that it's important to minimize both the SSR and TSS. What's wrong with minimizing the TSS?

  3. How do you justify the presence of a slope?

  4. What is the best numeric value or statistic for justifying the existence of a correlation?

  5. What should you plot to justify an ordinary 4-dimensional least squares regression?

  6. Why do we use different number of deducted degrees of freedom when doing hypothesis testing vs performing the regression?

  7. Write a model equation for 3-dimensional ordinary least squares regression with an intercept. For example, a one dimensional model equation without an intercept would be $y = \beta_0 x + \epsilon$

  8. Write a model equation for when $y \propto \ln{x}$. Assume no intercept

  9. Write a model equation for a person's life expectancy ($l$) assuming it depends on gender ($s$) and if the person eats vegetables ($v$). Assume for this problem that gender and eating vegetables are both binary (0 or 1).

  10. Write a model equation for homework performanced ($h$) based on music genre listended to while working. The following genres are conisdered: Kwaito, Electroswing, and Djent Metal. You can only listen to one genre at a time. Use the letters $k$, $e$, and $d$.

2. Short Answer Questions (16 Points)

Answer in Python or Markdown as appropiate 4 Points each

  1. If $\sigma_{xy} = -2.1$, $\sigma_{x}^2 = 3.5$, $\sigma_{y}^2 = 1.7$, what is the best fit slope? How does it change if the intercept is $-2.1$?

  2. If your model equation is $y = \beta_0 + \beta_1 x + \beta_2 z + \epsilon$, what is the deducted degrees of freedom?

  3. If $N = 12$, $D = 2$, and $S^2_{\beta_0} = 2.5$, what is the width of a 90% confidence interval for $\beta_0$?

  4. If your best fit intercept is $\hat{\alpha} = 3$ with a standard error of $0.7$, what is the $p$-value for the existence of the that intercept? Take $N = 15$ and assume it's 1D OLS.

3. Linearized Regression (24 Points)

Regress the following data to the model equation $y = \beta_0 \ln x + \beta_1 x + \beta_2 +\epsilon$ using a linearization so that you use ND OLS. Report the following:

  1. [4 points] Justification for regression. Use words and statistics.
  2. [12 points] Fit coefficients with 95% confidence intervals.
  3. [4 points] Plot fit
  4. [4 points] Show if residuals are normal
x = [0.2, 0.29, 0.39, 0.48, 0.57, 0.66, 0.76, 0.85, 0.94, 1.04, 1.13, 1.22, 1.31, 1.41, 1.5]
y = [2.92, 2.58, 3.18, 4.27, 4.5, 3.93, 4.32, 4.57, 4.55, 4.7, 5.02, 4.21, 3.04, 4.98, 6.45]

4. Non-Linear Regression (12 Points)

Repeat problem 3 with non-linear least squares instead. Only do the computation of the coefficients with confidence intervals. Be sure to write out your partials making up your $F$-matrix in markdown.