We use a script that extracts your answers by looking for cells in between the cells containing the exercise statements (beginning with Exercise X.X). So you
To make markdown, please switch the cell type to markdown (from code) - you can hit 'm' when you are in command mode - and use the markdown language. For a brief tutorial see: https://daringfireball.net/projects/markdown/syntax
In the conceptual exercises you should provide an explanation, with math when necessary, for any answers. When answering with math you should use basic LaTeX, as in $$E(Y|X=x) = \int_{\mathcal{Y}} f_{Y|X}(y|x) dy = \int_{\mathcal{Y}} \frac{f_{Y,X}(y,x)}{f_{X}(x)} dy$$ for displayed equations, and $R_{i,j} = 2^{-|i-j|}$ for inline equations. (To see the contents of this cell in markdown, double click on it or hit Enter in escape mode.) To see a list of latex math symbols see here: http://web.ift.uib.no/Teori/KURS/WRK/TeX/symALL.html
Exercise 1.1. (5 pts) Recall that the Hamming loss for Binary classification ($y \in \{0,1\}$) is $$l(y,\hat y) = 1\{y \ne \hat y\} = (y - \hat y)^2$$ as long as $\hat y \in \{0,1\}$. This loss can be extended to multiclass classification where there are $K$ possible values that $y$ can take (for example 'dog','cat','squirrel' or 1-5 stars). Explain how you can re-encode $y$ and $\hat y$ to be a $K-1$ dimensional vector that generalizes binary classification, and rewrite the loss using vector operations.
Exercise 1.2 (5 pts) Ex. 2.7 in ESL
Exercise 1.3 (5 pts, 1 for each part) Recall that the true risk for a prediction function, $f$, a loss function, $\ell$, and a joint distribution for $Y,X$ is $$R(f) = E \ell(y,f(x))$$ For a training set $\{x_i,y_x\}_{i=1}^n$, the empirical risk is $$R_n = \frac{1}{n} \sum_{i=1}^n \ell(y_i,f(x_i)).$$ Let $y = x^\top \beta + \epsilon$ be a linear model for $Y|X$, where $x,\beta$ are $p$-dimensional such that $\epsilon$ is Gaussian with mean 0 and variance $\sigma^2$ (independent of X). Let $\ell(y,\hat y) = (y - \hat y)^2$ be square error loss.
Exercise 1.4 (5 pts) Ex. 3.5 in ESL
Exercise 1.5 (5 pts) Ex 3.9 in ESL
You will be graded based on several criteria, and each is on a 5 point scale (5 is excellent - A - 1 is poor - C - 0 is not answered - D/F). You should strive to 'impress us' if you want a 5. This means excellent code, well explained conclusions, well annotated plots, correct answers, etc.
We will be grading you on several criteria:
Exercise 2 You should run the following code cells to import the code and reduce the variable set. Address the questions after the code.
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import LeaveOneOut
from sklearn import linear_model, neighbors
%matplotlib inline
plt.style.use('ggplot')
# dataset path
data_dir = "."
In [2]:
sample_data = pd.read_csv(data_dir+"/hw1.csv", delimiter=',')
sample_data.head()
The response variable is quality.
In [4]:
X = np.array(sample_data.iloc[:,range(1,5)])
y = np.array(sample_data.iloc[:,0])
In [6]:
def loo_risk(X,y,regmod):
"""
Construct the leave-one-out square error risk for a regression model
Input: design matrix, X, response vector, y, a regression model, regmod
Output: scalar LOO risk
"""
loo = LeaveOneOut()
loo_losses = []
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
regmod.fit(X_train,y_train)
y_hat = regmod.predict(X_test)
loss = np.sum((y_hat - y_test)**2)
loo_losses.append(loss)
return np.mean(loo_losses)
def emp_risk(X,y,regmod):
"""
Return the empirical risk for square error loss
Input: design matrix, X, response vector, y, a regression model, regmod
Output: scalar empirical risk
"""
regmod.fit(X,y)
y_hat = regmod.predict(X)
return np.mean((y_hat - y)**2)
Exercise 2.1 (5 pts) Compare the leave-one-out risk with the empirical risk for linear regression, on this dataset.
Exercise 2.2 (10 pts) Perform kNN regression and compare the leave-one-out risk with the empirical risk for k from 1 to 50. Remark on the tradeoff between bias and variance for this dataset and compare against linear regression.
Exercise 2.3 (10 pts) Implement forward stepwise regression (ESL section 3.3.2) for the linear model and compare the LOO risk for each stage. Recall that at each step forward stepwise regression will select a new variable that most improves the empirical risk and include that in the model (starting with the intercept).