Homework 6

Due: Tuesday, October 10 at 11:59 PM

Problem 1: Bank Account Revisited

We are going to rewrite the bank account closure problem we had a few assignments ago, only this time developing a formal class for a Bank User and Bank Account to use in our closure (recall previously we just had a nonlocal variable amount that we changed).

Some Preliminaries:

First we are going to define two types of bank accounts. Use the code below to do this:


In [1]:
from enum import Enum
class AccountType(Enum):
    SAVINGS = 1
    CHECKING = 2

An Enum stands for an enumeration, it's a convenient way for you to define lists of things. Typing:


In [2]:
AccountType.SAVINGS


Out[2]:
<AccountType.SAVINGS: 1>

returns a Python representation of an enumeration. You can compare these account types:


In [3]:
AccountType.SAVINGS == AccountType.SAVINGS


Out[3]:
True

In [4]:
AccountType.SAVINGS == AccountType.CHECKING


Out[4]:
False

To get a string representation of an Enum, you can use:


In [5]:
AccountType.SAVINGS.name


Out[5]:
'SAVINGS'

Part 1: Create a BankAccount class with the following specification:

Constructor is BankAccount(self, owner, accountType) where owner is a string representing the name of the account owner and accountType is one of the AccountType enums

Methods withdraw(self, amount) and deposit(self, amount) to modify the account balance of the account

Override methods __str__ to write an informative string of the account owner and the type of account, and __len__ to return the balance of the account

Part 2: Write a class BankUser with the following specification:

Constructor BankUser(self, owner) where owner is the name of the account.

Method addAccount(self, accountType) - to start, a user will have no accounts when the BankUser object is created. addAccount will add a new account to the user of the accountType specified. Only one savings/checking account per user, return appropriate error otherwise

Methods getBalance(self, accountType), deposit(self, accountType, amount), and withdraw(self, accountType, amount) for a specific AccountType.

Override __str__ to have an informative summary of user's accounts.

Write some simple tests to make sure this is working. Think of edge scenarios a user might try to do.

Part 3: ATM Closure

Finally, we are going to rewrite a closure to use our bank account. We will make use of the input function which takes user input to decide what actions to take.

Write a closure called ATMSession(bankUser) which takes in a BankUser object. Return a method called Interface that when called, would provide the following interface:

First screen for user will look like:

Enter Option:

1)Exit

2)Create Account

3)Check Balance

4)Deposit

5)Withdraw

Pressing 1 will exit, any other option will show the options:

Enter Option:

1)Checking

2)Savings

If a deposit or withdraw was chosen, then there must be a third screen:

Enter Integer Amount, Cannot Be Negative:

This is to keep the code relatively simple, if you'd like you can also curate the options depending on the BankUser object (for example, if user has no accounts then only show the Create Account option), but this is up to you. In any case, you must handle any input from the user in a reasonable way that an actual bank would be okay with, and give the user a proper response to the action specified.

Upon finishing a transaction or viewing balance, it should go back to the original screen

Part 4: Put everything in a module Bank.py

We will be grading this problem with a test suite. Put the enum, classes, and closure in a single file named Bank.py. It is very important that the class and method specifications we provided are used (with the same capitalization), otherwise you will receive no credit.


Problem 2: Linear Regression Class

Let's say you want to create Python classes for three related types of linear regression: Ordinary Least Squares Linear Regression, Ridge Regression, and Lasso Regression.

Consider the multivariate linear model:

$$y = X\beta + \epsilon$$

where $y$ is a length $n$ vector, $X$ is an $m \times p$ matrix, and $\beta$ is a $p$ length vector of coefficients.

Ordinary Least Squares Linear Regression

OLS Regression seeks to minimize the following cost function:

$$\|y - \beta\mathbf {X}\|^{2}$$

The best fit coefficients can be obtained by:

$$\hat{\beta} = (X^T X)^{-1}X^Ty$$

where $X^T$ is the transpose of the matrix $X$ and $X^{-1}$ is the inverse of the matrix $X$.

Ridge Regression

Ridge Regression introduces an L2 regularization term to the cost function:

$$\|y - \beta\mathbf {X}\|^{2}+\|\Gamma \beta \|^{2}$$

Where $\Gamma = \alpha I$ for some constant $\alpha$ and the identity matrix $I$.

The best fit coefficients can be obtained by: $$\hat{\beta} = (X^T X+\Gamma^T\Gamma)^{-1}X^Ty$$

Lasso Regression

Lasso Regression introduces an L1 regularization term and restricts the total number of predictor variables in the model. The following cost function: $${\displaystyle \min _{\beta _{0},\beta }\left\{{\frac {1}{m}}\left\|y-\beta _{0}-X\beta \right\|_{2}^{2}\right\} \quad \textrm{subject to} \quad \|\beta \|_{1}\leq \alpha.}$$

does not have a nice closed form solution. For the sake of this exercise, you may use the sklearn.linear_model.Lasso class, which uses a coordinate descent algorithm to find the best fit. You should only use the class in the fit() method of this exercise (ie. do not re-use the sklearn for other methods in your class).

$R^2$ score

The $R^2$ score is defined as: $${R^{2} = {1-{SS_E \over SS_T}}}$$

Where:

$$SS_T=\sum_i (y_i-\bar{y})^2, SS_R=\sum_i (\hat{y_i}-\bar{y})^2, SS_E=\sum_i (y_i - \hat{y_i})^2$$

where ${y_i}$ are the original data values, $\hat{y_i}$ are the predicted values, and $\bar{y_i}$ is the mean of the original data values.

Part 1: Base Class

Write a class called Regression with the following methods:

$fit(X, y)$: Fits linear model to $X$ and $y$.

$get\_params()$: Returns $\hat{\beta}$ for the fitted model. The parameters should be stored in a dictionary.

$predict(X)$: Predict new values with the fitted model given $X$.

$score(X, y)$: Returns $R^2$ value of the fitted model.

$set\_params()$: Manually set the parameters of the linear model.

This parent class should throw a NotImplementedError for methods that are intended to be implemented by subclasses.

Part 2: OLS Linear Regression

Write a class called OLSRegression that implements the OLS Regression model described above and inherits the Regression class.

Part 3: Ridge Regression

Write a class called RidgeRegression that implements Ridge Regression and inherits the OLSRegression class.

Part 3: Lasso Regression

Write a class called LassoRegression that implements Lasso Regression and inherits the OLSRegression class. You should only use Lasso(), Lasso.fit(), Lasso.coef_, and Lasso._intercept from the sklearn.linear_model.Lasso class.

Part 4: Model Scoring

You will use the Boston dataset for this part.

Instantiate each of the three models above. Using a for loop, fit (on the training data) and score (on the testing data) each model on the Boston dataset.

Print out the $R^2$ value for each model and the parameters for the best model using the get_params() method. Use an $\alpha$ value of 0.1.

Hint: You can consider using the sklearn.model_selection.train_test_split method to create the training and test datasets.

Part 5: Visualize Model Performance

We can evaluate how the models perform for various values of $\alpha$. Calculate the $R^2$ scores for each model for $\alpha \in [0.05, 1]$ and plot the three lines on the same graph. To change the parameters, use the set_params() method. Be sure to label each line and add axis labels.


In [ ]: