Question_1-1-3_Multiclass_Ridge

Janet Matsen

Code notes:

Indivudal regressions are done by instinces of RidgeRegression, defined in rige_regression.py.
- RidgeRegression gets some methods from ClassificationBase, defined in classification_base.py.
The class HyperparameterExplorer in hyperparameter_explorer is used to tune hyperparameters on training data.



In [1]:

    
import numpy as np
import matplotlib as mpl
%matplotlib inline
import time

import pandas as pd
import seaborn as sns

from mnist import MNIST  # public package for making arrays out of MINST data.



In [2]:

    
import sys
sys.path.append('../code/')



In [3]:

    
from ridge_regression import RidgeMulti
from hyperparameter_explorer import HyperparameterExplorer



In [4]:

    
from mnist_helpers import mnist_training, mnist_testing



In [5]:

    
import matplotlib.pyplot as plt
from pylab import rcParams
rcParams['figure.figsize'] = 4, 3

Prepare MNIST training data



In [6]:

    
train_X, train_y = mnist_training()
test_X, test_y = mnist_testing()









    



[    0     1     2 ..., 59997 59998 59999]
[   0    1    2 ..., 9997 9998 9999]

Explore hyperparameters before training model on all of the training data.



In [7]:

    
hyper_explorer = HyperparameterExplorer(X=train_X, y=train_y, 
                                        model=RidgeMulti, 
                                        validation_split=0.1, score_name = 'training RMSE', 
                                        use_prev_best_weights=False,
                                        test_X=test_X, test_y=test_y)









    



6000 of 60000 points from training are reserved for validation
variances of all training data: 8.347744528888887
variances of split-off training & validation data: 8.346716212620029, 8.354324333333334



In [ ]:

    
hyper_explorer.train_model(lam=1e10, verbose=False)



In [ ]:

    
hyper_explorer.train_model(lam=1e+08, verbose=False)
hyper_explorer.train_model(lam=1e+07, verbose=False)



In [8]:

    
hyper_explorer.train_model(lam=1e+06, verbose=False)









    



training RMSE:0.6257551354903268



In [ ]:

    
hyper_explorer.train_model(lam=1e5, verbose=False)
hyper_explorer.train_model(lam=1e4, verbose=False)
hyper_explorer.train_model(lam=1e03, verbose=False)
hyper_explorer.train_model(lam=1e2, verbose=False)



In [9]:

    
hyper_explorer.train_model(lam=1e1, verbose=False)









    



training RMSE:0.6248560063778434



In [ ]:

    
hyper_explorer.train_model(lam=1e0, verbose=False)
hyper_explorer.train_model(lam=1e-1, verbose=False)
hyper_explorer.train_model(lam=1e-2, verbose=False)
hyper_explorer.train_model(lam=1e-3, verbose=False)
hyper_explorer.train_model(lam=1e-4, verbose=False)
hyper_explorer.train_model(lam=1e-5, verbose=False)



In [10]:

    
hyper_explorer.summary









    Out[10]:






  
    
      
      # nonzero weights
      lambda
      model number
      training (0/1 loss)/N
      training 0/1 loss
      training RMSE
      training SSE
      weights
      validation (0/1 loss)/N
      validation 0/1 loss
      validation RMSE
      validation SSE
    
  
  
    
      0
      3
      1000000.0
      1
      0.149093
      8051
      0.625755
      21144.752438
      [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
      0.151333
      908
      0.632408
      2399.638236
    
    
      1
      445
      10.0
      2
      0.148648
      8027
      0.624856
      21084.031550
      [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
      0.153000
      918
      0.635386
      2422.293121



In [11]:

    
hyper_explorer.plot_fits()









    



../code/hyperparameter_explorer.py:185: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  plot_data = df.sort(x)



In [ ]:

    
t = time.localtime(time.time())

hyper_explorer.plot_fits(filename = "Q-1-1-3_val_and_train_RMSE_{}-{}".format(t.tm_mon, t.tm_mday))



In [ ]:

    
hyper_explorer.plot_fits(ylim=(.6,.7),
                         filename = "Q-1-1-3_val_and_train_RMSE_zoomed_in{}-{}".format(t.tm_mon, t.tm_mday))



In [ ]:

    
hyper_explorer.best('score')



In [ ]:

    
hyper_explorer.best('summary')



In [ ]:

    
hyper_explorer.best('best score')



In [13]:

    
hyper_explorer.train_on_whole_training_set()









    



getting best model.
{'training SSE': [21144.752437995936], 'lambda': [1000000.0], 'weights': [array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])], 'training RMSE': [0.6257551354903268], 'training 0/1 loss': [8051], 'training (0/1 loss)/N': [0.14909259259259258], '# nonzero weights': [3]}



In [14]:

    
hyper_explorer.final_model.results_row()









    Out[14]:





{'# nonzero weights': [3],
 'lambda': [1000000.0],
 'training (0/1 loss)/N': [0.14931666666666665],
 'training 0/1 loss': [8959],
 'training RMSE': [0.62642358770304951],
 'training SSE': [23544.39067384561],
 'weights': [array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
         [ 0.,  0.,  0., ...,  0.,  0.,  0.],
         [ 0.,  0.,  0., ...,  0.,  0.,  0.],
         ..., 
         [ 0.,  0.,  0., ...,  0.,  0.,  0.],
         [ 0.,  0.,  0., ...,  0.,  0.,  0.],
         [ 0.,  0.,  0., ...,  0.,  0.,  0.]])]}



In [15]:

    
hyper_explorer.evaluate_test_data()









    



                                                                   0
# nonzero weights                                                  3
lambda                                                         1e+06
test (0/1 loss)/N                                             0.1459
test 0/1 loss                                                   1459
test RMSE                                                   0.627847
test SSE                                                     3941.92
weights            [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...



In [ ]:



In [ ]:

	# nonzero weights	lambda	model number	training (0/1 loss)/N	training 0/1 loss	training RMSE	training SSE	weights	validation (0/1 loss)/N	validation 0/1 loss	validation RMSE	validation SSE
0	3	1000000.0	1	0.149093	8051	0.625755	21144.752438	[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...	0.151333	908	0.632408	2399.638236
1	445	10.0	2	0.148648	8027	0.624856	21084.031550	[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...	0.153000	918	0.635386	2422.293121