Checking ratio of update norms to parameter norm values to evaluate LR scale
In [19]:
import pylearn2.utils
import pylearn2.config
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import os.path
Model trained on on 0.1 split of data
In [63]:
model = pylearn2.utils.serial.load(os.path.expandvars('${DATA_DIR}/plankton/models/learning_rate_experiment/ilr_5e-2_lin_decay_adj_on_recent.pkl'))
In [64]:
print(model)
Plot train and valid set NLL
In [65]:
plt.plot(model.monitor.channels['valid_y_y_1_nll'].val_record)
plt.plot(model.monitor.channels['train_y_y_1_nll'].val_record)
plt.legend(['Valid', 'Train'])
plt.ylabel('NLL')
plt.xlabel('Epochs')
Out[65]:
Strangely though overfitting to training set does not seem to be increasing validation set NLL?
Get channel names corresponding to update norms and parameternorms
In [57]:
mean_update_channels = [c for c in model.monitor.channels if 'mean_update' in c]
print('\n'.join(mean_update_channels))
In [58]:
param_norm_channels = [c for c in model.monitor.channels if 'norms_mean' in c]
print('\n'.join(param_norm_channels))
Plot ratio of update norms to parameter norms across epochs for different layers
In [66]:
h1_W_up_norms = np.array([float(v) for v in model.monitor.channels['mean_update_h1_W_kernel_norm_mean'].val_record])
h1_W_norms = np.array([float(v) for v in model.monitor.channels['valid_h1_kernel_norms_mean'].val_record])
plt.plot(h1_W_norms / h1_W_up_norms)
Out[66]:
In [67]:
h2_W_up_norms = np.array([float(v) for v in model.monitor.channels['mean_update_h2_W_kernel_norm_mean'].val_record])
h2_W_norms = np.array([float(v) for v in model.monitor.channels['valid_h2_kernel_norms_mean'].val_record])
plt.plot(h2_W_norms / h2_W_up_norms)
Out[67]:
In [68]:
h3_W_up_norms = np.array([float(v) for v in model.monitor.channels['mean_update_h3_W_kernel_norm_mean'].val_record])
h3_W_norms = np.array([float(v) for v in model.monitor.channels['valid_h3_kernel_norms_mean'].val_record])
plt.plot(h3_W_norms / h3_W_up_norms)
Out[68]:
In [69]:
h4_W_up_norms = np.array([float(v) for v in model.monitor.channels['mean_update_h4_W_col_norm_mean'].val_record])
h4_W_norms = np.array([float(v) for v in model.monitor.channels['valid_h4_col_norms_mean'].val_record])
plt.plot(h4_W_norms / h4_W_up_norms)
Out[69]:
Annoyingly they also seem to be significantly less than a 1000 already suggesting the LR scale does not need increasingly significantly (on this heuristic at least).
In [ ]: