Goals of this notebook. Take our best model file:
In [1]:
import pylearn2.utils
import pylearn2.config
import theano
import neukrill_net.dense_dataset
import neukrill_net.utils
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import holoviews as hl
%load_ext holoviews.ipython
import sklearn.metrics
At the time of writing our best model is defined by the run settings file alexnet_based_40aug.json
, basically taking the AlexNet based architecture with an extra convolutional layer and using more augmentation. Full details are in the following YAML file:
In [2]:
cd ..
In [3]:
cat yaml_templates/alexnet_based_extra_convlayer.yaml
It has relatively few MLP layers, so maybe we should look at where the parameters in our model are distributed; comparing the MLP layers to the convolutional ones.
In [4]:
settings = neukrill_net.utils.Settings("settings.json")
run_settings = neukrill_net.utils.load_run_settings(
"run_settings/quicker_learning_1_fc_layer_experiment_no_norms_repeat.json", settings, force=True)
In [5]:
model = pylearn2.utils.serial.load(run_settings["pickle abspath"])
In [6]:
params = model.get_params()
In [7]:
params[0].name
Out[7]:
In [8]:
total_params = sum(map(lambda x: x.get_value().size,params))
print("Total parameters: {0}".format(total_params))
In [9]:
for l in params:
print("Layer {0}: {1} parameters".format(l.name,l.get_value().size))
In [10]:
for l in params:
print("Layer {0}: {1}% of the parameters.".format(l.name,
100*(float(l.get_value().size)/total_params)))
The reason we probably see little difference adding extra MLP layers is that the weight matrix leading into the first MLP layer is so much more massive than between the MLP layers themselves. Adding more MLP layers barely increases the number of parameters.
In [11]:
%env PYLEARN2_VIEWER_COMMAND=/afs/inf.ed.ac.uk/user/s08/s0805516/repos/neukrill-net-work/image_hack.sh
In [12]:
%run ~/repos/pylearn2/pylearn2/scripts/show_weights.py /disk/scratch/neuroglycerin/models/quicker_learning_1_fc_layer_experiment_no_norms_repeat_recent.pkl
In [13]:
from IPython.display import Image
In [14]:
def plot_recent_pylearn2():
pl2plt = Image(filename="/afs/inf.ed.ac.uk/user/s08/s0805516/tmp/pylearnplot.png", width=500)
return pl2plt
plot_recent_pylearn2()
Out[14]:
Trying to repeat this, but do it smarter, with holoviews.
In [16]:
weights = model.get_weights_topo()
In [31]:
#%%opts HeatMap style(cmap='Greys')
heatmaps = None
for w in weights:
w = {(i,j):w[i,j][0] for i in range(w.shape[0]) for j in range(w.shape[1])}
if heatmaps == None:
heatmaps = hl.HeatMap(w)
else:
heatmaps += hl.HeatMap(w)
heatmaps
Out[31]:
In [121]:
maxes = []
for irange in np.logspace(-5,-0.5,50):
W = ((np.random.rand(1000,5,5,1)*2)-1)*irange
maxes.append(np.mean(np.sqrt(np.sum(W**2,axis=(1,2,3)))))
In [124]:
plt.xlabel("irange")
plt.ylabel("max_kernel_norm")
plt.plot(np.logspace(-5,-1,50),maxes)
plt.grid()
In [128]:
maxes = []
s = np.logspace(-4,-1,50)
for irange in s:
W = ((np.random.rand(10,3,3,48)*2)-1)*irange
maxes.append(np.mean(np.sqrt(np.sum(W**2,axis=(1,2,3)))))
In [129]:
plt.xlabel("irange")
plt.ylabel("max_kernel_norm")
plt.plot(s,maxes)
plt.grid()
In [134]:
maxes = []
s = np.logspace(-4,-1.2,50)
for irange in s:
W = ((np.random.rand(10,3,3,128)*2)-1)*irange
maxes.append(np.mean(np.sqrt(np.sum(W**2,axis=(1,2,3)))))
In [135]:
plt.xlabel("irange")
plt.ylabel("max_kernel_norm")
plt.plot(s,maxes)
plt.grid()
In [3]:
10*46208*1024
Out[3]:
In [15]:
maxes = []
s = np.logspace(-5,-2,50)
for irange in s:
W = np.random.randn(10,46208)*irange
maxes.append(np.mean(np.sqrt(np.sum(W**2,axis=1))))
In [16]:
plt.xlabel("irange")
plt.ylabel("max_kernel_norm")
plt.plot(s,maxes)
plt.grid()
In [19]:
# MLP layer
maxnorm = 0
istdev = 0.0005
while maxnorm < 0.3:
W = np.random.randn(1024,46208)*istdev
maxnorm = np.mean(np.sqrt(np.sum(W**2,axis=1)))
istdev = istdev*1.1
print("istdev should be approximately {0}".format(istdev))