From ~/Downloads/github/pylearn2/pylearn2/scripts/tutorials/stacked_autoencoders
In [1]:
import os
os.environ['PYLEARN2_DATA_PATH'] = '/Users/udi/Downloads/lisa/data'
by Mehdi Mirza
This notebook will show you how to perform layer-wise pre-training using denoising autoencoders (DAEs), and subsequently stack the layers to form a multilayer perceptron (MLP) which can be fine-tuned using supervised training. You can also look at this more detailed tutorial of training DAEs using Theano as well as this tutorial which covers the stacked version.
The methods used here can easily be adapted to other models such as contractive auto-encoders (CAEs) or restricted Boltzmann machines (RBMs) with only small modifications.
In [2]:
layer1_yaml = open('dae_l1.yaml', 'r').read()
hyper_params_l1 = {'train_stop' : 50000,
'batch_size' : 100,
'monitoring_batches' : 5,
'nhid' : 500,
'max_epochs' : 10,
'save_path' : '.'}
layer1_yaml = layer1_yaml % (hyper_params_l1)
print layer1_yaml
!obj:pylearn2.train.Train {
dataset: &train !obj:pylearn2.datasets.mnist.MNIST {
which_set: 'train',
# TODO: the one_hot: 1 is only necessary because one_hot: 0 is
# broken, remove it after one_hot: 0 is fixed.
one_hot: 1,
start: 0,
stop: 50000
},
model: !obj:pylearn2.models.autoencoder.DenoisingAutoencoder {
nvis : 784,
nhid : 500,
irange : 0.05,
corruptor: !obj:pylearn2.corruption.BinomialCorruptor {
corruption_level: .2,
},
act_enc: "tanh",
act_dec: null, # Linear activation on the decoder side.
},
algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
learning_rate : 1e-3,
batch_size : 100,
monitoring_batches : 5,
monitoring_dataset : *train,
cost : !obj:pylearn2.costs.autoencoder.MeanSquaredReconstructionError {},
termination_criterion : !obj:pylearn2.termination_criteria.EpochCounter {
max_epochs: 10,
},
},
save_path: "./dae_l1.pkl",
save_freq: 1
}
Now we can train the model using the YAML string in the same way as the previous tutorials:
In [3]:
from pylearn2.config import yaml_parse
train = yaml_parse.load(layer1_yaml)
train.main_loop()
/Users/udi/Downloads/github/pylearn2/pylearn2/utils/call_check.py:99: UserWarning: the `one_hot` parameter is deprecated. To get one-hot encoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST's iterator method. `one_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
/Users/udi/anaconda/lib/python2.7/site-packages/theano/sandbox/rng_mrg.py:1188: UserWarning: MRG_RandomStreams Can't determine #streams from size (Shape.0), guessing 60*256
nstreams = self.n_streams(size)
Parameter and initial learning rate summary:
vb: 0.001
hb: 0.001
W: 0.001
Wprime: 0.001
Compiling sgd_update...
/Users/udi/Downloads/github/pylearn2/pylearn2/models/model.py:72: UserWarning: The <class 'pylearn2.models.autoencoder.DenoisingAutoencoder'> Model subclass seems not to call the Model constructor. This behavior may be considered an error on or after 2014-11-01.
warnings.warn("The " + str(type(self)) + " Model subclass "
Compiling sgd_update done. Time elapsed: 10.284412 seconds
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.543545 seconds
Monitored channels:
learning_rate
objective
total_seconds_last_epoch
training_seconds_this_epoch
Compiling accum...
graph size: 19
Compiling accum done. Time elapsed: 3.985918 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.001
objective: 89.189888493
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
Time this epoch: 7.978305 seconds
Monitoring step:
Epochs seen: 1
Batches seen: 500
Examples seen: 50000
learning_rate: 0.001
objective: 30.2296487294
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 7.978305
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.788898 seconds
Time this epoch: 6.294500 seconds
Monitoring step:
Epochs seen: 2
Batches seen: 1000
Examples seen: 100000
learning_rate: 0.001
objective: 22.9704822504
total_seconds_last_epoch: 12.095553
training_seconds_this_epoch: 6.2945
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.606921 seconds
Time this epoch: 5.781151 seconds
Monitoring step:
Epochs seen: 3
Batches seen: 1500
Examples seen: 150000
learning_rate: 0.001
objective: 19.3272344275
total_seconds_last_epoch: 10.68441
training_seconds_this_epoch: 5.781151
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.587990 seconds
Time this epoch: 5.826722 seconds
Monitoring step:
Epochs seen: 4
Batches seen: 2000
Examples seen: 200000
learning_rate: 0.001
objective: 17.0729861682
total_seconds_last_epoch: 9.526738
training_seconds_this_epoch: 5.826722
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.631785 seconds
Time this epoch: 5.467358 seconds
Monitoring step:
Epochs seen: 5
Batches seen: 2500
Examples seen: 250000
learning_rate: 0.001
objective: 15.5457219544
total_seconds_last_epoch: 9.541899
training_seconds_this_epoch: 5.467358
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.573065 seconds
Time this epoch: 5.347703 seconds
Monitoring step:
Epochs seen: 6
Batches seen: 3000
Examples seen: 300000
learning_rate: 0.001
objective: 14.4348570196
total_seconds_last_epoch: 9.192685
training_seconds_this_epoch: 5.347703
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.589890 seconds
Time this epoch: 5.440438 seconds
Monitoring step:
Epochs seen: 7
Batches seen: 3500
Examples seen: 350000
learning_rate: 0.001
objective: 13.5968633264
total_seconds_last_epoch: 9.076605
training_seconds_this_epoch: 5.440438
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.564320 seconds
Time this epoch: 5.469406 seconds
Monitoring step:
Epochs seen: 8
Batches seen: 4000
Examples seen: 400000
learning_rate: 0.001
objective: 12.9241679727
total_seconds_last_epoch: 9.071418
training_seconds_this_epoch: 5.469406
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.579426 seconds
Time this epoch: 5.066210 seconds
Monitoring step:
Epochs seen: 9
Batches seen: 4500
Examples seen: 450000
learning_rate: 0.001
objective: 12.3858239701
total_seconds_last_epoch: 9.461765
training_seconds_this_epoch: 5.06621
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.560447 seconds
Time this epoch: 5.325952 seconds
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.001
objective: 11.9513600238
total_seconds_last_epoch: 8.866708
training_seconds_this_epoch: 5.325952
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.557787 seconds
Saving to ./dae_l1.pkl...
Saving to ./dae_l1.pkl done. Time elapsed: 0.558132 seconds
The second layer takes the output of the first layer as its input. Hence we must first apply the first layer's transformations to the raw data using datasets.transformer_dataset.TransformerDataset. This class takes two arguments:
raw: the raw datatransformer: a Pylearn2 block that transforms the raw data, which in our case is the dae_l1.pkl file from the previous stepTo train the second layer, we load the YAML file as before and set the hyperparameters before starting the training loop.
In [4]:
layer2_yaml = open('dae_l2.yaml', 'r').read()
hyper_params_l2 = {'train_stop' : 50000,
'batch_size' : 100,
'monitoring_batches' : 5,
'nvis' : hyper_params_l1['nhid'],
'nhid' : 500,
'max_epochs' : 10,
'save_path' : '.'}
layer2_yaml = layer2_yaml % (hyper_params_l2)
print layer2_yaml
!obj:pylearn2.train.Train {
dataset: &train !obj:pylearn2.datasets.transformer_dataset.TransformerDataset {
raw: !obj:pylearn2.datasets.mnist.MNIST {
which_set: 'train',
# TODO: the one_hot: 1 is only necessary because one_hot: 0 is
# broken, remove it after one_hot: 0 is fixed.
one_hot: 1,
start: 0,
stop: 50000
},
transformer: !pkl: "./dae_l1.pkl"
},
model: !obj:pylearn2.models.autoencoder.DenoisingAutoencoder {
nvis : 500,
nhid : 500,
irange : 0.05,
corruptor: !obj:pylearn2.corruption.BinomialCorruptor {
corruption_level: .3,
},
act_enc: "tanh",
act_dec: null, # Linear activation on the decoder side.
},
algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
learning_rate : 1e-3,
batch_size : 100,
monitoring_batches : 5,
monitoring_dataset : *train,
cost : !obj:pylearn2.costs.autoencoder.MeanSquaredReconstructionError {},
termination_criterion : !obj:pylearn2.termination_criteria.EpochCounter {
max_epochs: 10,
},
},
save_path: "./dae_l2.pkl",
save_freq: 1
}
In [5]:
train = yaml_parse.load(layer2_yaml)
train.main_loop()
Parameter and initial learning rate summary:
vb: 0.001
hb: 0.001
W: 0.001
Wprime: 0.001
Compiling sgd_update...
Compiling sgd_update done. Time elapsed: 2.543316 seconds
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.055552 seconds
Monitored channels:
learning_rate
objective
total_seconds_last_epoch
training_seconds_this_epoch
Compiling accum...
graph size: 19
Compiling accum done. Time elapsed: 0.212405 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.001
objective: 52.3473862576
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
Time this epoch: 5.916752 seconds
Monitoring step:
Epochs seen: 1
Batches seen: 500
Examples seen: 50000
learning_rate: 0.001
objective: 20.403730567
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 5.916752
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.445217 seconds
Time this epoch: 8.161488 seconds
Monitoring step:
Epochs seen: 2
Batches seen: 1000
Examples seen: 100000
learning_rate: 0.001
objective: 13.3085194431
total_seconds_last_epoch: 10.261751
training_seconds_this_epoch: 8.161488
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.446662 seconds
Time this epoch: 7.390480 seconds
Monitoring step:
Epochs seen: 3
Batches seen: 1500
Examples seen: 150000
learning_rate: 0.001
objective: 9.98722106485
total_seconds_last_epoch: 13.724457
training_seconds_this_epoch: 7.39048
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.402655 seconds
Time this epoch: 5.774639 seconds
Monitoring step:
Epochs seen: 4
Batches seen: 2000
Examples seen: 200000
learning_rate: 0.001
objective: 8.00958744431
total_seconds_last_epoch: 11.817375
training_seconds_this_epoch: 5.774639
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.457210 seconds
Time this epoch: 6.381065 seconds
Monitoring step:
Epochs seen: 5
Batches seen: 2500
Examples seen: 250000
learning_rate: 0.001
objective: 6.75177105446
total_seconds_last_epoch: 9.740588
training_seconds_this_epoch: 6.381065
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.441539 seconds
Time this epoch: 6.016833 seconds
Monitoring step:
Epochs seen: 6
Batches seen: 3000
Examples seen: 300000
learning_rate: 0.001
objective: 5.90950617494
total_seconds_last_epoch: 10.551165
training_seconds_this_epoch: 6.016833
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.462518 seconds
Time this epoch: 5.173134 seconds
Monitoring step:
Epochs seen: 7
Batches seen: 3500
Examples seen: 350000
learning_rate: 0.001
objective: 5.3139621876
total_seconds_last_epoch: 9.939718
training_seconds_this_epoch: 5.173134
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.402607 seconds
Time this epoch: 5.052310 seconds
Monitoring step:
Epochs seen: 8
Batches seen: 4000
Examples seen: 400000
learning_rate: 0.001
objective: 4.88844513578
total_seconds_last_epoch: 8.790959
training_seconds_this_epoch: 5.05231
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.404440 seconds
Time this epoch: 4.980679 seconds
Monitoring step:
Epochs seen: 9
Batches seen: 4500
Examples seen: 450000
learning_rate: 0.001
objective: 4.57154658534
total_seconds_last_epoch: 9.117831
training_seconds_this_epoch: 4.980679
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.402997 seconds
Time this epoch: 5.802203 seconds
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.001
objective: 4.33914081349
total_seconds_last_epoch: 8.836268
training_seconds_this_epoch: 5.802203
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.398750 seconds
Saving to ./dae_l2.pkl...
Saving to ./dae_l2.pkl done. Time elapsed: 0.391285 seconds
Now that we have two pre-trained layers, we can stack them to form an MLP which can be trained in a supervised fashion. We use the MLP class as usual for this, except that we now use models.mlp.PretrainedLayer for the different layers so that we can pass our pre-trained layers (as pickle files) using the layer_content argument.
In [6]:
mlp_yaml = open('dae_mlp.yaml', 'r').read()
hyper_params_mlp = {'train_stop' : 50000,
'valid_stop' : 60000,
'batch_size' : 100,
'max_epochs' : 50,
'save_path' : '.'}
mlp_yaml = mlp_yaml % (hyper_params_mlp)
print mlp_yaml
!obj:pylearn2.train.Train {
dataset: &train !obj:pylearn2.datasets.mnist.MNIST {
which_set: 'train',
one_hot: 1,
start: 0,
stop: 50000
},
model: !obj:pylearn2.models.mlp.MLP {
batch_size: 100,
layers: [
!obj:pylearn2.models.mlp.PretrainedLayer {
layer_name: 'h1',
layer_content: !pkl: "./dae_l1.pkl"
},
!obj:pylearn2.models.mlp.PretrainedLayer {
layer_name: 'h2',
layer_content: !pkl: "./dae_l2.pkl"
},
!obj:pylearn2.models.mlp.Softmax {
max_col_norm: 1.9365,
layer_name: 'y',
n_classes: 10,
irange: .005
}
],
nvis: 784
},
algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
learning_rate: .05,
learning_rule: !obj:pylearn2.training_algorithms.learning_rule.Momentum {
init_momentum: .5,
},
monitoring_dataset:
{
'valid' : !obj:pylearn2.datasets.mnist.MNIST {
which_set: 'train',
one_hot: 1,
start: 50000,
stop: 60000
},
},
cost: !obj:pylearn2.costs.mlp.Default {},
termination_criterion: !obj:pylearn2.termination_criteria.And {
criteria: [
!obj:pylearn2.termination_criteria.MonitorBased {
channel_name: "valid_y_misclass",
prop_decrease: 0.,
N: 100
},
!obj:pylearn2.termination_criteria.EpochCounter {
max_epochs: 50
}
]
},
update_callbacks: !obj:pylearn2.training_algorithms.sgd.ExponentialDecay {
decay_factor: 1.00004,
min_lr: .000001
}
},
extensions: [
!obj:pylearn2.training_algorithms.learning_rule.MomentumAdjustor {
start: 1,
saturate: 250,
final_momentum: .7
}
]
}
In [7]:
train = yaml_parse.load(mlp_yaml)
train.main_loop()
Parameter and initial learning rate summary:
vb: 0.05
hb: 0.05
W: 0.05
Wprime: 0.05
vb: 0.05
hb: 0.05
W: 0.05
Wprime: 0.05
softmax_b: 0.05
softmax_W: 0.05
Compiling sgd_update...
Compiling sgd_update done. Time elapsed: 18.207897 seconds
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.071935 seconds
Monitored channels:
learning_rate
momentum
total_seconds_last_epoch
training_seconds_this_epoch
valid_objective
valid_y_col_norms_max
valid_y_col_norms_mean
valid_y_col_norms_min
valid_y_max_max_class
valid_y_mean_max_class
valid_y_min_max_class
valid_y_misclass
valid_y_nll
valid_y_row_norms_max
valid_y_row_norms_mean
valid_y_row_norms_min
Compiling accum...
graph size: 63
Compiling accum done. Time elapsed: 10.991465 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.05
momentum: 0.5
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
valid_objective: 2.3025427138
valid_y_col_norms_max: 0.0650026130651
valid_y_col_norms_mean: 0.0641744853852
valid_y_col_norms_min: 0.0624679393698
valid_y_max_max_class: 0.105517846447
valid_y_mean_max_class: 0.102751944167
valid_y_min_max_class: 0.101061860389
valid_y_misclass: 0.9045
valid_y_nll: 2.3025427138
valid_y_row_norms_max: 0.0125483545665
valid_y_row_norms_mean: 0.00897718040255
valid_y_row_norms_min: 0.00411555936503
Time this epoch: 5.424466 seconds
Monitoring step:
Epochs seen: 1
Batches seen: 500
Examples seen: 50000
learning_rate: 0.0490099532688
momentum: 0.5
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 5.424466
valid_objective: 0.285522835091
valid_y_col_norms_max: 1.37933177742
valid_y_col_norms_mean: 1.26006992057
valid_y_col_norms_min: 1.1055487119
valid_y_max_max_class: 0.999642237207
valid_y_mean_max_class: 0.891380619054
valid_y_min_max_class: 0.366655260832
valid_y_misclass: 0.0816
valid_y_nll: 0.285522835091
valid_y_row_norms_max: 0.305729549433
valid_y_row_norms_mean: 0.173910043492
valid_y_row_norms_min: 0.0764016240587
Time this epoch: 5.914395 seconds
Monitoring step:
Epochs seen: 2
Batches seen: 1000
Examples seen: 100000
learning_rate: 0.0480395103882
momentum: 0.500803212851
total_seconds_last_epoch: 5.916628
training_seconds_this_epoch: 5.914395
valid_objective: 0.247152456183
valid_y_col_norms_max: 1.53966505735
valid_y_col_norms_mean: 1.40253319321
valid_y_col_norms_min: 1.25545532251
valid_y_max_max_class: 0.999809042373
valid_y_mean_max_class: 0.91413155486
valid_y_min_max_class: 0.396858131932
valid_y_misclass: 0.0693
valid_y_nll: 0.247152456183
valid_y_row_norms_max: 0.349859366019
valid_y_row_norms_mean: 0.193156755311
valid_y_row_norms_min: 0.0768834575651
Time this epoch: 5.384064 seconds
Monitoring step:
Epochs seen: 3
Batches seen: 1500
Examples seen: 150000
learning_rate: 0.0470882831836
momentum: 0.501606425703
total_seconds_last_epoch: 6.467402
training_seconds_this_epoch: 5.384064
valid_objective: 0.209608763896
valid_y_col_norms_max: 1.67361362258
valid_y_col_norms_mean: 1.5175975166
valid_y_col_norms_min: 1.4172014207
valid_y_max_max_class: 0.999854004396
valid_y_mean_max_class: 0.925858884522
valid_y_min_max_class: 0.406040902917
valid_y_misclass: 0.0606
valid_y_nll: 0.209608763896
valid_y_row_norms_max: 0.39865097889
valid_y_row_norms_mean: 0.208246054914
valid_y_row_norms_min: 0.0794959523134
Time this epoch: 6.462137 seconds
Monitoring step:
Epochs seen: 4
Batches seen: 2000
Examples seen: 200000
learning_rate: 0.0461558911667
momentum: 0.502409638554
total_seconds_last_epoch: 5.87754
training_seconds_this_epoch: 6.462137
valid_objective: 0.182002507511
valid_y_col_norms_max: 1.88661737526
valid_y_col_norms_mean: 1.62843307965
valid_y_col_norms_min: 1.44947897479
valid_y_max_max_class: 0.999894541276
valid_y_mean_max_class: 0.934676843166
valid_y_min_max_class: 0.424385874472
valid_y_misclass: 0.0517
valid_y_nll: 0.182002507511
valid_y_row_norms_max: 0.445090109865
valid_y_row_norms_mean: 0.222652142005
valid_y_row_norms_min: 0.0809203018493
Time this epoch: 7.260037 seconds
Monitoring step:
Epochs seen: 5
Batches seen: 2500
Examples seen: 250000
learning_rate: 0.0452419613832
momentum: 0.503212851406
total_seconds_last_epoch: 7.185899
training_seconds_this_epoch: 7.260037
valid_objective: 0.15993592079
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.72137351889
valid_y_col_norms_min: 1.47563705063
valid_y_max_max_class: 0.999887268664
valid_y_mean_max_class: 0.940857921767
valid_y_min_max_class: 0.425844421556
valid_y_misclass: 0.0443
valid_y_nll: 0.15993592079
valid_y_row_norms_max: 0.469129537434
valid_y_row_norms_mean: 0.234461782035
valid_y_row_norms_min: 0.0818907474095
Time this epoch: 6.109988 seconds
Monitoring step:
Epochs seen: 6
Batches seen: 3000
Examples seen: 300000
learning_rate: 0.0443461282636
momentum: 0.504016064257
total_seconds_last_epoch: 7.832857
training_seconds_this_epoch: 6.109988
valid_objective: 0.143053170079
valid_y_col_norms_max: 1.93127251362
valid_y_col_norms_mean: 1.79741241555
valid_y_col_norms_min: 1.52140954332
valid_y_max_max_class: 0.999934626403
valid_y_mean_max_class: 0.948279377144
valid_y_min_max_class: 0.448804897479
valid_y_misclass: 0.0378
valid_y_nll: 0.143053170079
valid_y_row_norms_max: 0.50656723117
valid_y_row_norms_mean: 0.244059832993
valid_y_row_norms_min: 0.0839453832919
Time this epoch: 6.274909 seconds
Monitoring step:
Epochs seen: 7
Batches seen: 3500
Examples seen: 350000
learning_rate: 0.043468033477
momentum: 0.504819277108
total_seconds_last_epoch: 6.641488
training_seconds_this_epoch: 6.274909
valid_objective: 0.12899298538
valid_y_col_norms_max: 1.93629631377
valid_y_col_norms_mean: 1.8413293273
valid_y_col_norms_min: 1.56393053427
valid_y_max_max_class: 0.999935186738
valid_y_mean_max_class: 0.952691727165
valid_y_min_max_class: 0.457475747077
valid_y_misclass: 0.0371
valid_y_nll: 0.12899298538
valid_y_row_norms_max: 0.522436534903
valid_y_row_norms_mean: 0.249350677872
valid_y_row_norms_min: 0.0835793194683
Time this epoch: 5.780168 seconds
Monitoring step:
Epochs seen: 8
Batches seen: 4000
Examples seen: 400000
learning_rate: 0.0426073257879
momentum: 0.50562248996
total_seconds_last_epoch: 6.848056
training_seconds_this_epoch: 5.780168
valid_objective: 0.123507539177
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.87322047941
valid_y_col_norms_min: 1.61648739436
valid_y_max_max_class: 0.999963122695
valid_y_mean_max_class: 0.954592708979
valid_y_min_max_class: 0.463553841674
valid_y_misclass: 0.0348
valid_y_nll: 0.123507539177
valid_y_row_norms_max: 0.530573236568
valid_y_row_norms_mean: 0.253283284212
valid_y_row_norms_min: 0.0839513317413
Time this epoch: 5.862484 seconds
Monitoring step:
Epochs seen: 9
Batches seen: 4500
Examples seen: 450000
learning_rate: 0.0417636609155
momentum: 0.506425702811
total_seconds_last_epoch: 6.263422
training_seconds_this_epoch: 5.862484
valid_objective: 0.119219852832
valid_y_col_norms_max: 1.93649990006
valid_y_col_norms_mean: 1.89298464921
valid_y_col_norms_min: 1.65981380041
valid_y_max_max_class: 0.999965706697
valid_y_mean_max_class: 0.956153143933
valid_y_min_max_class: 0.464692993356
valid_y_misclass: 0.0325
valid_y_nll: 0.119219852832
valid_y_row_norms_max: 0.534978569943
valid_y_row_norms_mean: 0.255624289812
valid_y_row_norms_min: 0.083866479842
Time this epoch: 5.537590 seconds
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.040936701396
momentum: 0.507228915663
total_seconds_last_epoch: 6.380026
training_seconds_this_epoch: 5.53759
valid_objective: 0.107567259811
valid_y_col_norms_max: 1.9364999001
valid_y_col_norms_mean: 1.9034125219
valid_y_col_norms_min: 1.70328608813
valid_y_max_max_class: 0.999951104974
valid_y_mean_max_class: 0.960034365701
valid_y_min_max_class: 0.468327175933
valid_y_misclass: 0.0301
valid_y_nll: 0.107567259811
valid_y_row_norms_max: 0.542294959221
valid_y_row_norms_mean: 0.256801448936
valid_y_row_norms_min: 0.0852080089068
Time this epoch: 5.906574 seconds
Monitoring step:
Epochs seen: 11
Batches seen: 5500
Examples seen: 550000
learning_rate: 0.0401261164479
momentum: 0.508032128514
total_seconds_last_epoch: 6.028702
training_seconds_this_epoch: 5.906574
valid_objective: 0.107947427591
valid_y_col_norms_max: 1.93649990007
valid_y_col_norms_mean: 1.91489786344
valid_y_col_norms_min: 1.76231726369
valid_y_max_max_class: 0.999973748357
valid_y_mean_max_class: 0.959634515782
valid_y_min_max_class: 0.474093275273
valid_y_misclass: 0.03
valid_y_nll: 0.107947427591
valid_y_row_norms_max: 0.549992925871
valid_y_row_norms_mean: 0.258155136976
valid_y_row_norms_min: 0.0866275711933
Time this epoch: 5.412370 seconds
Monitoring step:
Epochs seen: 12
Batches seen: 6000
Examples seen: 600000
learning_rate: 0.0393315818394
momentum: 0.508835341365
total_seconds_last_epoch: 6.476826
training_seconds_this_epoch: 5.41237
valid_objective: 0.099866818925
valid_y_col_norms_max: 1.93649990003
valid_y_col_norms_mean: 1.92230766664
valid_y_col_norms_min: 1.80934364428
valid_y_max_max_class: 0.999977163009
valid_y_mean_max_class: 0.964608656832
valid_y_min_max_class: 0.500261865705
valid_y_misclass: 0.0275
valid_y_nll: 0.099866818925
valid_y_row_norms_max: 0.558781354335
valid_y_row_norms_mean: 0.259100870088
valid_y_row_norms_min: 0.0880999968826
Time this epoch: 6.015195 seconds
Monitoring step:
Epochs seen: 13
Batches seen: 6500
Examples seen: 650000
learning_rate: 0.0385527797588
momentum: 0.509638554217
total_seconds_last_epoch: 5.900172
training_seconds_this_epoch: 6.015195
valid_objective: 0.0978410464664
valid_y_col_norms_max: 1.93649990007
valid_y_col_norms_mean: 1.92794140834
valid_y_col_norms_min: 1.86112450513
valid_y_max_max_class: 0.999976969849
valid_y_mean_max_class: 0.964530701679
valid_y_min_max_class: 0.493560643367
valid_y_misclass: 0.0282
valid_y_nll: 0.0978410464664
valid_y_row_norms_max: 0.564768602442
valid_y_row_norms_mean: 0.259801533503
valid_y_row_norms_min: 0.0896851288266
Time this epoch: 5.360026 seconds
Monitoring step:
Epochs seen: 14
Batches seen: 7000
Examples seen: 700000
learning_rate: 0.0377893986872
momentum: 0.510441767068
total_seconds_last_epoch: 6.57288
training_seconds_this_epoch: 5.360026
valid_objective: 0.0951468312278
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93098837983
valid_y_col_norms_min: 1.90879801473
valid_y_max_max_class: 0.999983550414
valid_y_mean_max_class: 0.965579195639
valid_y_min_max_class: 0.49387690176
valid_y_misclass: 0.028
valid_y_nll: 0.0951468312278
valid_y_row_norms_max: 0.572342419681
valid_y_row_norms_mean: 0.260207611713
valid_y_row_norms_min: 0.0907168875795
Time this epoch: 6.135147 seconds
Monitoring step:
Epochs seen: 15
Batches seen: 7500
Examples seen: 750000
learning_rate: 0.0370411332743
momentum: 0.51124497992
total_seconds_last_epoch: 5.840326
training_seconds_this_epoch: 6.135147
valid_objective: 0.0946865767775
valid_y_col_norms_max: 1.9364999001
valid_y_col_norms_mean: 1.93528101845
valid_y_col_norms_min: 1.93273820413
valid_y_max_max_class: 0.99998445477
valid_y_mean_max_class: 0.966292634743
valid_y_min_max_class: 0.497419916571
valid_y_misclass: 0.0264
valid_y_nll: 0.0946865767775
valid_y_row_norms_max: 0.575220483639
valid_y_row_norms_mean: 0.26081324661
valid_y_row_norms_min: 0.0913843843503
Time this epoch: 5.447133 seconds
Monitoring step:
Epochs seen: 16
Batches seen: 8000
Examples seen: 800000
learning_rate: 0.0363076842159
momentum: 0.512048192771
total_seconds_last_epoch: 6.627608
training_seconds_this_epoch: 5.447133
valid_objective: 0.0891035756577
valid_y_col_norms_max: 1.93649990007
valid_y_col_norms_mean: 1.93567546133
valid_y_col_norms_min: 1.93226078666
valid_y_max_max_class: 0.99998511463
valid_y_mean_max_class: 0.968229560194
valid_y_min_max_class: 0.503004856839
valid_y_misclass: 0.0255
valid_y_nll: 0.0891035756577
valid_y_row_norms_max: 0.577980787976
valid_y_row_norms_mean: 0.260947657836
valid_y_row_norms_min: 0.0923057387534
Time this epoch: 6.057061 seconds
Monitoring step:
Epochs seen: 17
Batches seen: 8500
Examples seen: 850000
learning_rate: 0.0355887581344
momentum: 0.512851405622
total_seconds_last_epoch: 5.923046
training_seconds_this_epoch: 6.057061
valid_objective: 0.0881537906447
valid_y_col_norms_max: 1.93649990003
valid_y_col_norms_mean: 1.93483799999
valid_y_col_norms_min: 1.93229350347
valid_y_max_max_class: 0.999977816061
valid_y_mean_max_class: 0.968547838073
valid_y_min_max_class: 0.503032249459
valid_y_misclass: 0.026
valid_y_nll: 0.0881537906447
valid_y_row_norms_max: 0.580327280586
valid_y_row_norms_mean: 0.260898516117
valid_y_row_norms_min: 0.0930101525575
Time this epoch: 5.353301 seconds
Monitoring step:
Epochs seen: 18
Batches seen: 9000
Examples seen: 900000
learning_rate: 0.0348840674612
momentum: 0.513654618474
total_seconds_last_epoch: 6.544159
training_seconds_this_epoch: 5.353301
valid_objective: 0.0849988168168
valid_y_col_norms_max: 1.93649990007
valid_y_col_norms_mean: 1.93541869502
valid_y_col_norms_min: 1.93201480261
valid_y_max_max_class: 0.999984295216
valid_y_mean_max_class: 0.969765470533
valid_y_min_max_class: 0.510558770721
valid_y_misclass: 0.0242
valid_y_nll: 0.0849988168168
valid_y_row_norms_max: 0.581088622209
valid_y_row_norms_mean: 0.26107829656
valid_y_row_norms_min: 0.0923959534237
Time this epoch: 6.384324 seconds
Monitoring step:
Epochs seen: 19
Batches seen: 9500
Examples seen: 950000
learning_rate: 0.034193330322
momentum: 0.514457831325
total_seconds_last_epoch: 5.837694
training_seconds_this_epoch: 6.384324
valid_objective: 0.0860030965262
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93476177366
valid_y_col_norms_min: 1.93239041045
valid_y_max_max_class: 0.999982263957
valid_y_mean_max_class: 0.968567755339
valid_y_min_max_class: 0.500306941533
valid_y_misclass: 0.0244
valid_y_nll: 0.0860030965262
valid_y_row_norms_max: 0.583485659518
valid_y_row_norms_mean: 0.261092936398
valid_y_row_norms_min: 0.0937852230236
Time this epoch: 5.956484 seconds
Monitoring step:
Epochs seen: 20
Batches seen: 10000
Examples seen: 1000000
learning_rate: 0.0335162704237
momentum: 0.515261044177
total_seconds_last_epoch: 6.863189
training_seconds_this_epoch: 5.956484
valid_objective: 0.0827991311558
valid_y_col_norms_max: 1.93649990006
valid_y_col_norms_mean: 1.9356068121
valid_y_col_norms_min: 1.932339644
valid_y_max_max_class: 0.999988465173
valid_y_mean_max_class: 0.970974120441
valid_y_min_max_class: 0.512024650555
valid_y_misclass: 0.0253
valid_y_nll: 0.0827991311558
valid_y_row_norms_max: 0.585478955937
valid_y_row_norms_mean: 0.261291684144
valid_y_row_norms_min: 0.0935873984313
Time this epoch: 6.912120 seconds
Monitoring step:
Epochs seen: 21
Batches seen: 10500
Examples seen: 1050000
learning_rate: 0.0328526169442
momentum: 0.516064257028
total_seconds_last_epoch: 6.623968
training_seconds_this_epoch: 6.91212
valid_objective: 0.0818521815965
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93554695608
valid_y_col_norms_min: 1.93384153168
valid_y_max_max_class: 0.999989665122
valid_y_mean_max_class: 0.972759271353
valid_y_min_max_class: 0.532786606556
valid_y_misclass: 0.0242
valid_y_nll: 0.0818521815965
valid_y_row_norms_max: 0.582850746611
valid_y_row_norms_mean: 0.261402125409
valid_y_row_norms_min: 0.0943095869177
Time this epoch: 5.942600 seconds
Monitoring step:
Epochs seen: 22
Batches seen: 11000
Examples seen: 1100000
learning_rate: 0.0322021044239
momentum: 0.51686746988
total_seconds_last_epoch: 7.449036
training_seconds_this_epoch: 5.9426
valid_objective: 0.0818381087461
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.9354317889
valid_y_col_norms_min: 1.93240682737
valid_y_max_max_class: 0.999989762395
valid_y_mean_max_class: 0.971525657813
valid_y_min_max_class: 0.511311093519
valid_y_misclass: 0.0237
valid_y_nll: 0.0818381087461
valid_y_row_norms_max: 0.583744391912
valid_y_row_norms_mean: 0.261528473827
valid_y_row_norms_min: 0.0954943063263
Time this epoch: 6.306351 seconds
Monitoring step:
Epochs seen: 23
Batches seen: 11500
Examples seen: 1150000
learning_rate: 0.0315644726594
momentum: 0.517670682731
total_seconds_last_epoch: 6.45007
training_seconds_this_epoch: 6.306351
valid_objective: 0.0783682477947
valid_y_col_norms_max: 1.93649990006
valid_y_col_norms_mean: 1.93526822051
valid_y_col_norms_min: 1.93147578338
valid_y_max_max_class: 0.99999011425
valid_y_mean_max_class: 0.97293918071
valid_y_min_max_class: 0.526539137394
valid_y_misclass: 0.0227
valid_y_nll: 0.0783682477947
valid_y_row_norms_max: 0.582500635239
valid_y_row_norms_mean: 0.261607855034
valid_y_row_norms_min: 0.0961656137637
Time this epoch: 7.636177 seconds
Monitoring step:
Epochs seen: 24
Batches seen: 12000
Examples seen: 1200000
learning_rate: 0.0309394665998
momentum: 0.518473895582
total_seconds_last_epoch: 6.911065
training_seconds_this_epoch: 7.636177
valid_objective: 0.0787622672218
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93575002012
valid_y_col_norms_min: 1.93321450572
valid_y_max_max_class: 0.99998974409
valid_y_mean_max_class: 0.97318432332
valid_y_min_max_class: 0.518712142169
valid_y_misclass: 0.0225
valid_y_nll: 0.0787622672218
valid_y_row_norms_max: 0.583587850135
valid_y_row_norms_mean: 0.261799057069
valid_y_row_norms_min: 0.0973764887908
Time this epoch: 5.956405 seconds
Monitoring step:
Epochs seen: 25
Batches seen: 12500
Examples seen: 1250000
learning_rate: 0.0303268362444
momentum: 0.519277108434
total_seconds_last_epoch: 8.194458
training_seconds_this_epoch: 5.956405
valid_objective: 0.0773718887277
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93532355362
valid_y_col_norms_min: 1.93351647398
valid_y_max_max_class: 0.99999145149
valid_y_mean_max_class: 0.973759518777
valid_y_min_max_class: 0.529670173287
valid_y_misclass: 0.0233
valid_y_nll: 0.0773718887277
valid_y_row_norms_max: 0.582513357512
valid_y_row_norms_mean: 0.26186702499
valid_y_row_norms_min: 0.0979607422094
Time this epoch: 5.743074 seconds
Monitoring step:
Epochs seen: 26
Batches seen: 13000
Examples seen: 1300000
learning_rate: 0.0297263365425
momentum: 0.520080321285
total_seconds_last_epoch: 6.488384
training_seconds_this_epoch: 5.743074
valid_objective: 0.0761000400278
valid_y_col_norms_max: 1.93649990003
valid_y_col_norms_mean: 1.93572293722
valid_y_col_norms_min: 1.93308728002
valid_y_max_max_class: 0.999990865485
valid_y_mean_max_class: 0.974327828318
valid_y_min_max_class: 0.535549664623
valid_y_misclass: 0.0222
valid_y_nll: 0.0761000400278
valid_y_row_norms_max: 0.582412817396
valid_y_row_norms_mean: 0.262060779735
valid_y_row_norms_min: 0.0987484252388
Time this epoch: 5.530118 seconds
Monitoring step:
Epochs seen: 27
Batches seen: 13500
Examples seen: 1350000
learning_rate: 0.029137727296
momentum: 0.520883534137
total_seconds_last_epoch: 6.300112
training_seconds_this_epoch: 5.530118
valid_objective: 0.0745081666292
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93582241037
valid_y_col_norms_min: 1.93385572353
valid_y_max_max_class: 0.999992622678
valid_y_mean_max_class: 0.974613548424
valid_y_min_max_class: 0.528671878457
valid_y_misclass: 0.0224
valid_y_nll: 0.0745081666292
valid_y_row_norms_max: 0.580291571914
valid_y_row_norms_mean: 0.262198077457
valid_y_row_norms_min: 0.0991215943111
Time this epoch: 5.768094 seconds
Monitoring step:
Epochs seen: 28
Batches seen: 14000
Examples seen: 1400000
learning_rate: 0.0285607730628
momentum: 0.521686746988
total_seconds_last_epoch: 6.053191
training_seconds_this_epoch: 5.768094
valid_objective: 0.0740452001005
valid_y_col_norms_max: 1.93649990003
valid_y_col_norms_mean: 1.93625501779
valid_y_col_norms_min: 1.93551373359
valid_y_max_max_class: 0.999993221811
valid_y_mean_max_class: 0.975309055813
valid_y_min_max_class: 0.533641371741
valid_y_misclass: 0.0215
valid_y_nll: 0.0740452001005
valid_y_row_norms_max: 0.580056651905
valid_y_row_norms_mean: 0.262354805648
valid_y_row_norms_min: 0.100014501684
Time this epoch: 5.513620 seconds
Monitoring step:
Epochs seen: 29
Batches seen: 14500
Examples seen: 1450000
learning_rate: 0.0279952430625
momentum: 0.522489959839
total_seconds_last_epoch: 6.254529
training_seconds_this_epoch: 5.51362
valid_objective: 0.073504601426
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93584892773
valid_y_col_norms_min: 1.93416522858
valid_y_max_max_class: 0.999993403152
valid_y_mean_max_class: 0.975481711533
valid_y_min_max_class: 0.529656340405
valid_y_misclass: 0.0227
valid_y_nll: 0.073504601426
valid_y_row_norms_max: 0.5790139873
valid_y_row_norms_mean: 0.262423752802
valid_y_row_norms_min: 0.100872029594
Time this epoch: 5.889689 seconds
Monitoring step:
Epochs seen: 30
Batches seen: 15000
Examples seen: 1500000
learning_rate: 0.0274409110849
momentum: 0.523293172691
total_seconds_last_epoch: 6.052284
training_seconds_this_epoch: 5.889689
valid_objective: 0.0742676570822
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93607586676
valid_y_col_norms_min: 1.93510461646
valid_y_max_max_class: 0.999993515244
valid_y_mean_max_class: 0.975243305948
valid_y_min_max_class: 0.529464250743
valid_y_misclass: 0.0214
valid_y_nll: 0.0742676570822
valid_y_row_norms_max: 0.577061673794
valid_y_row_norms_mean: 0.262574402565
valid_y_row_norms_min: 0.101525157962
Time this epoch: 5.481031 seconds
Monitoring step:
Epochs seen: 31
Batches seen: 15500
Examples seen: 1550000
learning_rate: 0.0268975553984
momentum: 0.524096385542
total_seconds_last_epoch: 6.374491
training_seconds_this_epoch: 5.481031
valid_objective: 0.0728776691158
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93537613609
valid_y_col_norms_min: 1.93343938173
valid_y_max_max_class: 0.999993167924
valid_y_mean_max_class: 0.97504830818
valid_y_min_max_class: 0.530201316588
valid_y_misclass: 0.0212
valid_y_nll: 0.0728776691158
valid_y_row_norms_max: 0.575226588311
valid_y_row_norms_mean: 0.262591911926
valid_y_row_norms_min: 0.102804652481
Time this epoch: 5.813406 seconds
Monitoring step:
Epochs seen: 32
Batches seen: 16000
Examples seen: 1600000
learning_rate: 0.0263649586624
momentum: 0.524899598394
total_seconds_last_epoch: 6.010313
training_seconds_this_epoch: 5.813406
valid_objective: 0.0728836989039
valid_y_col_norms_max: 1.93649990003
valid_y_col_norms_mean: 1.93582905882
valid_y_col_norms_min: 1.93383688687
valid_y_max_max_class: 0.999994016591
valid_y_mean_max_class: 0.976248888568
valid_y_min_max_class: 0.523712212878
valid_y_misclass: 0.0216
valid_y_nll: 0.0728836989039
valid_y_row_norms_max: 0.573600835574
valid_y_row_norms_mean: 0.262762722983
valid_y_row_norms_min: 0.103200994169
Time this epoch: 5.543042 seconds
Monitoring step:
Epochs seen: 33
Batches seen: 16500
Examples seen: 1650000
learning_rate: 0.0258429078396
momentum: 0.525702811245
total_seconds_last_epoch: 6.272276
training_seconds_this_epoch: 5.543042
valid_objective: 0.0711776701676
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93601849576
valid_y_col_norms_min: 1.9347452523
valid_y_max_max_class: 0.999994804807
valid_y_mean_max_class: 0.976937162353
valid_y_min_max_class: 0.535316842768
valid_y_misclass: 0.0215
valid_y_nll: 0.0711776701676
valid_y_row_norms_max: 0.57294316861
valid_y_row_norms_mean: 0.26290736819
valid_y_row_norms_min: 0.103953794527
Time this epoch: 5.721724 seconds
Monitoring step:
Epochs seen: 34
Batches seen: 17000
Examples seen: 1700000
learning_rate: 0.025331194111
momentum: 0.526506024096
total_seconds_last_epoch: 6.074407
training_seconds_this_epoch: 5.721724
valid_objective: 0.0699203328758
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93592743587
valid_y_col_norms_min: 1.93428107334
valid_y_max_max_class: 0.999995376263
valid_y_mean_max_class: 0.976599284745
valid_y_min_max_class: 0.526191214552
valid_y_misclass: 0.021
valid_y_nll: 0.0699203328758
valid_y_row_norms_max: 0.571380504077
valid_y_row_norms_mean: 0.262986149627
valid_y_row_norms_min: 0.10382617767
Time this epoch: 5.558886 seconds
Monitoring step:
Epochs seen: 35
Batches seen: 17500
Examples seen: 1750000
learning_rate: 0.0248296127924
momentum: 0.527309236948
total_seconds_last_epoch: 6.196367
training_seconds_this_epoch: 5.558886
valid_objective: 0.0703058351436
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93556322238
valid_y_col_norms_min: 1.93352817916
valid_y_max_max_class: 0.999995617856
valid_y_mean_max_class: 0.977215418205
valid_y_min_max_class: 0.540389429489
valid_y_misclass: 0.0214
valid_y_nll: 0.0703058351436
valid_y_row_norms_max: 0.56730498694
valid_y_row_norms_mean: 0.263051858774
valid_y_row_norms_min: 0.103871414144
Time this epoch: 5.726418 seconds
Monitoring step:
Epochs seen: 36
Batches seen: 18000
Examples seen: 1800000
learning_rate: 0.0243379632528
momentum: 0.528112449799
total_seconds_last_epoch: 6.09765
training_seconds_this_epoch: 5.726418
valid_objective: 0.0702599115727
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93603299071
valid_y_col_norms_min: 1.93538484492
valid_y_max_max_class: 0.999995632432
valid_y_mean_max_class: 0.97787637084
valid_y_min_max_class: 0.538194459331
valid_y_misclass: 0.0213
valid_y_nll: 0.0702599115727
valid_y_row_norms_max: 0.568113383761
valid_y_row_norms_mean: 0.263222703155
valid_y_row_norms_min: 0.104112327856
Time this epoch: 5.610193 seconds
Monitoring step:
Epochs seen: 37
Batches seen: 18500
Examples seen: 1850000
learning_rate: 0.0238560488335
momentum: 0.528915662651
total_seconds_last_epoch: 6.197008
training_seconds_this_epoch: 5.610193
valid_objective: 0.0700176350165
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93587895351
valid_y_col_norms_min: 1.93341619328
valid_y_max_max_class: 0.999996322536
valid_y_mean_max_class: 0.977628071153
valid_y_min_max_class: 0.542568666563
valid_y_misclass: 0.0209
valid_y_nll: 0.0700176350165
valid_y_row_norms_max: 0.566012260871
valid_y_row_norms_mean: 0.263292142199
valid_y_row_norms_min: 0.103009402164
Time this epoch: 5.764235 seconds
Monitoring step:
Epochs seen: 38
Batches seen: 19000
Examples seen: 1900000
learning_rate: 0.0233836767702
momentum: 0.529718875502
total_seconds_last_epoch: 6.1322
training_seconds_this_epoch: 5.764235
valid_objective: 0.0708193370316
valid_y_col_norms_max: 1.93649990004
valid_y_col_norms_mean: 1.93591896523
valid_y_col_norms_min: 1.93506382472
valid_y_max_max_class: 0.999996173337
valid_y_mean_max_class: 0.978034880895
valid_y_min_max_class: 0.541540751774
valid_y_misclass: 0.0211
valid_y_nll: 0.0708193370316
valid_y_row_norms_max: 0.564171236477
valid_y_row_norms_mean: 0.263400685781
valid_y_row_norms_min: 0.103364593489
Time this epoch: 8.119455 seconds
Monitoring step:
Epochs seen: 39
Batches seen: 19500
Examples seen: 1950000
learning_rate: 0.0229206581152
momentum: 0.530522088353
total_seconds_last_epoch: 6.258956
training_seconds_this_epoch: 8.119455
valid_objective: 0.070476986816
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93585109817
valid_y_col_norms_min: 1.93342596378
valid_y_max_max_class: 0.999996190721
valid_y_mean_max_class: 0.978034580491
valid_y_min_max_class: 0.547503878931
valid_y_misclass: 0.0211
valid_y_nll: 0.070476986816
valid_y_row_norms_max: 0.561514221412
valid_y_row_norms_mean: 0.263508182543
valid_y_row_norms_min: 0.103347333576
Time this epoch: 6.312362 seconds
Monitoring step:
Epochs seen: 40
Batches seen: 20000
Examples seen: 2000000
learning_rate: 0.0224668076623
momentum: 0.531325301205
total_seconds_last_epoch: 8.714237
training_seconds_this_epoch: 6.312362
valid_objective: 0.0690301024797
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93612443028
valid_y_col_norms_min: 1.93513479545
valid_y_max_max_class: 0.99999639704
valid_y_mean_max_class: 0.978084482899
valid_y_min_max_class: 0.548426862892
valid_y_misclass: 0.0201
valid_y_nll: 0.0690301024797
valid_y_row_norms_max: 0.559634525084
valid_y_row_norms_mean: 0.263633786722
valid_y_row_norms_min: 0.102985762366
Time this epoch: 6.132795 seconds
Monitoring step:
Epochs seen: 41
Batches seen: 20500
Examples seen: 2050000
learning_rate: 0.0220219438726
momentum: 0.532128514056
total_seconds_last_epoch: 6.828239
training_seconds_this_epoch: 6.132795
valid_objective: 0.0694333851078
valid_y_col_norms_max: 1.93649990003
valid_y_col_norms_mean: 1.93617701772
valid_y_col_norms_min: 1.93556864871
valid_y_max_max_class: 0.999996268269
valid_y_mean_max_class: 0.978082685929
valid_y_min_max_class: 0.551582992126
valid_y_misclass: 0.021
valid_y_nll: 0.0694333851078
valid_y_row_norms_max: 0.557070974796
valid_y_row_norms_mean: 0.263730191482
valid_y_row_norms_min: 0.103568235915
Time this epoch: 5.379670 seconds
Monitoring step:
Epochs seen: 42
Batches seen: 21000
Examples seen: 2100000
learning_rate: 0.0215858888017
momentum: 0.532931726908
total_seconds_last_epoch: 6.611762
training_seconds_this_epoch: 5.37967
valid_objective: 0.0682303426279
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93583612318
valid_y_col_norms_min: 1.93494503131
valid_y_max_max_class: 0.99999624094
valid_y_mean_max_class: 0.978472639253
valid_y_min_max_class: 0.536463401277
valid_y_misclass: 0.0207
valid_y_nll: 0.0682303426279
valid_y_row_norms_max: 0.556202545699
valid_y_row_norms_mean: 0.263762636273
valid_y_row_norms_min: 0.102883379419
Time this epoch: 5.930689 seconds
Monitoring step:
Epochs seen: 43
Batches seen: 21500
Examples seen: 2150000
learning_rate: 0.0211584680287
momentum: 0.533734939759
total_seconds_last_epoch: 5.896643
training_seconds_this_epoch: 5.930689
valid_objective: 0.0681362986241
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93625812448
valid_y_col_norms_min: 1.93568812209
valid_y_max_max_class: 0.999996842896
valid_y_mean_max_class: 0.978984041993
valid_y_min_max_class: 0.545429982729
valid_y_misclass: 0.0208
valid_y_nll: 0.0681362986241
valid_y_row_norms_max: 0.555366264275
valid_y_row_norms_mean: 0.263893828294
valid_y_row_norms_min: 0.10304527795
Time this epoch: 5.453186 seconds
Monitoring step:
Epochs seen: 44
Batches seen: 22000
Examples seen: 2200000
learning_rate: 0.0207395105865
momentum: 0.53453815261
total_seconds_last_epoch: 6.391853
training_seconds_this_epoch: 5.453186
valid_objective: 0.0677736444362
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93613422245
valid_y_col_norms_min: 1.93494219442
valid_y_max_max_class: 0.999996661316
valid_y_mean_max_class: 0.978851094724
valid_y_min_max_class: 0.544526207014
valid_y_misclass: 0.0201
valid_y_nll: 0.0677736444362
valid_y_row_norms_max: 0.554979558689
valid_y_row_norms_mean: 0.263962321191
valid_y_row_norms_min: 0.103195129463
Time this epoch: 5.829830 seconds
Monitoring step:
Epochs seen: 45
Batches seen: 22500
Examples seen: 2250000
learning_rate: 0.0203288488932
momentum: 0.535341365462
total_seconds_last_epoch: 5.97628
training_seconds_this_epoch: 5.82983
valid_objective: 0.0676661977602
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93608486641
valid_y_col_norms_min: 1.93489012164
valid_y_max_max_class: 0.999997122489
valid_y_mean_max_class: 0.979207311934
valid_y_min_max_class: 0.541562838223
valid_y_misclass: 0.0213
valid_y_nll: 0.0676661977602
valid_y_row_norms_max: 0.55242065089
valid_y_row_norms_mean: 0.264012881039
valid_y_row_norms_min: 0.102884048645
Time this epoch: 5.430088 seconds
Monitoring step:
Epochs seen: 46
Batches seen: 23000
Examples seen: 2300000
learning_rate: 0.0199263186853
momentum: 0.536144578313
total_seconds_last_epoch: 6.305132
training_seconds_this_epoch: 5.430088
valid_objective: 0.0665784955302
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93639183725
valid_y_col_norms_min: 1.93598163006
valid_y_max_max_class: 0.999997269505
valid_y_mean_max_class: 0.97923932851
valid_y_min_max_class: 0.552940083313
valid_y_misclass: 0.02
valid_y_nll: 0.0665784955302
valid_y_row_norms_max: 0.549802901625
valid_y_row_norms_mean: 0.264148793386
valid_y_row_norms_min: 0.102842307922
Time this epoch: 5.897824 seconds
Monitoring step:
Epochs seen: 47
Batches seen: 23500
Examples seen: 2350000
learning_rate: 0.0195317589517
momentum: 0.536947791165
total_seconds_last_epoch: 5.959428
training_seconds_this_epoch: 5.897824
valid_objective: 0.0681469031182
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93598926751
valid_y_col_norms_min: 1.93463964799
valid_y_max_max_class: 0.999997257893
valid_y_mean_max_class: 0.979613624216
valid_y_min_max_class: 0.542315144824
valid_y_misclass: 0.0201
valid_y_nll: 0.0681469031182
valid_y_row_norms_max: 0.548223465559
valid_y_row_norms_mean: 0.264169306677
valid_y_row_norms_min: 0.102381753864
Time this epoch: 5.525158 seconds
Monitoring step:
Epochs seen: 48
Batches seen: 24000
Examples seen: 2400000
learning_rate: 0.0191450118696
momentum: 0.537751004016
total_seconds_last_epoch: 6.35558
training_seconds_this_epoch: 5.525158
valid_objective: 0.0698892905822
valid_y_col_norms_max: 1.93649990001
valid_y_col_norms_mean: 1.93575084477
valid_y_col_norms_min: 1.93358945446
valid_y_max_max_class: 0.999997643543
valid_y_mean_max_class: 0.979543125765
valid_y_min_max_class: 0.542149109755
valid_y_misclass: 0.0213
valid_y_nll: 0.0698892905822
valid_y_row_norms_max: 0.546325855752
valid_y_row_norms_mean: 0.264193353918
valid_y_row_norms_min: 0.10206665524
Time this epoch: 5.760655 seconds
Monitoring step:
Epochs seen: 49
Batches seen: 24500
Examples seen: 2450000
learning_rate: 0.0187659227412
momentum: 0.538554216867
total_seconds_last_epoch: 6.052614
training_seconds_this_epoch: 5.760655
valid_objective: 0.0663245377722
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93616316362
valid_y_col_norms_min: 1.93531479922
valid_y_max_max_class: 0.999997640275
valid_y_mean_max_class: 0.979900889515
valid_y_min_max_class: 0.557046970771
valid_y_misclass: 0.0205
valid_y_nll: 0.0663245377722
valid_y_row_norms_max: 0.546145159507
valid_y_row_norms_mean: 0.264324724773
valid_y_row_norms_min: 0.102420875042
Time this epoch: 5.505016 seconds
Monitoring step:
Epochs seen: 50
Batches seen: 25000
Examples seen: 2500000
learning_rate: 0.0183943399319
momentum: 0.539357429719
total_seconds_last_epoch: 6.23482
training_seconds_this_epoch: 5.505016
valid_objective: 0.0667406732564
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93614341932
valid_y_col_norms_min: 1.93520197097
valid_y_max_max_class: 0.999997716942
valid_y_mean_max_class: 0.980091116867
valid_y_min_max_class: 0.549003720211
valid_y_misclass: 0.0201
valid_y_nll: 0.0667406732564
valid_y_row_norms_max: 0.544320723906
valid_y_row_norms_mean: 0.264379014826
valid_y_row_norms_min: 0.102157939244
In [ ]:
Content source: udibr/seizure-prediction
Similar notebooks: