input (None, 1, 28, 28) produces 784 outputs
conv1 (None, 4, 25, 25) produces 2500 outputs
pool1 (None, 4, 9, 9) produces 324 outputs
dropout1 (None, 4, 9, 9) produces 324 outputs
conv2 (None, 8, 7, 7) produces 392 outputs
pool2 (None, 8, 4, 4) produces 128 outputs
dropout2 (None, 8, 4, 4) produces 128 outputs
conv3 (None, 16, 3, 3) produces 144 outputs
pool3 (None, 16, 2, 2) produces 64 outputs
dropout3 (None, 16, 2, 2) produces 64 outputs
hidden4 (None, 100) produces 100 outputs
hidden5 (None, 100) produces 100 outputs
output (None, 10) produces 10 outputs
| epoch | train loss | valid loss | valid best | train/val | val acc | dur |
|--------:|-------------:|-------------:|-------------:|------------:|----------:|-------:|
| 1 | 1.8062 | 0.8608 | 0.8608 | 2.0982 | 0.7483 | 5.0279 |
| 2 | 0.8855 | 0.6061 | 0.6061 | 1.4609 | 0.7926 | 5.0242 |
| 3 | 0.6766 | 0.4469 | 0.4469 | 1.5138 | 0.8522 | 4.9939 |
| 4 | 0.5713 | 0.3705 | 0.3705 | 1.5416 | 0.8808 | 5.0022 |
| 5 | 0.5115 | 0.3501 | 0.3501 | 1.4611 | 0.8866 | 5.0291 |
| 6 | 0.4619 | 0.2905 | 0.2905 | 1.5901 | 0.9083 | 5.0135 |
| 7 | 0.4325 | 0.2628 | 0.2628 | 1.6458 | 0.9158 | 5.0075 |
| 8 | 0.4022 | 0.2510 | 0.2510 | 1.6021 | 0.9192 | 5.0117 |
| 9 | 0.3807 | 0.2069 | 0.2069 | 1.8402 | 0.9367 | 5.0088 |
| 10 | 0.3614 | 0.2574 | | 1.4039 | 0.9152 | 5.0129 |
| 11 | 0.3424 | 0.1959 | 0.1959 | 1.7473 | 0.9397 | 5.0135 |
| 12 | 0.3238 | 0.2764 | | 1.1714 | 0.9086 | 5.0100 |
| 13 | 0.3127 | 0.1657 | 0.1657 | 1.8870 | 0.9524 | 5.0123 |
| 14 | 0.3034 | 0.1560 | 0.1560 | 1.9448 | 0.9506 | 5.0086 |
| 15 | 0.3010 | 0.1658 | | 1.8148 | 0.9496 | 5.1799 |
| 16 | 0.2904 | 0.1670 | | 1.7382 | 0.9487 | 5.2096 |
| 17 | 0.2876 | 0.1727 | | 1.6650 | 0.9484 | 5.1542 |
| 18 | 0.2791 | 0.1653 | | 1.6889 | 0.9500 | 5.1316 |
| 19 | 0.2760 | 0.1441 | 0.1441 | 1.9152 | 0.9565 | 5.1440 |
| 20 | 0.2681 | 0.1565 | | 1.7130 | 0.9521 | 5.1560 |
| 21 | 0.2649 | 0.1494 | | 1.7727 | 0.9534 | 5.1376 |
| 22 | 0.2564 | 0.1366 | 0.1366 | 1.8769 | 0.9584 | 5.0909 |
| 23 | 0.2532 | 0.1459 | | 1.7352 | 0.9534 | 5.0111 |
| 24 | 0.2546 | 0.1320 | 0.1320 | 1.9286 | 0.9595 | 5.0143 |
| 25 | 0.2456 | 0.1567 | | 1.5667 | 0.9499 | 5.0057 |
| 26 | 0.2422 | 0.1390 | | 1.7429 | 0.9570 | 5.0079 |
| 27 | 0.2394 | 0.1493 | | 1.6031 | 0.9546 | 5.0117 |
| 28 | 0.2446 | 0.1358 | | 1.8010 | 0.9590 | 5.0044 |
| 29 | 0.2386 | 0.1321 | | 1.8060 | 0.9592 | 5.0151 |
| 30 | 0.2325 | 0.1277 | 0.1277 | 1.8208 | 0.9609 | 5.0255 |
| 31 | 0.2350 | 0.1309 | | 1.7954 | 0.9605 | 5.0306 |
| 32 | 0.2280 | 0.1280 | | 1.7817 | 0.9596 | 5.0061 |
| 33 | 0.2282 | 0.1249 | 0.1249 | 1.8270 | 0.9613 | 5.0262 |
| 34 | 0.2247 | 0.1260 | | 1.7832 | 0.9592 | 5.0026 |
| 35 | 0.2214 | 0.1207 | 0.1207 | 1.8337 | 0.9632 | 5.0108 |
| 36 | 0.2213 | 0.1215 | | 1.8206 | 0.9609 | 5.0146 |
| 37 | 0.2186 | 0.1304 | | 1.6763 | 0.9592 | 5.0136 |
| 38 | 0.2188 | 0.1289 | | 1.6971 | 0.9585 | 5.0144 |
| 39 | 0.2161 | 0.1383 | | 1.5624 | 0.9564 | 5.0131 |
| 40 | 0.2135 | 0.1332 | | 1.6023 | 0.9573 | 5.0153 |
| 41 | 0.2100 | 0.1235 | | 1.7006 | 0.9624 | 5.0094 |
| 42 | 0.2117 | 0.1315 | | 1.6097 | 0.9588 | 5.0206 |
| 43 | 0.2093 | 0.1155 | 0.1155 | 1.8111 | 0.9651 | 5.0167 |
| 44 | 0.2132 | 0.1184 | | 1.8007 | 0.9637 | 5.0131 |
| 45 | 0.2050 | 0.1189 | | 1.7248 | 0.9644 | 5.0277 |
| 46 | 0.1989 | 0.1159 | | 1.7154 | 0.9646 | 5.0190 |
| 47 | 0.2067 | 0.1152 | 0.1152 | 1.7943 | 0.9658 | 5.0103 |
| 48 | 0.2028 | 0.1234 | | 1.6436 | 0.9628 | 5.0101 |
| 49 | 0.2019 | 0.1189 | | 1.6984 | 0.9639 | 5.0116 |
| 50 | 0.2014 | 0.1085 | 0.1085 | 1.8566 | 0.9674 | 5.0067 |
Out[14]:
NeuralNet(X_tensor_type=<function tensor4 at 0x7efc0405e6e0>,
batch_iterator_test=<nolearn.lasagne.base.BatchIterator object at 0x7efbe9d726d0>,
batch_iterator_train=<nolearn.lasagne.base.BatchIterator object at 0x7efbe9d722d0>,
conv1_filter_size=(4, 4),
conv1_nonlinearity=<function rectify at 0x7efbeb794398>,
conv1_num_filters=4, conv2_filter_size=(3, 3),
conv2_nonlinearity=<function rectify at 0x7efbeb794398>,
conv2_num_filters=8, conv3_filter_size=(2, 2),
conv3_nonlinearity=<function rectify at 0x7efbeb794398>,
conv3_num_filters=16, custom_score=None, dropout1_p=0.1,
dropout2_p=0.1, dropout3_p=0.1, eval_size=0.2, hidden4_num_units=100,
hidden5_num_units=100, input_shape=(None, 1, 28, 28),
layers=[('input', <class 'lasagne.layers.input.InputLayer'>), ('conv1', <class 'lasagne.layers.conv.Conv2DLayer'>), ('pool1', <class 'lasagne.layers.pool.MaxPool2DLayer'>), ('dropout1', <class 'lasagne.layers.noise.DropoutLayer'>), ('conv2', <class 'lasagne.layers.conv.Conv2DLayer'>), ('pool2', <cla..., <class 'lasagne.layers.dense.DenseLayer'>), ('output', <class 'lasagne.layers.dense.DenseLayer'>)],
loss=None, max_epochs=50, more_params={},
objective=<class 'lasagne.objectives.Objective'>,
objective_loss_function=<function categorical_crossentropy at 0x7efc02b22758>,
on_epoch_finished=[<__main__.AdjustVariable object at 0x7efbea5c42d0>, <__main__.AdjustVariable object at 0x7efbe2e43f50>],
on_training_finished=(),
output_nonlinearity=<function softmax at 0x7efbeb7942a8>,
output_num_units=10, pool1_pool_size=(3, 3), pool2_pool_size=(2, 2),
pool3_pool_size=(2, 2), regression=False,
update=<function nesterov_momentum at 0x7efbeac0be60>,
update_learning_rate=<CudaNdarrayType(float32, scalar)>,
update_momentum=<CudaNdarrayType(float32, scalar)>,
use_label_encoder=False, verbose=1,
y_tensor_type=TensorType(int32, vector))