Brewing Logistic Regression then Going Deeper

While Caffe is made for deep networks it can likewise represent "shallow" models like logistic regression for classification. We'll do simple logistic regression on synthetic data that we'll generate and save to HDF5 to feed vectors to Caffe. Once that model is done, we'll add layers to improve accuracy. That's what Caffe is about: define a model, experiment, and then deploy.


In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

import os
os.chdir('..')

import sys
sys.path.insert(0, './python')
import caffe


import os
import h5py
import shutil
import tempfile

import sklearn
import sklearn.datasets
import sklearn.linear_model

import pandas as pd

Synthesize a dataset of 10,000 4-vectors for binary classification with 2 informative features and 2 noise features.


In [2]:
X, y = sklearn.datasets.make_classification(
    n_samples=10000, n_features=4, n_redundant=0, n_informative=2, 
    n_clusters_per_class=2, hypercube=False, random_state=0
)

# Split into train and test
X, Xt, y, yt = sklearn.cross_validation.train_test_split(X, y)

# Visualize sample of the data
ind = np.random.permutation(X.shape[0])[:1000]
df = pd.DataFrame(X[ind])
_ = pd.scatter_matrix(df, figsize=(9, 9), diagonal='kde', marker='o', s=40, alpha=.4, c=y[ind])


Learn and evaluate scikit-learn's logistic regression with stochastic gradient descent (SGD) training. Time and check the classifier's accuracy.


In [3]:
%%timeit
# Train and test the scikit-learn SGD logistic regression.
clf = sklearn.linear_model.SGDClassifier(
    loss='log', n_iter=1000, penalty='l2', alpha=1e-3, class_weight='auto')

clf.fit(X, y)
yt_pred = clf.predict(Xt)
print('Accuracy: {:.3f}'.format(sklearn.metrics.accuracy_score(yt, yt_pred)))


Accuracy: 0.783
Accuracy: 0.783
Accuracy: 0.783
Accuracy: 0.783
1 loops, best of 3: 508 ms per loop

Save the dataset to HDF5 for loading in Caffe.


In [4]:
# Write out the data to HDF5 files in a temp directory.
# This file is assumed to be caffe_root/examples/hdf5_classification.ipynb
dirname = os.path.abspath('./examples/hdf5_classification/data')
if not os.path.exists(dirname):
    os.makedirs(dirname)

train_filename = os.path.join(dirname, 'train.h5')
test_filename = os.path.join(dirname, 'test.h5')

# HDF5DataLayer source should be a file containing a list of HDF5 filenames.
# To show this off, we'll list the same data file twice.
with h5py.File(train_filename, 'w') as f:
    f['data'] = X
    f['label'] = y.astype(np.float32)
with open(os.path.join(dirname, 'train.txt'), 'w') as f:
    f.write(train_filename + '\n')
    f.write(train_filename + '\n')
    
# HDF5 is pretty efficient, but can be further compressed.
comp_kwargs = {'compression': 'gzip', 'compression_opts': 1}
with h5py.File(test_filename, 'w') as f:
    f.create_dataset('data', data=Xt, **comp_kwargs)
    f.create_dataset('label', data=yt.astype(np.float32), **comp_kwargs)
with open(os.path.join(dirname, 'test.txt'), 'w') as f:
    f.write(test_filename + '\n')

Let's define logistic regression in Caffe through Python net specification. This is a quick and natural way to define nets that sidesteps manually editing the protobuf model.


In [5]:
from caffe import layers as L
from caffe import params as P

def logreg(hdf5, batch_size):
    # logistic regression: data, matrix multiplication, and 2-class softmax loss
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=2, weight_filler=dict(type='xavier'))
    n.accuracy = L.Accuracy(n.ip1, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip1, n.label)
    return n.to_proto()
    
with open('examples/hdf5_classification/logreg_auto_train.prototxt', 'w') as f:
    f.write(str(logreg('examples/hdf5_classification/data/train.txt', 10)))
    
with open('examples/hdf5_classification/logreg_auto_test.prototxt', 'w') as f:
    f.write(str(logreg('examples/hdf5_classification/data/test.txt', 10)))

Time to learn and evaluate our Caffeinated logistic regression in Python.


In [6]:
%%timeit
caffe.set_mode_cpu()
solver = caffe.get_solver('examples/hdf5_classification/solver.prototxt')
solver.solve()

accuracy = 0
batch_size = solver.test_nets[0].blobs['data'].num
test_iters = int(len(Xt) / batch_size)
for i in range(test_iters):
    solver.test_nets[0].forward()
    accuracy += solver.test_nets[0].blobs['accuracy'].data
accuracy /= test_iters

print("Accuracy: {:.3f}".format(accuracy))


Accuracy: 0.782
Accuracy: 0.782
Accuracy: 0.782
Accuracy: 0.782
1 loops, best of 3: 287 ms per loop

Do the same through the command line interface for detailed output on the model and solving.


In [7]:
!./build/tools/caffe train -solver examples/hdf5_classification/solver.prototxt


I0318 00:58:32.322571 2013098752 caffe.cpp:117] Use CPU.
I0318 00:58:32.643163 2013098752 caffe.cpp:121] Starting Optimization
I0318 00:58:32.643229 2013098752 solver.cpp:32] Initializing solver from parameters: 
train_net: "examples/hdf5_classification/logreg_auto_train.prototxt"
test_net: "examples/hdf5_classification/logreg_auto_test.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "examples/hdf5_classification/data/train"
solver_mode: CPU
I0318 00:58:32.643333 2013098752 solver.cpp:61] Creating training net from train_net file: examples/hdf5_classification/logreg_auto_train.prototxt
I0318 00:58:32.643465 2013098752 net.cpp:42] Initializing net from parameters: 
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip1"
  bottom: "label"
  top: "accuracy"
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip1"
  bottom: "label"
  top: "loss"
}
I0318 00:58:32.644197 2013098752 layer_factory.hpp:74] Creating layer data
I0318 00:58:32.644219 2013098752 net.cpp:84] Creating Layer data
I0318 00:58:32.644230 2013098752 net.cpp:338] data -> data
I0318 00:58:32.644256 2013098752 net.cpp:338] data -> label
I0318 00:58:32.644269 2013098752 net.cpp:113] Setting up data
I0318 00:58:32.644278 2013098752 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt
I0318 00:58:32.644327 2013098752 hdf5_data_layer.cpp:80] Number of HDF5 files: 2
I0318 00:58:32.646458 2013098752 net.cpp:120] Top shape: 10 4 (40)
I0318 00:58:32.646502 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:32.646518 2013098752 layer_factory.hpp:74] Creating layer label_data_1_split
I0318 00:58:32.646538 2013098752 net.cpp:84] Creating Layer label_data_1_split
I0318 00:58:32.646546 2013098752 net.cpp:380] label_data_1_split <- label
I0318 00:58:32.646556 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_0
I0318 00:58:32.646569 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_1
I0318 00:58:32.646579 2013098752 net.cpp:113] Setting up label_data_1_split
I0318 00:58:32.646586 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:32.646595 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:32.646601 2013098752 layer_factory.hpp:74] Creating layer ip1
I0318 00:58:32.646615 2013098752 net.cpp:84] Creating Layer ip1
I0318 00:58:32.646622 2013098752 net.cpp:380] ip1 <- data
I0318 00:58:32.646664 2013098752 net.cpp:338] ip1 -> ip1
I0318 00:58:32.646689 2013098752 net.cpp:113] Setting up ip1
I0318 00:58:32.652330 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:32.652371 2013098752 layer_factory.hpp:74] Creating layer ip1_ip1_0_split
I0318 00:58:32.652393 2013098752 net.cpp:84] Creating Layer ip1_ip1_0_split
I0318 00:58:32.652407 2013098752 net.cpp:380] ip1_ip1_0_split <- ip1
I0318 00:58:32.652421 2013098752 net.cpp:338] ip1_ip1_0_split -> ip1_ip1_0_split_0
I0318 00:58:32.652467 2013098752 net.cpp:338] ip1_ip1_0_split -> ip1_ip1_0_split_1
I0318 00:58:32.652480 2013098752 net.cpp:113] Setting up ip1_ip1_0_split
I0318 00:58:32.652489 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:32.652498 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:32.652505 2013098752 layer_factory.hpp:74] Creating layer accuracy
I0318 00:58:32.652521 2013098752 net.cpp:84] Creating Layer accuracy
I0318 00:58:32.652534 2013098752 net.cpp:380] accuracy <- ip1_ip1_0_split_0
I0318 00:58:32.652545 2013098752 net.cpp:380] accuracy <- label_data_1_split_0
I0318 00:58:32.652562 2013098752 net.cpp:338] accuracy -> accuracy
I0318 00:58:32.652577 2013098752 net.cpp:113] Setting up accuracy
I0318 00:58:32.652590 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:32.652642 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:32.652655 2013098752 net.cpp:84] Creating Layer loss
I0318 00:58:32.652663 2013098752 net.cpp:380] loss <- ip1_ip1_0_split_1
I0318 00:58:32.652672 2013098752 net.cpp:380] loss <- label_data_1_split_1
I0318 00:58:32.652679 2013098752 net.cpp:338] loss -> loss
I0318 00:58:32.652689 2013098752 net.cpp:113] Setting up loss
I0318 00:58:32.652701 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:32.652716 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:32.652724 2013098752 net.cpp:122]     with loss weight 1
I0318 00:58:32.652740 2013098752 net.cpp:167] loss needs backward computation.
I0318 00:58:32.652746 2013098752 net.cpp:169] accuracy does not need backward computation.
I0318 00:58:32.652753 2013098752 net.cpp:167] ip1_ip1_0_split needs backward computation.
I0318 00:58:32.652760 2013098752 net.cpp:167] ip1 needs backward computation.
I0318 00:58:32.652786 2013098752 net.cpp:169] label_data_1_split does not need backward computation.
I0318 00:58:32.652801 2013098752 net.cpp:169] data does not need backward computation.
I0318 00:58:32.652808 2013098752 net.cpp:205] This network produces output accuracy
I0318 00:58:32.652815 2013098752 net.cpp:205] This network produces output loss
I0318 00:58:32.652825 2013098752 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0318 00:58:32.652833 2013098752 net.cpp:217] Network initialization done.
I0318 00:58:32.652839 2013098752 net.cpp:218] Memory required for data: 528
I0318 00:58:32.652964 2013098752 solver.cpp:154] Creating test net (#0) specified by test_net file: examples/hdf5_classification/logreg_auto_test.prototxt
I0318 00:58:32.652986 2013098752 net.cpp:42] Initializing net from parameters: 
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/test.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip1"
  bottom: "label"
  top: "accuracy"
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip1"
  bottom: "label"
  top: "loss"
}
I0318 00:58:32.653069 2013098752 layer_factory.hpp:74] Creating layer data
I0318 00:58:32.653080 2013098752 net.cpp:84] Creating Layer data
I0318 00:58:32.653090 2013098752 net.cpp:338] data -> data
I0318 00:58:32.653128 2013098752 net.cpp:338] data -> label
I0318 00:58:32.653146 2013098752 net.cpp:113] Setting up data
I0318 00:58:32.653154 2013098752 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: examples/hdf5_classification/data/test.txt
I0318 00:58:32.653192 2013098752 hdf5_data_layer.cpp:80] Number of HDF5 files: 1
I0318 00:58:32.654850 2013098752 net.cpp:120] Top shape: 10 4 (40)
I0318 00:58:32.654897 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:32.654914 2013098752 layer_factory.hpp:74] Creating layer label_data_1_split
I0318 00:58:32.654933 2013098752 net.cpp:84] Creating Layer label_data_1_split
I0318 00:58:32.654943 2013098752 net.cpp:380] label_data_1_split <- label
I0318 00:58:32.654953 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_0
I0318 00:58:32.654966 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_1
I0318 00:58:32.654976 2013098752 net.cpp:113] Setting up label_data_1_split
I0318 00:58:32.654985 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:32.654992 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:32.655000 2013098752 layer_factory.hpp:74] Creating layer ip1
I0318 00:58:32.655010 2013098752 net.cpp:84] Creating Layer ip1
I0318 00:58:32.655017 2013098752 net.cpp:380] ip1 <- data
I0318 00:58:32.655030 2013098752 net.cpp:338] ip1 -> ip1
I0318 00:58:32.655041 2013098752 net.cpp:113] Setting up ip1
I0318 00:58:32.655061 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:32.655072 2013098752 layer_factory.hpp:74] Creating layer ip1_ip1_0_split
I0318 00:58:32.655148 2013098752 net.cpp:84] Creating Layer ip1_ip1_0_split
I0318 00:58:32.655159 2013098752 net.cpp:380] ip1_ip1_0_split <- ip1
I0318 00:58:32.655170 2013098752 net.cpp:338] ip1_ip1_0_split -> ip1_ip1_0_split_0
I0318 00:58:32.655180 2013098752 net.cpp:338] ip1_ip1_0_split -> ip1_ip1_0_split_1
I0318 00:58:32.655190 2013098752 net.cpp:113] Setting up ip1_ip1_0_split
I0318 00:58:32.655199 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:32.655206 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:32.655213 2013098752 layer_factory.hpp:74] Creating layer accuracy
I0318 00:58:32.655223 2013098752 net.cpp:84] Creating Layer accuracy
I0318 00:58:32.655230 2013098752 net.cpp:380] accuracy <- ip1_ip1_0_split_0
I0318 00:58:32.655237 2013098752 net.cpp:380] accuracy <- label_data_1_split_0
I0318 00:58:32.655251 2013098752 net.cpp:338] accuracy -> accuracy
I0318 00:58:32.655259 2013098752 net.cpp:113] Setting up accuracy
I0318 00:58:32.655267 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:32.655340 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:32.655354 2013098752 net.cpp:84] Creating Layer loss
I0318 00:58:32.655361 2013098752 net.cpp:380] loss <- ip1_ip1_0_split_1
I0318 00:58:32.655369 2013098752 net.cpp:380] loss <- label_data_1_split_1
I0318 00:58:32.655378 2013098752 net.cpp:338] loss -> loss
I0318 00:58:32.655388 2013098752 net.cpp:113] Setting up loss
I0318 00:58:32.655397 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:32.655414 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:32.655422 2013098752 net.cpp:122]     with loss weight 1
I0318 00:58:32.655438 2013098752 net.cpp:167] loss needs backward computation.
I0318 00:58:32.655446 2013098752 net.cpp:169] accuracy does not need backward computation.
I0318 00:58:32.655455 2013098752 net.cpp:167] ip1_ip1_0_split needs backward computation.
I0318 00:58:32.655462 2013098752 net.cpp:167] ip1 needs backward computation.
I0318 00:58:32.655469 2013098752 net.cpp:169] label_data_1_split does not need backward computation.
I0318 00:58:32.655477 2013098752 net.cpp:169] data does not need backward computation.
I0318 00:58:32.655483 2013098752 net.cpp:205] This network produces output accuracy
I0318 00:58:32.655489 2013098752 net.cpp:205] This network produces output loss
I0318 00:58:32.655503 2013098752 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0318 00:58:32.655511 2013098752 net.cpp:217] Network initialization done.
I0318 00:58:32.655517 2013098752 net.cpp:218] Memory required for data: 528
I0318 00:58:32.655547 2013098752 solver.cpp:42] Solver scaffolding done.
I0318 00:58:32.655567 2013098752 solver.cpp:222] Solving 
I0318 00:58:32.655575 2013098752 solver.cpp:223] Learning Rate Policy: step
I0318 00:58:32.655583 2013098752 solver.cpp:266] Iteration 0, Testing net (#0)
I0318 00:58:32.683643 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.3736
I0318 00:58:32.683686 2013098752 solver.cpp:315]     Test net output #1: loss = 1.00555 (* 1 = 1.00555 loss)
I0318 00:58:32.683846 2013098752 solver.cpp:189] Iteration 0, loss = 0.869394
I0318 00:58:32.683861 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.3
I0318 00:58:32.683871 2013098752 solver.cpp:204]     Train net output #1: loss = 0.869394 (* 1 = 0.869394 loss)
I0318 00:58:32.683883 2013098752 solver.cpp:464] Iteration 0, lr = 0.01
I0318 00:58:32.698721 2013098752 solver.cpp:266] Iteration 1000, Testing net (#0)
I0318 00:58:32.701917 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7848
I0318 00:58:32.701961 2013098752 solver.cpp:315]     Test net output #1: loss = 0.590972 (* 1 = 0.590972 loss)
I0318 00:58:32.702014 2013098752 solver.cpp:189] Iteration 1000, loss = 0.54742
I0318 00:58:32.702029 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.7
I0318 00:58:32.702041 2013098752 solver.cpp:204]     Train net output #1: loss = 0.54742 (* 1 = 0.54742 loss)
I0318 00:58:32.702051 2013098752 solver.cpp:464] Iteration 1000, lr = 0.01
I0318 00:58:32.718360 2013098752 solver.cpp:266] Iteration 2000, Testing net (#0)
I0318 00:58:32.721529 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7696
I0318 00:58:32.721562 2013098752 solver.cpp:315]     Test net output #1: loss = 0.593946 (* 1 = 0.593946 loss)
I0318 00:58:32.721593 2013098752 solver.cpp:189] Iteration 2000, loss = 0.729569
I0318 00:58:32.721603 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.5
I0318 00:58:32.721613 2013098752 solver.cpp:204]     Train net output #1: loss = 0.729569 (* 1 = 0.729569 loss)
I0318 00:58:32.721622 2013098752 solver.cpp:464] Iteration 2000, lr = 0.01
I0318 00:58:32.740182 2013098752 solver.cpp:266] Iteration 3000, Testing net (#0)
I0318 00:58:32.743494 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.77
I0318 00:58:32.743544 2013098752 solver.cpp:315]     Test net output #1: loss = 0.591229 (* 1 = 0.591229 loss)
I0318 00:58:32.744209 2013098752 solver.cpp:189] Iteration 3000, loss = 0.406097
I0318 00:58:32.744231 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.8
I0318 00:58:32.744249 2013098752 solver.cpp:204]     Train net output #1: loss = 0.406096 (* 1 = 0.406096 loss)
I0318 00:58:32.744266 2013098752 solver.cpp:464] Iteration 3000, lr = 0.01
I0318 00:58:32.764135 2013098752 solver.cpp:266] Iteration 4000, Testing net (#0)
I0318 00:58:32.769110 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7848
I0318 00:58:32.769170 2013098752 solver.cpp:315]     Test net output #1: loss = 0.590972 (* 1 = 0.590972 loss)
I0318 00:58:32.769223 2013098752 solver.cpp:189] Iteration 4000, loss = 0.54742
I0318 00:58:32.769242 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.7
I0318 00:58:32.769255 2013098752 solver.cpp:204]     Train net output #1: loss = 0.54742 (* 1 = 0.54742 loss)
I0318 00:58:32.769265 2013098752 solver.cpp:464] Iteration 4000, lr = 0.01
I0318 00:58:32.785846 2013098752 solver.cpp:266] Iteration 5000, Testing net (#0)
I0318 00:58:32.788722 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7696
I0318 00:58:32.788751 2013098752 solver.cpp:315]     Test net output #1: loss = 0.593946 (* 1 = 0.593946 loss)
I0318 00:58:32.788811 2013098752 solver.cpp:189] Iteration 5000, loss = 0.72957
I0318 00:58:32.788833 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.5
I0318 00:58:32.788846 2013098752 solver.cpp:204]     Train net output #1: loss = 0.729569 (* 1 = 0.729569 loss)
I0318 00:58:32.788856 2013098752 solver.cpp:464] Iteration 5000, lr = 0.001
I0318 00:58:32.804762 2013098752 solver.cpp:266] Iteration 6000, Testing net (#0)
I0318 00:58:32.808061 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7856
I0318 00:58:32.808112 2013098752 solver.cpp:315]     Test net output #1: loss = 0.59028 (* 1 = 0.59028 loss)
I0318 00:58:32.808732 2013098752 solver.cpp:189] Iteration 6000, loss = 0.415444
I0318 00:58:32.808753 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:32.808773 2013098752 solver.cpp:204]     Train net output #1: loss = 0.415444 (* 1 = 0.415444 loss)
I0318 00:58:32.808786 2013098752 solver.cpp:464] Iteration 6000, lr = 0.001
I0318 00:58:32.827118 2013098752 solver.cpp:266] Iteration 7000, Testing net (#0)
I0318 00:58:32.831614 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7848
I0318 00:58:32.831657 2013098752 solver.cpp:315]     Test net output #1: loss = 0.589454 (* 1 = 0.589454 loss)
I0318 00:58:32.831707 2013098752 solver.cpp:189] Iteration 7000, loss = 0.538038
I0318 00:58:32.831728 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.8
I0318 00:58:32.831745 2013098752 solver.cpp:204]     Train net output #1: loss = 0.538037 (* 1 = 0.538037 loss)
I0318 00:58:32.831759 2013098752 solver.cpp:464] Iteration 7000, lr = 0.001
I0318 00:58:32.849634 2013098752 solver.cpp:266] Iteration 8000, Testing net (#0)
I0318 00:58:32.852712 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7796
I0318 00:58:32.852748 2013098752 solver.cpp:315]     Test net output #1: loss = 0.589365 (* 1 = 0.589365 loss)
I0318 00:58:32.852792 2013098752 solver.cpp:189] Iteration 8000, loss = 0.684219
I0318 00:58:32.852840 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.5
I0318 00:58:32.852852 2013098752 solver.cpp:204]     Train net output #1: loss = 0.684219 (* 1 = 0.684219 loss)
I0318 00:58:32.852861 2013098752 solver.cpp:464] Iteration 8000, lr = 0.001
I0318 00:58:32.868440 2013098752 solver.cpp:266] Iteration 9000, Testing net (#0)
I0318 00:58:32.871438 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.7816
I0318 00:58:32.871461 2013098752 solver.cpp:315]     Test net output #1: loss = 0.589656 (* 1 = 0.589656 loss)
I0318 00:58:32.872109 2013098752 solver.cpp:189] Iteration 9000, loss = 0.421879
I0318 00:58:32.872131 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:32.872143 2013098752 solver.cpp:204]     Train net output #1: loss = 0.421879 (* 1 = 0.421879 loss)
I0318 00:58:32.872153 2013098752 solver.cpp:464] Iteration 9000, lr = 0.001
I0318 00:58:32.889981 2013098752 solver.cpp:334] Snapshotting to examples/hdf5_classification/data/train_iter_10000.caffemodel
I0318 00:58:32.890224 2013098752 solver.cpp:342] Snapshotting solver state to examples/hdf5_classification/data/train_iter_10000.solverstate
I0318 00:58:32.890362 2013098752 solver.cpp:248] Iteration 10000, loss = 0.538933
I0318 00:58:32.890380 2013098752 solver.cpp:266] Iteration 10000, Testing net (#0)
I0318 00:58:32.893728 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.782
I0318 00:58:32.893757 2013098752 solver.cpp:315]     Test net output #1: loss = 0.589366 (* 1 = 0.589366 loss)
I0318 00:58:32.893775 2013098752 solver.cpp:253] Optimization Done.
I0318 00:58:32.893786 2013098752 caffe.cpp:134] Optimization Done.

If you look at output or the logreg_auto_train.prototxt, you'll see that the model is simple logistic regression. We can make it a little more advanced by introducing a non-linearity between weights that take the input and weights that give the output -- now we have a two-layer network. That network is given in nonlinear_auto_train.prototxt, and that's the only change made in nonlinear_solver.prototxt which we will now use.

The final accuracy of the new network should be higher than logistic regression!


In [8]:
from caffe import layers as L
from caffe import params as P

def nonlinear_net(hdf5, batch_size):
    # one small nonlinearity, one leap for model kind
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    # define a hidden layer of dimension 40
    n.ip1 = L.InnerProduct(n.data, num_output=40, weight_filler=dict(type='xavier'))
    # transform the output through the ReLU (rectified linear) non-linearity
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    # score the (now non-linear) features
    n.ip2 = L.InnerProduct(n.ip1, num_output=2, weight_filler=dict(type='xavier'))
    # same accuracy and loss as before
    n.accuracy = L.Accuracy(n.ip2, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
    return n.to_proto()
    
with open('examples/hdf5_classification/nonlinear_auto_train.prototxt', 'w') as f:
    f.write(str(nonlinear_net('examples/hdf5_classification/data/train.txt', 10)))
    
with open('examples/hdf5_classification/nonlinear_auto_test.prototxt', 'w') as f:
    f.write(str(nonlinear_net('examples/hdf5_classification/data/test.txt', 10)))

In [9]:
%%timeit
caffe.set_mode_cpu()
solver = caffe.get_solver('examples/hdf5_classification/nonlinear_solver.prototxt')
solver.solve()

accuracy = 0
batch_size = solver.test_nets[0].blobs['data'].num
test_iters = int(len(Xt) / batch_size)
for i in range(test_iters):
    solver.test_nets[0].forward()
    accuracy += solver.test_nets[0].blobs['accuracy'].data
accuracy /= test_iters

print("Accuracy: {:.3f}".format(accuracy))


Accuracy: 0.832
Accuracy: 0.832
Accuracy: 0.832
Accuracy: 0.831
1 loops, best of 3: 386 ms per loop

Do the same through the command line interface for detailed output on the model and solving.


In [10]:
!./build/tools/caffe train -solver examples/hdf5_classification/nonlinear_solver.prototxt


I0318 00:58:43.336922 2013098752 caffe.cpp:117] Use CPU.
I0318 00:58:43.654698 2013098752 caffe.cpp:121] Starting Optimization
I0318 00:58:43.654747 2013098752 solver.cpp:32] Initializing solver from parameters: 
train_net: "examples/hdf5_classification/nonlinear_auto_train.prototxt"
test_net: "examples/hdf5_classification/nonlinear_auto_test.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "examples/hdf5_classification/data/train"
solver_mode: CPU
I0318 00:58:43.654855 2013098752 solver.cpp:61] Creating training net from train_net file: examples/hdf5_classification/nonlinear_auto_train.prototxt
I0318 00:58:43.655004 2013098752 net.cpp:42] Initializing net from parameters: 
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0318 00:58:43.655120 2013098752 layer_factory.hpp:74] Creating layer data
I0318 00:58:43.655139 2013098752 net.cpp:84] Creating Layer data
I0318 00:58:43.655264 2013098752 net.cpp:338] data -> data
I0318 00:58:43.655297 2013098752 net.cpp:338] data -> label
I0318 00:58:43.655310 2013098752 net.cpp:113] Setting up data
I0318 00:58:43.655318 2013098752 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt
I0318 00:58:43.655365 2013098752 hdf5_data_layer.cpp:80] Number of HDF5 files: 2
I0318 00:58:43.657317 2013098752 net.cpp:120] Top shape: 10 4 (40)
I0318 00:58:43.657342 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:43.657356 2013098752 layer_factory.hpp:74] Creating layer label_data_1_split
I0318 00:58:43.657373 2013098752 net.cpp:84] Creating Layer label_data_1_split
I0318 00:58:43.657384 2013098752 net.cpp:380] label_data_1_split <- label
I0318 00:58:43.657395 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_0
I0318 00:58:43.657407 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_1
I0318 00:58:43.657418 2013098752 net.cpp:113] Setting up label_data_1_split
I0318 00:58:43.657426 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:43.657433 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:43.657441 2013098752 layer_factory.hpp:74] Creating layer ip1
I0318 00:58:43.657451 2013098752 net.cpp:84] Creating Layer ip1
I0318 00:58:43.657459 2013098752 net.cpp:380] ip1 <- data
I0318 00:58:43.657467 2013098752 net.cpp:338] ip1 -> ip1
I0318 00:58:43.657479 2013098752 net.cpp:113] Setting up ip1
I0318 00:58:43.662454 2013098752 net.cpp:120] Top shape: 10 40 (400)
I0318 00:58:43.662477 2013098752 layer_factory.hpp:74] Creating layer relu1
I0318 00:58:43.662497 2013098752 net.cpp:84] Creating Layer relu1
I0318 00:58:43.662508 2013098752 net.cpp:380] relu1 <- ip1
I0318 00:58:43.662520 2013098752 net.cpp:327] relu1 -> ip1 (in-place)
I0318 00:58:43.662530 2013098752 net.cpp:113] Setting up relu1
I0318 00:58:43.662539 2013098752 net.cpp:120] Top shape: 10 40 (400)
I0318 00:58:43.662546 2013098752 layer_factory.hpp:74] Creating layer ip2
I0318 00:58:43.662555 2013098752 net.cpp:84] Creating Layer ip2
I0318 00:58:43.662562 2013098752 net.cpp:380] ip2 <- ip1
I0318 00:58:43.662571 2013098752 net.cpp:338] ip2 -> ip2
I0318 00:58:43.662580 2013098752 net.cpp:113] Setting up ip2
I0318 00:58:43.662595 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:43.662606 2013098752 layer_factory.hpp:74] Creating layer ip2_ip2_0_split
I0318 00:58:43.662654 2013098752 net.cpp:84] Creating Layer ip2_ip2_0_split
I0318 00:58:43.662665 2013098752 net.cpp:380] ip2_ip2_0_split <- ip2
I0318 00:58:43.662678 2013098752 net.cpp:338] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0318 00:58:43.662689 2013098752 net.cpp:338] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0318 00:58:43.662698 2013098752 net.cpp:113] Setting up ip2_ip2_0_split
I0318 00:58:43.662706 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:43.662714 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:43.662722 2013098752 layer_factory.hpp:74] Creating layer accuracy
I0318 00:58:43.662734 2013098752 net.cpp:84] Creating Layer accuracy
I0318 00:58:43.662740 2013098752 net.cpp:380] accuracy <- ip2_ip2_0_split_0
I0318 00:58:43.662749 2013098752 net.cpp:380] accuracy <- label_data_1_split_0
I0318 00:58:43.662756 2013098752 net.cpp:338] accuracy -> accuracy
I0318 00:58:43.662766 2013098752 net.cpp:113] Setting up accuracy
I0318 00:58:43.662818 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:43.662827 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:43.662839 2013098752 net.cpp:84] Creating Layer loss
I0318 00:58:43.662847 2013098752 net.cpp:380] loss <- ip2_ip2_0_split_1
I0318 00:58:43.662854 2013098752 net.cpp:380] loss <- label_data_1_split_1
I0318 00:58:43.662863 2013098752 net.cpp:338] loss -> loss
I0318 00:58:43.662873 2013098752 net.cpp:113] Setting up loss
I0318 00:58:43.662883 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:43.662901 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:43.662909 2013098752 net.cpp:122]     with loss weight 1
I0318 00:58:43.662922 2013098752 net.cpp:167] loss needs backward computation.
I0318 00:58:43.662930 2013098752 net.cpp:169] accuracy does not need backward computation.
I0318 00:58:43.662936 2013098752 net.cpp:167] ip2_ip2_0_split needs backward computation.
I0318 00:58:43.662942 2013098752 net.cpp:167] ip2 needs backward computation.
I0318 00:58:43.662976 2013098752 net.cpp:167] relu1 needs backward computation.
I0318 00:58:43.662988 2013098752 net.cpp:167] ip1 needs backward computation.
I0318 00:58:43.662997 2013098752 net.cpp:169] label_data_1_split does not need backward computation.
I0318 00:58:43.663003 2013098752 net.cpp:169] data does not need backward computation.
I0318 00:58:43.663009 2013098752 net.cpp:205] This network produces output accuracy
I0318 00:58:43.663017 2013098752 net.cpp:205] This network produces output loss
I0318 00:58:43.663028 2013098752 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0318 00:58:43.663035 2013098752 net.cpp:217] Network initialization done.
I0318 00:58:43.663041 2013098752 net.cpp:218] Memory required for data: 3728
I0318 00:58:43.663158 2013098752 solver.cpp:154] Creating test net (#0) specified by test_net file: examples/hdf5_classification/nonlinear_auto_test.prototxt
I0318 00:58:43.663179 2013098752 net.cpp:42] Initializing net from parameters: 
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/test.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0318 00:58:43.663349 2013098752 layer_factory.hpp:74] Creating layer data
I0318 00:58:43.663365 2013098752 net.cpp:84] Creating Layer data
I0318 00:58:43.663373 2013098752 net.cpp:338] data -> data
I0318 00:58:43.663385 2013098752 net.cpp:338] data -> label
I0318 00:58:43.663396 2013098752 net.cpp:113] Setting up data
I0318 00:58:43.663422 2013098752 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: examples/hdf5_classification/data/test.txt
I0318 00:58:43.663457 2013098752 hdf5_data_layer.cpp:80] Number of HDF5 files: 1
I0318 00:58:43.664719 2013098752 net.cpp:120] Top shape: 10 4 (40)
I0318 00:58:43.664739 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:43.664754 2013098752 layer_factory.hpp:74] Creating layer label_data_1_split
I0318 00:58:43.664772 2013098752 net.cpp:84] Creating Layer label_data_1_split
I0318 00:58:43.664783 2013098752 net.cpp:380] label_data_1_split <- label
I0318 00:58:43.664791 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_0
I0318 00:58:43.664803 2013098752 net.cpp:338] label_data_1_split -> label_data_1_split_1
I0318 00:58:43.664813 2013098752 net.cpp:113] Setting up label_data_1_split
I0318 00:58:43.664822 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:43.664829 2013098752 net.cpp:120] Top shape: 10 (10)
I0318 00:58:43.664837 2013098752 layer_factory.hpp:74] Creating layer ip1
I0318 00:58:43.664846 2013098752 net.cpp:84] Creating Layer ip1
I0318 00:58:43.664854 2013098752 net.cpp:380] ip1 <- data
I0318 00:58:43.664862 2013098752 net.cpp:338] ip1 -> ip1
I0318 00:58:43.664875 2013098752 net.cpp:113] Setting up ip1
I0318 00:58:43.664901 2013098752 net.cpp:120] Top shape: 10 40 (400)
I0318 00:58:43.664924 2013098752 layer_factory.hpp:74] Creating layer relu1
I0318 00:58:43.664945 2013098752 net.cpp:84] Creating Layer relu1
I0318 00:58:43.664958 2013098752 net.cpp:380] relu1 <- ip1
I0318 00:58:43.664966 2013098752 net.cpp:327] relu1 -> ip1 (in-place)
I0318 00:58:43.664975 2013098752 net.cpp:113] Setting up relu1
I0318 00:58:43.664983 2013098752 net.cpp:120] Top shape: 10 40 (400)
I0318 00:58:43.664990 2013098752 layer_factory.hpp:74] Creating layer ip2
I0318 00:58:43.665000 2013098752 net.cpp:84] Creating Layer ip2
I0318 00:58:43.665006 2013098752 net.cpp:380] ip2 <- ip1
I0318 00:58:43.665015 2013098752 net.cpp:338] ip2 -> ip2
I0318 00:58:43.665030 2013098752 net.cpp:113] Setting up ip2
I0318 00:58:43.665052 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:43.665066 2013098752 layer_factory.hpp:74] Creating layer ip2_ip2_0_split
I0318 00:58:43.665077 2013098752 net.cpp:84] Creating Layer ip2_ip2_0_split
I0318 00:58:43.665086 2013098752 net.cpp:380] ip2_ip2_0_split <- ip2
I0318 00:58:43.665093 2013098752 net.cpp:338] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0318 00:58:43.665103 2013098752 net.cpp:338] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0318 00:58:43.665113 2013098752 net.cpp:113] Setting up ip2_ip2_0_split
I0318 00:58:43.665122 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:43.665128 2013098752 net.cpp:120] Top shape: 10 2 (20)
I0318 00:58:43.665137 2013098752 layer_factory.hpp:74] Creating layer accuracy
I0318 00:58:43.665144 2013098752 net.cpp:84] Creating Layer accuracy
I0318 00:58:43.665153 2013098752 net.cpp:380] accuracy <- ip2_ip2_0_split_0
I0318 00:58:43.665168 2013098752 net.cpp:380] accuracy <- label_data_1_split_0
I0318 00:58:43.665180 2013098752 net.cpp:338] accuracy -> accuracy
I0318 00:58:43.665192 2013098752 net.cpp:113] Setting up accuracy
I0318 00:58:43.665200 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:43.665207 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:43.665216 2013098752 net.cpp:84] Creating Layer loss
I0318 00:58:43.665223 2013098752 net.cpp:380] loss <- ip2_ip2_0_split_1
I0318 00:58:43.665230 2013098752 net.cpp:380] loss <- label_data_1_split_1
I0318 00:58:43.665241 2013098752 net.cpp:338] loss -> loss
I0318 00:58:43.665251 2013098752 net.cpp:113] Setting up loss
I0318 00:58:43.665259 2013098752 layer_factory.hpp:74] Creating layer loss
I0318 00:58:43.665273 2013098752 net.cpp:120] Top shape: (1)
I0318 00:58:43.665282 2013098752 net.cpp:122]     with loss weight 1
I0318 00:58:43.665290 2013098752 net.cpp:167] loss needs backward computation.
I0318 00:58:43.665338 2013098752 net.cpp:169] accuracy does not need backward computation.
I0318 00:58:43.665351 2013098752 net.cpp:167] ip2_ip2_0_split needs backward computation.
I0318 00:58:43.665380 2013098752 net.cpp:167] ip2 needs backward computation.
I0318 00:58:43.665387 2013098752 net.cpp:167] relu1 needs backward computation.
I0318 00:58:43.665393 2013098752 net.cpp:167] ip1 needs backward computation.
I0318 00:58:43.665400 2013098752 net.cpp:169] label_data_1_split does not need backward computation.
I0318 00:58:43.665407 2013098752 net.cpp:169] data does not need backward computation.
I0318 00:58:43.665415 2013098752 net.cpp:205] This network produces output accuracy
I0318 00:58:43.665421 2013098752 net.cpp:205] This network produces output loss
I0318 00:58:43.665431 2013098752 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0318 00:58:43.665441 2013098752 net.cpp:217] Network initialization done.
I0318 00:58:43.665446 2013098752 net.cpp:218] Memory required for data: 3728
I0318 00:58:43.665534 2013098752 solver.cpp:42] Solver scaffolding done.
I0318 00:58:43.665568 2013098752 solver.cpp:222] Solving 
I0318 00:58:43.665577 2013098752 solver.cpp:223] Learning Rate Policy: step
I0318 00:58:43.665586 2013098752 solver.cpp:266] Iteration 0, Testing net (#0)
I0318 00:58:43.683938 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.5184
I0318 00:58:43.683981 2013098752 solver.cpp:315]     Test net output #1: loss = 0.716141 (* 1 = 0.716141 loss)
I0318 00:58:43.684236 2013098752 solver.cpp:189] Iteration 0, loss = 0.764954
I0318 00:58:43.684267 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.5
I0318 00:58:43.684285 2013098752 solver.cpp:204]     Train net output #1: loss = 0.764954 (* 1 = 0.764954 loss)
I0318 00:58:43.684305 2013098752 solver.cpp:464] Iteration 0, lr = 0.01
I0318 00:58:43.714700 2013098752 solver.cpp:266] Iteration 1000, Testing net (#0)
I0318 00:58:43.721762 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8168
I0318 00:58:43.721818 2013098752 solver.cpp:315]     Test net output #1: loss = 0.434918 (* 1 = 0.434918 loss)
I0318 00:58:43.721899 2013098752 solver.cpp:189] Iteration 1000, loss = 0.282425
I0318 00:58:43.721917 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:43.721932 2013098752 solver.cpp:204]     Train net output #1: loss = 0.282426 (* 1 = 0.282426 loss)
I0318 00:58:43.721942 2013098752 solver.cpp:464] Iteration 1000, lr = 0.01
I0318 00:58:43.750509 2013098752 solver.cpp:266] Iteration 2000, Testing net (#0)
I0318 00:58:43.754590 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8224
I0318 00:58:43.754621 2013098752 solver.cpp:315]     Test net output #1: loss = 0.416874 (* 1 = 0.416874 loss)
I0318 00:58:43.754660 2013098752 solver.cpp:189] Iteration 2000, loss = 0.51988
I0318 00:58:43.754672 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.7
I0318 00:58:43.754683 2013098752 solver.cpp:204]     Train net output #1: loss = 0.51988 (* 1 = 0.51988 loss)
I0318 00:58:43.754690 2013098752 solver.cpp:464] Iteration 2000, lr = 0.01
I0318 00:58:43.782609 2013098752 solver.cpp:266] Iteration 3000, Testing net (#0)
I0318 00:58:43.789728 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8176
I0318 00:58:43.789777 2013098752 solver.cpp:315]     Test net output #1: loss = 0.415907 (* 1 = 0.415907 loss)
I0318 00:58:43.790487 2013098752 solver.cpp:189] Iteration 3000, loss = 0.5093
I0318 00:58:43.790510 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.7
I0318 00:58:43.790530 2013098752 solver.cpp:204]     Train net output #1: loss = 0.509301 (* 1 = 0.509301 loss)
I0318 00:58:43.790544 2013098752 solver.cpp:464] Iteration 3000, lr = 0.01
I0318 00:58:43.817451 2013098752 solver.cpp:266] Iteration 4000, Testing net (#0)
I0318 00:58:43.821740 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8252
I0318 00:58:43.821770 2013098752 solver.cpp:315]     Test net output #1: loss = 0.409124 (* 1 = 0.409124 loss)
I0318 00:58:43.821822 2013098752 solver.cpp:189] Iteration 4000, loss = 0.284815
I0318 00:58:43.821835 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:43.821846 2013098752 solver.cpp:204]     Train net output #1: loss = 0.284815 (* 1 = 0.284815 loss)
I0318 00:58:43.821890 2013098752 solver.cpp:464] Iteration 4000, lr = 0.01
I0318 00:58:43.847015 2013098752 solver.cpp:266] Iteration 5000, Testing net (#0)
I0318 00:58:43.852102 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8256
I0318 00:58:43.852145 2013098752 solver.cpp:315]     Test net output #1: loss = 0.404445 (* 1 = 0.404445 loss)
I0318 00:58:43.852188 2013098752 solver.cpp:189] Iteration 5000, loss = 0.511566
I0318 00:58:43.852200 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.7
I0318 00:58:43.852210 2013098752 solver.cpp:204]     Train net output #1: loss = 0.511566 (* 1 = 0.511566 loss)
I0318 00:58:43.852219 2013098752 solver.cpp:464] Iteration 5000, lr = 0.001
I0318 00:58:43.876060 2013098752 solver.cpp:266] Iteration 6000, Testing net (#0)
I0318 00:58:43.880080 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8328
I0318 00:58:43.880105 2013098752 solver.cpp:315]     Test net output #1: loss = 0.396847 (* 1 = 0.396847 loss)
I0318 00:58:43.880700 2013098752 solver.cpp:189] Iteration 6000, loss = 0.397858
I0318 00:58:43.880718 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:43.880729 2013098752 solver.cpp:204]     Train net output #1: loss = 0.397858 (* 1 = 0.397858 loss)
I0318 00:58:43.880738 2013098752 solver.cpp:464] Iteration 6000, lr = 0.001
I0318 00:58:43.913795 2013098752 solver.cpp:266] Iteration 7000, Testing net (#0)
I0318 00:58:43.917851 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8316
I0318 00:58:43.917876 2013098752 solver.cpp:315]     Test net output #1: loss = 0.398135 (* 1 = 0.398135 loss)
I0318 00:58:43.917956 2013098752 solver.cpp:189] Iteration 7000, loss = 0.243849
I0318 00:58:43.917971 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:43.917989 2013098752 solver.cpp:204]     Train net output #1: loss = 0.243849 (* 1 = 0.243849 loss)
I0318 00:58:43.918002 2013098752 solver.cpp:464] Iteration 7000, lr = 0.001
I0318 00:58:43.943681 2013098752 solver.cpp:266] Iteration 8000, Testing net (#0)
I0318 00:58:43.947589 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8312
I0318 00:58:43.947615 2013098752 solver.cpp:315]     Test net output #1: loss = 0.394763 (* 1 = 0.394763 loss)
I0318 00:58:43.947651 2013098752 solver.cpp:189] Iteration 8000, loss = 0.513399
I0318 00:58:43.947664 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.7
I0318 00:58:43.947674 2013098752 solver.cpp:204]     Train net output #1: loss = 0.513399 (* 1 = 0.513399 loss)
I0318 00:58:43.947682 2013098752 solver.cpp:464] Iteration 8000, lr = 0.001
I0318 00:58:43.973080 2013098752 solver.cpp:266] Iteration 9000, Testing net (#0)
I0318 00:58:43.977033 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.834
I0318 00:58:43.977056 2013098752 solver.cpp:315]     Test net output #1: loss = 0.395663 (* 1 = 0.395663 loss)
I0318 00:58:43.977710 2013098752 solver.cpp:189] Iteration 9000, loss = 0.399341
I0318 00:58:43.977735 2013098752 solver.cpp:204]     Train net output #0: accuracy = 0.9
I0318 00:58:43.977746 2013098752 solver.cpp:204]     Train net output #1: loss = 0.399342 (* 1 = 0.399342 loss)
I0318 00:58:43.977756 2013098752 solver.cpp:464] Iteration 9000, lr = 0.001
I0318 00:58:44.003437 2013098752 solver.cpp:334] Snapshotting to examples/hdf5_classification/data/train_iter_10000.caffemodel
I0318 00:58:44.003702 2013098752 solver.cpp:342] Snapshotting solver state to examples/hdf5_classification/data/train_iter_10000.solverstate
I0318 00:58:44.003850 2013098752 solver.cpp:248] Iteration 10000, loss = 0.244639
I0318 00:58:44.003871 2013098752 solver.cpp:266] Iteration 10000, Testing net (#0)
I0318 00:58:44.008216 2013098752 solver.cpp:315]     Test net output #0: accuracy = 0.8308
I0318 00:58:44.008252 2013098752 solver.cpp:315]     Test net output #1: loss = 0.397291 (* 1 = 0.397291 loss)
I0318 00:58:44.008262 2013098752 solver.cpp:253] Optimization Done.
I0318 00:58:44.008270 2013098752 caffe.cpp:134] Optimization Done.

In [11]:
# Clean up (comment this out if you want to examine the hdf5_classification/data directory).
shutil.rmtree(dirname)