Caffeinated Logistic Regression of HDF5 Data

While Caffe is made for deep networks it can likewise represent "shallow" models like logistic regression for classification. We'll do simple logistic regression on synthetic data that we'll generate and save to HDF5 to feed vectors to Caffe. Once that model is done, we'll add layers to improve accuracy. That's what Caffe is about: define a model, experiment, and then deploy.



In [1]:

    
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

# Make sure that caffe is on the python path:
caffe_root = '../'  # this file is expected to be in {caffe_root}/examples
import sys
sys.path.insert(0, caffe_root + 'python')

import caffe

import os
import h5py
import shutil
import tempfile

# You may need to 'pip install scikit-learn'
import sklearn
import sklearn.datasets
import sklearn.linear_model

Synthesize a dataset of 10,000 4-vectors for binary classification with 2 informative features and 2 noise features.



In [2]:

    
X, y = sklearn.datasets.make_classification(
    n_samples=10000, n_features=4, n_redundant=0, n_informative=2, 
    n_clusters_per_class=2, hypercube=False, random_state=0
)

# Split into train and test
X, Xt, y, yt = sklearn.cross_validation.train_test_split(X, y)

# Visualize sample of the data
ind = np.random.permutation(X.shape[0])[:1000]
df = pd.DataFrame(X[ind])
_ = pd.scatter_matrix(df, figsize=(9, 9), diagonal='kde', marker='o', s=40, alpha=.4, c=y[ind])

Learn and evaluate scikit-learn's logistic regression with stochastic gradient descent (SGD) training. Time and check the classifier's accuracy.



In [3]:

    
# Train and test the scikit-learn SGD logistic regression.
clf = sklearn.linear_model.SGDClassifier(
    loss='log', n_iter=1000, penalty='l2', alpha=1e-3, class_weight='auto')

%timeit clf.fit(X, y)
yt_pred = clf.predict(Xt)
print('Accuracy: {:.3f}'.format(sklearn.metrics.accuracy_score(yt, yt_pred)))









    



1 loops, best of 3: 499 ms per loop
Accuracy: 0.756

Save the dataset to HDF5 for loading in Caffe.



In [4]:

    
# Write out the data to HDF5 files in a temp directory.
# This file is assumed to be caffe_root/examples/hdf5_classification.ipynb
dirname = os.path.abspath('./hdf5_classification/data')
if not os.path.exists(dirname):
    os.makedirs(dirname)

train_filename = os.path.join(dirname, 'train.h5')
test_filename = os.path.join(dirname, 'test.h5')

# HDF5DataLayer source should be a file containing a list of HDF5 filenames.
# To show this off, we'll list the same data file twice.
with h5py.File(train_filename, 'w') as f:
    f['data'] = X
    f['label'] = y.astype(np.float32)
with open(os.path.join(dirname, 'train.txt'), 'w') as f:
    f.write(train_filename + '\n')
    f.write(train_filename + '\n')
    
# HDF5 is pretty efficient, but can be further compressed.
comp_kwargs = {'compression': 'gzip', 'compression_opts': 1}
with h5py.File(test_filename, 'w') as f:
    f.create_dataset('data', data=Xt, **comp_kwargs)
    f.create_dataset('label', data=yt.astype(np.float32), **comp_kwargs)
with open(os.path.join(dirname, 'test.txt'), 'w') as f:
    f.write(test_filename + '\n')

Learn and evaluate logistic regression in Caffe.



In [5]:

    
def learn_and_test(solver_file):
    caffe.set_mode_cpu()
    solver = caffe.get_solver(solver_file)
    solver.solve()

    accuracy = 0
    test_iters = int(len(Xt) / solver.test_nets[0].blobs['data'].num)
    for i in range(test_iters):
        solver.test_nets[0].forward()
        accuracy += solver.test_nets[0].blobs['accuracy'].data
    accuracy /= test_iters
    return accuracy

%timeit learn_and_test('hdf5_classification/solver.prototxt')
acc = learn_and_test('hdf5_classification/solver.prototxt')
print("Accuracy: {:.3f}".format(acc))









    



1 loops, best of 3: 240 ms per loop
Accuracy: 0.752

Do the same through the command line interface for detailed output on the model and solving.



In [6]:

    
!../build/tools/caffe train -solver hdf5_classification/solver.prototxt









    



I0307 01:34:29.141863 2099749632 caffe.cpp:103] Use CPU.
I0307 01:34:29.418283 2099749632 caffe.cpp:107] Starting Optimization
I0307 01:34:29.418323 2099749632 solver.cpp:32] Initializing solver from parameters: 
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "hdf5_classification/data/train"
solver_mode: CPU
net: "hdf5_classification/train_val.prototxt"
I0307 01:34:29.418416 2099749632 solver.cpp:70] Creating training net from net file: hdf5_classification/train_val.prototxt
I0307 01:34:29.418583 2099749632 net.cpp:257] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0307 01:34:29.418598 2099749632 net.cpp:257] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0307 01:34:29.418608 2099749632 net.cpp:42] Initializing net from parameters: 
name: "LogisticRegressionNet"
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc1"
  bottom: "label"
  top: "loss"
}
I0307 01:34:29.418692 2099749632 layer_factory.hpp:74] Creating layer data
I0307 01:34:29.418853 2099749632 net.cpp:84] Creating Layer data
I0307 01:34:29.418879 2099749632 net.cpp:338] data -> data
I0307 01:34:29.418905 2099749632 net.cpp:338] data -> label
I0307 01:34:29.418918 2099749632 net.cpp:113] Setting up data
I0307 01:34:29.418926 2099749632 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: hdf5_classification/data/train.txt
I0307 01:34:29.418992 2099749632 hdf5_data_layer.cpp:80] Number of HDF5 files: 2
I0307 01:34:29.420812 2099749632 net.cpp:120] Top shape: 10 4 (40)
I0307 01:34:29.420841 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:29.420852 2099749632 layer_factory.hpp:74] Creating layer fc1
I0307 01:34:29.420866 2099749632 net.cpp:84] Creating Layer fc1
I0307 01:34:29.420872 2099749632 net.cpp:380] fc1 <- data
I0307 01:34:29.420882 2099749632 net.cpp:338] fc1 -> fc1
I0307 01:34:29.420894 2099749632 net.cpp:113] Setting up fc1
I0307 01:34:29.425689 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:29.425709 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:29.425724 2099749632 net.cpp:84] Creating Layer loss
I0307 01:34:29.425731 2099749632 net.cpp:380] loss <- fc1
I0307 01:34:29.425739 2099749632 net.cpp:380] loss <- label
I0307 01:34:29.425747 2099749632 net.cpp:338] loss -> loss
I0307 01:34:29.425756 2099749632 net.cpp:113] Setting up loss
I0307 01:34:29.425767 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:29.425781 2099749632 net.cpp:120] Top shape: (1)
I0307 01:34:29.425789 2099749632 net.cpp:122]     with loss weight 1
I0307 01:34:29.425801 2099749632 net.cpp:167] loss needs backward computation.
I0307 01:34:29.425808 2099749632 net.cpp:167] fc1 needs backward computation.
I0307 01:34:29.425815 2099749632 net.cpp:169] data does not need backward computation.
I0307 01:34:29.425822 2099749632 net.cpp:205] This network produces output loss
I0307 01:34:29.425829 2099749632 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0307 01:34:29.425837 2099749632 net.cpp:217] Network initialization done.
I0307 01:34:29.425843 2099749632 net.cpp:218] Memory required for data: 284
I0307 01:34:29.425961 2099749632 solver.cpp:154] Creating test net (#0) specified by net file: hdf5_classification/train_val.prototxt
I0307 01:34:29.425984 2099749632 net.cpp:257] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0307 01:34:29.425997 2099749632 net.cpp:42] Initializing net from parameters: 
name: "LogisticRegressionNet"
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "hdf5_classification/data/test.txt"
    batch_size: 10
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc1"
  bottom: "label"
  top: "loss"
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc1"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
I0307 01:34:29.426126 2099749632 layer_factory.hpp:74] Creating layer data
I0307 01:34:29.426311 2099749632 net.cpp:84] Creating Layer data
I0307 01:34:29.426331 2099749632 net.cpp:338] data -> data
I0307 01:34:29.426343 2099749632 net.cpp:338] data -> label
I0307 01:34:29.426354 2099749632 net.cpp:113] Setting up data
I0307 01:34:29.426362 2099749632 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: hdf5_classification/data/test.txt
I0307 01:34:29.426484 2099749632 hdf5_data_layer.cpp:80] Number of HDF5 files: 1
I0307 01:34:29.427692 2099749632 net.cpp:120] Top shape: 10 4 (40)
I0307 01:34:29.427711 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:29.427721 2099749632 layer_factory.hpp:74] Creating layer label_data_1_split
I0307 01:34:29.427731 2099749632 net.cpp:84] Creating Layer label_data_1_split
I0307 01:34:29.427738 2099749632 net.cpp:380] label_data_1_split <- label
I0307 01:34:29.427747 2099749632 net.cpp:338] label_data_1_split -> label_data_1_split_0
I0307 01:34:29.427759 2099749632 net.cpp:338] label_data_1_split -> label_data_1_split_1
I0307 01:34:29.427768 2099749632 net.cpp:113] Setting up label_data_1_split
I0307 01:34:29.427777 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:29.427784 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:29.427791 2099749632 layer_factory.hpp:74] Creating layer fc1
I0307 01:34:29.427804 2099749632 net.cpp:84] Creating Layer fc1
I0307 01:34:29.427813 2099749632 net.cpp:380] fc1 <- data
I0307 01:34:29.427821 2099749632 net.cpp:338] fc1 -> fc1
I0307 01:34:29.427831 2099749632 net.cpp:113] Setting up fc1
I0307 01:34:29.427845 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:29.427857 2099749632 layer_factory.hpp:74] Creating layer fc1_fc1_0_split
I0307 01:34:29.427866 2099749632 net.cpp:84] Creating Layer fc1_fc1_0_split
I0307 01:34:29.427872 2099749632 net.cpp:380] fc1_fc1_0_split <- fc1
I0307 01:34:29.427881 2099749632 net.cpp:338] fc1_fc1_0_split -> fc1_fc1_0_split_0
I0307 01:34:29.427891 2099749632 net.cpp:338] fc1_fc1_0_split -> fc1_fc1_0_split_1
I0307 01:34:29.427942 2099749632 net.cpp:113] Setting up fc1_fc1_0_split
I0307 01:34:29.427955 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:29.427965 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:29.427976 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:29.427991 2099749632 net.cpp:84] Creating Layer loss
I0307 01:34:29.428001 2099749632 net.cpp:380] loss <- fc1_fc1_0_split_0
I0307 01:34:29.428009 2099749632 net.cpp:380] loss <- label_data_1_split_0
I0307 01:34:29.428017 2099749632 net.cpp:338] loss -> loss
I0307 01:34:29.428026 2099749632 net.cpp:113] Setting up loss
I0307 01:34:29.428035 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:29.428048 2099749632 net.cpp:120] Top shape: (1)
I0307 01:34:29.428056 2099749632 net.cpp:122]     with loss weight 1
I0307 01:34:29.428064 2099749632 layer_factory.hpp:74] Creating layer accuracy
I0307 01:34:29.428076 2099749632 net.cpp:84] Creating Layer accuracy
I0307 01:34:29.428084 2099749632 net.cpp:380] accuracy <- fc1_fc1_0_split_1
I0307 01:34:29.428092 2099749632 net.cpp:380] accuracy <- label_data_1_split_1
I0307 01:34:29.428102 2099749632 net.cpp:338] accuracy -> accuracy
I0307 01:34:29.428131 2099749632 net.cpp:113] Setting up accuracy
I0307 01:34:29.428140 2099749632 net.cpp:120] Top shape: (1)
I0307 01:34:29.428148 2099749632 net.cpp:169] accuracy does not need backward computation.
I0307 01:34:29.428154 2099749632 net.cpp:167] loss needs backward computation.
I0307 01:34:29.428161 2099749632 net.cpp:167] fc1_fc1_0_split needs backward computation.
I0307 01:34:29.428167 2099749632 net.cpp:167] fc1 needs backward computation.
I0307 01:34:29.428174 2099749632 net.cpp:169] label_data_1_split does not need backward computation.
I0307 01:34:29.428181 2099749632 net.cpp:169] data does not need backward computation.
I0307 01:34:29.428189 2099749632 net.cpp:205] This network produces output accuracy
I0307 01:34:29.428324 2099749632 net.cpp:205] This network produces output loss
I0307 01:34:29.428342 2099749632 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0307 01:34:29.428350 2099749632 net.cpp:217] Network initialization done.
I0307 01:34:29.428357 2099749632 net.cpp:218] Memory required for data: 528
I0307 01:34:29.428388 2099749632 solver.cpp:42] Solver scaffolding done.
I0307 01:34:29.428412 2099749632 solver.cpp:222] Solving LogisticRegressionNet
I0307 01:34:29.428421 2099749632 solver.cpp:223] Learning Rate Policy: step
I0307 01:34:29.428431 2099749632 solver.cpp:266] Iteration 0, Testing net (#0)
I0307 01:34:29.471674 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.4532
I0307 01:34:29.471724 2099749632 solver.cpp:315]     Test net output #1: loss = 0.694067 (* 1 = 0.694067 loss)
I0307 01:34:29.471853 2099749632 solver.cpp:189] Iteration 0, loss = 0.692695
I0307 01:34:29.471878 2099749632 solver.cpp:204]     Train net output #0: loss = 0.692695 (* 1 = 0.692695 loss)
I0307 01:34:29.471890 2099749632 solver.cpp:464] Iteration 0, lr = 0.01
I0307 01:34:29.483834 2099749632 solver.cpp:266] Iteration 1000, Testing net (#0)
I0307 01:34:29.486868 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7424
I0307 01:34:29.486896 2099749632 solver.cpp:315]     Test net output #1: loss = 0.601764 (* 1 = 0.601764 loss)
I0307 01:34:29.486922 2099749632 solver.cpp:189] Iteration 1000, loss = 0.472665
I0307 01:34:29.486934 2099749632 solver.cpp:204]     Train net output #0: loss = 0.472665 (* 1 = 0.472665 loss)
I0307 01:34:29.486944 2099749632 solver.cpp:464] Iteration 1000, lr = 0.01
I0307 01:34:29.498821 2099749632 solver.cpp:266] Iteration 2000, Testing net (#0)
I0307 01:34:29.501900 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7364
I0307 01:34:29.501941 2099749632 solver.cpp:315]     Test net output #1: loss = 0.60818 (* 1 = 0.60818 loss)
I0307 01:34:29.501988 2099749632 solver.cpp:189] Iteration 2000, loss = 0.6863
I0307 01:34:29.502003 2099749632 solver.cpp:204]     Train net output #0: loss = 0.6863 (* 1 = 0.6863 loss)
I0307 01:34:29.502013 2099749632 solver.cpp:464] Iteration 2000, lr = 0.01
I0307 01:34:29.513921 2099749632 solver.cpp:266] Iteration 3000, Testing net (#0)
I0307 01:34:29.517227 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.6964
I0307 01:34:29.517300 2099749632 solver.cpp:315]     Test net output #1: loss = 0.604707 (* 1 = 0.604707 loss)
I0307 01:34:29.518105 2099749632 solver.cpp:189] Iteration 3000, loss = 0.617542
I0307 01:34:29.518154 2099749632 solver.cpp:204]     Train net output #0: loss = 0.617542 (* 1 = 0.617542 loss)
I0307 01:34:29.518170 2099749632 solver.cpp:464] Iteration 3000, lr = 0.01
I0307 01:34:29.531672 2099749632 solver.cpp:266] Iteration 4000, Testing net (#0)
I0307 01:34:29.534873 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7424
I0307 01:34:29.534920 2099749632 solver.cpp:315]     Test net output #1: loss = 0.601764 (* 1 = 0.601764 loss)
I0307 01:34:29.534950 2099749632 solver.cpp:189] Iteration 4000, loss = 0.472666
I0307 01:34:29.534962 2099749632 solver.cpp:204]     Train net output #0: loss = 0.472665 (* 1 = 0.472665 loss)
I0307 01:34:29.534973 2099749632 solver.cpp:464] Iteration 4000, lr = 0.01
I0307 01:34:29.546567 2099749632 solver.cpp:266] Iteration 5000, Testing net (#0)
I0307 01:34:29.549762 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7364
I0307 01:34:29.549789 2099749632 solver.cpp:315]     Test net output #1: loss = 0.60818 (* 1 = 0.60818 loss)
I0307 01:34:29.549815 2099749632 solver.cpp:189] Iteration 5000, loss = 0.686301
I0307 01:34:29.549828 2099749632 solver.cpp:204]     Train net output #0: loss = 0.6863 (* 1 = 0.6863 loss)
I0307 01:34:29.549837 2099749632 solver.cpp:464] Iteration 5000, lr = 0.001
I0307 01:34:29.562142 2099749632 solver.cpp:266] Iteration 6000, Testing net (#0)
I0307 01:34:29.565335 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7476
I0307 01:34:29.565373 2099749632 solver.cpp:315]     Test net output #1: loss = 0.59775 (* 1 = 0.59775 loss)
I0307 01:34:29.566051 2099749632 solver.cpp:189] Iteration 6000, loss = 0.664614
I0307 01:34:29.566086 2099749632 solver.cpp:204]     Train net output #0: loss = 0.664614 (* 1 = 0.664614 loss)
I0307 01:34:29.566097 2099749632 solver.cpp:464] Iteration 6000, lr = 0.001
I0307 01:34:29.577900 2099749632 solver.cpp:266] Iteration 7000, Testing net (#0)
I0307 01:34:29.580993 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7524
I0307 01:34:29.581015 2099749632 solver.cpp:315]     Test net output #1: loss = 0.597349 (* 1 = 0.597349 loss)
I0307 01:34:29.581038 2099749632 solver.cpp:189] Iteration 7000, loss = 0.456775
I0307 01:34:29.581050 2099749632 solver.cpp:204]     Train net output #0: loss = 0.456774 (* 1 = 0.456774 loss)
I0307 01:34:29.581059 2099749632 solver.cpp:464] Iteration 7000, lr = 0.001
I0307 01:34:29.592854 2099749632 solver.cpp:266] Iteration 8000, Testing net (#0)
I0307 01:34:29.595973 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7568
I0307 01:34:29.596002 2099749632 solver.cpp:315]     Test net output #1: loss = 0.597265 (* 1 = 0.597265 loss)
I0307 01:34:29.596027 2099749632 solver.cpp:189] Iteration 8000, loss = 0.673885
I0307 01:34:29.596040 2099749632 solver.cpp:204]     Train net output #0: loss = 0.673885 (* 1 = 0.673885 loss)
I0307 01:34:29.596048 2099749632 solver.cpp:464] Iteration 8000, lr = 0.001
I0307 01:34:29.607822 2099749632 solver.cpp:266] Iteration 9000, Testing net (#0)
I0307 01:34:29.610930 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7432
I0307 01:34:29.610960 2099749632 solver.cpp:315]     Test net output #1: loss = 0.597777 (* 1 = 0.597777 loss)
I0307 01:34:29.611558 2099749632 solver.cpp:189] Iteration 9000, loss = 0.66526
I0307 01:34:29.611583 2099749632 solver.cpp:204]     Train net output #0: loss = 0.66526 (* 1 = 0.66526 loss)
I0307 01:34:29.611593 2099749632 solver.cpp:464] Iteration 9000, lr = 0.001
I0307 01:34:29.623009 2099749632 solver.cpp:334] Snapshotting to hdf5_classification/data/train_iter_10000.caffemodel
I0307 01:34:29.623209 2099749632 solver.cpp:342] Snapshotting solver state to hdf5_classification/data/train_iter_10000.solverstate
I0307 01:34:29.623319 2099749632 solver.cpp:248] Iteration 10000, loss = 0.457922
I0307 01:34:29.623333 2099749632 solver.cpp:266] Iteration 10000, Testing net (#0)
I0307 01:34:29.626454 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.752
I0307 01:34:29.626484 2099749632 solver.cpp:315]     Test net output #1: loss = 0.597362 (* 1 = 0.597362 loss)
I0307 01:34:29.626493 2099749632 solver.cpp:253] Optimization Done.
I0307 01:34:29.626502 2099749632 caffe.cpp:121] Optimization Done.

If you look at output or the train_val.prototxt, you'll see that the model is simple logistic regression. We can make it a little more advanced by introducing a non-linearity between weights that take the input and weights that give the output -- now we have a two-layer network. That network is given in train_val2.prototxt, and that's the only change made in solver2.prototxt which we will now use.

The final accuracy of the new network be higher than logistic regression!



In [7]:

    
def learn_and_test(solver_file):
    caffe.set_mode_cpu()
    solver = caffe.get_solver(solver_file)
    solver.solve()

    accuracy = 0
    test_iters = int(len(Xt) / solver.test_nets[0].blobs['data'].num)
    for i in range(test_iters):
        solver.test_nets[0].forward()
        accuracy += solver.test_nets[0].blobs['accuracy'].data
    accuracy /= test_iters
    return accuracy

%timeit learn_and_test('hdf5_classification/solver2.prototxt')
acc = learn_and_test('hdf5_classification/solver2.prototxt')
print("Accuracy: {:.3f}".format(acc))









    



1 loops, best of 3: 333 ms per loop
Accuracy: 0.818

Do the same through the command line interface for detailed output on the model and solving.



In [8]:

    
!../build/tools/caffe train -solver hdf5_classification/solver2.prototxt









    



I0307 01:34:31.589234 2099749632 caffe.cpp:103] Use CPU.
I0307 01:34:31.872560 2099749632 caffe.cpp:107] Starting Optimization
I0307 01:34:31.872596 2099749632 solver.cpp:32] Initializing solver from parameters: 
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "hdf5_classification/data/train"
solver_mode: CPU
net: "hdf5_classification/train_val2.prototxt"
I0307 01:34:31.872687 2099749632 solver.cpp:70] Creating training net from net file: hdf5_classification/train_val2.prototxt
I0307 01:34:31.872865 2099749632 net.cpp:257] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0307 01:34:31.872882 2099749632 net.cpp:257] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0307 01:34:31.872891 2099749632 net.cpp:42] Initializing net from parameters: 
name: "LogisticRegressionNet"
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "fc1"
  top: "fc1"
}
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}
I0307 01:34:31.873246 2099749632 layer_factory.hpp:74] Creating layer data
I0307 01:34:31.873276 2099749632 net.cpp:84] Creating Layer data
I0307 01:34:31.873292 2099749632 net.cpp:338] data -> data
I0307 01:34:31.873332 2099749632 net.cpp:338] data -> label
I0307 01:34:31.873352 2099749632 net.cpp:113] Setting up data
I0307 01:34:31.873361 2099749632 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: hdf5_classification/data/train.txt
I0307 01:34:31.873443 2099749632 hdf5_data_layer.cpp:80] Number of HDF5 files: 2
I0307 01:34:31.875783 2099749632 net.cpp:120] Top shape: 10 4 (40)
I0307 01:34:31.875816 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:31.875829 2099749632 layer_factory.hpp:74] Creating layer fc1
I0307 01:34:31.875846 2099749632 net.cpp:84] Creating Layer fc1
I0307 01:34:31.875857 2099749632 net.cpp:380] fc1 <- data
I0307 01:34:31.875875 2099749632 net.cpp:338] fc1 -> fc1
I0307 01:34:31.875892 2099749632 net.cpp:113] Setting up fc1
I0307 01:34:31.882478 2099749632 net.cpp:120] Top shape: 10 40 (400)
I0307 01:34:31.882505 2099749632 layer_factory.hpp:74] Creating layer relu1
I0307 01:34:31.882524 2099749632 net.cpp:84] Creating Layer relu1
I0307 01:34:31.882532 2099749632 net.cpp:380] relu1 <- fc1
I0307 01:34:31.882544 2099749632 net.cpp:327] relu1 -> fc1 (in-place)
I0307 01:34:31.882555 2099749632 net.cpp:113] Setting up relu1
I0307 01:34:31.882565 2099749632 net.cpp:120] Top shape: 10 40 (400)
I0307 01:34:31.882583 2099749632 layer_factory.hpp:74] Creating layer fc2
I0307 01:34:31.882609 2099749632 net.cpp:84] Creating Layer fc2
I0307 01:34:31.882619 2099749632 net.cpp:380] fc2 <- fc1
I0307 01:34:31.882632 2099749632 net.cpp:338] fc2 -> fc2
I0307 01:34:31.882644 2099749632 net.cpp:113] Setting up fc2
I0307 01:34:31.882663 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:31.882678 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:31.882694 2099749632 net.cpp:84] Creating Layer loss
I0307 01:34:31.882704 2099749632 net.cpp:380] loss <- fc2
I0307 01:34:31.882712 2099749632 net.cpp:380] loss <- label
I0307 01:34:31.882779 2099749632 net.cpp:338] loss -> loss
I0307 01:34:31.882796 2099749632 net.cpp:113] Setting up loss
I0307 01:34:31.882810 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:31.882833 2099749632 net.cpp:120] Top shape: (1)
I0307 01:34:31.882844 2099749632 net.cpp:122]     with loss weight 1
I0307 01:34:31.882860 2099749632 net.cpp:167] loss needs backward computation.
I0307 01:34:31.882869 2099749632 net.cpp:167] fc2 needs backward computation.
I0307 01:34:31.882877 2099749632 net.cpp:167] relu1 needs backward computation.
I0307 01:34:31.882886 2099749632 net.cpp:167] fc1 needs backward computation.
I0307 01:34:31.882894 2099749632 net.cpp:169] data does not need backward computation.
I0307 01:34:31.882904 2099749632 net.cpp:205] This network produces output loss
I0307 01:34:31.882931 2099749632 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0307 01:34:31.882942 2099749632 net.cpp:217] Network initialization done.
I0307 01:34:31.882951 2099749632 net.cpp:218] Memory required for data: 3484
I0307 01:34:31.883157 2099749632 solver.cpp:154] Creating test net (#0) specified by net file: hdf5_classification/train_val2.prototxt
I0307 01:34:31.883189 2099749632 net.cpp:257] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0307 01:34:31.883203 2099749632 net.cpp:42] Initializing net from parameters: 
name: "LogisticRegressionNet"
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "hdf5_classification/data/test.txt"
    batch_size: 10
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "fc1"
  top: "fc1"
}
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
I0307 01:34:31.883535 2099749632 layer_factory.hpp:74] Creating layer data
I0307 01:34:31.883548 2099749632 net.cpp:84] Creating Layer data
I0307 01:34:31.883556 2099749632 net.cpp:338] data -> data
I0307 01:34:31.883569 2099749632 net.cpp:338] data -> label
I0307 01:34:31.883579 2099749632 net.cpp:113] Setting up data
I0307 01:34:31.883585 2099749632 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: hdf5_classification/data/test.txt
I0307 01:34:31.883664 2099749632 hdf5_data_layer.cpp:80] Number of HDF5 files: 1
I0307 01:34:31.884842 2099749632 net.cpp:120] Top shape: 10 4 (40)
I0307 01:34:31.884860 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:31.884870 2099749632 layer_factory.hpp:74] Creating layer label_data_1_split
I0307 01:34:31.884879 2099749632 net.cpp:84] Creating Layer label_data_1_split
I0307 01:34:31.884886 2099749632 net.cpp:380] label_data_1_split <- label
I0307 01:34:31.884896 2099749632 net.cpp:338] label_data_1_split -> label_data_1_split_0
I0307 01:34:31.884909 2099749632 net.cpp:338] label_data_1_split -> label_data_1_split_1
I0307 01:34:31.884919 2099749632 net.cpp:113] Setting up label_data_1_split
I0307 01:34:31.884927 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:31.884934 2099749632 net.cpp:120] Top shape: 10 (10)
I0307 01:34:31.884941 2099749632 layer_factory.hpp:74] Creating layer fc1
I0307 01:34:31.884951 2099749632 net.cpp:84] Creating Layer fc1
I0307 01:34:31.884958 2099749632 net.cpp:380] fc1 <- data
I0307 01:34:31.884989 2099749632 net.cpp:338] fc1 -> fc1
I0307 01:34:31.885000 2099749632 net.cpp:113] Setting up fc1
I0307 01:34:31.885017 2099749632 net.cpp:120] Top shape: 10 40 (400)
I0307 01:34:31.885030 2099749632 layer_factory.hpp:74] Creating layer relu1
I0307 01:34:31.885041 2099749632 net.cpp:84] Creating Layer relu1
I0307 01:34:31.885048 2099749632 net.cpp:380] relu1 <- fc1
I0307 01:34:31.885056 2099749632 net.cpp:327] relu1 -> fc1 (in-place)
I0307 01:34:31.885064 2099749632 net.cpp:113] Setting up relu1
I0307 01:34:31.885071 2099749632 net.cpp:120] Top shape: 10 40 (400)
I0307 01:34:31.885079 2099749632 layer_factory.hpp:74] Creating layer fc2
I0307 01:34:31.885088 2099749632 net.cpp:84] Creating Layer fc2
I0307 01:34:31.885094 2099749632 net.cpp:380] fc2 <- fc1
I0307 01:34:31.885103 2099749632 net.cpp:338] fc2 -> fc2
I0307 01:34:31.885113 2099749632 net.cpp:113] Setting up fc2
I0307 01:34:31.885126 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:31.885138 2099749632 layer_factory.hpp:74] Creating layer fc2_fc2_0_split
I0307 01:34:31.885149 2099749632 net.cpp:84] Creating Layer fc2_fc2_0_split
I0307 01:34:31.885155 2099749632 net.cpp:380] fc2_fc2_0_split <- fc2
I0307 01:34:31.885164 2099749632 net.cpp:338] fc2_fc2_0_split -> fc2_fc2_0_split_0
I0307 01:34:31.885174 2099749632 net.cpp:338] fc2_fc2_0_split -> fc2_fc2_0_split_1
I0307 01:34:31.885182 2099749632 net.cpp:113] Setting up fc2_fc2_0_split
I0307 01:34:31.885190 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:31.885242 2099749632 net.cpp:120] Top shape: 10 2 (20)
I0307 01:34:31.885256 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:31.885267 2099749632 net.cpp:84] Creating Layer loss
I0307 01:34:31.885275 2099749632 net.cpp:380] loss <- fc2_fc2_0_split_0
I0307 01:34:31.885285 2099749632 net.cpp:380] loss <- label_data_1_split_0
I0307 01:34:31.885296 2099749632 net.cpp:338] loss -> loss
I0307 01:34:31.885308 2099749632 net.cpp:113] Setting up loss
I0307 01:34:31.885316 2099749632 layer_factory.hpp:74] Creating layer loss
I0307 01:34:31.885330 2099749632 net.cpp:120] Top shape: (1)
I0307 01:34:31.885337 2099749632 net.cpp:122]     with loss weight 1
I0307 01:34:31.885346 2099749632 layer_factory.hpp:74] Creating layer accuracy
I0307 01:34:31.885360 2099749632 net.cpp:84] Creating Layer accuracy
I0307 01:34:31.885368 2099749632 net.cpp:380] accuracy <- fc2_fc2_0_split_1
I0307 01:34:31.885375 2099749632 net.cpp:380] accuracy <- label_data_1_split_1
I0307 01:34:31.885383 2099749632 net.cpp:338] accuracy -> accuracy
I0307 01:34:31.885392 2099749632 net.cpp:113] Setting up accuracy
I0307 01:34:31.885401 2099749632 net.cpp:120] Top shape: (1)
I0307 01:34:31.885407 2099749632 net.cpp:169] accuracy does not need backward computation.
I0307 01:34:31.885413 2099749632 net.cpp:167] loss needs backward computation.
I0307 01:34:31.885419 2099749632 net.cpp:167] fc2_fc2_0_split needs backward computation.
I0307 01:34:31.885426 2099749632 net.cpp:167] fc2 needs backward computation.
I0307 01:34:31.885432 2099749632 net.cpp:167] relu1 needs backward computation.
I0307 01:34:31.885438 2099749632 net.cpp:167] fc1 needs backward computation.
I0307 01:34:31.885444 2099749632 net.cpp:169] label_data_1_split does not need backward computation.
I0307 01:34:31.885452 2099749632 net.cpp:169] data does not need backward computation.
I0307 01:34:31.885457 2099749632 net.cpp:205] This network produces output accuracy
I0307 01:34:31.885613 2099749632 net.cpp:205] This network produces output loss
I0307 01:34:31.885632 2099749632 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0307 01:34:31.885639 2099749632 net.cpp:217] Network initialization done.
I0307 01:34:31.885645 2099749632 net.cpp:218] Memory required for data: 3728
I0307 01:34:31.885685 2099749632 solver.cpp:42] Solver scaffolding done.
I0307 01:34:31.885711 2099749632 solver.cpp:222] Solving LogisticRegressionNet
I0307 01:34:31.885721 2099749632 solver.cpp:223] Learning Rate Policy: step
I0307 01:34:31.885730 2099749632 solver.cpp:266] Iteration 0, Testing net (#0)
I0307 01:34:31.901005 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.5944
I0307 01:34:31.901049 2099749632 solver.cpp:315]     Test net output #1: loss = 0.693021 (* 1 = 0.693021 loss)
I0307 01:34:31.901177 2099749632 solver.cpp:189] Iteration 0, loss = 0.693163
I0307 01:34:31.901192 2099749632 solver.cpp:204]     Train net output #0: loss = 0.693163 (* 1 = 0.693163 loss)
I0307 01:34:31.901203 2099749632 solver.cpp:464] Iteration 0, lr = 0.01
I0307 01:34:31.920586 2099749632 solver.cpp:266] Iteration 1000, Testing net (#0)
I0307 01:34:31.924612 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7556
I0307 01:34:31.924646 2099749632 solver.cpp:315]     Test net output #1: loss = 0.511002 (* 1 = 0.511002 loss)
I0307 01:34:31.924684 2099749632 solver.cpp:189] Iteration 1000, loss = 0.38536
I0307 01:34:31.924696 2099749632 solver.cpp:204]     Train net output #0: loss = 0.38536 (* 1 = 0.38536 loss)
I0307 01:34:31.924706 2099749632 solver.cpp:464] Iteration 1000, lr = 0.01
I0307 01:34:31.944727 2099749632 solver.cpp:266] Iteration 2000, Testing net (#0)
I0307 01:34:31.948729 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7824
I0307 01:34:31.948763 2099749632 solver.cpp:315]     Test net output #1: loss = 0.489214 (* 1 = 0.489214 loss)
I0307 01:34:31.948799 2099749632 solver.cpp:189] Iteration 2000, loss = 0.532582
I0307 01:34:31.948812 2099749632 solver.cpp:204]     Train net output #0: loss = 0.532582 (* 1 = 0.532582 loss)
I0307 01:34:31.948823 2099749632 solver.cpp:464] Iteration 2000, lr = 0.01
I0307 01:34:31.968670 2099749632 solver.cpp:266] Iteration 3000, Testing net (#0)
I0307 01:34:31.972393 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.7956
I0307 01:34:31.972411 2099749632 solver.cpp:315]     Test net output #1: loss = 0.454184 (* 1 = 0.454184 loss)
I0307 01:34:31.973024 2099749632 solver.cpp:189] Iteration 3000, loss = 0.541374
I0307 01:34:31.973057 2099749632 solver.cpp:204]     Train net output #0: loss = 0.541374 (* 1 = 0.541374 loss)
I0307 01:34:31.973067 2099749632 solver.cpp:464] Iteration 3000, lr = 0.01
I0307 01:34:31.994829 2099749632 solver.cpp:266] Iteration 4000, Testing net (#0)
I0307 01:34:31.998638 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.798
I0307 01:34:31.998663 2099749632 solver.cpp:315]     Test net output #1: loss = 0.456348 (* 1 = 0.456348 loss)
I0307 01:34:31.998705 2099749632 solver.cpp:189] Iteration 4000, loss = 0.490437
I0307 01:34:31.998718 2099749632 solver.cpp:204]     Train net output #0: loss = 0.490437 (* 1 = 0.490437 loss)
I0307 01:34:31.998725 2099749632 solver.cpp:464] Iteration 4000, lr = 0.01
I0307 01:34:32.021085 2099749632 solver.cpp:266] Iteration 5000, Testing net (#0)
I0307 01:34:32.024950 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.804
I0307 01:34:32.024981 2099749632 solver.cpp:315]     Test net output #1: loss = 0.46184 (* 1 = 0.46184 loss)
I0307 01:34:32.025017 2099749632 solver.cpp:189] Iteration 5000, loss = 0.467703
I0307 01:34:32.025028 2099749632 solver.cpp:204]     Train net output #0: loss = 0.467704 (* 1 = 0.467704 loss)
I0307 01:34:32.025038 2099749632 solver.cpp:464] Iteration 5000, lr = 0.001
I0307 01:34:32.044390 2099749632 solver.cpp:266] Iteration 6000, Testing net (#0)
I0307 01:34:32.048216 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.8208
I0307 01:34:32.048239 2099749632 solver.cpp:315]     Test net output #1: loss = 0.423084 (* 1 = 0.423084 loss)
I0307 01:34:32.048790 2099749632 solver.cpp:189] Iteration 6000, loss = 0.480104
I0307 01:34:32.048809 2099749632 solver.cpp:204]     Train net output #0: loss = 0.480105 (* 1 = 0.480105 loss)
I0307 01:34:32.048827 2099749632 solver.cpp:464] Iteration 6000, lr = 0.001
I0307 01:34:32.067795 2099749632 solver.cpp:266] Iteration 7000, Testing net (#0)
I0307 01:34:32.071524 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.8124
I0307 01:34:32.071542 2099749632 solver.cpp:315]     Test net output #1: loss = 0.423947 (* 1 = 0.423947 loss)
I0307 01:34:32.071570 2099749632 solver.cpp:189] Iteration 7000, loss = 0.447471
I0307 01:34:32.071617 2099749632 solver.cpp:204]     Train net output #0: loss = 0.447472 (* 1 = 0.447472 loss)
I0307 01:34:32.071626 2099749632 solver.cpp:464] Iteration 7000, lr = 0.001
I0307 01:34:32.091625 2099749632 solver.cpp:266] Iteration 8000, Testing net (#0)
I0307 01:34:32.095410 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.814
I0307 01:34:32.095432 2099749632 solver.cpp:315]     Test net output #1: loss = 0.423586 (* 1 = 0.423586 loss)
I0307 01:34:32.095461 2099749632 solver.cpp:189] Iteration 8000, loss = 0.386258
I0307 01:34:32.095474 2099749632 solver.cpp:204]     Train net output #0: loss = 0.386259 (* 1 = 0.386259 loss)
I0307 01:34:32.095481 2099749632 solver.cpp:464] Iteration 8000, lr = 0.001
I0307 01:34:32.117184 2099749632 solver.cpp:266] Iteration 9000, Testing net (#0)
I0307 01:34:32.121587 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.8208
I0307 01:34:32.121608 2099749632 solver.cpp:315]     Test net output #1: loss = 0.419969 (* 1 = 0.419969 loss)
I0307 01:34:32.122161 2099749632 solver.cpp:189] Iteration 9000, loss = 0.468262
I0307 01:34:32.122181 2099749632 solver.cpp:204]     Train net output #0: loss = 0.468262 (* 1 = 0.468262 loss)
I0307 01:34:32.122191 2099749632 solver.cpp:464] Iteration 9000, lr = 0.001
I0307 01:34:32.141635 2099749632 solver.cpp:334] Snapshotting to hdf5_classification/data/train_iter_10000.caffemodel
I0307 01:34:32.141860 2099749632 solver.cpp:342] Snapshotting solver state to hdf5_classification/data/train_iter_10000.solverstate
I0307 01:34:32.141978 2099749632 solver.cpp:248] Iteration 10000, loss = 0.441529
I0307 01:34:32.141995 2099749632 solver.cpp:266] Iteration 10000, Testing net (#0)
I0307 01:34:32.145747 2099749632 solver.cpp:315]     Test net output #0: accuracy = 0.8148
I0307 01:34:32.145771 2099749632 solver.cpp:315]     Test net output #1: loss = 0.4216 (* 1 = 0.4216 loss)
I0307 01:34:32.145779 2099749632 solver.cpp:253] Optimization Done.
I0307 01:34:32.145786 2099749632 caffe.cpp:121] Optimization Done.



In [9]:

    
# Clean up (comment this out if you want to examine the hdf5_classification/data directory).
shutil.rmtree(dirname)