Experiments using regular ensembles

We start by building the model and showing the basic inference procedure and calculation of the performance on the MNIST classification and the outlier detection task. Then perform multiple runs of the model with different number of samples in the ensemble to calculate performance statistics. This experiment uses the same learning rate schedule as the SGLD example for comparable results.


In [1]:
# Let's first setup the libraries, session and experimental data
import experiment
import inferences
import edward as ed
import tensorflow as tf
import numpy as np
import os

s = experiment.setup()
mnist, notmnist = experiment.get_data()


Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Extracting notMNIST_data/train-images-idx3-ubyte.gz
Extracting notMNIST_data/train-labels-idx1-ubyte.gz
Extracting notMNIST_data/t10k-images-idx3-ubyte.gz
Extracting notMNIST_data/t10k-labels-idx1-ubyte.gz

In [2]:
# Builds the model and approximation variables used for the model
y_, model_variables = experiment.get_model_3layer()
approx_variables = experiment.get_pointmass_approximation_variables_3layer()

In [3]:
# Performs inference with edward's MAP class and save each model state
models = []
num_models = 10

lr = tf.placeholder(tf.float32, shape=[])
optimizer = tf.train.GradientDescentOptimizer(lr)
inference_dict = {model_variables[key]: val for key, val in approx_variables.iteritems()}

for _ in range(num_models):
    inference = ed.MAP(inference_dict, data={y_: model_variables['y']})
    n_iter=1000
    inference.initialize(n_iter=n_iter, optimizer=optimizer)

    tf.global_variables_initializer().run()
    for i in range(n_iter):
        batch = mnist.train.next_batch(100)
        info_dict = inference.update({model_variables['x']: batch[0],
                                      model_variables['y']: batch[1],
                                     lr:0.005/(i+1.)})
        inference.print_progress(info_dict)

    inference.finalize()
    models.append({key: tf.Variable(val.eval()) for key, val in approx_variables.iteritems()})


1000/1000 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 221189.375
1000/1000 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 221192.859
1000/1000 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 221169.828
1000/1000 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 221182.047
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221175.859
1000/1000 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 221172.453
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221183.641
1000/1000 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 221185.312
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221198.953
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221198.094

In [4]:
# Computes the accuracy of our model
accuracy, disagreement = experiment.get_metrics_ensemble(model_variables, models, num_samples=10)
tf.global_variables_initializer().run()
print(accuracy.eval({model_variables['x']: mnist.test.images, model_variables['y']: mnist.test.labels}))
print(disagreement.eval({model_variables['x']: mnist.test.images, model_variables['y']: mnist.test.labels}))


0.8729
[ 0.49893495  1.93824971  0.23904318 ...,  0.96060151  1.65797067
  0.3496128 ]

In [5]:
# Computes some statistics for the proposed outlier detection
outlier_stats = experiment.get_outlier_stats(model_variables, disagreement, mnist, notmnist)
print(outlier_stats)
print('TP/(FN+TP): {}'.format(float(outlier_stats['TP']) / (outlier_stats['TP'] + outlier_stats['FN'])))
print('FP/(FP+TN): {}'.format(float(outlier_stats['FP']) / (outlier_stats['FP'] + outlier_stats['TN'])))


{'FP': 63, 'TN': 9937, 'FN': 705, 'TP': 9295}
TP/(FN+TP): 0.9295
FP/(FP+TN): 0.0063

The following cell performs multiple runs of this model with different number of samples within the ensemble to capture performance statistics. Results are saved in Full_Ensemble_SGLD_LR.csv.


In [6]:
import pandas as pd

results = pd.DataFrame(columns=('run', 'samples', 'acc', 'TP', 'FN', 'TN', 'FP'))

for run in range(5):
    models = []
    num_models = 15

    lr = tf.placeholder(tf.float32, shape=[])
    optimizer = tf.train.GradientDescentOptimizer(lr)
    inference_dict = {model_variables[key]: val for key, val in approx_variables.iteritems()}

    for _ in range(num_models):
        inference = ed.MAP(inference_dict, data={y_: model_variables['y']})
        n_iter=1000
        inference.initialize(n_iter=n_iter, optimizer=optimizer)

        tf.global_variables_initializer().run()
        for i in range(n_iter):
            batch = mnist.train.next_batch(100)
            info_dict = inference.update({model_variables['x']: batch[0],
                                          model_variables['y']: batch[1],
                                         lr:0.005/(i+1.)})
            inference.print_progress(info_dict)

        inference.finalize()
        models.append({key: tf.Variable(val.eval()) for key, val in approx_variables.iteritems()})
    
    for num_samples in range(15):
        accuracy, disagreement = experiment.get_metrics_ensemble(model_variables, models,
                                                                 num_samples=num_samples + 1)
        tf.global_variables_initializer().run()
        acc = accuracy.eval({model_variables['x']: mnist.test.images, model_variables['y']: mnist.test.labels})
        outlier_stats = experiment.get_outlier_stats(model_variables, disagreement, mnist, notmnist)
        results.loc[len(results)] = [run, num_samples + 1, acc,
                                     outlier_stats['TP'], outlier_stats['FN'],
                                     outlier_stats['TN'], outlier_stats['FP']]
results.to_csv('Full_Ensemble_SGLD_LR.csv', index=False)


1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221189.812
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221188.016
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221169.609
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221177.719
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221203.828
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221169.047
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221203.672
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221168.969
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221193.344
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221178.297
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221196.844
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221177.078
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221176.016
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221189.188
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221193.516
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221192.016
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221169.172
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221223.391
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221188.016
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221184.953
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221175.484
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221176.672
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221185.594
1000/1000 [100%] ██████████████████████████████ Elapsed: 6s | Loss: 221187.172
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221198.109
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221194.438
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221192.875
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221208.266
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221181.875
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221196.500
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221182.406
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221195.703
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221210.656
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221202.203
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221165.766
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221196.031
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221207.094
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221185.094
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221208.906
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221187.812
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221203.828
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221171.406
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221201.141
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221178.062
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221193.156
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221171.766
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221207.141
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221186.078
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221177.625
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221217.547
1000/1000 [100%] ██████████████████████████████ Elapsed: 7s | Loss: 221200.234
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221192.719
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221189.906
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221172.484
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221202.125
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221202.203
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221191.969
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221184.203
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221182.219
1000/1000 [100%] ██████████████████████████████ Elapsed: 9s | Loss: 221181.766
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221178.266
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221184.172
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221200.281
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221194.859
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221174.547
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221189.375
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221187.953
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221201.219
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221185.281
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221193.250
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221179.609
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221212.547
1000/1000 [100%] ██████████████████████████████ Elapsed: 8s | Loss: 221193.141
1000/1000 [100%] ██████████████████████████████ Elapsed: 9s | Loss: 221195.594
1000/1000 [100%] ██████████████████████████████ Elapsed: 9s | Loss: 221194.766