Experiment:

Evaluate pruning by magnitude weighted by coactivations (more thorough evaluation), compare it to baseline (SET).

Motivation.

Check if results are consistently above baseline.

Conclusion

  • No significant difference between both models
  • No support for early stopping

In [1]:
%load_ext autoreload
%autoreload 2

In [6]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import glob
import tabulate
import pprint
import click
import numpy as np
import pandas as pd
from ray.tune.commands import *
from nupic.research.frameworks.dynamic_sparse.common.browser import *

Load and check data


In [8]:
exps = ['improved_magpruning_eval1', ]
paths = [os.path.expanduser("~/nta/results/{}".format(e)) for e in exps]
df = load_many(paths)

In [9]:
df.head(5)


Out[9]:
Experiment Name train_acc_max train_acc_max_epoch train_acc_min train_acc_min_epoch train_acc_median train_acc_last val_acc_max val_acc_max_epoch val_acc_min ... momentum network num_classes on_perc optim_alg pruning_early_stop test_noise use_kwinners weight_decay weight_prune_perc
0 0_model=DSNNWeightedMag,on_perc=0.2,pruning_ea... 0.999483 93 0.922517 0 0.998808 0.999083 0.9802 35 0.9597 ... 0.9 MLPHeb 10 0.2 SGD 0 False False 0.0001 NaN
1 1_model=DSNNMixedHeb,on_perc=0.2,pruning_early... 0.999433 87 0.926283 0 0.998683 0.999333 0.9798 35 0.9603 ... 0.9 MLPHeb 10 0.2 SGD 0 False False 0.0001 NaN
2 2_model=DSNNWeightedMag,on_perc=0.1,pruning_ea... 0.993517 92 0.908733 0 0.990750 0.993150 0.9733 92 0.9506 ... 0.9 MLPHeb 10 0.1 SGD 0 False False 0.0001 NaN
3 3_model=DSNNMixedHeb,on_perc=0.1,pruning_early... 0.993217 94 0.905483 0 0.990275 0.993017 0.9725 38 0.9502 ... 0.9 MLPHeb 10 0.1 SGD 0 False False 0.0001 NaN
4 4_model=DSNNWeightedMag,on_perc=0.2,pruning_ea... 0.999400 75 0.927883 0 0.998633 0.999050 0.9818 44 0.9640 ... 0.9 MLPHeb 10 0.2 SGD 1 False False 0.0001 NaN

5 rows × 42 columns


In [10]:
# replace hebbian prine
df['hebbian_prune_perc'] = df['hebbian_prune_perc'].replace(np.nan, 0.0, regex=True)
df['weight_prune_perc'] = df['weight_prune_perc'].replace(np.nan, 0.0, regex=True)

In [11]:
df.columns


Out[11]:
Index(['Experiment Name', 'train_acc_max', 'train_acc_max_epoch',
       'train_acc_min', 'train_acc_min_epoch', 'train_acc_median',
       'train_acc_last', 'val_acc_max', 'val_acc_max_epoch', 'val_acc_min',
       'val_acc_min_epoch', 'val_acc_median', 'val_acc_last', 'epochs',
       'experiment_file_name', 'trial_time', 'mean_epoch_time', 'batch_norm',
       'data_dir', 'dataset_name', 'debug_sparse', 'debug_weights', 'device',
       'hebbian_grow', 'hebbian_prune_perc', 'hidden_sizes', 'input_size',
       'learning_rate', 'lr_gamma', 'lr_milestones', 'lr_scheduler', 'model',
       'momentum', 'network', 'num_classes', 'on_perc', 'optim_alg',
       'pruning_early_stop', 'test_noise', 'use_kwinners', 'weight_decay',
       'weight_prune_perc'],
      dtype='object')

In [12]:
df.shape


Out[12]:
(288, 42)

In [13]:
df.iloc[1]


Out[13]:
Experiment Name         1_model=DSNNMixedHeb,on_perc=0.2,pruning_early...
train_acc_max                                                    0.999433
train_acc_max_epoch                                                    87
train_acc_min                                                    0.926283
train_acc_min_epoch                                                     0
train_acc_median                                                 0.998683
train_acc_last                                                   0.999333
val_acc_max                                                        0.9798
val_acc_max_epoch                                                      35
val_acc_min                                                        0.9603
val_acc_min_epoch                                                       0
val_acc_median                                                     0.9784
val_acc_last                                                       0.9789
epochs                                                                100
experiment_file_name    /Users/lsouza/nta/results/improved_magpruning_...
trial_time                                                        57.1836
mean_epoch_time                                                  0.571836
batch_norm                                                           True
data_dir                                        /home/ubuntu/nta/datasets
dataset_name                                                        MNIST
debug_sparse                                                         True
debug_weights                                                        True
device                                                               cuda
hebbian_grow                                                        False
hebbian_prune_perc                                                      0
hidden_sizes                                                          100
input_size                                                            784
learning_rate                                                         0.1
lr_gamma                                                              0.1
lr_milestones                                                          60
lr_scheduler                                                  MultiStepLR
model                                                        DSNNMixedHeb
momentum                                                              0.9
network                                                            MLPHeb
num_classes                                                            10
on_perc                                                               0.2
optim_alg                                                             SGD
pruning_early_stop                                                      0
test_noise                                                          False
use_kwinners                                                        False
weight_decay                                                       0.0001
weight_prune_perc                                                       0
Name: 1, dtype: object

In [14]:
df.groupby('model')['model'].count()


Out[14]:
model
DSNNMixedHeb       144
DSNNWeightedMag    144
Name: model, dtype: int64

Analysis

Experiment Details

base_exp_config = dict( device="cuda", # dataset related dataset_name="CIFAR10", input_size=3072, num_classes=10, stats_mean=(0.4914, 0.4822, 0.4465), stats_std=(0.2023, 0.1994, 0.2010), data_dir="~/nta/datasets", # model related model="DSNNMixedHeb", network="MLPHeb", init_weights=True, batch_norm=True, dropout=False, kwinners=True, percent_on=0.3, boost_strength=1.4, boost_strength_factor=0.7, # optimizer related optim_alg="SGD", momentum=0.9, learning_rate=0.01, weight_decay=1e-4, # sparse related epsilon=100, start_sparse=1, end_sparse=None, weight_prune_perc=0.45, hebbian_prune_perc=0.45, pruning_es=True, pruning_es_patience=0, pruning_es_window_size=5, pruning_es_threshold=0.02, pruning_interval=1, # additional validation test_noise=True, noise_level=0.1, # debugging debug_weights=True, debug_sparse=True, ) # ray configurations tune_config = dict( name="hebbian-gs-test", num_samples=1, local_dir=os.path.expanduser("~/nta/results"), checkpoint_freq=0, checkpoint_at_end=False, stop={"training_iteration": 1000}, # 300 in cifar resources_per_trial={"cpu": 1, "gpu": 0.33}, loggers=DEFAULT_LOGGERS, verbose=1, )

In [15]:
# Did any  trials failed?
df[df["epochs"]<30]["epochs"].count()


Out[15]:
0

In [16]:
# Removing failed or incomplete trials
df_origin = df.copy()
df = df_origin[df_origin["epochs"]>=30]
df.shape


Out[16]:
(288, 42)

In [17]:
# which ones failed?
# failed, or still ongoing?
df_origin['failed'] = df_origin["epochs"]<30
df_origin[df_origin['failed']]['epochs']


Out[17]:
Series([], Name: epochs, dtype: int64)

In [18]:
# helper functions
def mean_and_std(s):
    return "{:.3f} ± {:.3f}".format(s.mean(), s.std())

def round_mean(s):
    return "{:.0f}".format(round(s.mean()))

stats = ['min', 'max', 'mean', 'std']

def agg(columns, filter=None, round=3):
    if filter is None:
        return (df.groupby(columns)
             .agg({'val_acc_max_epoch': round_mean,
                   'val_acc_max': stats,                
                   'model': ['count']})).round(round)
    else:
        return (df[filter].groupby(columns)
             .agg({'val_acc_max_epoch': round_mean,
                   'val_acc_max': stats,                
                   'model': ['count']})).round(round)
Does improved weight pruning outperforms regular SET

In [20]:
agg(['model'])


Out[20]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
model
DSNNMixedHeb 49 0.972 0.986 0.982 0.003 144
DSNNWeightedMag 48 0.973 0.986 0.981 0.003 144

In [22]:
agg(['on_perc', 'model'])


Out[22]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
on_perc model
0.1 DSNNMixedHeb 46 0.972 0.984 0.980 0.003 72
DSNNWeightedMag 47 0.973 0.983 0.980 0.003 72
0.2 DSNNMixedHeb 53 0.980 0.986 0.983 0.002 72
DSNNWeightedMag 49 0.979 0.986 0.983 0.001 72

In [24]:
agg(['weight_prune_perc', 'model'])


Out[24]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
weight_prune_perc model
0.0 DSNNMixedHeb 38 0.972 0.982 0.978 0.003 24
DSNNWeightedMag 42 0.973 0.983 0.978 0.003 24
0.1 DSNNMixedHeb 48 0.980 0.986 0.982 0.002 24
DSNNWeightedMag 49 0.979 0.985 0.983 0.002 24
0.2 DSNNMixedHeb 48 0.979 0.985 0.982 0.002 24
DSNNWeightedMag 51 0.980 0.985 0.982 0.002 24
0.3 DSNNMixedHeb 55 0.980 0.985 0.983 0.001 24
DSNNWeightedMag 52 0.980 0.986 0.983 0.002 24
0.4 DSNNMixedHeb 53 0.981 0.985 0.983 0.002 24
DSNNWeightedMag 52 0.979 0.985 0.982 0.002 24
0.5 DSNNMixedHeb 53 0.979 0.985 0.982 0.002 24
DSNNWeightedMag 43 0.978 0.984 0.981 0.002 24

In [ ]:
agg(['on_perc', 'pruning_early_stop', 'model'])

In [28]:
agg(['on_perc', 'pruning_early_stop', 'model'])


Out[28]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
on_perc pruning_early_stop model
0.1 0 DSNNMixedHeb 47 0.972 0.983 0.980 0.003 18
DSNNWeightedMag 52 0.973 0.982 0.979 0.003 18
1 DSNNMixedHeb 43 0.973 0.983 0.980 0.003 18
DSNNWeightedMag 54 0.974 0.983 0.980 0.003 18
2 DSNNMixedHeb 50 0.974 0.984 0.980 0.003 18
DSNNWeightedMag 38 0.973 0.983 0.980 0.003 18
3 DSNNMixedHeb 43 0.974 0.982 0.980 0.003 18
DSNNWeightedMag 46 0.975 0.983 0.980 0.002 18
0.2 0 DSNNMixedHeb 46 0.980 0.985 0.983 0.002 18
DSNNWeightedMag 48 0.980 0.985 0.983 0.001 18
1 DSNNMixedHeb 54 0.980 0.985 0.984 0.002 18
DSNNWeightedMag 49 0.979 0.986 0.983 0.001 18
2 DSNNMixedHeb 53 0.980 0.986 0.983 0.002 18
DSNNWeightedMag 49 0.979 0.985 0.983 0.002 18
3 DSNNMixedHeb 57 0.980 0.985 0.984 0.001 18
DSNNWeightedMag 52 0.981 0.985 0.983 0.001 18
  • No significant difference between the two approaches
What is the impact of early stopping?

In [29]:
agg(['pruning_early_stop'])


Out[29]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
pruning_early_stop
0 48 0.972 0.985 0.982 0.003 72
1 50 0.973 0.986 0.982 0.003 72
2 48 0.973 0.986 0.981 0.003 72
3 49 0.974 0.985 0.981 0.003 72

In [32]:
agg(['model', 'pruning_early_stop'])


Out[32]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
model pruning_early_stop
DSNNMixedHeb 0 46 0.972 0.985 0.982 0.003 36
1 49 0.973 0.985 0.982 0.003 36
2 52 0.974 0.986 0.982 0.003 36
3 50 0.974 0.985 0.982 0.003 36
DSNNWeightedMag 0 50 0.973 0.985 0.981 0.003 36
1 51 0.974 0.986 0.982 0.003 36
2 44 0.973 0.985 0.981 0.003 36
3 49 0.975 0.985 0.981 0.002 36

In [30]:
agg(['on_perc', 'pruning_early_stop'])


Out[30]:
val_acc_max_epoch val_acc_max model
round_mean min max mean std count
on_perc pruning_early_stop
0.1 0 49 0.972 0.983 0.980 0.003 36
1 48 0.973 0.983 0.980 0.003 36
2 44 0.973 0.984 0.980 0.003 36
3 44 0.974 0.983 0.980 0.002 36
0.2 0 47 0.980 0.985 0.983 0.001 36
1 51 0.979 0.986 0.983 0.001 36
2 51 0.979 0.986 0.983 0.002 36
3 54 0.980 0.985 0.983 0.001 36
  • Results are not strong enough. But it is has a slight increase in acc (0.1%), not greater than the standard deviation, when pruning stops at the first learning rate decay for WeightedMag model and at the second learning rate decay for DSSNMixedHeb models.

In [ ]: