From ~/Downloads/github/pylearn2/pylearn2/scripts/tutorials/stacked_autoencoders


In [10]:
target = 'Dog_1'

In [11]:
target2batch_size = {'Dog_1':128}

Stacked Autoencoders

by Mehdi Mirza

Introduction

This notebook will show you how to perform layer-wise pre-training using denoising autoencoders (DAEs), and subsequently stack the layers to form a multilayer perceptron (MLP) which can be fine-tuned using supervised training. You can also look at this more detailed tutorial of training DAEs using Theano as well as this tutorial which covers the stacked version.

The methods used here can easily be adapted to other models such as contractive auto-encoders (CAEs) or restricted Boltzmann machines (RBMs) with only small modifications.

First layer

The first layer and its training algorithm are defined in the file dae_l1.yaml. Here we load the model and set some of its hypyerparameters.


In [13]:
layer1_yaml = open('my_dae_l1.yaml', 'r').read()
hyper_params_l1 = {
                   'batch_size' : target2batch_size.get(target, 128),
                   'monitoring_batches' : 5,
                   'nvis' : 3996,
                   'target' : target,
                   'nhid' : 1000,
                   'max_epochs' : 10,
                   'save_path' : '../data-cache'}
layer1_yaml = layer1_yaml % (hyper_params_l1)
print layer1_yaml


!obj:pylearn2.train.Train {
    dataset: &train !obj:my_pylearn2_dataset.MyPyLearn2Dataset {
        target: "Dog_1",
        # TODO: the one_hot: 1 is only necessary because one_hot: 0 is
        # broken, remove it after one_hot: 0 is fixed.
        one_hot: 1,
        skip: 20,
    },
    model: !obj:pylearn2.models.autoencoder.DenoisingAutoencoder {
        nvis : 3996,
        nhid : 1000,
        irange : 0.05,
        corruptor: !obj:pylearn2.corruption.BinomialCorruptor {
            corruption_level: .2,
        },
        act_enc: "tanh",
        act_dec: null,    # Linear activation on the decoder side.
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate : 1e-3,
        batch_size : 128,
        monitoring_batches : 5,
        monitoring_dataset : *train,
        cost : !obj:pylearn2.costs.autoencoder.MeanSquaredReconstructionError {},
        termination_criterion : !obj:pylearn2.termination_criteria.EpochCounter {
            max_epochs: 10,
        },
    },
    save_path: "../data-cache/Dog_1_dae_l1.pkl",
    save_freq: 1
}

Now we can train the model using the YAML string in the same way as the previous tutorials:


In [14]:
from pylearn2.config import yaml_parse
train = yaml_parse.load(layer1_yaml)
train.main_loop()


Number of segments 504
Number of channels 16
Number of examples 24192
Number of features 3996
(24192, 3996) (24192, 2) [ 0.5  0.5  0.5 ...,  0.5  0.5  0.5]
time 52s
Parameter and initial learning rate summary:
	vb: 0.001
	hb: 0.001
	W: 0.001
	Wprime: 0.001
Compiling sgd_update...
Compiling sgd_update done. Time elapsed: 0.300831 seconds
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.036552 seconds
Monitored channels: 
	learning_rate
	objective
	total_seconds_last_epoch
	training_seconds_this_epoch
Compiling accum...
graph size: 19
Compiling accum done. Time elapsed: 0.249010 seconds
Monitoring step:
	Epochs seen: 0
	Batches seen: 0
	Examples seen: 0
	learning_rate: 0.001
	objective: 4244.32909518
	total_seconds_last_epoch: 0.0
	training_seconds_this_epoch: 0.0
Time this epoch: 19.008504 seconds
Monitoring step:
	Epochs seen: 1
	Batches seen: 189
	Examples seen: 24192
	learning_rate: 0.001
	objective: 2562.62217026
	total_seconds_last_epoch: 0.0
	training_seconds_this_epoch: 19.008504
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.692388 seconds
Time this epoch: 21.530655 seconds
Monitoring step:
	Epochs seen: 2
	Batches seen: 378
	Examples seen: 48384
	learning_rate: 0.001
	objective: 2079.31153631
	total_seconds_last_epoch: 33.847182
	training_seconds_this_epoch: 21.530655
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.628441 seconds
Time this epoch: 23.530632 seconds
Monitoring step:
	Epochs seen: 3
	Batches seen: 567
	Examples seen: 72576
	learning_rate: 0.001
	objective: 1811.81045629
	total_seconds_last_epoch: 37.811571
	training_seconds_this_epoch: 23.530632
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.694599 seconds
Time this epoch: 20.020923 seconds
Monitoring step:
	Epochs seen: 4
	Batches seen: 756
	Examples seen: 96768
	learning_rate: 0.001
	objective: 1625.58644433
	total_seconds_last_epoch: 39.845403
	training_seconds_this_epoch: 20.020923
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.468819 seconds
Time this epoch: 20.248166 seconds
Monitoring step:
	Epochs seen: 5
	Batches seen: 945
	Examples seen: 120960
	learning_rate: 0.001
	objective: 1514.78205021
	total_seconds_last_epoch: 34.826727
	training_seconds_this_epoch: 20.248166
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.489679 seconds
Time this epoch: 19.717466 seconds
Monitoring step:
	Epochs seen: 6
	Batches seen: 1134
	Examples seen: 145152
	learning_rate: 0.001
	objective: 1410.87572642
	total_seconds_last_epoch: 35.065903
	training_seconds_this_epoch: 19.717466
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.349537 seconds
Time this epoch: 21.022654 seconds
Monitoring step:
	Epochs seen: 7
	Batches seen: 1323
	Examples seen: 169344
	learning_rate: 0.001
	objective: 1485.76221057
	total_seconds_last_epoch: 34.011133
	training_seconds_this_epoch: 21.022654
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.332337 seconds
Time this epoch: 18.538898 seconds
Monitoring step:
	Epochs seen: 8
	Batches seen: 1512
	Examples seen: 193536
	learning_rate: 0.001
	objective: 1293.59967292
	total_seconds_last_epoch: 35.659905
	training_seconds_this_epoch: 18.538898
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.448066 seconds
Time this epoch: 24.358437 seconds
Monitoring step:
	Epochs seen: 9
	Batches seen: 1701
	Examples seen: 217728
	learning_rate: 0.001
	objective: 1209.51421808
	total_seconds_last_epoch: 32.772341
	training_seconds_this_epoch: 24.358437
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.729353 seconds
Time this epoch: 23.948276 seconds
Monitoring step:
	Epochs seen: 10
	Batches seen: 1890
	Examples seen: 241920
	learning_rate: 0.001
	objective: 1162.51018096
	total_seconds_last_epoch: 40.645838
	training_seconds_this_epoch: 23.948276
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.511518 seconds
Saving to ../data-cache/Dog_1_dae_l1.pkl...
Saving to ../data-cache/Dog_1_dae_l1.pkl done. Time elapsed: 5.368170 seconds

Second layer

The second layer takes the output of the first layer as its input. Hence we must first apply the first layer's transformations to the raw data using datasets.transformer_dataset.TransformerDataset. This class takes two arguments:

  • raw: the raw data
  • transformer: a Pylearn2 block that transforms the raw data, which in our case is the dae_l1.pkl file from the previous step

To train the second layer, we load the YAML file as before and set the hyperparameters before starting the training loop.


In [15]:
layer2_yaml = open('my_dae_l2.yaml', 'r').read()
hyper_params_l2 = {
                   'target' : target,
                   'batch_size' : target2batch_size.get(target, 128),
                   'monitoring_batches' : 5,
                   'nvis' : hyper_params_l1['nhid'],
                   'nhid' : 500,
                   'max_epochs' : 10,
                   'save_path' : '../data-cache'}
layer2_yaml = layer2_yaml % (hyper_params_l2)
print layer2_yaml


!obj:pylearn2.train.Train {
    dataset: &train !obj:pylearn2.datasets.transformer_dataset.TransformerDataset {
        raw: !obj:my_pylearn2_dataset.MyPyLearn2Dataset {
            target: "Dog_1",
            # TODO: the one_hot: 1 is only necessary because one_hot: 0 is
            # broken, remove it after one_hot: 0 is fixed.
            one_hot: 1,
            skip: 20,
        },
        transformer: !pkl: "../data-cache/Dog_1_dae_l1.pkl"
    },
    model: !obj:pylearn2.models.autoencoder.DenoisingAutoencoder {
        nvis : 1000,
        nhid : 500,
        irange : 0.05,
        corruptor: !obj:pylearn2.corruption.BinomialCorruptor {
            corruption_level: .3,
        },
        act_enc: "tanh",
        act_dec: null,    # Linear activation on the decoder side.
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate : 1e-3,
        batch_size : 128,
        monitoring_batches : 5,
        monitoring_dataset : *train,
        cost : !obj:pylearn2.costs.autoencoder.MeanSquaredReconstructionError {},
        termination_criterion : !obj:pylearn2.termination_criteria.EpochCounter {
            max_epochs: 10,
        },
    },
    save_path: "../data-cache/Dog_1_dae_l2.pkl",
    save_freq: 1
}


In [16]:
train = yaml_parse.load(layer2_yaml)
train.main_loop()


Number of segments 504
Number of channels 16
Number of examples 24192
Number of features 3996
(24192, 3996) (24192, 2) [ 0.5  0.5  0.5 ...,  0.5  0.5  0.5]
time 57s
Parameter and initial learning rate summary:
	vb: 0.001
	hb: 0.001
	W: 0.001
	Wprime: 0.001
Compiling sgd_update...
Compiling sgd_update done. Time elapsed: 0.428997 seconds
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.073893 seconds
Monitored channels: 
	learning_rate
	objective
	total_seconds_last_epoch
	training_seconds_this_epoch
Compiling accum...
graph size: 19
Compiling accum done. Time elapsed: 0.192914 seconds
Monitoring step:
	Epochs seen: 0
	Batches seen: 0
	Examples seen: 0
	learning_rate: 0.001
	objective: 562.973165818
	total_seconds_last_epoch: 0.0
	training_seconds_this_epoch: 0.0
Time this epoch: 7.786953 seconds
Monitoring step:
	Epochs seen: 1
	Batches seen: 189
	Examples seen: 24192
	learning_rate: 0.001
	objective: 518.411233016
	total_seconds_last_epoch: 0.0
	training_seconds_this_epoch: 7.786953
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.809702 seconds
Time this epoch: 7.130184 seconds
Monitoring step:
	Epochs seen: 2
	Batches seen: 378
	Examples seen: 48384
	learning_rate: 0.001
	objective: 485.426489799
	total_seconds_last_epoch: 13.967449
	training_seconds_this_epoch: 7.130184
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.792078 seconds
Time this epoch: 7.280351 seconds
Monitoring step:
	Epochs seen: 3
	Batches seen: 567
	Examples seen: 72576
	learning_rate: 0.001
	objective: 453.795471769
	total_seconds_last_epoch: 12.759258
	training_seconds_this_epoch: 7.280351
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.786692 seconds
Time this epoch: 7.185426 seconds
Monitoring step:
	Epochs seen: 4
	Batches seen: 756
	Examples seen: 96768
	learning_rate: 0.001
	objective: 423.199625832
	total_seconds_last_epoch: 12.858925
	training_seconds_this_epoch: 7.185426
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.783186 seconds
Time this epoch: 6.874295 seconds
Monitoring step:
	Epochs seen: 5
	Batches seen: 945
	Examples seen: 120960
	learning_rate: 0.001
	objective: 395.382596971
	total_seconds_last_epoch: 12.706713
	training_seconds_this_epoch: 6.874295
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.782726 seconds
Time this epoch: 7.172230 seconds
Monitoring step:
	Epochs seen: 6
	Batches seen: 1134
	Examples seen: 145152
	learning_rate: 0.001
	objective: 371.212530953
	total_seconds_last_epoch: 12.448513
	training_seconds_this_epoch: 7.17223
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.804245 seconds
Time this epoch: 7.120830 seconds
Monitoring step:
	Epochs seen: 7
	Batches seen: 1323
	Examples seen: 169344
	learning_rate: 0.001
	objective: 350.468058904
	total_seconds_last_epoch: 12.808857
	training_seconds_this_epoch: 7.12083
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.778049 seconds
Time this epoch: 7.187064 seconds
Monitoring step:
	Epochs seen: 8
	Batches seen: 1512
	Examples seen: 193536
	learning_rate: 0.001
	objective: 332.553496939
	total_seconds_last_epoch: 12.568535
	training_seconds_this_epoch: 7.187064
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.795121 seconds
Time this epoch: 7.924122 seconds
Monitoring step:
	Epochs seen: 9
	Batches seen: 1701
	Examples seen: 217728
	learning_rate: 0.001
	objective: 317.197505656
	total_seconds_last_epoch: 12.634313
	training_seconds_this_epoch: 7.924122
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.799766 seconds
Time this epoch: 6.698593 seconds
Monitoring step:
	Epochs seen: 10
	Batches seen: 1890
	Examples seen: 241920
	learning_rate: 0.001
	objective: 303.752628152
	total_seconds_last_epoch: 13.920985
	training_seconds_this_epoch: 6.698593
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.785581 seconds
Saving to ../data-cache/Dog_1_dae_l2.pkl...
Saving to ../data-cache/Dog_1_dae_l2.pkl done. Time elapsed: 0.789089 seconds