First things first

Click File -> Save a copy in Drive and click Open in new tab in the pop-up window to save your progress in Google Drive.
Click Runtime -> Change runtime type and select GPU in Hardware accelerator box to enable faster GPU training.

Variational Autoencoder

In this assignment, you will build Variational Autoencoder, train it on the MNIST dataset, and play with its architecture and hyperparameters.



In [1]:

    
%tensorflow_version 1.x









    



TensorFlow 1.x selected.

Setup

Loading auxiliary files and importing the necessary libraries.



In [2]:

    
try:
    import google.colab
    IN_COLAB = True
except:
    IN_COLAB = False
if IN_COLAB:
    print("Downloading Colab files")
    ! shred -u setup_google_colab.py
    ! wget https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/setup_google_colab.py -O setup_google_colab.py
    import setup_google_colab
    setup_google_colab.load_data_week5()









    



Downloading Colab files
--2020-06-15 19:50:52--  https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/setup_google_colab.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1232 (1.2K) [text/plain]
Saving to: ‘setup_google_colab.py’

setup_google_colab. 100%[===================>]   1.20K  --.-KB/s    in 0s      

2020-06-15 19:50:52 (72.2 MB/s) - ‘setup_google_colab.py’ saved [1232/1232]

https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/week5/w5_grader.py w5_grader.py
https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/week5/test_data.npz test_data.npz



In [3]:

    
import tensorflow as tf
import keras
import numpy as np
import matplotlib.pyplot as plt

from keras.layers import Input, Dense, Lambda, InputLayer, concatenate
from keras.models import Model, Sequential
from keras import backend as K
from keras import metrics
from keras.datasets import mnist
from keras.utils import np_utils
from w5_grader import VAEGrader









    



Using TensorFlow backend.

Grading

We will create a grader instance below and use it to collect your answers. Note that these outputs will be stored locally inside grader and will be uploaded to the platform only after running submit function in the last part of this assignment. If you want to make a partial submission, you can run that cell anytime you want.



In [0]:

    
grader = VAEGrader()

Variational Autoencoder

Recall that Variational Autoencoder is a probabilistic model of data based on a continious mixture of distributions. In the lecture we covered the mixture of gaussians case, but here we will apply VAE to binary MNIST images (each pixel is either black or white). To better model binary data we will use a continuous mixture of binomial distributions: $p(x \mid w) = \int p(x \mid t, w) p(t) dt$, where the prior distribution on the latent code $t$ is standard normal $p(t) = \mathcal{N}(0, I)$, but probability that $(i, j)$-th pixel is black equals to $(i, j)$-th output of the decoder neural detwork: $p(x_{i, j} \mid t, w) = \text{decoder}(t, w)_{i, j}$.

To train this model we would like to maximize marginal log-likelihood of our dataset $\max_w \log p(X \mid w)$, but it's very hard to do computationally, so instead we maximize the Variational Lower Bound w.r.t. both the original parameters $w$ and variational distribution $q$ which we define as encoder neural network with parameters $\phi$ which takes input image $x$ and outputs parameters of the gaussian distribution $q(t \mid x, \phi)$: $\log p(X \mid w) \geq \mathcal{L}(w, \phi) \rightarrow \max_{w, \phi}$.

So overall our model looks as follows: encoder takes an image $x$, produces a distribution over latent codes $q(t \mid x)$ which should approximate the posterior distribution $p(t \mid x)$ (at least after training), samples a point from this distribution $\widehat{t} \sim q(t \mid x, \phi)$, and finally feeds it into a decoder that outputs a distribution over images.

In the lecture, we also discussed that variational lower bound has an expected value inside which we are going to approximate with sampling. But it is not trivial since we need to differentiate through this approximation. However, we learned about reparametrization trick which suggests instead of sampling from distribution $\widehat{t} \sim q(t \mid x, \phi)$ sample from a distribution which doesn't depend on any parameters, e.g. standard normal, and then deterministically transform this sample to the desired one: $\varepsilon \sim \mathcal{N}(0, I); ~~\widehat{t} = m(x, \phi) + \varepsilon \sigma(x, \phi)$. This way we don't have to worry about our stochastic gradient being biased and can straightforwardly differentiate our loss w.r.t. all the parameters while treating the current sample $\varepsilon$ as constant.

Negative Variational Lower Bound

Task 1 Derive and implement Variational Lower Bound for the continuous mixture of Binomial distributions.

Note that in lectures we discussed maximizing the VLB (which is typically a negative number), but in this assignment, for convenience, we will minimize the negated version of VLB (which will be a positive number) instead of maximizing the usual VLB. In what follows we always talk about negated VLB, even when we use the term VLB for short.

Also note that to pass the test, your code should work with any mini-batch size.

To do that, we need a stochastic estimate of VLB: $$\text{VLB} = \sum_{i=1}^N \text{VLB}_i \approx \frac{N}{M}\sum_{i_s}^M \text{VLB}_{i_s}$$ where $N$ is the dataset size, $\text{VLB}_i$ is the term of VLB corresponding to the $i$-th object, and $M$ is the mini-batch size. But instead of this stochastic estimate of the full VLB we will use an estimate of the negated VLB normalized by the dataset size, i.e. in the function below you need to return average across the mini-batch $-\frac{1}{M}\sum_{i_s}^M \text{VLB}_{i_s}$. People usually optimize this normalized version of VLB since it doesn't depend on the dataset set - you can write VLB function once and use it for different datasets - the dataset size won't affect the learning rate too much. The correct value for this normalized negated VLB should be around $100 - 170$ in the example below.



In [0]:

    
def vlb_binomial(x, x_decoded_mean, t_mean, t_log_var):
    """Returns the value of negative Variational Lower Bound
    
    The inputs are tf.Tensor
        x: (batch_size x number_of_pixels) matrix with one image per row with zeros and ones
        x_decoded_mean: (batch_size x number_of_pixels) mean of the distribution p(x | t), real numbers from 0 to 1
        t_mean: (batch_size x latent_dim) mean vector of the (normal) distribution q(t | x)
        t_log_var: (batch_size x latent_dim) logarithm of the variance vector of the (normal) distribution q(t | x)
    
    Returns:
        A tf.Tensor with one element (averaged across the batch), VLB
    """
    ### YOUR CODE HERE

    
    kl =  K.mean(0.5 * K.sum(-t_log_var + K.exp(t_log_var) + K.square(t_mean) - 1, axis=1))
    eq = K.mean(-K.sum(x * K.log(x_decoded_mean+1e-6) + (1-x) * K.log(1-x_decoded_mean+1e-6), axis=1))
    return  (eq+kl)



In [0]:

    
# Start tf session so we can run code.
#import tensorflow.compat.v1 as tfc
#sess = tf.compat.v1.keras.backend.get_session()

sess = tf.InteractiveSession()
# Connect keras to the created session.
K.set_session(sess)



In [7]:

    
grader.submit_vlb(sess, vlb_binomial)









    



Current answer for task 1 (vlb) is: 157.59485

Encoder / decoder definition

Task 2 Read the code below that defines encoder and decoder networks and implement sampling with reparametrization trick in the provided space.



In [8]:

    
batch_size = 100
original_dim = 784 # Number of pixels in MNIST images.
latent_dim = 100 # d, dimensionality of the latent code t.
intermediate_dim = 256 # Size of the hidden layer.
epochs = 50

x = Input(batch_shape=(batch_size, original_dim))
def create_encoder(input_dim):
    # Encoder network.
    # We instantiate these layers separately so as to reuse them later
    encoder = Sequential(name='encoder')
    encoder.add(InputLayer([input_dim]))
    encoder.add(Dense(intermediate_dim, activation='relu'))
    encoder.add(Dense(2 * latent_dim))
    return encoder
encoder = create_encoder(original_dim)

get_t_mean = Lambda(lambda h: h[:, :latent_dim])
get_t_log_var = Lambda(lambda h: h[:, latent_dim:])
h = encoder(x)
t_mean = get_t_mean(h)
t_log_var = get_t_log_var(h)

# Sampling from the distribution 
#     q(t | x) = N(t_mean, exp(t_log_var))
# with reparametrization trick.
def sampling(args):
    """Returns sample from a distribution N(args[0], diag(args[1]))
    
    The sample should be computed with reparametrization trick.
    
    The inputs are tf.Tensor
        args[0]: (batch_size x latent_dim) mean of the desired distribution
        args[1]: (batch_size x latent_dim) logarithm of the variance vector of the desired distribution
    
    Returns:
        A tf.Tensor of size (batch_size x latent_dim), the samples.
    """
    t_mean, t_log_var = args
    return t_mean + K.exp(0.5*t_log_var)* K.random_normal(t_mean.shape)
    # YOUR CODE HERE


t = Lambda(sampling)([t_mean, t_log_var])

def create_decoder(input_dim):
    # Decoder network
    # We instantiate these layers separately so as to reuse them later
    decoder = Sequential(name='decoder')
    decoder.add(InputLayer([input_dim]))
    decoder.add(Dense(intermediate_dim, activation='relu'))
    decoder.add(Dense(original_dim, activation='sigmoid'))
    return decoder
decoder = create_decoder(latent_dim)
x_decoded_mean = decoder(t)









    



WARNING:tensorflow:From /tensorflow-1.15.2/python3.6/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.



In [9]:

    
grader.submit_samples(sess, sampling)









    



Current answer for task 2.1 (samples mean) is: -0.1187539
Current answer for task 2.2 (samples var) is: 0.037108824

Training the model

Task 3 Run the cells below to train the model with the default settings. Modify the parameters to get better results. Especially pay attention to the encoder/decoder architectures (e.g. using more layers, maybe making them convolutional), learning rate, and the number of epochs.



In [0]:

    
loss = vlb_binomial(x, x_decoded_mean, t_mean, t_log_var)
vae = Model(x, x_decoded_mean)
# Keras will provide input (x) and output (x_decoded_mean) to the function that
# should construct loss, but since our function also depends on other
# things (e.g. t_means), it is easier to build the loss in advance and pass
# a function that always returns it.
vae.compile(optimizer=keras.optimizers.RMSprop(lr=0.001), loss=lambda x, y: loss)

Load and prepare the data



In [0]:

    
# train the VAE on MNIST digits
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# One hot encoding.
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

Train the model



In [12]:

    
hist = vae.fit(x=x_train, y=x_train,
               shuffle=True,
               epochs=epochs,
               batch_size=batch_size,
               validation_data=(x_test, x_test),
               verbose=2)









    



WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

Train on 60000 samples, validate on 10000 samples
Epoch 1/50
 - 3s - loss: 165.0246 - val_loss: 140.9925
Epoch 2/50
 - 3s - loss: 135.8820 - val_loss: 128.4452
Epoch 3/50
 - 3s - loss: 125.8290 - val_loss: 121.2854
Epoch 4/50
 - 3s - loss: 119.2361 - val_loss: 115.5471
Epoch 5/50
 - 3s - loss: 115.3627 - val_loss: 112.9687
Epoch 6/50
 - 3s - loss: 112.9840 - val_loss: 111.8860
Epoch 7/50
 - 3s - loss: 111.4230 - val_loss: 110.0520
Epoch 8/50
 - 3s - loss: 110.4536 - val_loss: 109.1489
Epoch 9/50
 - 3s - loss: 109.6748 - val_loss: 108.7086
Epoch 10/50
 - 3s - loss: 109.0772 - val_loss: 108.5137
Epoch 11/50
 - 3s - loss: 108.6208 - val_loss: 107.3852
Epoch 12/50
 - 3s - loss: 108.2660 - val_loss: 107.4420
Epoch 13/50
 - 3s - loss: 107.9351 - val_loss: 106.8445
Epoch 14/50
 - 3s - loss: 107.6423 - val_loss: 107.0053
Epoch 15/50
 - 3s - loss: 107.4362 - val_loss: 107.2013
Epoch 16/50
 - 3s - loss: 107.2575 - val_loss: 106.6446
Epoch 17/50
 - 3s - loss: 107.0397 - val_loss: 106.7603
Epoch 18/50
 - 3s - loss: 106.8910 - val_loss: 106.4562
Epoch 19/50
 - 3s - loss: 106.7507 - val_loss: 106.1641
Epoch 20/50
 - 3s - loss: 106.5980 - val_loss: 105.9526
Epoch 21/50
 - 3s - loss: 106.4738 - val_loss: 106.0330
Epoch 22/50
 - 3s - loss: 106.3938 - val_loss: 105.9787
Epoch 23/50
 - 3s - loss: 106.2420 - val_loss: 105.4662
Epoch 24/50
 - 3s - loss: 106.1378 - val_loss: 105.3826
Epoch 25/50
 - 3s - loss: 106.0685 - val_loss: 106.0999
Epoch 26/50
 - 3s - loss: 105.9935 - val_loss: 105.8477
Epoch 27/50
 - 3s - loss: 105.9107 - val_loss: 106.1496
Epoch 28/50
 - 3s - loss: 105.8450 - val_loss: 105.4200
Epoch 29/50
 - 3s - loss: 105.7771 - val_loss: 106.0636
Epoch 30/50
 - 3s - loss: 105.6860 - val_loss: 105.1811
Epoch 31/50
 - 3s - loss: 105.6288 - val_loss: 105.9227
Epoch 32/50
 - 3s - loss: 105.5555 - val_loss: 105.4339
Epoch 33/50
 - 3s - loss: 105.5126 - val_loss: 105.2827
Epoch 34/50
 - 3s - loss: 105.4223 - val_loss: 104.9255
Epoch 35/50
 - 3s - loss: 105.4051 - val_loss: 105.5562
Epoch 36/50
 - 3s - loss: 105.3554 - val_loss: 105.2869
Epoch 37/50
 - 3s - loss: 105.2884 - val_loss: 105.1023
Epoch 38/50
 - 3s - loss: 105.2799 - val_loss: 105.5781
Epoch 39/50
 - 3s - loss: 105.2062 - val_loss: 105.0202
Epoch 40/50
 - 3s - loss: 105.1820 - val_loss: 104.8091
Epoch 41/50
 - 3s - loss: 105.1510 - val_loss: 105.3650
Epoch 42/50
 - 3s - loss: 105.0821 - val_loss: 104.5676
Epoch 43/50
 - 3s - loss: 105.0488 - val_loss: 105.2462
Epoch 44/50
 - 3s - loss: 105.0132 - val_loss: 104.6552
Epoch 45/50
 - 3s - loss: 104.9646 - val_loss: 104.5682
Epoch 46/50
 - 3s - loss: 104.9377 - val_loss: 105.5012
Epoch 47/50
 - 3s - loss: 104.8943 - val_loss: 104.5357
Epoch 48/50
 - 3s - loss: 104.8764 - val_loss: 104.6857
Epoch 49/50
 - 3s - loss: 104.8377 - val_loss: 104.6887
Epoch 50/50
 - 3s - loss: 104.8331 - val_loss: 104.9659

Visualize reconstructions for train and validation data

In the picture below you can see the reconstruction ability of your network on training and validation data. In each of the two images, the left column is MNIST images and the right column is the corresponding image after passing through autoencoder (or more precisely the mean of the binomial distribution over the output images).

Note that getting the best possible reconstruction is not the point of VAE, the KL term of the objective specifically hurts the reconstruction performance. But the reconstruction should be anyway reasonable and they provide a visual debugging tool.



In [13]:

    
fig = plt.figure(figsize=(10, 10))
for fid_idx, (data, title) in enumerate(
            zip([x_train, x_test], ['Train', 'Validation'])):
    n = 10  # figure with 10 x 2 digits
    digit_size = 28
    figure = np.zeros((digit_size * n, digit_size * 2))
    decoded = sess.run(x_decoded_mean, feed_dict={x: data[:batch_size, :]})
    for i in range(10):
        figure[i * digit_size: (i + 1) * digit_size,
               :digit_size] = data[i, :].reshape(digit_size, digit_size)
        figure[i * digit_size: (i + 1) * digit_size,
               digit_size:] = decoded[i, :].reshape(digit_size, digit_size)
    ax = fig.add_subplot(1, 2, fid_idx + 1)
    ax.imshow(figure, cmap='Greys_r')
    ax.set_title(title)
    ax.axis('off')
plt.show()

Sending the results of your best model as Task 3 submission



In [14]:

    
grader.submit_best_val_loss(hist)









    



Current answer for task 3 (best val loss) is: 104.96589630126954

Hallucinating new data

Task 4 Write code to generate new samples of images from your trained VAE. To do that you have to sample from the prior distribution $p(t)$ and then from the likelihood $p(x \mid t)$.

Note that the sampling you've written in Task 2 was for the variational distribution $q(t \mid x)$, while here you need to sample from the prior.



In [0]:

    
n_samples = 10  # To pass automatic grading please use at least 2 samples here.
# YOUR CODE HERE.
# ...
# sampled_im_mean is a tf.Tensor of size 10 x 784 with 10 random
# images sampled from the vae model.
z = tf.random_normal((n_samples, latent_dim))
sampled_im_mean = decoder(z)



In [16]:

    
sampled_im_mean_np = sess.run(sampled_im_mean)
# Show the sampled images.
plt.figure()
for i in range(n_samples):
    ax = plt.subplot(n_samples // 5 + 1, 5, i + 1)
    plt.imshow(sampled_im_mean_np[i, :].reshape(28, 28), cmap='gray')
    ax.axis('off')
plt.show()



In [17]:

    
grader.submit_hallucinating(sess, sampled_im_mean)









    



Current answer for task 4.1 (hallucinating mean) is: 0.12460205
Current answer for task 4.2 (hallucinating var) is: 0.20432839

Conditional VAE

In the final task, you will modify your code to obtain Conditional Variational Autoencoder [1]. The idea is very simple: to be able to control the samples you generate, we condition all the distributions on some additional information. In our case, this additional information will be the class label (the digit on the image, from 0 to 9).

So now both the likelihood and the variational distributions are conditioned on the class label: $p(x \mid t, \text{label}, w)$, $q(t \mid x, \text{label}, \phi)$.

The only thing you have to change in your code is to concatenate input image $x$ with (one-hot) label of this image to pass into the encoder $q$ and to concatenate latent code $t$ with the same label to pass into the decoder $p$. Note that it's slightly harder to do with convolutional encoder/decoder model.

[1] Sohn, Kihyuk, Honglak Lee, and Xinchen Yan. “Learning Structured Output Representation using Deep Conditional Generative Models.” Advances in Neural Information Processing Systems. 2015.

Final task

Task 5.1 Implement CVAE model. You may reuse create_encoder and create_decoder modules defined previously (now you can see why they accept the input size as an argument ;) ). You may also need concatenate Keras layer to concatenate labels with input data and latent code.

To finish this task, you should go to Conditionally hallucinate data section and find there Task 5.2



In [0]:

    
# One-hot labels placeholder.
x = Input(batch_shape=(batch_size, original_dim))
label = Input(batch_shape=(batch_size, 10))

# YOUR CODE HERE.



cencoder = create_encoder(original_dim + 10)
stacked_x = concatenate([x, label])
h = cencoder(stacked_x)
cond_t_mean = get_t_mean(h)
cond_t_log_var = get_t_log_var(h)

t = Lambda(sampling)([cond_t_mean, cond_t_log_var])
stacked_t = concatenate([t, label])
cdecoder = create_decoder(latent_dim + 10)
#cond_t_mean =  
#cond_t_log_var = # Logarithm of the variance of the latent code (without label) for cvae model.
cond_x_decoded_mean = cdecoder(stacked_t) # Final output of the cvae model.

Define the loss and the model



In [0]:

    
conditional_loss = vlb_binomial(x, cond_x_decoded_mean, cond_t_mean, cond_t_log_var)
cvae = Model([x, label], cond_x_decoded_mean)
cvae.compile(optimizer=keras.optimizers.RMSprop(lr=0.001), loss=lambda x, y: conditional_loss)

Train the model



In [20]:

    
hist = cvae.fit(x=[x_train, y_train],
                y=x_train,
                shuffle=True,
                epochs=epochs,
                batch_size=batch_size,
                validation_data=([x_test, y_test], x_test),
                verbose=2)









    



Train on 60000 samples, validate on 10000 samples
Epoch 1/50
 - 3s - loss: 162.1430 - val_loss: 140.0692
Epoch 2/50
 - 3s - loss: 131.7862 - val_loss: 124.8866
Epoch 3/50
 - 3s - loss: 121.4543 - val_loss: 117.0139
Epoch 4/50
 - 3s - loss: 115.3008 - val_loss: 111.5167
Epoch 5/50
 - 3s - loss: 111.6758 - val_loss: 109.0942
Epoch 6/50
 - 3s - loss: 109.4082 - val_loss: 107.7561
Epoch 7/50
 - 3s - loss: 107.9546 - val_loss: 106.3933
Epoch 8/50
 - 3s - loss: 106.8876 - val_loss: 106.4017
Epoch 9/50
 - 3s - loss: 106.0540 - val_loss: 105.2602
Epoch 10/50
 - 3s - loss: 105.4428 - val_loss: 105.3779
Epoch 11/50
 - 3s - loss: 104.9480 - val_loss: 104.2526
Epoch 12/50
 - 3s - loss: 104.5444 - val_loss: 103.5976
Epoch 13/50
 - 3s - loss: 104.1486 - val_loss: 103.2618
Epoch 14/50
 - 3s - loss: 103.8265 - val_loss: 104.1324
Epoch 15/50
 - 3s - loss: 103.5985 - val_loss: 103.1551
Epoch 16/50
 - 3s - loss: 103.3926 - val_loss: 102.5668
Epoch 17/50
 - 3s - loss: 103.1419 - val_loss: 102.5262
Epoch 18/50
 - 3s - loss: 102.9544 - val_loss: 103.1231
Epoch 19/50
 - 3s - loss: 102.7934 - val_loss: 102.1779
Epoch 20/50
 - 3s - loss: 102.6107 - val_loss: 102.1554
Epoch 21/50
 - 3s - loss: 102.4814 - val_loss: 102.2242
Epoch 22/50
 - 3s - loss: 102.3271 - val_loss: 101.8822
Epoch 23/50
 - 3s - loss: 102.1746 - val_loss: 101.4033
Epoch 24/50
 - 3s - loss: 102.1067 - val_loss: 101.3745
Epoch 25/50
 - 3s - loss: 102.0168 - val_loss: 101.6865
Epoch 26/50
 - 3s - loss: 101.8892 - val_loss: 101.4390
Epoch 27/50
 - 3s - loss: 101.8146 - val_loss: 101.1223
Epoch 28/50
 - 3s - loss: 101.7060 - val_loss: 101.6858
Epoch 29/50
 - 3s - loss: 101.6065 - val_loss: 101.4806
Epoch 30/50
 - 3s - loss: 101.5486 - val_loss: 101.4052
Epoch 31/50
 - 3s - loss: 101.4319 - val_loss: 101.6371
Epoch 32/50
 - 3s - loss: 101.4195 - val_loss: 100.6070
Epoch 33/50
 - 3s - loss: 101.3473 - val_loss: 101.0900
Epoch 34/50
 - 3s - loss: 101.2243 - val_loss: 101.1381
Epoch 35/50
 - 3s - loss: 101.1696 - val_loss: 101.2052
Epoch 36/50
 - 3s - loss: 101.1181 - val_loss: 101.4627
Epoch 37/50
 - 3s - loss: 101.0615 - val_loss: 101.4759
Epoch 38/50
 - 3s - loss: 101.0084 - val_loss: 100.6846
Epoch 39/50
 - 3s - loss: 100.9300 - val_loss: 100.4458
Epoch 40/50
 - 3s - loss: 100.8469 - val_loss: 100.5292
Epoch 41/50
 - 3s - loss: 100.8339 - val_loss: 100.9035
Epoch 42/50
 - 3s - loss: 100.8180 - val_loss: 100.5002
Epoch 43/50
 - 3s - loss: 100.7338 - val_loss: 100.1439
Epoch 44/50
 - 3s - loss: 100.7049 - val_loss: 101.2499
Epoch 45/50
 - 3s - loss: 100.6839 - val_loss: 101.1791
Epoch 46/50
 - 3s - loss: 100.6119 - val_loss: 100.3744
Epoch 47/50
 - 3s - loss: 100.5633 - val_loss: 100.3483
Epoch 48/50
 - 3s - loss: 100.5427 - val_loss: 101.2223
Epoch 49/50
 - 3s - loss: 100.4781 - val_loss: 100.8245
Epoch 50/50
 - 3s - loss: 100.4526 - val_loss: 100.2016

Visualize reconstructions for train and validation data



In [21]:

    
fig = plt.figure(figsize=(10, 10))
for fid_idx, (x_data, y_data, title) in enumerate(
            zip([x_train, x_test], [y_train, y_test], ['Train', 'Validation'])):
    n = 10  # figure with 10 x 2 digits
    digit_size = 28
    figure = np.zeros((digit_size * n, digit_size * 2))
    decoded = sess.run(cond_x_decoded_mean,
                       feed_dict={x: x_data[:batch_size, :],
                                  label: y_data[:batch_size, :]})
    for i in range(10):
        figure[i * digit_size: (i + 1) * digit_size,
               :digit_size] = x_data[i, :].reshape(digit_size, digit_size)
        figure[i * digit_size: (i + 1) * digit_size,
               digit_size:] = decoded[i, :].reshape(digit_size, digit_size)
    ax = fig.add_subplot(1, 2, fid_idx + 1)
    ax.imshow(figure, cmap='Greys_r')
    ax.set_title(title)
    ax.axis('off')
plt.show()

Conditionally hallucinate data

Task 5.2 Implement the conditional sampling from the distribution $p(x \mid t, \text{label})$ by firstly sampling from the prior $p(t)$ and then sampling from the likelihood $p(x \mid t, \text{label})$.



In [0]:

    
# Prepare one hot labels of form
#   0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 ...
# to sample five zeros, five ones, etc
curr_labels = np.eye(10)
curr_labels = np.repeat(curr_labels, 5, axis=0)  # Its shape is 50 x 10.
# YOUR CODE HERE.
# ...
# cond_sampled_im_mean is a tf.Tensor of size 50 x 784 with 5 random zeros,
# then 5 random ones, etc sampled from the cvae model.
z = tf.random_normal((50, latent_dim))
labels = tf.convert_to_tensor(curr_labels, dtype=tf.float32)
stacked_z = concatenate([z, labels])
cond_sampled_im_mean = cdecoder(stacked_z)



In [23]:

    
cond_sampled_im_mean_np = sess.run(cond_sampled_im_mean)
# Show the sampled images.
plt.figure(figsize=(10, 10))
global_idx = 0
for digit in range(10):
    for _ in range(5):
        ax = plt.subplot(10, 5, global_idx + 1)
        plt.imshow(cond_sampled_im_mean_np[global_idx, :].reshape(28, 28), cmap='gray')
        ax.axis('off')
        global_idx += 1
plt.show()



In [24]:

    
# Submit Task 5 (both 5.1 and 5.2).
grader.submit_conditional_hallucinating(sess, cond_sampled_im_mean)









    



Current answer for task 5.1 (conditional hallucinating mean) is: 0.09457468693426802
Current answer for task 5.2 (conditional hallucinating var) is: 0.04358166107149503

Authorization & Submission

To submit assignment parts to Cousera platform, please, enter your e-mail and token into variables below. You can generate a token on this programming assignment's page. Note: The token expires 30 minutes after generation.



In [25]:

    
STUDENT_EMAIL = "saketkc@gmail.com" # EMAIL HERE
STUDENT_TOKEN =  "7vivXLipZ6P2kJdf" # TOKEN HERE

grader.status()









    



You want to submit these numbers:
Task 1 (vlb): 157.59485
Task 2.1 (samples mean): -0.1187539
Task 2.2 (samples var): 0.037108824
Task 3 (best val loss): 104.96589630126954
Task 4.1 (hallucinating mean): 0.12460205
Task 4.2 (hallucinating var): 0.20432839
Task 5.1 (conditional hallucinating mean): 0.09457468693426802
Task 5.2 (conditional hallucinating var): 0.04358166107149503



In [26]:

    
grader.submit(STUDENT_EMAIL, STUDENT_TOKEN)









    



Submitted to Coursera platform. See results on assignment page!

Playtime (UNGRADED)

Once you passed all the tests, modify the code above to work with the mixture of Gaussian distributions (in contrast to the mixture of Binomial distributions), and redo the experiments with CIFAR-10 dataset, which are full-color natural images with much more diverse structure.



In [0]:

    
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()



In [28]:

    
plt.imshow(x_train[7, :])
plt.show()



In [0]: