In [ ]:
"""This area sets up the Jupyter environment.
Please do not modify anything in this cell.
"""
import os
import sys
import time

# Add project to PYTHONPATH for future use
sys.path.insert(1, os.path.join(sys.path[0], '..'))

# Import miscellaneous modules
from IPython.core.display import display, HTML

# Set CSS styling
with open('../admin/custom.css', 'r') as f:
    style = """<style>\n{}\n</style>""".format(f.read())
    display(HTML(style))

# Plots will be show inside the notebook
%matplotlib notebook
import matplotlib.pyplot as plt

Generative Adversarial Networks 1

In this notebook we will use what we have learned about artificial neural networks to explore generative modelling with generative adversarial networks.

Generative adversarial networks, or GANs, is a generative modelling methodology by Goodfellow et al. [1] from 2014 that has garnered much interest these past few years. It is based on the idea of transforming samples of latent variables $\mathbf{z}$ to samples $\mathbf{x}$ of a probability distribution that we would like to learn. The transformation is done via a differentiable function, which typically is defined as an artifical neural network.

When viewed through the lens of game theory, a GAN consists of a generator and an adversary called the discriminator. The generator network $\mathbf{G}$ produces samples $\mathbf{x}$ by transforming latent variables $\mathbf{z}$ with the help of a neural network. The adversary, the discriminator network $\mathbf{D}$, attempts to discriminate between the samples $\mathbf{x}$ generated by $\mathbf{G}$ and the training data. In other words, the discriminator seeks to detect whether the input data is fake or real. At the same time, the generator attempts to fool the discriminator by generating plausible samples. A GAN has converged when the discriminator can no longer differentiate between real data and samples generated by the generator.

Distinguishing between fake and real data sounds like something we have done several times before; indeed, it is a binary classification problem. The original formulation of GANs as a zero-sum game can be seen below:

$$ \begin{equation*} \underset{\mathbf{G}}{\arg\min}\max_{\mathbf{D}} \frac{1}{N}\sum_{i=1}^{N} \ln\mathbf{D}(\mathbf{x}_i)+\ln(1-\mathbf{D}(\mathbf{G}(\mathbf{z}_i))) \end{equation*} $$

We can see that the discriminator tries to maximise the log-likelihood of giving the correct prediction, whilst the generator tries to minimise this quantity. In practice, the training is done by alternating optimisation between the generator and the discriminator.

GANs are notoriously difficult to train, so for this and the next notebook we are going to have to do some simplifications. First, we are going to train on what we consider to be easy datasets:

  • Notebook 1: A 1-d multimodal distribution
  • Notebook 2: The MNIST database of handwritten digits

Secondly, most of the implementation will already be done for you; the focus will be on testing out different kinds of network definitions for the generator and the discriminator. This notebook will start with the 1-d multimodal distribution dataset, while the next one will handle the familiar MNIST dataset.

[1] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio “Generative Adversarial Nets” in: Advances in neural information processing systems 2014, pp. 2672–2680

Example: Multimodal distiribution

In this notebook we will use a GAN to generate sample from a 1-d multimodal distribution.

We will start loading our uni-dimensional data from a CSV file.

In the following snippet of code we will:
  • Load data from a CSV file
  • Plot the normalised data histogram

In [ ]:
import pandas as pd
import numpy as np
import admin.tools as tools

data = pd.read_csv('resources/multinomial.csv', index_col=0 )

Unpack the Pandas DataFrames to NumPy arrays:


In [ ]:
X_data = data.values

Plot normalised histogram:


In [ ]:
fig, ax = plt.subplots(1,1)
tools.plot_normalized_histogram(ax,X_data)
ax.set_title('Data Histogram')
ax.set_xlabel('Data Value')
ax.set_ylabel('Normalized frequency')
ax.grid()

fig.canvas.draw()
time.sleep(0.04)

Task I: Implement a Generator Network

As previously mentioned, the generator network is built to map a latent space to a specific data distribution.

In this task we will make a network that has as input a vector of zdim dimensions and is mapped to a pre-defined number of outputs. The number of outputs and its shape is defined by the data distribution we are learning.

Task: :
  • Make a network that accepts inputs where the shape is defined by `zdim` $\rightarrow$ `shape=(z_dim,)`
  • The number of outputs of your network need to be defined as `nb_outputs`
  • Reshape the final layer to be in the shape of `output_shape`
  • Since the data lies in the range [-1,1] try using the 'tanh' as the final activation function.

Keras references: Reshape()


In [ ]:
# Import some useful Keras libraries
import keras
from keras.models import Model
from keras.layers import *


def generator(z_dim, nb_outputs, output_shape):
    
    # Define the input_noise as a function of Input()
    latent_var = None

    # Insert the desired amount of layers for your network
    x = None

    # Map you latest layer to nb_outputs
    x = None

    # Reshape you data
    x = Reshape(output_shape)(x)

    model = Model(inputs=latent_var, outputs=x)

    return model

Now, let's build a generative network using the function you just made.

In the following code snippet we will:
  • Define the number of dimensions of the latent vector $\mathbf{z}$
  • Find out the shape of a sample of data
  • Compute numbers of dimensions in a sample of data
  • Create the network using your function
  • Display a summary of your generator network

In [ ]:
# Define the dimension of the latent vector
z_dim = 100

# Dimension of our sample
sample_dimentions = X_data[0].shape

# Calculate the number of dimensions in a sample
n_dimensions=1
for x in list(sample_dimentions):
    n_dimensions *= x

print('A sample of data has shape {} composed of {} dimension(s)'.format(sample_dimentions, n_dimensions))

# Create the generative network
G = generator(z_dim, n_dimensions, sample_dimentions)

# We recommend the followin optimizer
g_optim = keras.optimizers.adam(lr=0.002, beta_1=0.5, beta_2=0.999, epsilon=1e-08, decay=0.0)

# Compile network
G.compile (loss='binary_crossentropy', optimizer=g_optim)

# Network Summary
G.summary()

Task II: Implement a Discriminative Network

The discriminator network is a simple binary classifier where the output indicates the probability of the input data being real or fake.

Task:
  • Create a network where the input shape is `input_shape`
  • We recomend reshaping your network just after input. This way you can have a vector with shape `(None, nb_inputs)`
  • Implement a simple network that can classify data

Keras references: Reshape()


In [ ]:
def discriminator(input_shape, nb_inputs):
    # Define the network input to have input_shape shape
    input_x = None
    
    # Reshape the input
    x = None

    # Implement the rest of you classifier
    x = None

    # Get the output activation (binary classification)
    probabilities = Dense(1, activation='sigmoid')(x)
    
    model = Model(inputs=input_x, outputs=probabilities)

    return model

Now, let's build a discriminator network using the function you just made.

In the following code snippet we will:
  • Create the network using your function
  • Display a summary of your generator network

In [ ]:
# We already computed the shape and number of dimensions in a data sample
print('The data has shape {} composed of {} dimension(s)'.format(sample_dimentions, n_dimensions))

# Discriminative network
D = discriminator(sample_dimentions, n_dimensions)

# Recommended optimiser
d_optim = keras.optimizers.adam(lr=0.002, beta_1=0.5, beta_2=0.999, epsilon=1e-08, decay=0.0)

# Compile network
D.compile(loss='binary_crossentropy', optimizer=d_optim)

# Network summary
D.summary()

Putting the GAN Together

In the following code we will put the generator and discriminator together so we can train our adversarial model.

In the following code snippet we will:
  • Use the generator and discriminator to construct a GAN

In [ ]:
from keras.models import Sequential


def build(generator, discriminator):
    """Build a base model for a Generative Adversarial Networks.
    Parameters
    ----------
    generator : keras.engine.training.Model
        A keras model built either with keras.models ( Model, or Sequential ).
        This is the model that generates the data for the Generative Adversarial networks.
    Discriminator : keras.engine.training.Model
        A keras model built either with keras.models ( Model, or Sequential ).
        This is the model that is a binary classifier for REAL/GENERATED data.
    Returns
    -------
    (keras.engine.training.Model)
        It returns a Sequential Keras Model by connecting a Generator model to a
        Discriminator model.  [ generator-->discriminator]
    """
    model = Sequential()
    model.add(generator)
    discriminator.trainable = False
    model.add(discriminator)
    return model


# Create GAN
G_plus_D = build(G, D)
G_plus_D.compile(loss='binary_crossentropy', optimizer=g_optim)
D.trainable = True

Task III: Define Hyperparameters

Please define the following hyper-parameters to train your GAN.

Task: Please define the following hyperparameters to train your GAN:
  • Batch size
  • Number of training epochs

In [ ]:
BATCH_SIZE = 32
NB_EPOCHS = 50
In the following code snippet we will:
  • Train the constructed GAN
  • Live plot the histogram of the generated data

In [ ]:
# Figure for live plot
fig, ax = plt.subplots(1,1)

# Allocate space for noise variable
z = np.zeros((BATCH_SIZE, z_dim))

# n_bathces
number_of_batches = int(X_data.shape[0] / BATCH_SIZE)

for epoch in range(NB_EPOCHS):
    for index in range(number_of_batches):
        # Sample minimibath m=BATCH_SIZE from data generating distribution
        # in other words :
        # Grab a batch of the real data
        data_batch = X_data[index*BATCH_SIZE:(index+1)*BATCH_SIZE]

        # Sample minibatch of m= BATCH_SIZE noise samples
        # in other words, we sample from a uniform distribution
        z = np.random.uniform(-1, 1, (BATCH_SIZE,z_dim))

        # Sample minibatch m=BATCH_SIZE from data generating distribution Pdata
        # in ohter words
        # Use genrator to create new fake samples
        generated_batch = G.predict(z, verbose=0)

        # Update/Train discriminator D
        X = np.concatenate((data_batch, generated_batch))
        y = [1] * BATCH_SIZE + [0.0] * BATCH_SIZE

        d_loss = D.train_on_batch(X, y)

        # Sample minibatch of m= BATCH_SIZE noise samples
        # in other words, we sample from a uniform distribution
        z = np.random.uniform(-1, 1, (BATCH_SIZE,z_dim))
        
        #Update Generator while not updating discriminator
        D.trainable = False
        # to do gradient ascent we just flip the labels ...
        g_loss = G_plus_D.train_on_batch(z, [1] * BATCH_SIZE)
        D.trainable = True
        
        # Plot data every 10 mini batches
        if index % 10 == 0:
            ax.clear() 
            
            # Histogram of generated data
            tools.plot_normalized_histogram(ax , generated_batch.flatten(), color='b',label='Generated')
            
            # Histogram of real data
            tools.plot_normalized_histogram(ax , X_data, color='y',label='Real')

            
            # Plot details
            ax.legend()
            ax.grid()
            ax.set_xlim([-1,1])

            fig.canvas.draw()
            time.sleep(0.01)


    # End of epoch ....
    print("epoch %d : g_loss : %f  | d_loss : %f" % (epoch, g_loss,  d_loss))

In [ ]: