In [ ]:
"""This area sets up the Jupyter environment.
Please do not modify anything in this cell.
"""
import os
import sys
import time
# Add project to PYTHONPATH for future use
sys.path.insert(1, os.path.join(sys.path[0], '..'))
# Import miscellaneous modules
from IPython.core.display import display, HTML
# Set CSS styling
with open('../admin/custom.css', 'r') as f:
style = """<style>\n{}\n</style>""".format(f.read())
display(HTML(style))
# Plots will be show inside the notebook
%matplotlib notebook
import matplotlib.pyplot as plt
Generative adversarial networks, or GANs, is a generative modelling methodology by Goodfellow et al. [1] from 2014 that has garnered much interest these past few years. It is based on the idea of transforming samples of latent variables $\mathbf{z}$ to samples $\mathbf{x}$ of a probability distribution that we would like to learn. The transformation is done via a differentiable function, which typically is defined as an artifical neural network.
When viewed through the lens of game theory, a GAN consists of a generator and an adversary called the discriminator. The generator network $\mathbf{G}$ produces samples $\mathbf{x}$ by transforming latent variables $\mathbf{z}$ with the help of a neural network. The adversary, the discriminator network $\mathbf{D}$, attempts to discriminate between the samples $\mathbf{x}$ generated by $\mathbf{G}$ and the training data. In other words, the discriminator seeks to detect whether the input data is fake or real. At the same time, the generator attempts to fool the discriminator by generating plausible samples. A GAN has converged when the discriminator can no longer differentiate between real data and samples generated by the generator.
Distinguishing between fake and real data sounds like something we have done several times before; indeed, it is a binary classification problem. The original formulation of GANs as a zero-sum game can be seen below:
$$ \begin{equation*} \underset{\mathbf{G}}{\arg\min}\max_{\mathbf{D}} \frac{1}{N}\sum_{i=1}^{N} \ln\mathbf{D}(\mathbf{x}_i)+\ln(1-\mathbf{D}(\mathbf{G}(\mathbf{z}_i))) \end{equation*} $$We can see that the discriminator tries to maximise the log-likelihood of giving the correct prediction, whilst the generator tries to minimise this quantity. In practice, the training is done by alternating optimisation between the generator and the discriminator.
GANs are notoriously difficult to train, so for this and the next notebook we are going to have to do some simplifications. First, we are going to train on what we consider to be easy datasets:
Secondly, most of the implementation will already be done for you; the focus will be on testing out different kinds of network definitions for the generator and the discriminator. This notebook will start with the 1-d multimodal distribution dataset, while the next one will handle the familiar MNIST dataset.
[1] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio “Generative Adversarial Nets” in: Advances in neural information processing systems 2014, pp. 2672–2680
In [ ]:
import pandas as pd
import numpy as np
import admin.tools as tools
data = pd.read_csv('resources/multinomial.csv', index_col=0 )
Unpack the Pandas DataFrames to NumPy arrays:
In [ ]:
X_data = data.values
Plot normalised histogram:
In [ ]:
fig, ax = plt.subplots(1,1)
tools.plot_normalized_histogram(ax,X_data)
ax.set_title('Data Histogram')
ax.set_xlabel('Data Value')
ax.set_ylabel('Normalized frequency')
ax.grid()
fig.canvas.draw()
time.sleep(0.04)
As previously mentioned, the generator network is built to map a latent space to a specific data distribution.
In this task we will make a network that has as input a vector of zdim
dimensions and is mapped to a pre-defined number of outputs. The number of outputs and its shape is defined by the data distribution we are learning.
Keras references: Reshape()
In [ ]:
# Import some useful Keras libraries
import keras
from keras.models import Model
from keras.layers import *
def generator(z_dim, nb_outputs, output_shape):
# Define the input_noise as a function of Input()
latent_var = None
# Insert the desired amount of layers for your network
x = None
# Map you latest layer to nb_outputs
x = None
# Reshape you data
x = Reshape(output_shape)(x)
model = Model(inputs=latent_var, outputs=x)
return model
Now, let's build a generative network using the function you just made.
In [ ]:
# Define the dimension of the latent vector
z_dim = 100
# Dimension of our sample
sample_dimentions = X_data[0].shape
# Calculate the number of dimensions in a sample
n_dimensions=1
for x in list(sample_dimentions):
n_dimensions *= x
print('A sample of data has shape {} composed of {} dimension(s)'.format(sample_dimentions, n_dimensions))
# Create the generative network
G = generator(z_dim, n_dimensions, sample_dimentions)
# We recommend the followin optimizer
g_optim = keras.optimizers.adam(lr=0.002, beta_1=0.5, beta_2=0.999, epsilon=1e-08, decay=0.0)
# Compile network
G.compile (loss='binary_crossentropy', optimizer=g_optim)
# Network Summary
G.summary()
The discriminator network is a simple binary classifier where the output indicates the probability of the input data being real or fake.
Keras references: Reshape()
In [ ]:
def discriminator(input_shape, nb_inputs):
# Define the network input to have input_shape shape
input_x = None
# Reshape the input
x = None
# Implement the rest of you classifier
x = None
# Get the output activation (binary classification)
probabilities = Dense(1, activation='sigmoid')(x)
model = Model(inputs=input_x, outputs=probabilities)
return model
Now, let's build a discriminator network using the function you just made.
In [ ]:
# We already computed the shape and number of dimensions in a data sample
print('The data has shape {} composed of {} dimension(s)'.format(sample_dimentions, n_dimensions))
# Discriminative network
D = discriminator(sample_dimentions, n_dimensions)
# Recommended optimiser
d_optim = keras.optimizers.adam(lr=0.002, beta_1=0.5, beta_2=0.999, epsilon=1e-08, decay=0.0)
# Compile network
D.compile(loss='binary_crossentropy', optimizer=d_optim)
# Network summary
D.summary()
In [ ]:
from keras.models import Sequential
def build(generator, discriminator):
"""Build a base model for a Generative Adversarial Networks.
Parameters
----------
generator : keras.engine.training.Model
A keras model built either with keras.models ( Model, or Sequential ).
This is the model that generates the data for the Generative Adversarial networks.
Discriminator : keras.engine.training.Model
A keras model built either with keras.models ( Model, or Sequential ).
This is the model that is a binary classifier for REAL/GENERATED data.
Returns
-------
(keras.engine.training.Model)
It returns a Sequential Keras Model by connecting a Generator model to a
Discriminator model. [ generator-->discriminator]
"""
model = Sequential()
model.add(generator)
discriminator.trainable = False
model.add(discriminator)
return model
# Create GAN
G_plus_D = build(G, D)
G_plus_D.compile(loss='binary_crossentropy', optimizer=g_optim)
D.trainable = True
In [ ]:
BATCH_SIZE = 32
NB_EPOCHS = 50
In [ ]:
# Figure for live plot
fig, ax = plt.subplots(1,1)
# Allocate space for noise variable
z = np.zeros((BATCH_SIZE, z_dim))
# n_bathces
number_of_batches = int(X_data.shape[0] / BATCH_SIZE)
for epoch in range(NB_EPOCHS):
for index in range(number_of_batches):
# Sample minimibath m=BATCH_SIZE from data generating distribution
# in other words :
# Grab a batch of the real data
data_batch = X_data[index*BATCH_SIZE:(index+1)*BATCH_SIZE]
# Sample minibatch of m= BATCH_SIZE noise samples
# in other words, we sample from a uniform distribution
z = np.random.uniform(-1, 1, (BATCH_SIZE,z_dim))
# Sample minibatch m=BATCH_SIZE from data generating distribution Pdata
# in ohter words
# Use genrator to create new fake samples
generated_batch = G.predict(z, verbose=0)
# Update/Train discriminator D
X = np.concatenate((data_batch, generated_batch))
y = [1] * BATCH_SIZE + [0.0] * BATCH_SIZE
d_loss = D.train_on_batch(X, y)
# Sample minibatch of m= BATCH_SIZE noise samples
# in other words, we sample from a uniform distribution
z = np.random.uniform(-1, 1, (BATCH_SIZE,z_dim))
#Update Generator while not updating discriminator
D.trainable = False
# to do gradient ascent we just flip the labels ...
g_loss = G_plus_D.train_on_batch(z, [1] * BATCH_SIZE)
D.trainable = True
# Plot data every 10 mini batches
if index % 10 == 0:
ax.clear()
# Histogram of generated data
tools.plot_normalized_histogram(ax , generated_batch.flatten(), color='b',label='Generated')
# Histogram of real data
tools.plot_normalized_histogram(ax , X_data, color='y',label='Real')
# Plot details
ax.legend()
ax.grid()
ax.set_xlim([-1,1])
fig.canvas.draw()
time.sleep(0.01)
# End of epoch ....
print("epoch %d : g_loss : %f | d_loss : %f" % (epoch, g_loss, d_loss))
In [ ]: