Facial composites are widely used in forensics to generate images of suspects. Since victim or witness usually isn't good at drawing, computer-aided generation is applied to reconstruct the face attacker. One of the most commonly used techniques is evolutionary systems that compose the final face from many predefined parts.
In this project, we will try to implement an app for creating a facial composite that will be able to construct desired faces without explicitly providing databases of templates. We will apply Variational Autoencoders and Gaussian processes for this task.
The final project is developed in a way that you can apply learned techniques to real project yourself. We will include the main guidelines and hints, but a great part of the project will need your creativity and experience from previous assignments.
In [1]:
%tensorflow_version 1.x
In [2]:
try:
import google.colab
IN_COLAB = True
except:
IN_COLAB = False
if IN_COLAB:
print("Downloading Colab files")
! shred -u setup_google_colab.py
! wget https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/setup_google_colab.py -O setup_google_colab.py
import setup_google_colab
setup_google_colab.load_data_final_project()
In [3]:
! pip install GPy gpyopt
In [4]:
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
import tensorflow as tf
import GPy
import GPyOpt
import keras
from keras.layers import Input, Dense, Lambda, InputLayer, concatenate, Activation, Flatten, Reshape
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D, Deconv2D
from keras.losses import MSE
from keras.models import Model, Sequential
from keras import backend as K
from keras import metrics
from keras.datasets import mnist
from keras.utils import np_utils
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import utils
import os
%matplotlib inline
As some of the final project tasks can be graded only visually, the final assignment is graded using the peer-review procedure. You will be asked to upload your Jupyter notebook on the web and attach a link to it in the submission form. Detailed submission instructions and grading criterions are written at the end of this notebook.
We will first train variational autoencoder on face images to compress them to low dimension. One important feature of VAE is that constructed latent space is dense. That means that we can traverse the latent space and reconstruct any point along our path into a valid face.
Using this continuous latent space we can use Bayesian optimization to maximize some similarity function between a person's face in victim/witness's memory and a face reconstructed from the current point of latent space. Bayesian optimization is an appropriate choice here since people start to forget details about the attacker after they were shown many similar photos. Because of this, we want to reconstruct the photo with the smallest possible number of trials.
For this task, you will need to use some database of face images. There are multiple datasets available on the web that you can use: for example, CelebA or Labeled Faces in the Wild. We used Aligned & Cropped version of CelebA that you can find here to pretrain VAE model for you. See optional part of the final project if you wish to train VAE on your own.
Task 1: Train VAE on faces dataset and draw some samples from it. (You can use code from previous assignments. You may also want to use convolutional encoders and decoders as well as tuning hyperparameters)
In [5]:
sess = tf.InteractiveSession()
K.set_session(sess)
In [6]:
latent_size = 8
In [7]:
vae, encoder, decoder = utils.create_vae(batch_size=128, latent=latent_size)
sess.run(tf.global_variables_initializer())
vae.load_weights('CelebA_VAE_small_8.h5')
In [8]:
K.set_learning_phase(False)
In [9]:
latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))
decode = decoder(latent_placeholder)
As the first part of the assignment, you need to become familiar with the trained model. For all tasks, you will only need a decoder to reconstruct samples from a latent space.
To decode the latent variable, you need to run decode
operation defined above with random samples from a standard normal distribution.
In [10]:
### TODO: Draw 25 samples from VAE here
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
z = tf.random_normal((1, latent_size))
image = sess.run(decoder(z)).reshape(64, 64, 3)
plt.imshow(np.clip(image, 0, 1))
plt.axis('off')
Now that we have a way to reconstruct images, we need to set up an optimization procedure to find a person that will be the most similar to the one we are thinking about. To do so, we need to set up some scoring utility. Imagine that you want to generate an image of Brad Pitt. You start with a small number of random samples, say 5, and rank them according to their similarity to your vision of Brad Pitt: 1 for the worst, 5 for the best. You then rate image by image using GPyOpt that works in a latent space of VAE. For the new image, you need to somehow assign a real number that will show how good this image is. The simple idea is to ask a user to compare a new image with previous images (along with their scores). A user then enters score to a current image.
The proposed scoring has a lot of drawbacks, and you may feel free to come up with new ones: e.g. showing user 9 different images and asking a user which image looks the "best".
Note that the goal of this task is for you to implement a new algorithm by yourself. You may try different techniques for your task and select one that works the best.
Task 2: Implement person search using Bayesian optimization. (You can use code from the assignment on Gaussian Processes)
Note: try varying acquisition_type
and acquisition_par
parameters.
In [13]:
class FacialComposit:
def __init__(self, decoder, latent_size):
self.latent_size = latent_size
self.latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))
self.decode = decoder(self.latent_placeholder)
self.samples = None
self.images = None
self.rating = None
def _get_image(self, latent):
img = sess.run(self.decode,
feed_dict={self.latent_placeholder: latent[None, :]})[0]
img = np.clip(img, 0, 1)
return img
@staticmethod
def _show_images(images, titles):
assert len(images) == len(titles)
clear_output()
plt.figure(figsize=(3*len(images), 3))
n = len(titles)
for i in range(n):
plt.subplot(1, n, i+1)
plt.imshow(images[i])
plt.title(str(titles[i]))
plt.axis('off')
plt.show()
@staticmethod
def _draw_border(image, w=2):
bordred_image = image.copy()
bordred_image[:, :w] = [1, 0, 0]
bordred_image[:, -w:] = [1, 0, 0]
bordred_image[:w, :] = [1, 0, 0]
bordred_image[-w:, :] = [1, 0, 0]
return bordred_image
def query_initial(self, n_start=5, select_top=None):
'''
Creates initial points for Bayesian optimization
Generate *n_start* random images and asks user to rank them.
Gives maximum score to the best image and minimum to the worst.
:param n_start: number of images to rank initialy.
:param select_top: number of images to keep
'''
if select_top is None:
select_top = n_start
samples = np.random.normal(size=(n_start,self.latent_size))
images = np.array([self._get_image(samples[i]) for i in range(n_start)])
titles = ['img-{}'.format(i+1) for i in range(n_start)]
self._show_images(images, titles)
ratings = []
for i in range(select_top):
rating = input("Enter rating for img-{}: ".format(i+1))
rating = float(rating)
rating = min(max(0, rating), 10)
ratings.append(rating)
select_top_images = np.argsort(ratings)[::-1][:select_top]
self.samples = samples[select_top_images]### YOUR CODE HERE (size: select_top x 64 x 64 x 3)
self.images = images[select_top_images]### YOUR CODE HERE (size: select_top x 64 x 64 x 3)
self.rating = np.array(ratings)[select_top_images] ### YOUR CODE HERE (size: select_top)
# Check that tensor sizes are correct
np.testing.assert_equal(self.rating.shape, [select_top])
np.testing.assert_equal(self.images.shape, [select_top, 64, 64, 3])
np.testing.assert_equal(self.samples.shape, [select_top, self.latent_size])
def evaluate(self, candidate):
'''
Queries candidate vs known image set.
Adds candidate into images pool.
:param candidate: latent vector of size 1xlatent_size
'''
initial_size = len(self.images)
### YOUR CODE HERE
## Show user an image and ask to assign score to it.
## You may want to show some images to user along with their scores
## You should also save candidate, corresponding image and rating
candidate_image = self._get_image(candidate[0]).reshape(1,64,64,3)
def _diff(query, target):
query = np.asarray(query)
idx = (np.abs(query - target)).argmin()
return idx
avg_rating_idx = _diff(self.rating, np.average(self.rating))
select_image_idx = np.array([0, avg_rating_idx, initial_size - 1])
images = self.images[select_image_idx]
ratings = self.rating[select_image_idx]
comp = np.append(images, candidate_image, axis=0)
self._show_images(comp, [f'Best (r:{ratings[0]})',
f'Avg (r:{ratings[1]})',
f'Worst (r:{ratings[2]})', 'Candidate'])
print("Rate the new candidate comparing it to th previous..")
candidate_rating = input("Candidate img rating: ")
candidate_rating = float(candidate_rating)
candidate_rating = min(max(0, candidate_rating), 10)
rating = np.append(self.rating, candidate_rating)
sorted_rating_idx = np.argsort(rating)[::-1]
self.images = np.append(self.images, candidate_image, axis=0)[sorted_rating_idx]
self.rating = rating[sorted_rating_idx]
self.samples = np.append(self.samples, candidate, axis=0)[sorted_rating_idx]
assert len(self.images) == initial_size + 1
assert len(self.rating) == initial_size + 1
assert len(self.samples) == initial_size + 1
return candidate_rating
def optimize(self, n_iter=10, w=4, acquisition_type='MPI', acquisition_par=0.3):
if self.samples is None:
self.query_initial()
bounds = [{'name': 'z_{0:03d}'.format(i),
'type': 'continuous',
'domain': (-w, w)}
for i in range(self.latent_size)]
optimizer = GPyOpt.methods.BayesianOptimization(f=self.evaluate, domain=bounds,
acquisition_type = acquisition_type,
acquisition_par = acquisition_par,
exact_eval=False, # Since we are not sure
model_type='GP',
X=self.samples,
Y=self.rating[:, None],
maximize=True)
optimizer.run_optimization(max_iter=n_iter, eps=-1)
def get_best(self):
index_best = np.argmax(self.rating)
return self.images[index_best]
def draw_best(self, title=''):
index_best = np.argmax(self.rating)
image = self.images[index_best]
plt.imshow(image)
plt.title(title)
plt.axis('off')
plt.show()
How do you assign a score to a new image?
Randomly generate 10 images from the decoder and ask the user to rateall these based on their closeness of features to the target image (as desired by the user). Then the top candidates are chosen for initiating the Bayesian Optimization. For the next few iterations a new sample ois generated and presented to ht user for rating.
How do you select reference images to help user assign a new score?
The user is provided with a list of best, average and the worst images so far as he/she is asked to rate the candidate with repect to previous ratings.
What are the limitations of your approach?
Since the optimization depends on user input and there is no check in place to see if the user is indeed going towards an optimum, it is possible to get completely random results.
In these sections, we will apply the implemented app to search for different people. Each task will ask you to generate images that will have some property like "dark hair" or "mustache". You will need to run your search algorithm and provide the best discovered image.
In [14]:
composit = FacialComposit(decoder, 8)
composit.optimize()
In [15]:
composit.draw_best('Darkest hair')
In [16]:
composit = FacialComposit(decoder, 8)
composit.optimize()
In [17]:
composit.draw_best('Widest smile')
Note: this task highly depends on the quality of a VAE and a search algorithm. You may need to restart your search algorithm a few times and start with larget initial set.
In [18]:
composit = FacialComposit(decoder, 8)
composit.optimize()
In [19]:
composit.draw_best('Lecturer')
Now that you have a good sense of what your algorithm can do, here is an optional assignment for you. Think of a famous person and take look at his/her picture for a minute. Then use your app to create an image of the person you thought of. You can post it in the forum Final project: guess who!
In [ ]:
### Your code here
You need to share this notebook via a link: click SHARE
(in the top-right corner) $\rightarrow$ Get shareable link
and then paste the link into the assignment page.
Note that the reviewers always see the current version of the notebook, so please do not remove or change the contents of the notebook until the review is done.
Please upload your notebook to Colab and share it via the link or upload the notebook to any file sharing service (e.g. Dropbox) and submit the link to the notebook through https://nbviewer.jupyter.org/, so by clicking on the link the reviewers will see it's contents (and will not need to download the file).