DeepDream is a computer vision program created by Google which uses a Convolutional Neural Network to find and enhance patterns in images which is basically creating dreamlike hallucinogenic appearance.
For showing the implementation of DeepDream, we will be using the Inception Model (deep convolutional network) and TensorFlow. The Inception Model has many layers and TensorFlow is used in order to generate a gradient
In [ ]:
from IPython.display import Image, display
This is the main function of the algorithm. The function takes input the layer-tensor (0-11), the image to be processed, the number of iterations, step size, tile size and show_gradient( to show the intermediate graphs). The function first obtains the gradient for the tensor layer which is basically first squares the tensor, then calculates the reduce_mean and then finds the gradient of this mean on the default graph. Once we obtain the gradient, we then iterate (the number of optimization we want to run) to blend the image with the patterns. The value of gradient is calculated to understand how we can change the image so as to maximize the mean of the given layer-tensor. The gradient is blurred in order to enhance the patterns and obtain a more smooth image. Finally the image is updated with the calculated gradient and this process is repeated for the number of iterations (by default it is 10).
Since the Inception Model was trained for a very low resolution images (200-300 pixels) in order to get proper results, the input image is downscaled and deepdream is run. But with downscaling the image, the results of the algorithm are not good, so the process of downscaling the image and running deep dream is done recursively to obtain proper patterns in the output image. Thus first the image is downscaled as per the num of repeats, now each of the downscaled image is passed to the optimize_image function along with adding it with the upscaled image. Thus we finally get the same size image as the original with enhanced patterns.
In [ ]:
# Imports
get_ipython().magic('matplotlib inline') #2Dplotting lobrary which produces publication quality figures
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np #for scientific computing in python
import random
import math
# Image manipulation.
import PIL.Image
from scipy.ndimage.filters import gaussian_filter
from random import randrange
In [ ]:
import inception5h
Download the data for Inception Model (if it doesn't exists)
In [ ]:
inception5h.maybe_download()
Load the Inception Model
In [ ]:
model = inception5h.Inception5h()
Layers in Inception Model used for this implementation : 12
In [ ]:
len(model.layer_tensors)
# printing the first model. Shows: **************************************************************
model.layer_tensors[0]
In [ ]:
def load_image(imageFileName):
image = PIL.Image.open(imageFileName)
return np.float32(image)
In [ ]:
img = load_image('images/elon_musk_100x100.jpg')
# print(img)
In [ ]:
def save_image(image, filename):
# Ensure the pixel-values are between 0 and 255.
image = np.clip(image, 0.0, 255.0)
# Convert to bytes.
image = image.astype(np.uint8)
# Write the image-file in jpeg-format.
with open(filename, 'wb') as file:
PIL.Image.fromarray(image).save(file, 'jpeg')
Plot the image using the PIL since matplotlib gives low resolution images.
In [ ]:
def plot_image(image):
# Assume the pixel-values are scaled between 0 and 255.
if False:
# Convert the pixel-values to the range between 0.0 and 1.0
image = np.clip(image/255.0, 0.0, 1.0)
# Plot using matplotlib.
plt.imshow(image, interpolation='lanczos')
plt.show()
else:
# Ensure the pixel-values are between 0 and 255.
image = np.clip(image, 0.0, 255.0)
# Convert pixels to bytes.
image = image.astype(np.uint8)
# Convert to a PIL-image and display it.
display(PIL.Image.fromarray(image))
In [ ]:
def normalize_image(x):
# Get the min and max values for all pixels in the input.
x_min = x.min()
x_max = x.max()
# Normalize so all values are between 0.0 and 1.0
x_norm = (x - x_min) / (x_max - x_min)
return x_norm
In [ ]:
def plot_gradient(gradient):
# Normalize the gradient so it is between 0.0 and 1.0
gradient_normalized = normalize_image(gradient)
# Plot the normalized gradient.
plt.imshow(gradient_normalized, interpolation='bilinear')
plt.show()
Resize the image : this function resizes the image to the desired pixels or to the rescaling factor.
In [ ]:
def resize_image(image, size=None, factor=None):
# If a rescaling-factor is provided then use it.
if factor is not None:
# Scale the numpy array's shape for height and width.
size = np.array(image.shape[0:2]) * factor
# The size is floating-point because it was scaled.
# PIL requires the size to be integers.
size = size.astype(int)
else:
# Ensure the size has length 2.
size = size[0:2]
# The height and width is reversed in numpy vs. PIL.
size = tuple(reversed(size))
# Ensure the pixel-values are between 0 and 255.
img = np.clip(image, 0.0, 255.0)
# Convert the pixels to 8-bit bytes.
img = img.astype(np.uint8)
# Create PIL-object from numpy array.
img = PIL.Image.fromarray(img)
# Resize the image.
img_resized = img.resize(size, PIL.Image.LANCZOS)
# Convert 8-bit pixel values back to floating-point.
img_resized = np.float32(img_resized)
# print(img_resized)
return img_resized
The Inception Model can accept image of any size, but this may require more RAM for processing. In order to get the results from the DeepDream algorithm, if we downscale the image directly to 200*200 pixels (on which the model is actually trained) this will result in an image in which the patterns may not be clearly visible. Thus this algorithm splits the image into smaller tiles and then use TensorFlow to calculate gradient for each of the tiles.
Below function is used to determine the appropritate tile size. The desired tile-size default value = 400*400 pixels and the actual tile-size depends on the image-dimensions.
In [ ]:
def get_tile_size(num_pixels, tile_size=400):
"""
num_pixels is the number of pixels in a dimension of the image.
tile_size is the desired tile-size.
"""
# How many times can we repeat a tile of the desired size.
num_tiles = int(round(num_pixels / tile_size))
# Ensure that there is at least 1 tile.
num_tiles = max(1, num_tiles)
# The actual tile-size.
actual_tile_size = math.ceil(num_pixels / num_tiles)
return actual_tile_size
This function calculates the gradient for an input image. The input image is split into tiles and the gradient is calculated for each of the tile. The tiles are chosen randomly - this is to avoid visible lines in the final output image from DeepDream.
In [ ]:
def tiled_gradient(gradient, image, tile_size=400):
# Allocate an array for the gradient of the entire image.
grad = np.zeros_like(image)
# Number of pixels for the x- and y-axes.
x_max, y_max, _ = image.shape
# Tile-size for the x-axis.
x_tile_size = get_tile_size(num_pixels=x_max, tile_size=tile_size)
# 1/4 of the tile-size.
x_tile_size4 = x_tile_size // 4
# Tile-size for the y-axis.
y_tile_size = get_tile_size(num_pixels=y_max, tile_size=tile_size)
# 1/4 of the tile-size
y_tile_size4 = y_tile_size // 4
# Random start-position for the tiles on the x-axis.
# The random value is between -3/4 and -1/4 of the tile-size.
# This is so the border-tiles are at least 1/4 of the tile-size,
# otherwise the tiles may be too small which creates noisy gradients.
x_start = random.randint(-3*x_tile_size4, -x_tile_size4)
while x_start < x_max:
# End-position for the current tile.
x_end = x_start + x_tile_size
# Ensure the tile's start- and end-positions are valid.
x_start_lim = max(x_start, 0)
x_end_lim = min(x_end, x_max)
# Random start-position for the tiles on the y-axis.
# The random value is between -3/4 and -1/4 of the tile-size.
y_start = random.randint(-3*y_tile_size4, -y_tile_size4)
while y_start < y_max:
# End-position for the current tile.
y_end = y_start + y_tile_size
# Ensure the tile's start- and end-positions are valid.
y_start_lim = max(y_start, 0)
y_end_lim = min(y_end, y_max)
# Get the image-tile.
img_tile = image[x_start_lim:x_end_lim,
y_start_lim:y_end_lim, :]
# Create a feed-dict with the image-tile.
feed_dict = model.create_feed_dict(image=img_tile)
# Use TensorFlow to calculate the gradient-value.
g = session.run(gradient, feed_dict=feed_dict)
# Normalize the gradient for the tile. This is
# necessary because the tiles may have very different
# values. Normalizing gives a more coherent gradient.
g /= (np.std(g) + 1e-8)
# Store the tile's gradient at the appropriate location.
grad[x_start_lim:x_end_lim,
y_start_lim:y_end_lim, :] = g
# Advance the start-position for the y-axis.
y_start = y_end
# Advance the start-position for the x-axis.
x_start = x_end
return grad
In order to process the images fast and preventing unnecessary memory usage, the get_gradient function in inception5h is called just once before we process any image and obtain the gradient for a particular tensor layer.
In [ ]:
def call_get_gradient(layer_tensor):
gradient = model.get_gradient(layer_tensor)
return gradient
This is an Optimization that runs in a loop which forms a main part of DeepDream algorithm. It calculates the gradient of the given layer of Inception Model with respect to the input image which is then added to the input image. This increases the mean value of the layer-tensor and this process is repeated a number of times which helps in amplifying the patterns which the Inception Model sees in the input image.
In [ ]:
def optimize_image(layer_tensor, image, gradient,
num_iterations=10, step_size=3.0, tile_size=400,
show_gradient=True, filename='test'):
"""
Use gradient ascent to optimize an image so it maximizes the
mean value of the given layer_tensor.
Parameters:
layer_tensor: Reference to a tensor that will be maximized.
image: Input image used as the starting point.
num_iterations: Number of optimization iterations to perform.
step_size: Scale for each step of the gradient ascent.
tile_size: Size of the tiles when calculating the gradient.
show_gradient: Plot the gradient in each iteration.
"""
# Copy the image so we don't overwrite the original image.
img = image.copy()
print("Image before:")
plot_image(img)
# save the file showing the before image
filename1 = 'images/deepdream_BeforeO_'+filename+'.jpg'
# kruti sharme - uncomment the below line to save intermediate results
#save_image(img,filename=filename1)
print("Processing image: ", end="")
#kruti sharma - the below function is called outside optimize function now. This is called only once for each tensor layer.
# Use TensorFlow to get the mathematical function for the
# gradient of the given layer-tensor with regard to the
# input image. This may cause TensorFlow to add the same
# math-expressions to the graph each time this function is called.
#gradient = model.get_gradient(layer_tensor)
for i in range(num_iterations):
# Calculate the value of the gradient.
# This tells us how to change the image so as to
# maximize the mean of the given layer-tensor.
grad = tiled_gradient(gradient=gradient, image=img)
# Blur the gradient with different amounts and add
# them together. The blur amount is also increased
# during the optimization. This was found to give
# nice, smooth images. You can try and change the formulas.
# The blur-amount is called sigma (0=no blur, 1=low blur, etc.)
# We could call gaussian_filter(grad, sigma=(sigma, sigma, 0.0))
# which would not blur the colour-channel. This tends to
# give psychadelic / pastel colours in the resulting images.
# When the colour-channel is also blurred the colours of the
# input image are mostly retained in the output image.
sigma = (i * 4.0) / num_iterations + 0.5
grad_smooth1 = gaussian_filter(grad, sigma=sigma)
grad_smooth2 = gaussian_filter(grad, sigma=sigma*2)
grad_smooth3 = gaussian_filter(grad, sigma=sigma*0.5)
grad = (grad_smooth1 + grad_smooth2 + grad_smooth3)
# Scale the step-size according to the gradient-values.
# This may not be necessary because the tiled-gradient
# is already normalized.
step_size_scaled = step_size / (np.std(grad) + 1e-8)
# Update the image by following the gradient.
img += grad * step_size_scaled
# kruti sharma - subtracting the gradient instead of adding that to the image.
#img -= grad * step_size_scaled
if show_gradient:
# Print statistics for the gradient.
msg = "Gradient min: {0:>9.6f}, max: {1:>9.6f}, stepsize: {2:>9.2f}"
print(msg.format(grad.min(), grad.max(), step_size_scaled))
# Plot the gradient.
plot_gradient(grad)
else:
# Otherwise show a little progress-indicator.
print(". ", end="")
print()
print("Image after:")
plot_image(img)
filename1 = 'images/deepdream_AfterO_'+filename+'.jpg'
# kruti sharme - uncomment the below line to save intermediate results
#save_image(img,filename=filename1)
return img
In order to downscale the input image, the below helper function downscales the input image which helps to speed up the processing of DeepDream algorithm and also produces proper patterns from the Inception Model. This downscales the image several times (depending on the num_repeats param) and runs each of the downscaled version through optimize_image() function (as defined above).
In [ ]:
def recursive_optimize(layer_tensor, image, gradient,
num_repeats=4, rescale_factor=0.7, blend=0.2,
num_iterations=10, step_size=3.0,
tile_size=400, filename='test'):
"""
Recursively blur and downscale the input image.
Each downscaled image is run through the optimize_image()
function to amplify the patterns that the Inception model sees.
Parameters:
image: Input image used as the starting point.
rescale_factor: Downscaling factor for the image.
num_repeats: Number of times to downscale the image.
blend: Factor for blending the original and processed images.
Parameters passed to optimize_image():
layer_tensor: Reference to a tensor that will be maximized.
num_iterations: Number of optimization iterations to perform.
step_size: Scale for each step of the gradient ascent.
tile_size: Size of the tiles when calculating the gradient.
"""
# Do a recursive step?
if num_repeats>0:
# Blur the input image to prevent artifacts when downscaling.
# The blur amount is controlled by sigma. Note that the
# colour-channel is not blurred as it would make the image gray.
sigma = 0.5
# kruti sharma : changing the blur value to check how the downscaling is impacted
#sigma = 1.0
# kruti sharma : changing the blur value to check how the downscaling is impacted
#sigma = 0.25
img_blur = gaussian_filter(image, sigma=(sigma, sigma, 0.0))
# Downscale the image.
img_downscaled = resize_image(image=img_blur,
factor=rescale_factor)
print('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Downscale image in Recursive Level: ', num_repeats)
plot_image(img_downscaled)
dfilename = 'images/downscale_'+filename+'_'+str(num_repeats)+'.jpg'
# kruti sharma - uncomment the below line to save the downscaled file
#save_image(img_downscaled, filename=dfilename)
# Recursive call to this function.
# Subtract one from num_repeats and use the downscaled image.
img_result = recursive_optimize(layer_tensor=layer_tensor,
image=img_downscaled,
gradient=gradient,
num_repeats=num_repeats-1,
rescale_factor=rescale_factor,
blend=blend,
num_iterations=num_iterations,
step_size=step_size,
tile_size=tile_size,
filename=filename)
# Upscale the resulting image back to its original size.
img_upscaled = resize_image(image=img_result, size=image.shape)
print('*****************************Upscaled Image in Recursive Level: ', num_repeats)
plot_image(img_upscaled)
ufilename = 'images/upscale_'+filename+'_'+str(num_repeats)+'.jpg'
# kruti sharma - uncomment the below line to save the downscaled file
#save_image(img_upscaled, filename=ufilename)
# Blend the original and processed images.
image = blend * image + (1.0 - blend) * img_upscaled
print("Recursive level:", num_repeats)
# Process the image using the DeepDream algorithm.
filename1 = filename+'_'+str(num_repeats)
img_result = optimize_image(layer_tensor=layer_tensor,
image=image,
gradient=gradient,
num_iterations=num_iterations,
step_size=step_size,
tile_size=tile_size,
filename=filename1)
return img_result
In [ ]:
session = tf.InteractiveSession(graph=model.graph)
Test the algorithm for Willu Wonka Old image.
In [ ]:
image = load_image('images/willy_wonka_old.jpg')
filename = 'willy_wonka_old'
plot_image(image)
Now using the 3rd Layer (layer index = 2) of the Inception Model on the input image
The layer_tensor will hold the inception model 3rd layer and shows that it has 192 channels
In [ ]:
layer_tensor = model.layer_tensors[2]
layer_tensor
Running the DeepDream Optimization algorithm with iterations as 10, step size as 6.0.
In [ ]:
gradient = call_get_gradient(layer_tensor)
img_result = optimize_image(layer_tensor, image, gradient,
num_iterations=20, step_size=3.0, tile_size=400,
show_gradient=False, filename=filename)
In [ ]:
def process_inputs():
print('Tensor Layer to be Used: '+layer_tensor_ip)
new_layer_tensor_ip = model.layer_tensors[int(layer_tensor_ip)]
print('*************************************************')
print('layer tensor actual value after input from user: ')
print(new_layer_tensor_ip)
print('*************************************************')
if image_ip == "":
image_value = 'willy_wonka_new.jpg'
filename_ip = 'images/'+image_ip
new_image_ip = load_image(filename_ip)
print('New Input image from user')
print('*************************************************')
plot_image(new_image_ip)
print('*************************************************')
print('Step Size: '+step_size_ip)
print('*************************************************')
print('Rescale factor: '+rescale_factor_ip)
print('*************************************************')
print('Number of Iterations: '+num_iterations_ip)
print('*************************************************')
print('Number of Repeats: '+num_repeats_ip)
print('*************************************************')
print('*************** PROCESSING with Optimize Image **********************')
parts = image_ip.split('.')
inputImage = parts[0]
print('New input image: ',inputImage)
# calling the gradient function outside the optimize_image() function - to reduce the memory consumption
gradient = call_get_gradient(new_layer_tensor_ip)
img_result = optimize_image(new_layer_tensor_ip, new_image_ip, gradient,
num_iterations=int(num_iterations_ip), step_size=float(step_size_ip), tile_size=400,
show_gradient=True, filename=inputImage)
frac= str(rescale_factor_ip).split('.')
ss = str(step_size_ip).split('.')
filename_ip = 'images/deepdream_O'+parts[0]+'_'+layer_tensor_ip+'_'+ss[0]+'_0'+frac[1]+'.'+parts[1]
print('New Filename for Optimize: '+filename_ip)
save_image(img_result, filename=filename_ip)
print('*************** PROCESSING with Recursive Optimize Image **********************')
img_result = recursive_optimize(new_layer_tensor_ip, new_image_ip, gradient,
num_repeats=int(num_repeats_ip), rescale_factor=float(rescale_factor_ip), blend=0.2,
num_iterations=10, step_size=float(step_size_ip),
tile_size=400, filename=inputImage)
filename_ip = 'images/deepdream_R'+parts[0]+'_'+layer_tensor_ip+'_'+ss[0]+'_0'+frac[1]+'.'+parts[1]
print('New Filename for Recursive Optimize: '+filename_ip)
save_image(img_result, filename=filename_ip)
In [ ]:
from IPython.display import HTML
input_form = """
<div style="background-color:gainsboro; border:solid black; width:800px; padding:20px;">
<B>Tensor Layer:</B> <input type="text" id="layer_tensor" value="3"> Value between 0 - 11 <br> <br>
<B>Step Size:</B> <input type="text" id="step_size" value="3.0"> <br> <br>
<B>Rescale Factor:</B> <input type="text" id="rescale_factor" value="0.7"> <br> <br>
<B>Iterations:</B> <input type="text" id="num_iterations" value="10"> Value >= 10 <br> <br>
<B>Repeats:</B>   <input type="text" id="num_repeats" value="4"> Value >= 3 <br> <br>
<input type="file" id="file"/><br><br>
<button onclick="process_image()">Set Parameters</button><br> <br>
<span id="output"></span>
</div>
"""
javascript = """
<script type="text/Javascript">
var count=0;
process_image();
document.getElementById('file').onchange = function(event) {
var value = this.value;
console.log(event.target.files[0].name);
var image_name = 'image_ip';
var image_value = event.target.files[0].name;
count++;
var filecommand = image_name + " = '" + image_value + "'";
console.log("File Click: Executing Command: " + filecommand);
var kernel = IPython.notebook.kernel;
kernel.execute(filecommand);
};
function process_image(){
var layer_tensor_name = 'layer_tensor_ip';
var layer_tensor_value = document.getElementById('layer_tensor').value;
var step_size_name = 'step_size_ip';
var step_size_value = document.getElementById('step_size').value;
var rescale_factor_name = 'rescale_factor_ip';
var rescale_factor_value = document.getElementById('rescale_factor').value;
var num_iterations_name = 'num_iterations_ip';
var num_iterations_value = document.getElementById('num_iterations').value;
var num_repeats_name = 'num_repeats_ip';
var num_repeats_value = document.getElementById('num_repeats').value;
var kernel = IPython.notebook.kernel;
var command = layer_tensor_name + " = '" + layer_tensor_value + "'";
console.log("Executing Command: " + command);
kernel.execute(command);
command = step_size_name + " = '" + step_size_value + "'";
console.log("Executing Command: " + command);
kernel.execute(command);
command = rescale_factor_name + " = '" + rescale_factor_value + "'";
console.log("Executing Command: " + command);
kernel.execute(command);
command = num_iterations_name + " = '" + num_iterations_value + "'";
console.log("Executing Command: " + command);
kernel.execute(command);
command = num_repeats_name + " = '" + num_repeats_value + "'";
console.log("Executing Command: " + command);
kernel.execute(command);
if(count == 0){
var image_name = 'image_ip';
var image_value = 'willy_wonka_new.jpg';
var filecommand = image_name + " = '" + image_value + "'";
console.log("Executing Command: " + filecommand);
var kernel = IPython.notebook.kernel;
kernel.execute(filecommand);
}
document.getElementById("output").textContent="Change parameters and uncomment and execute process_inputs() to see output";
}
</script>
"""
HTML(input_form + javascript)
In [ ]:
#process_inputs()
In [ ]:
# The below code is commented. The users can uncomment once they have done the run through.
# session.close()
Running over different sets of parameters, we could see that a better result set is generated when we have a Rescale Factor between 0.4 - 0.8, number of iterations that we run Optimize function between 10-20 gives a smooth image with defined patterns. With less number of iterations, the patterns will not be visible. The recursive optimize function is run for atleast 4-5 times (parameter: number of repeats) and hence blends the image with more lines and patterns but if the number of repeats is increased too much, the output does not produces a smooth image.
The gradient plays a major role. Adding up different gradient with varying blur helps in creating a smooth final image where the patterns and the original image blends well. For a very high blur the original image itself looses the lines and smootheness. Thus a blur of 0.5 is good.
For analysing and understanding, each of the intermediate ouputs can be saved - the codes are in commented form to avoid unnecessary saving of multiple files. These lines can uncommented to save each of the intermediate outputs. The final images are saved in the local drive in /images folder.
Copyright (c) 2016 by Magnus Erik Hvass Pedersen
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
In [ ]: