In [0]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
This notebook uses Lucid to produce feature visualizations on 3D mesh surfaces by using a Differentiable Image Parameterization.
This notebook doesn't introduce the abstractions behind lucid; you may wish to also read the .
Note: The easiest way to use this tutorial is as a colab notebook, which allows you to dive in with no setup.
This notebook uses OpenGL and thus requires a GPU, unlikely most of our notebooks. You can check whether your GPU is available and configured correctly for tensorflow:
In [0]:
import tensorflow as tf
assert tf.test.is_gpu_available()
This notebook uses OpenGL and thus requires a GPU. If the above assert statement fails, you can always run the notebook on colab and use a free GPU by selecting:
Runtime → Change runtime type → Hardware Accelerator: GPU
In [0]:
!pip -q install lucid>=0.2.3
In [0]:
import os
import io
import sys
from string import Template
import numpy as np
import PIL.Image
import matplotlib.pylab as pl
from IPython.display import clear_output, display, Image, HTML
from lucid.misc.gl.glcontext import create_opengl_context
import OpenGL.GL as gl
from lucid.misc.gl import meshutil
from lucid.misc.gl import glrenderer
import lucid.misc.io.showing as show
from lucid.misc.io import load
from lucid.misc.tfutil import create_session
from lucid.modelzoo import vision_models
from lucid.optvis import objectives
from lucid.optvis import param
from lucid.optvis import render as lucid_render
from lucid.optvis.param.spatial import sample_bilinear
You can check the installed version of OpenGL:
In [5]:
create_opengl_context()
gl.glGetString(gl.GL_VERSION)
Out[5]:
In [6]:
!gsutil cp gs://deepdream/article_models.zip . && \
unzip -qo article_models.zip && \
cat article_models/readme.txt
The 3D models are in the common obj format. They also come with textures, let's take a brief look:
In [7]:
!ls article_models
Let's ensure they load…
In [0]:
mesh = meshutil.load_obj('article_models/bunny.obj')
mesh = meshutil.normalize_mesh(mesh)
original_texture = load('article_models/bunny.png')
…and look reasonable. This shows you how to use our built-in 3d viewer:
In [9]:
show.textured_mesh(mesh, original_texture)
We describe this process in the Efficient Texture Optimization through 3D Rendering. Remember that the main ingredients beside the 3D model we just loaded are:
…and from there we can use our knowledge of which parts of the 3D model were visible in the flat image to backpropagate that gradient through the rendering process and into the learned texture.
We provide a way to sample random views onto a 3D model. You can specify the range of distances, and the resulting views will be centered on the object. The resulting 4x4 matrix is interpreted as a ModelView matrix.
In [0]:
random_view = meshutil.sample_view(11.0, 13.0,)
Let's initialize a renderer and take a look on our mesh from the direction of random_view.
In [0]:
renderer = glrenderer.MeshRenderer((512, 512))
In [22]:
random_view_image = renderer.render_mesh(modelview=random_view, **mesh)
show.image(random_view_image)
Note that this image has an alpha channel to separate foreground from background, and the colors red and green encode the UV coordinates—where a pixel in the texture would end up on the model. We will use this information to take the gradient of the flat image coming from our CNN model and translate it back onto the texture we're learning/optimizing.
We want to synthesize a texture with some property that we can describe in the feature space of our pretrained CNN. For simplicity, we focus on a simple Feature Visualization objective here—but in the follow up notebook we will use a more complex style transfer objective for even more interesting results.
Let's start by loading up our CNN model as usual:
In [0]:
model = vision_models.InceptionV1()
model.load_graphdef()
And quickly see a simple 2D image optimized for the same Feature Visualization objective we'll later use to generate the 3D model's texture:
In [14]:
objective = objectives.channel('mixed4b_pool_reduce_pre_relu', 17)
vis = lucid_render.render_vis(model, objective, verbose=False) # (lucid.optvis.render is imported as lucid_render to differentiate it from the 3D renderer)
show.image(vis)
In [0]:
sess = create_session()
# t_fragments is used to feed rasterized UV coordinates for the current view.
# Channels: [U, V, _, Alpha]. Alpha is 1 for pixels covered by the object, and
# 0 for background.
t_fragments = tf.placeholder(tf.float32, [None, None, 4])
t_uv = t_fragments[...,:2]
t_alpha = t_fragments[...,3:]
t_texture = param.image(1024, fft=True, decorrelate=True)[0]
t_frame = sample_bilinear(t_texture, t_uv) * t_alpha
model.import_graph(t_frame)
def T(layer):
return sess.graph.get_tensor_by_name("import/%s:0"%layer)
# obj = objectives.channel('mixed3a_1x1_pre_relu', 1)(T)
# obj = objectives.channel('mixed4a_1x1_pre_relu', 26)(T)
# obj = objectives.channel('mixed4a_1x1_pre_relu', 11)(T)
# obj = objectives.channel('mixed4a_3x3_pre_relu', 27)(T)
# obj = objectives.channel('mixed4a_3x3_pre_relu', 174)(T)
# obj = objectives.channel('mixed4a_1x1_pre_relu', 179)(T)
# obj = objectives.channel('mixed4a_1x1_pre_relu', 190)(T)
# obj = objectives.channel('mixed4a_1x1_pre_relu', 5)(T)
obj = objectives.channel('mixed4b_pool_reduce_pre_relu', 17)(T)
tf.losses.add_loss(-obj)
t_lr = tf.constant(0.01)
t_loss = tf.losses.get_total_loss()
trainer = tf.train.AdamOptimizer(t_lr)
train_op = trainer.minimize(t_loss)
init_op = tf.global_variables_initializer()
init_op.run()
We can sanity check that at least our parameterization fits together by generating the UV map again with the renderer ("fragments") and then evaling the t_frame tensor while feeding the original texture:
In [26]:
fragments = renderer.render_mesh(modelview=meshutil.sample_view(11.0, 13.0), **mesh)
img = t_frame.eval({t_fragments: fragments, t_texture: original_texture})
show.images([fragments, img])
Looks reasonable! Let's run the actual optimization loop and see if we can generate a texture!
In [17]:
loss_log = []
init_op.run()
for i in range(400):
# Render mesh UVs with OpenGL
fragments = renderer.render_mesh(modelview=meshutil.sample_view(11.0, 13.0), **mesh)
# Perform step optimization for the current view
_, loss = sess.run([train_op, t_loss], {t_fragments: fragments, t_lr:0.03})
loss_log.append(loss)
# Reporting
if i==0 or (i+1)%50 == 0:
clear_output()
last_frame = sess.run(t_frame, {t_fragments: fragments})
show.images([last_frame, fragments], ['current', 'uv'])
if i==0 or (i+1)%10 == 0:
print(len(loss_log), loss)
Since this is such a stochastic procedure, it's good to sanity check that we observe the loss going down. Remember the loss only captures how well the final rendered image activates the feature we are optimizing for, while we view the 3D model from a different perspective at each time—so expect high variance.
In [18]:
pl.plot(loss_log);
In [19]:
texture = t_texture.eval()
show.textured_mesh(mesh, texture)
You can also view the texture we optimized directly:
In [20]:
show.image(texture, 'jpeg')