This exercise is about learning to apply convolutions on an image using Tensorflow. Idea is to create a weight matrix and apply the function conv2d with 'same' and 'Valid' padding to check the effects on output image.
To give an overview of how the output changes based on convolution parameters, this exercise is designed to build Layer one of Convolution Neural Network (CNN) along with maxpooling and relu functions and Visualize the outputs given an input image.
In [1]:
#Importing
import numpy as np
from scipy import signal
from scipy import misc
import matplotlib.pyplot as plt
from PIL import Image
import tensorflow as tf
In [2]:
import scipy
scipy.__version__
Out[2]:
In [3]:
import PIL
PIL.__version__
Out[3]:
First download a samole image:
In [2]:
!wget --output-document ../../data/lena.png https://ibm.box.com/shared/static/o4x2cvlqfre9lax05ihbn47cxpnzwlvb.png
In [4]:
#read the image as Float data type
im=misc.imread("../../data/lena.png").astype(np.float)
#im=misc.imread("one.png").astype(np.float)
#Convert image to gray scale
grayim=np.dot(im[...,:3], [0.299, 0.587, 0.114])
#Plot the images
%matplotlib inline
plt.subplot(1, 2, 1)
plt.imshow(im)
plt.xlabel(" Float Image ")
plt.subplot(1, 2, 2)
plt.imshow(grayim, cmap=plt.get_cmap("gray"))
plt.xlabel(" Gray Scale Image ")
Out[4]:
Print the shape of Gray Scale Image
In [5]:
# Your Code Goes Here
grayim.shape
Out[5]:
In [6]:
Image = np.expand_dims(np.expand_dims(grayim, 0), -1)
Image.shape
Out[6]:
In [7]:
# Your Code Goes Here
in_img = tf.placeholder(dtype=tf.float32, shape=[None, 512, 512, 1])
in_img.get_shape().as_list()
Out[7]:
In [8]:
# Your Code Goes Here
weights = tf.Variable(tf.constant(1., shape=[5, 5, 1, 1], dtype=tf.float32))
weights.get_shape().as_list()
Out[8]:
In [9]:
# Your Code Goes Here
# For convolution output 1
ConOut = tf.nn.conv2d(input=in_img, filter=weights, strides=[1, 1, 1, 1], padding='SAME')
# For convolution output 2
ConOut2 = tf.nn.conv2d(input=in_img, filter=weights, strides=[1, 1, 1, 1], padding='VALID')
In [10]:
init = tf.global_variables_initializer()
sess= tf.Session()
sess.run(init)
Run the sesions to get the results for two convolution operations
In [11]:
# Your Code Goes Here
# Session for Result 1
result = sess.run(ConOut, feed_dict={in_img: Image})
# Session for Result 2
result2 = sess.run(ConOut2, feed_dict={in_img: Image})
In [12]:
# for the result with 'SAME' Padding
#reduce the dimension
vec = np.reshape(result, (1, -1));
# Reshape the image
image= np.reshape(vec,(512,512))
print(image.shape)
# for the result with 'VALID' Padding
#reduce the dimension
vec2 = np.reshape(result2, (1, -1));
# Reshape the image
image2= np.reshape(vec2,(508,508))
print(image2.shape)
Display the images
In [13]:
#Plot the images
%matplotlib inline
plt.subplot(1, 2, 1)
plt.imshow(image,cmap=plt.get_cmap("gray"))
plt.xlabel(" SAME Padding ")
plt.subplot(1, 2, 2)
plt.imshow(image2, cmap=plt.get_cmap("gray"))
plt.xlabel(" VALID Padding ")
Out[13]:
Feel free to change the weight matrix and experiment with different Paddings to see the changes in output images.
using above conv2d function lets build our first conv Layer. Usually most general CNN architecture Layer 1 comprises of Convolution, Relu and MaxPooling. Lets create these functions to check the effects on "Lena" Image. Depending on the architecture these functions may change. For this exercise lets assume our Layer 1 has just three functions Convolution, Relu and Maxpooling.
It is most often repetation of these layers stacked on top of each other to create Deep CNN
In [14]:
#lets create functions for convolution and Maxpooling
def conv2d(X,W):
return tf.nn.conv2d(X, W, strides=[1, 1, 1, 1], padding='SAME')
def MaxPool(X):
return tf.nn.max_pool(X, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
In [15]:
weights = {
# 5 x 5 convolution, 1 input image, 32 outputs
'W_conv1': tf.Variable(tf.random_normal([5, 5, 1, 32]))
}
biases = {
#bias should be of the size of number of Outputs
'b_conv1': tf.Variable(tf.random_normal([32]))
}
The output of Conv2d is passed through Relu Layer and finally, the output of Relu is given as input for Maxpooling layer. Let's define the graph and print the shapes.The size of Image is reduced after passing through Maxpool Layer. You can change the size and strides in Maxpool layer to check how the image size varies
In [16]:
conv1 = tf.nn.relu(conv2d(in_img, weights['W_conv1']) + biases['b_conv1'])
Mxpool = MaxPool(conv1)
print(conv1.get_shape().as_list())
print(Mxpool.get_shape().as_list())
In [17]:
init = tf.global_variables_initializer()
sess= tf.Session()
sess.run(init)
In [18]:
Layer1 = sess.run(Mxpool, feed_dict={in_img: Image})
In [19]:
print(Layer1.shape)
vec = np.reshape(Layer1, (256,256,32));
print(vec.shape)
for i in range (32):
image=vec[:,:,i]
#print image
#image *= 255.0/image.max()
#print image
plt.imshow(image,cmap=plt.get_cmap("gray"))
plt.xlabel( i , fontsize=20, color='red')
plt.show()
plt.close()
The Idea behind this exercise is to gain understanding on how to apply convolutions and other functions on Images. We are NOT training Neural Network here, however just checking the effects of changing parameters of the above functions which are basic building blocks of any Deep Convolution neural Networks.
Created by Shashibushan Yenkanchi </h4>