Convolutions

Objectives:

Application of convolution on images

Reading and opening images

The following code enables to read an image, put it in a numpy array and display it in the notebook.



In [ ]:

    
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

from skimage.io import imread
from skimage.transform import resize



In [ ]:

    
sample_image = imread("bumblebee.png")
sample_image= sample_image.astype("float32")

size = sample_image.shape
print("sample image shape: ", sample_image.shape)

plt.imshow(sample_image.astype('uint8'));

A simple convolution filter

The goal of this section to use tensorflow / Keras to perform individual convolutions on images. This section does not involve training any model yet.



In [ ]:

    
import tensorflow as tf
print(tf.__version__)



In [ ]:

    
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D



In [ ]:

    
conv = Conv2D(filters=3, kernel_size=(5, 5), padding="same",
              input_shape=(None, None, 3))

Remember: in Keras, None is used as a marker for tensor dimensions with dynamic size. In this case batch_size, width and height are all dynamic: they can depend on the input. Only the number of input channels (3 colors) has been fixed.



In [ ]:

    
sample_image.shape



In [ ]:

    
img_in = np.expand_dims(sample_image, 0)
img_in.shape

Questions:

If we apply this convolution to this image, what will be the shape of the generated feature map?

Hints:

in Keras padding="same" means that convolutions uses as much padding as necessary so has to preserve the spatial dimension of the input maps or image;
in Keras, convolutions have no strides by default.

Bonus: how much padding Keras has to use to preserve the spatial dimensions in this particular case?



In [ ]:

    
img_out = conv(img_in)
print(type(img_out), img_out.shape)

The output is a tensorflow Eager Tensor, which can be converted to obtain a standard numpy array:



In [ ]:

    
np_img_out = img_out[0].numpy()
print(type(np_img_out))



In [ ]:

    
fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(10, 5))
ax0.imshow(sample_image.astype('uint8'))
ax1.imshow(np_img_out.astype('uint8'));

The output has 3 channels, hence can also be interpreted as an RGB image with matplotlib. However it is the result of a random convolutional filter applied to the original one.

Let's look at the parameters:



In [ ]:

    
conv.count_params()

Question: can you compute the number of trainable parameters from the layer hyperparameters?

Hints:

the input image has 3 colors and a single convolution kernel mixes information from all the three input channels to compute its output;
a convolutional layer outputs many channels at once: each channel is the output of a distinct convolution operation (aka unit) of the layer;
do not forget the biases!



In [ ]:

Solution: let's introspect the keras model:



In [ ]:

    
len(conv.get_weights())



In [ ]:

    
weights = conv.get_weights()[0]
weights.shape

Eeach of the 3 output channels is generated by a distinct convolution kernel.

Each convolution kernel has a spatial size of 5x5 and operates across 3 input channels.



In [ ]:

    
biases = conv.get_weights()[1]
biases.shape

One bias per output channel.

We can instead build a kernel ourselves, by defining a function which will be passed to Conv2D Layer. We'll create an array with 1/25 for filters, with each channel seperated.



In [ ]:

    
def my_init(shape=(5, 5, 3, 3), dtype=None):
    array = np.zeros(shape=shape, dtype="float32")
    array[:, :, 0, 0] = 1 / 25
    array[:, :, 1, 1] = 1 / 25
    array[:, :, 2, 2] = 1 / 25
    return array

We can display the numpy filters by moving the spatial dimensions in the end (using np.transpose):



In [ ]:

    
np.transpose(my_init(), (2, 3, 0, 1))



In [ ]:

    
conv = Conv2D(filters=3, kernel_size=(5, 5), padding="same",
           input_shape=(None, None, 3), kernel_initializer=my_init)



In [ ]:

    
fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(10, 5))
ax0.imshow(img_in[0].astype('uint8'))

img_out = conv(img_in)
np_img_out = img_out[0].numpy()
ax1.imshow(np_img_out.astype('uint8'));

Exercise

Define a Conv2D layer with 3 filters (5x5) that compute the identity function (preserve the input image without mixing the colors).
Change the stride to 2. What is the size of the output image?
Change the padding to 'VALID'. What do you observe?



In [ ]:



In [ ]:

    
# %load solutions/strides_padding.py

Working on edge detection on Grayscale image



In [ ]:

    
# convert image to greyscale
grey_sample_image = sample_image.mean(axis=2)

# add the channel dimension even if it's only one channel so
# as to be consistent with Keras expectations.
grey_sample_image = grey_sample_image[:, :, np.newaxis]


# matplotlib does not like the extra dim for the color channel
# when plotting gray-level images. Let's use squeeze:
plt.imshow(np.squeeze(grey_sample_image.astype(np.uint8)),
           cmap=plt.cm.gray);

Exercise

Build an edge detector using Conv2D on greyscale image
You may experiment with several kernels to find a way to detect edges
https://en.wikipedia.org/wiki/Kernel_(image_processing)

Try Conv2D? or press shift-tab to get the documentation. You may get help at https://keras.io/layers/convolutional/



In [ ]:



In [ ]:

    
# %load solutions/edge_detection

Pooling and strides with convolutions

Exercise

Use MaxPool2D to apply a 2x2 max pool with strides 2 to the image. What is the impact on the shape of the image?
Use AvgPool2D to apply an average pooling.
Is it possible to compute a max pooling and an average pooling with well chosen kernels?

Bonus

Implement a 3x3 average pooling with a regular convolution Conv2D, with well chosen strides, kernel and padding



In [ ]:

    
from tensorflow.keras.layers import MaxPool2D, AvgPool2D



In [ ]:

    
# %load solutions/pooling.py



In [ ]:

    
# %load solutions/average_as_conv.py



In [ ]:



In [ ]: