Convolutions

Objectives:

  • Application of convolution on images

Reading and opening images

The following code enables to read an image, put it in a numpy array and display it in the notebook.


In [ ]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

from skimage.io import imread
from skimage.transform import resize

In [ ]:
sample_image = imread("bumblebee.png")
sample_image= sample_image.astype("float32")

size = sample_image.shape
print("sample image shape: ", sample_image.shape)

plt.imshow(sample_image.astype('uint8'));

A simple convolution filter

The goal of this section to use tensorflow / Keras to perform individual convolutions on images. This section does not involve training any model yet.


In [ ]:
import tensorflow as tf
print(tf.__version__)

In [ ]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D

In [ ]:
conv = Conv2D(filters=3, kernel_size=(5, 5), padding="same",
              input_shape=(None, None, 3))

Remember: in Keras, None is used as a marker for tensor dimensions with dynamic size. In this case batch_size, width and height are all dynamic: they can depend on the input. Only the number of input channels (3 colors) has been fixed.


In [ ]:
sample_image.shape

In [ ]:
img_in = np.expand_dims(sample_image, 0)
img_in.shape

Questions:

If we apply this convolution to this image, what will be the shape of the generated feature map?

Hints:

  • in Keras padding="same" means that convolutions uses as much padding as necessary so has to preserve the spatial dimension of the input maps or image;

  • in Keras, convolutions have no strides by default.

Bonus: how much padding Keras has to use to preserve the spatial dimensions in this particular case?


In [ ]:
img_out = conv(img_in)
print(type(img_out), img_out.shape)

The output is a tensorflow Eager Tensor, which can be converted to obtain a standard numpy array:


In [ ]:
np_img_out = img_out[0].numpy()
print(type(np_img_out))

In [ ]:
fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(10, 5))
ax0.imshow(sample_image.astype('uint8'))
ax1.imshow(np_img_out.astype('uint8'));

The output has 3 channels, hence can also be interpreted as an RGB image with matplotlib. However it is the result of a random convolutional filter applied to the original one.

Let's look at the parameters:


In [ ]:
conv.count_params()

Question: can you compute the number of trainable parameters from the layer hyperparameters?

Hints:

  • the input image has 3 colors and a single convolution kernel mixes information from all the three input channels to compute its output;

  • a convolutional layer outputs many channels at once: each channel is the output of a distinct convolution operation (aka unit) of the layer;

  • do not forget the biases!


In [ ]:

Solution: let's introspect the keras model:


In [ ]:
len(conv.get_weights())

In [ ]:
weights = conv.get_weights()[0]
weights.shape

Eeach of the 3 output channels is generated by a distinct convolution kernel.

Each convolution kernel has a spatial size of 5x5 and operates across 3 input channels.


In [ ]:
biases = conv.get_weights()[1]
biases.shape

One bias per output channel.

We can instead build a kernel ourselves, by defining a function which will be passed to Conv2D Layer. We'll create an array with 1/25 for filters, with each channel seperated.


In [ ]:
def my_init(shape=(5, 5, 3, 3), dtype=None):
    array = np.zeros(shape=shape, dtype="float32")
    array[:, :, 0, 0] = 1 / 25
    array[:, :, 1, 1] = 1 / 25
    array[:, :, 2, 2] = 1 / 25
    return array

We can display the numpy filters by moving the spatial dimensions in the end (using np.transpose):


In [ ]:
np.transpose(my_init(), (2, 3, 0, 1))

In [ ]:
conv = Conv2D(filters=3, kernel_size=(5, 5), padding="same",
           input_shape=(None, None, 3), kernel_initializer=my_init)

In [ ]:
fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(10, 5))
ax0.imshow(img_in[0].astype('uint8'))

img_out = conv(img_in)
np_img_out = img_out[0].numpy()
ax1.imshow(np_img_out.astype('uint8'));

Exercise

  • Define a Conv2D layer with 3 filters (5x5) that compute the identity function (preserve the input image without mixing the colors).
  • Change the stride to 2. What is the size of the output image?
  • Change the padding to 'VALID'. What do you observe?

In [ ]:


In [ ]:
# %load solutions/strides_padding.py

Working on edge detection on Grayscale image


In [ ]:
# convert image to greyscale
grey_sample_image = sample_image.mean(axis=2)

# add the channel dimension even if it's only one channel so
# as to be consistent with Keras expectations.
grey_sample_image = grey_sample_image[:, :, np.newaxis]


# matplotlib does not like the extra dim for the color channel
# when plotting gray-level images. Let's use squeeze:
plt.imshow(np.squeeze(grey_sample_image.astype(np.uint8)),
           cmap=plt.cm.gray);

Exercise

Try Conv2D? or press shift-tab to get the documentation. You may get help at https://keras.io/layers/convolutional/


In [ ]:


In [ ]:
# %load solutions/edge_detection

Pooling and strides with convolutions

Exercise

  • Use MaxPool2D to apply a 2x2 max pool with strides 2 to the image. What is the impact on the shape of the image?
  • Use AvgPool2D to apply an average pooling.
  • Is it possible to compute a max pooling and an average pooling with well chosen kernels?

Bonus

  • Implement a 3x3 average pooling with a regular convolution Conv2D, with well chosen strides, kernel and padding

In [ ]:
from tensorflow.keras.layers import MaxPool2D, AvgPool2D

In [ ]:
# %load solutions/pooling.py

In [ ]:
# %load solutions/average_as_conv.py

In [ ]:


In [ ]: