Computer Vision: Tutorial 1

This tutorial is for the people who are complete beginners in the area of computer vision and image processing also aren't very familiar with numpy and other tools in python. It intends to give a good balance between the practical and the theoretical aspects of the area. I expect to finish the tutorial within 2 hours.

Prerequisites

Anaconda Package for the latest Python 2.x . That should include the python interpretor, IPython, numpy, scipy and matplotlib. Anaconda can be downloaded from Continuum's website. Download Link
Python imaging library(PIL). (NOTE: PIL is not supported for Python 3.x) PIL can be installed with conda (installed with anaconda) using command conda install pil
Very basic familiarity with Python: That can be gained from spending an hour or two on code academy.

What is an Image?

(10 minutes)

The matrix view
The samples of a function view
Color spaces

Using Python tools for image analysis and manipulation

Basic numpy operations

Numpy is a python library for the very fast manipulation of large arrays, multidimensional arrays and matrices. It has a lot of inbuild features that make these manipulation pretty easy.

Creating Numpy arrays

np.array function creates a numpy array out of python lists, or nested lists. Unlike Python lists which can have any type of element at any position Numpy arrays are homogenious. It means all elements are of the same type. The data type of the array can be specified by the keyword argument dtype.



In [2]:

    
import numpy as np

arr = np.array([1, 2, 3, 4], dtype=np.uint8)
print(type(arr))









    



<type 'numpy.ndarray'>



In [5]:

    
arr = np.array([[1, 2],[2, 3],[3, 4]], dtype=np.uint8)
print(arr)









    



[[1 2]
 [2 3]
 [3 4]]

Creating numpy array in which every element cannot be represented in the same type raises Value Error.



In [4]:

    
arr = np.array([[1],[2, 3],[3, 4]], dtype=np.uint8)









    



---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-b72712c0bb92> in <module>()
----> 1 arr = np.array([[1],[2, 3],[3, 4]], dtype=np.uint8)

ValueError: setting an array element with a sequence.



In [12]:

    
ten_cross_ten = np.array([range(i*10, (i + 1)*10) for i in range(10)])
print(ten_cross_ten)









    



[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]

The individual elements of the array can be acessed in the same way the values of a regular python nested list can be acessed.



In [8]:

    
print(ten_cross_ten[1][1])

Slicing

The basic slice syntax is i:j:k where i is the starting index, j is the stopping index, and k is the step ($k\neq0$). This selects the m elements (in the corresponding dimension) with index values i, i + k, ..., i + (m - 1) k where m = q + ($r\neq0$) and q and r are the quotient and remainder obtained by dividing j - i by k: j - i = q k + r, so that i + (m - 1) k < j



In [6]:

    
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
x[1:7:2]









    Out[6]:





array([1, 3, 5])

Negative i and j are interpreted as n + i and n + j where n is the number of elements in the corresponding dimension. Negative k makes stepping go towards smaller indices.



In [7]:

    
x[-2:10]









    Out[7]:





array([8, 9])



In [9]:

    
x[-3:3:-1]









    Out[9]:





array([7, 6, 5, 4])

Assume n is the number of elements in the dimension being sliced. Then, if i is not given it defaults to 0 for k > 0 and n for k < 0 . If j is not given it defaults to n for k > 0 and -1 for k < 0 . If k is not given it defaults to 1. Note that :: is the same as : and means select all indices along this axis.



In [10]:

    
x[5:]









    Out[10]:





array([5, 6, 7, 8, 9])

Now suppose you want to select the first two coloums from the first five rows.



In [23]:

    
ten_cross_ten[:5][:2]









    Out[23]:





array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

Well it didn't worked out this way. Can any one of you explain why?



In [24]:

    
ten_cross_ten[:5]









    Out[24]:





array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

Now you must have guessed, ten_cross_ten[:5][:2] selects the first two elements from the output of ten_cross_ten[:5]. So to do our desired operation you have to do the following:



In [25]:

    
ten_cross_ten[:5, :5]









    Out[25]:





array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

I hope you get the point. This works for arrays with higher dimentions.

Exercise: Print the matrix formed by the even elements

Point wise operations



In [26]:

    
ten_cross_ten + 1









    Out[26]:





array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])



In [27]:

    
ten_cross_ten * ten_cross_ten









    Out[27]:





array([[   0,    1,    4,    9,   16,   25,   36,   49,   64,   81],
       [ 100,  121,  144,  169,  196,  225,  256,  289,  324,  361],
       [ 400,  441,  484,  529,  576,  625,  676,  729,  784,  841],
       [ 900,  961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521],
       [1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401],
       [2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481],
       [3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761],
       [4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241],
       [6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921],
       [8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801]])

Operations on regions



In [31]:

    
ten_cross_ten[:5, :5] = 0
print(ten_cross_ten)









    



[[ 0  0  0  0  0  5  6  7  8  9]
 [ 0  0  0  0  0 15 16 17 18 19]
 [ 0  0  0  0  0 25 26 27 28 29]
 [ 0  0  0  0  0 35 36 37 38 39]
 [ 0  0  0  0  0 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]



In [32]:

    
ten_cross_ten[6, :] = 0
print(ten_cross_ten)









    



[[ 0  0  0  0  0  5  6  7  8  9]
 [ 0  0  0  0  0 15 16 17 18 19]
 [ 0  0  0  0  0 25 26 27 28 29]
 [ 0  0  0  0  0 35 36 37 38 39]
 [ 0  0  0  0  0 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [ 0  0  0  0  0  0  0  0  0  0]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]

Exercise 1:



In [ ]:

Basic Image Processing (40 minutes)

Loading an image

We will need the imread function from the scipy to read the images from the files



In [1]:

    
from scipy.misc import imread
panda = imread('panda.jpg')

Displaying an image

To display images you will another library called matplotlib. The %matplotlib inline magic of IPython has to set to display results inline in the notebook.



In [2]:

    
%matplotlib inline
import matplotlib.pyplot as plt



In [3]:

    
plt.imshow(panda)









    Out[3]:





<matplotlib.image.AxesImage at 0x7f02aebd00d0>



In [4]:

    
panda.shape









    Out[4]:





(1400, 1170, 3)

A colored image can be seen as a 3 dimentional array. So every pixel is an array of size 3



In [6]:

    
panda[0][0]









    Out[6]:





array([255, 255, 255], dtype=uint8)

Since the images read from imread are numpy array, they can manipulated in the same way we manipulated array earlier. Try to guess the output of the following code.



In [4]:

    
panda[100:200, 100:200] = [0, 0, 255]
plt.imshow(panda)









    Out[4]:





<matplotlib.image.AxesImage at 0x7f02ae8bce90>

Now guess what will this do



In [6]:

    
rows, cols, channels = panda.shape
mirror_panda = panda.copy() #because we want to preserve the original panda image
mirror_panda[:rows/2, :] = panda[-rows/2 - 1::-1, :]
plt.imshow(mirror_panda)









    Out[6]:





<matplotlib.image.AxesImage at 0x7f02ae802790>

Exercise: Given the coordinates write a function to make a red "H" on the image, with thickness 3px, width 12px and heigh 24px



In [ ]:

You can do other crazy stuff like



In [7]:

    
mirror_panda = panda.copy()
mirror_panda[:rows/2, :cols/2, :-1] = 0
mirror_panda[rows/2:, cols/2:, 1:] = 0
plt.imshow(mirror_panda)









    Out[7]:





<matplotlib.image.AxesImage at 0x7f02ae73e2d0>

Exercise: make the other two quadrants yellow and green

Manipulating Pixels
More manipulation of Pixels
Exercise 2: The red, blue and green image from the given colored image (teaches about the color space in the RBG image and manipulating them)
Exercise 3: Write an H on the image (teaches similar things as above)
Homework 1: Staining the image

Matplotlib and Histograms

(20 minutes)

Matplotlib provides easy procedures to create publishing quality plots.

matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB. Each pyplot function makes some change to a figure: eg, create a figure, create a plotting area in a figure, plot some lines in a plotting area, decorate the plot with labels, etc.... matplotlib.pyplot is stateful, in that it keeps track of the current figure and plotting area, and the plotting functions are directed to the current axes. Without the %matplotlib inline directive we specified earlier plt.show() has to be used to display the plots.



In [14]:

    
import matplotlib.pyplot as plt
import numpy as np
plt.plot([1,2,3,4])
plt.ylabel('some numbers')









    Out[14]:





<matplotlib.text.Text at 0x7f02ae438d90>

To plot points (x1, y1), (x2, y2) ..., (xn ,yn) the pyplot syntax is plot([x1, x2, ..., xn], [y1, y2, ..., yn], <other parameters>) when only one array is provided it is assumed to be y values and x values automatically takes the value [0, 1, ..., n-1] where n is the number of y values. When no styling option is provided pyplot joins all the plotted points with a blue line.



In [16]:

    
y_vals = np.sin(np.arange(0, np.pi, 0.01))
x_vals = np.arange(0, np.pi, 0.01)
plt.plot(x_vals, y_vals)









    Out[16]:





[<matplotlib.lines.Line2D at 0x7f02ae2970d0>]

Exercise: plot a circle

What are histograms



In [19]:

    
plt.hist(panda.flatten())









    Out[19]:





(array([  613525.,   356927.,   200242.,   149056.,   148335.,   384407.,
          147526.,    63398.,    81599.,  2768985.]),
 array([   0. ,   25.5,   51. ,   76.5,  102. ,  127.5,  153. ,  178.5,
         204. ,  229.5,  255. ]),
 <a list of 10 Patch objects>)

Plotting a histogram
Exercise: plotting a histogram of colored image
Linear stretching of histogram
Histogram equalization
Homework 2:
Thresholding

Questions and buffer time

(30 minutes)

References

Numpy indexing tutorial: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
Matplotlib tutorial: http://matplotlib.org/users/pyplot_tutorial.html
OpenCV Python Tutorial, histogram equalization