This Problem Set covers the topics of frequency domain filtering and color image processing.
Your solutions to the following problems should include commented source code and a short description of each function. You should test your functions with several input images, besides the ones provided here. Include the input and output images that you used for experimentation. Analyze your results. If you discover something interesting, let us know!
In [146]:
'''This is a definition script, so we do not have to rewrite code'''
import numpy as np
import os
import cv2
import matplotlib.pyplot as mplt
import random
# set matplotlib to print inline (Jupyter)
%matplotlib inline
# path prefix
pth = '../data/'
# files to be used as samples
# list *files* holds the names of the test images
files = sorted(os.listdir(pth))
print files
# Usefull function
def rg(img_path):
return cv2.imread(pth+img_path, cv2.IMREAD_GRAYSCALE)
Visualize the Fourier spectrum of an image. Write a function that visualizes the spectrum of an input gray-scale image. The function should perform the following steps:
Test your function by applying it to the images face.png, blown_ic.png, test_pattern_blurring_orig.png, translated_rectangle.png and rectangle.png.
Please comment your results.
The code below accomplishes the visualization of the spectrum of an image. The function spectrumVisualization(img)
returns a tuple and requires an image as input. The first value of the returned tuple is the spectrum of the magnitude of the fft, the second value is the spectrum of the decibel transformation of that magnitude. If we do not apply a decibel transformation, the plot of the spectrum is useless as much of the data is in the high frecuency domain (very sharp intensity variations), thus accumulating themselves in the middle of the plot.
Now, what can be conclude from the results obtained... If we recall from our course of signals and systems, if a periodic signal spans infinitely in time, its frequency spectrum will be a vertical line in a given frequency value; but if the signal cuts in time, its specrum will span infinitely in frequency. We can see a similar thing here: in images 1, 2 and 3, the intensity variations spans evenly in the intensity scale, so the fft spectrum agglomerates in a region (close to a point), but in image 4 and 5 the images are binary (two intensity values, no in between), so the fft spectra spans (this makes sense).
In [3]:
def spectrumVisualization(img):
# Compute Fourier transform (bidimensional)
fft = np.fft.fft2(img)
# Center te fft origin in the middle of the image
ffts = np.fft.fftshift(fft)
# Generate the magnitude spectrum of the fft
mag = np.abs(ffts)
# That was useless... Apply log intensity transform (e.g convert to dB)
# Avoid log of 0
mag[mag==0] = 1e-6
mag_dB = 20*np.log(mag)
return (mag, mag_dB)
# Gather images for computation
imgs_idx = [13, 3, 31, 32, 25]
imgs = [rg(files[i]) for i in imgs_idx]
# Get the spectrum of all images
spec = [spectrumVisualization(img) for img in imgs]
# Printing
f, ax = mplt.subplots(len(spec), 3, figsize=(15,15))
for i in range(len(spec)):
ax[i][0].imshow(imgs[i], cmap='gray')
ax[i][1].imshow(spec[i][0], cmap='gray')
ax[i][2].imshow(spec[i][1], cmap='gray')
Frequency domain filtering. Write a PYTHON function that performs high-frequency emphasis (hfe) filtering. Recall that an hfe filter is defined as: \begin{equation} H_{hfe}(u,v) = a + b \cdot H_{hp}(u,v) \end{equation} where $a$ is the offset, $b$ is the high-frequency amplification and $H_{hp}$ is the transfer function of a high-pass filter. Your function should be based on a second-order high-pass Butterworth filter with a cut-off frequency $D_0$, which is given by: \begin{equation} H_{hp}(u,v) = \frac{1}{1 + \left[\frac{D_0}{D(u,v)}\right]^{2n}}. \end{equation} Note. $D(u,v)$ measures the Euclidean distance from the point $(u,v)$ to the center of the frequency plane.
Use your function and filters on the image chest.jpg to reproduce the image enhancement illustrated below.
<img src="../data/files/chest.jpg"/ width=200> <img src="../data/files/out.png"/width=200>
The outline of the process is the following:
Use the following parameters for the filters: $a = 0.5$, $b = 2$, $n=2$ and $D_0$ should be set to $5\%$ of the vertical dimension of the filter.
Please comment your results.
The functions of the butterworth and the hfe filter do the requested. Both filters just accept one image as imput and return a list with the array of the filter,the fft of the imput image and the filtered image. The difference between both are the constants of the filter, the hfe can be a sheer butterworth if the constants are set as $a = 0$ and $b = 1$. With the constants set as $a = 0.5$ and $b = 2$ the hfe filter is more clear in the sense that it let us see more details of the image, so depending on the values of the constant it can be obtained different filters.
In [333]:
def eucDist(x1,y1,x2,y2):
"""
Calculates the Euclidean Distance between the points(x1,y1) and (x2,y2).
Input
x1 , y1: First coordinates in x and y.
x1 , y2: Second coordinates in x and y.
output
Euclidean distance(float).
"""
return ((x2-x1)**2+(y2-y1)**2)**(1/2.0)
def butter(img, n = 2, pD0 = 5):
"""
Butterworth Filter.
Input
img: Array of the image.
n: Filter order.
pD0: Percentage of the vertical dimension of the filter.
output
[Hhp,imgf,imgfilt]: List with the array of the filter,the fft
of the imput image and the filtered image.
"""
y0, x0 = img.shape #x0 = width, y0 = height
D0 = y0*pD0/100.0
# Butterworth Filter
Hhp = np.zeros([n*y0, n*x0])
for i in range(0,n*y0):
for j in range(0,n*x0):
if not(i == y0 and j == x0):
x = -x0 + j
y = y0 - i
Hhp[i][j]= 1/(1+(D0/eucDist(0,0,x,y))**(2.0*n))
# Getting the fft of the image
imgf = np.fft.fft2(img,[n*y0,n*x0])
imgf = np.fft.fftshift(imgf)
# Applying filter
ishift = np.fft.ifftshift(imgf)
imgfilt = abs(np.fft.ifft2(np.multiply(Hhp,ishift)))
imgfilt = imgfilt[:y0,:x0]
return [Hhp,imgf,imgfilt]
def hfe(img, a = 0.5, b = 2, n = 2, pD0 = 5):
"""
Butterworth Filter.
Input
img: Array of the image.
n: Filter order.
pD0: Percentage of the vertical dimension of the filter.
a,b: Filter constants
output
[Hfe,imgf,imgfilt]: List with the array of the filter,the fft
of the imput image and the filtered image.
"""
y0, x0 = img.shape #x0 = width, y0 = height
D0 = y0*pD0/100.0
# Hfe Filter
Hfe = np.zeros([n*y0, n*x0])
for i in range(0,n*y0):
for j in range(0,n*x0):
if not(i == y0 and j == x0):
x = -x0 + j
y = y0 - i
Hfe[i][j]= a + b*(1/(1+(D0/eucDist(0,0,x,y))**(2.0*n)))
# Getting the fft of the image
imgf = np.fft.fft2(img,[n*y0,n*x0])
imgf = np.fft.fftshift(imgf)
# Applying filter
#ishift = np.fft.ifftshift(imgf)
imgfilt = abs(np.fft.ifft2(np.multiply(Hfe,ishift)))
imgfilt = np.array(imgfilt[:y0,:x0])
return [Hfe,imgf,imgfilt]
# Loading image
img = rg(files[8])
# Getting the array of the filter,the fft of the imput image and the filtered image
butterw = butter(img)
hfe = butter(img)
# plotting
mplt.figure(), mplt.imshow(butterw[0], cmap='gray'),mplt.title("Butterworth filter"),
mplt.figure(), mplt.imshow(hfe[0], cmap='gray'),mplt.title("Hfe filter")
mplt.figure(), mplt.imshow(abs(butterw[1]), cmap='gray'),mplt.title("Frecuency domain image")
mplt.figure(), mplt.imshow(butterw[2], cmap='gray'),mplt.title("Image filtered with butterworth")
mplt.figure(), mplt.imshow(hfe[2], cmap='gray'),mplt.title("Image filtered with HFE")
# Histogram equalization and plot of original and filtered
equ = cv2.equalizeHist(np.array(hfe[2],dtype = np.uint8))
mplt.figure()
mplt.subplot(121),mplt.imshow(img, cmap='gray'),mplt.title(files[8])
mplt.xticks([]), mplt.yticks([])
mplt.subplot(122),mplt.imshow(equ, cmap='gray'),mplt.title("High-pass frecuency emphasis")
mplt.xticks([]), mplt.yticks([])
Out[333]:
Color spaces. Write a function that converts an image from the RGB color space to HSI color space. Write another function that converts an image from the HSI color space to RGB color space. You should use the homework equations to implement these conversions.
When plotting images in the HSI space, you should plot each channel separately, using one image for the H channel, another for the S channel and another for the I channel.
Test your function converting images from RGB to HSI and back to RGB.
Change between spaces make processing of image more versatil, in the sense we can use methods like the Histogram equalization in a color image. The code below have two functions, one to transform from RGB to HSI and another to do the opposite. These ecuations are applied to all the values of the tree channels and return the other color space image when all the values of its tree channels are calcualted. The functions are based on some ecuations from this page, wich accomplishes the purpose of the transformation in both ways.
In [345]:
def rgb2hsi(img):
"""
Transform a RGB image into a HSI image.
Input
img: Array of the image in RGB color space.
output
img_hsi: The image in HSI space.
"""
hg, wd, _ = img.shape
H = np.zeros((hg,wd))
S = np.zeros((hg,wd))
I = np.zeros((hg,wd))
for i in range(0,hg):
for j in range(0,wd):
#Getting R,G,B values
B = img[i][j][0]/255.0
G = img[i][j][1]/255.0
R = img[i][j][2]/255.0
Cmin = min([B,G,R])
I[i][j] = (B+G+R)/3.0
if I[i][j] > 0:
S[i][j] = 1-Cmin/I[i][j]
if G >= B:
h = (R - 0.5*G - 0.5*B)/((R**2.0 + G**2.0 + B**2.0 - R*G - R*B - G*B)**(1/2.0))
if abs(h) <= 1:
H[i][j] = (np.arccos(h)*180)/np.pi
else:
H[i][j] = np.pi
else:
h = (R - 0.5*G - 0.5*B)/((R**2 + G**2 + B**2 - R*G - R*B - G*B)**(1/2.0))
if abs(h) <= 1:
H[i][j] = 360 - (np.arccos(h)*180)/np.pi
else:
H[i][j] = np.pi
return cv2.merge((H,S,I*255.0))
def hsi2rgb(img):
"""
Transform a HSI image into a RGB image.
Input
img: Array of the image in HSI color space.
output
img_hsi: The image in RGB space.
"""
hg, wd, _ = img.shape
R = np.zeros((hg,wd))
G = np.zeros((hg,wd))
B = np.zeros((hg,wd))
for i in range(0,hg):
for j in range(0,wd):
H = img[i][j][0]
Hrad = H*np.pi/180
S = img[i][j][1]
I = img[i][j][2]/255.0
if H == 0:
r = I + 2*I*S
g = I - I*S
b = I - I*S
if 0 < H < 120:
r = I + (I*S*np.cos(Hrad))/np.cos(np.pi/3-Hrad)
g = I + I*S*(1 - np.cos(Hrad)/np.cos(np.pi/3-Hrad))
b = I - I*S
elif H == 120:
r = I - I*S
g = I + 2*I*S
b = I - I*S
elif 120 < H < 240:
r = I - I*S
g = I + I*S*np.cos(Hrad-np.pi*2/3)/np.cos(np.pi-Hrad)
b = I + I*S*(1 - np.cos(Hrad-np.pi*2/3)/np.cos(np.pi-Hrad))
elif H == 240:
r = I - I*S
g = I - I*S
b = I + 2*I*S
else:
r = I + I*S*(1 - np.cos(Hrad-np.pi*4/3)/np.cos(np.pi*5/3-Hrad))
g = I - I*S
b = I + I*S*np.cos(Hrad-np.pi*4/3)/np.cos(np.pi*5/3-Hrad)
B[i][j] = b
G[i][j] = g
R[i][j] = r
return cv2.merge((B,G,R))
# Loading image
f = files[12]
img = cv2.imread(pth+f, 1)
# Getting the HSI image
hsiimg = rgb2hsi(img)
h,s,i = cv2.split(hsiimg)
# Retrieving the original RGB trough convertion
imgrest = hsi2rgb(hsiimg)
# Plotting
mplt.imshow(img, cmap = 'gray'),mplt.title(f)
mplt.figure(),mplt.imshow(h, cmap = 'gray'),mplt.title('Hue of '+f+' in HCI space')
mplt.figure(),mplt.imshow(s, cmap = 'gray'),mplt.title('Saturation of '+f+' in HCI space')
mplt.figure(),mplt.imshow(i, cmap = 'gray'),mplt.title('Intensity of '+f+' in HCI space')
mplt.figure(),mplt.imshow(imgrest),mplt.title('Image in RGB of the convertion from HSI space')
Out[345]:
Manipulating color images. Write a function that converts a color image from the RGB to the HSI color space, modifies the ‘I’ channel by applying histogram equalization, and revert back the image to the RGB color space. Apply this function to enhance the image dark_fountain.jpg.
If someonw wants to do a Histogram equalization(HE) to a color image it is not posible to do it into the RGB space, it is because if someone wnats to do the HE in this image have to do it to all the channels, and this approach is wrong because the HE works with intensty values and not with the colors. This why we change the color space in order to obtain a channel with the intensity value and then applying to it the HE and then transform the merge of the channels H,S and the new I to RGB again.
We applied the Histogram equalization to the I channel and it increases the contrast level in the image. In dark_fountain.jpg this operation causes to some of the parts of the image to display only one pure color, like the cyan. THe efects changes with the image used, in most of the image this effect is very low.
In [346]:
# Loading the image
f = files[12]
img = cv2.imread(pth+f, 1)
# Getting the HSI image
hsiimg = rgb2hsi(img)
h,s,i = cv2.split(hsiimg)
# Applying Histogram equalization to the I channel and merging again
i = cv2.equalizeHist(np.array(i,dtype = np.uint8))
i = np.array(i, np.float64)
hsiimg = cv2.merge((h,s,i))
# Plotting
mplt.imshow(img, cmap = 'gray'),mplt.title(f)
mplt.figure(),mplt.imshow(h, cmap = 'gray'),mplt.title('Hue of '+f+' in HCI space')
mplt.figure(),mplt.imshow(s, cmap = 'gray'),mplt.title('Saturation of '+f+' in HCI space')
mplt.figure(),mplt.imshow(i, cmap = 'gray'),mplt.title('Equalized Intensity of '+f+' in HCI space')
imgrest = hsi2rgb(hsiimg)
mplt.figure(),mplt.imshow(imgrest),mplt.title('Image in RGB of the convertion from HSI space')
Out[346]:
Manipulating color images. Consider the RGB color image squares.jpg, where the squares are pure red, green and blue.
In this problem we explore other color spaces than RGB. The selected color space HSI, was switched to HLS due to implementation problems; the former color space is not an equivalent transformation of the RGB space. HLS was selected, as it is similar and has a fast implementation in OpenCV. In fact, for the current exercise it does not matter if we select HSI or HLS or HSV, as they all calculte the same Hue parameter.
Once the new color space is selected, we apply a smooth filter of kernel size 25x25 over the H matrix; as a result the borders of the boxes are no longer binary (there is a degradation). When we revert back to RGB we notice that now the colors merge (resulting in yellow between red and green, and cyan between blue and green), this is because the hue parameter is the color pigmentation (matiz in spanish). As can be seen, we can modify the colors of an image without affecting the lighting via a color space transformation.
In [284]:
# Read the image
img = cv2.imread(pth + 'squares.jpg')
mplt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
# Convert to HSI
# OpenCV offers no such a conversion (only to HSV and HLS)
# HLS is recomended, and will be used
# We can argue that Hue is calculated in the same way in all three systems
img_hls = cv2.cvtColor(img, cv2.COLOR_BGR2HLS)
# Extract H/L/S columns
cols = np.swapaxes(img_hls, 0, 2)
cols = np.swapaxes(cols, 1, 2)
H, L, S = cols
# Average image
kernel = np.ones((25,25),np.float32)/(25**2)
H = cv2.filter2D(H, -1, kernel)
H[H < 0] = 0
img_sm = np.dstack([H,L,S])
# Revert to RGB
img_bck = cv2.cvtColor(img_sm, cv2.COLOR_HLS2BGR)
mplt.imshow(cv2.cvtColor(img_bck, cv2.COLOR_BGR2RGB))
Out[284]:
False color. Write a Python function that produces a false color visualization of the input gray-scale image as follows:
Use your function to reproduce the following result on the image weld_x-ray.jpg:
Primeramente guardamos la imagen que vamos a utilizar en una variable llamada "img". Luego creamos tres matrices de ceros con el tamaño de la imagen llamadas R, G y B, las cuales usaremos más adelante para asignar el color azul o amarillo al pixel dependiendo del threshold. Definimos dos funciones además de la función main "falseColor" que son "pixelMapper" y "cambiarPixel". La primera recibe la imagen y las matrices R, G, B en la que se recorre cada pixel de la imagen original y se reemplazan y guardan en las tres matrices para darle el nuevo valor que tomará llamando a la segunda función cambiarPixel la cual recive el valor del pixel a cambiar como parametro y se le define un threshold en este caso de 250 ya que es muy cercano a 255 que es el color blanco, esta función le asigna el valor de azul y amarillo correspondiente a una lista (que puede verse como un vector con los valores de R, G, B que lo componen) y compara el valor del pixel que recive con el del threshold para asignar el color azul si el valor del pixel es menor y amarillo si el valor del pixel es mayor. Por último tenemos la función main que recibe la matriz de la imagen y las de R, G y B como parámetros y retorna la imagen que resulta de llamar la función recorre toda la imagen.
In [8]:
img = rg(files[-1])
R = np.zeros(img.shape, np.uint8)
G = np.zeros(img.shape, np.uint8)
B = np.zeros(img.shape, np.uint8)
def pixelMapper(oldimage, R, G, B):
height, width = oldimage.shape
for row in range(height):
for col in range(width):
originalPixel = oldimage[row, col]
R[row][col],G[row][col],B[row][col] = cambiarPixel(originalPixel)
return np.dstack([R,G,B])
def cambiarPixel(pixel, threshold = 250):
blue = [0, 0, 255]
yellow = [255,255,0]
return blue if pixel < threshold else yellow
def falseColor(image, R, G, B):
return pixelMapper(image, R, G, B)
wi = falseColor(img, R, G, B)
mplt.figure()
mplt.imshow(img, cmap='gray')
mplt.title(files[-1])
mplt.figure()
mplt.imshow(wi)
mplt.title(files[-1])
Out[8]:
Green screen effects. Write a function that replaces the green background of the image hiro.jpg with a background image of your choice.
Primero guardamos la imagen a la que queremos aplicarle la función en la variable llamada "img" y la imagen que queremos reemplazar de fondo en "background". Ahora, antes de poder reemplazar la imagen del fondo debemos asegurarnos de que sea del mismo tamaño de la imagen fondo verde, para esto usamos la función resized de cv2. Luego con la función de OpenCV, grabCut
extraemos el fondo verde, que nos retorna una máscara mask2
; esta máscara contiene ceros en el fondo y unos en el personaje. Multiplicando la máscara por la imagen original quitamos el fondo verde. Para poder introducir a la persona en el nuevo fondo debemos sustraer esta parte del mismo; para eso se invierte mask2 (quedando ceros en el área de la persona y unos en lo demás) y se multiplica con el nuevo fondo. Al resultado de la operación anterior la imagen con el fondo quitado.
In [20]:
img = cv2.imread('../data/'+'hiro.jpg')
background = cv2.imread('../data/'+'Fall-Nature-Background-Pictures.jpg')
# Resizing
image = rg(files[0])
hg, wd = image.shape
r = 412.5 / wd
dim = (600, int(hg * r))
bk_resized = cv2.resize(background, dim, interpolation = cv2.INTER_AREA)
mask = np.zeros(img.shape[:2], np.uint8)*255
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)
rect = (170, 8, 290, 290)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask == 2)|(mask == 0), 0, 1).astype('uint8')
img = img * mask2[:,:, np.newaxis]
mplt.figure()
mplt.imshow(mask2, cmap='gray')
mplt.figure()
mplt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
mplt.title(files[15])
mplt.figure()
mplt.imshow(cv2.cvtColor(bk_resized, cv2.COLOR_BGR2RGB))
mplt.title(files[0])
# Substract person area from background
# For that, invert mask2
mask2[mask2==0] = 2
mask2[mask2==1] = 0
mask2[mask2==2] = 1
# multiply with background
bk_cut = bk_resized*np.dstack([mask2,mask2,mask2])
mplt.figure()
mplt.imshow(cv2.cvtColor(bk_cut, cv2.COLOR_BGR2RGB))
mplt.title(files[15])
# Add person to background
final_img = img + bk_cut
mplt.figure()
mplt.imshow(cv2.cvtColor(final_img, cv2.COLOR_BGR2RGB))
mplt.title(files[15])
Out[20]:
In [ ]: