Michael duPont - CodeCamp 2017
The first thing we need to do is pick out faces from a larger image. Because the model for this is not user or case specific, we can use an existing model, load it with OpenCV, and tune the hyperparameters instead of building one from scratch, which we will have to do later.
In [ ]:
import cv2
import numpy as np
CASCADE = cv2.CascadeClassifier('findme/haar_cc_front_face.xml')
def find_faces(img: np.ndarray, sf=1.16, mn=5) -> np.array([[int]]):
"""Returns a list of bounding boxes for every face found in an image"""
return CASCADE.detectMultiScale(
cv2.cvtColor(img, cv2.COLOR_RGB2GRAY),
scaleFactor=sf,
minNeighbors=mn,
minSize=(45, 45),
flags=cv2.CASCADE_SCALE_IMAGE
)
That's really all we need. Now let's test it by drawing rectangles around a few images of groups. Here's one example:
In [ ]:
import matplotlib.pyplot as plt
from matplotlib.image import imread, imsave
%matplotlib inline
plt.imshow(imread('test_imgs/initial/group0.jpg'))
In [ ]:
from glob import glob
def draw_boxes(bboxes: [[int]], img: 'np.array', line_width: int=2) -> 'np.array':
"""Returns an image array with the bounding boxes drawn around potential faces"""
for x, y, w, h in bboxes:
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), line_width)
return img
#Find faces for each test image
for fname in glob('test_imgs/initial/group*.jpg'):
img = imread(fname)
bboxes = find_faces(img)
print(bboxes)
imsave(fname.replace('initial', 'find_faces'), draw_boxes(bboxes, img))
plt.imshow(imread('test_imgs/find_faces/group0.jpg'))
In [ ]:
#Creates cropped faces for imgs matching 'test_imgs/group*.jpg'
def crop(img: np.ndarray, x: int, y: int, width: int, height: int) -> np.ndarray:
"""Returns an image cropped to a given bounding box of top-left coords, width, and height"""
return img[y:y+height, x:x+width]
def pull_faces(glob_in: str, path_out: str) -> int:
"""Pulls faces out of images found in glob_in and saves them as path_out
Returns the total number of faces found
"""
i = 0
for fname in glob(glob_in):
print(fname)
img = imread(fname)
bboxes = find_faces(img)
for bbox in bboxes:
cropped = crop(img, *bbox)
imsave(path_out.format(i), cropped)
i += 1
return i
found = pull_faces('test_imgs/initial/group*.jpg', 'test_imgs/corpus/face{}.jpg')
print('Total number of base corpus faces found:', found)
plt.imshow(imread('test_imgs/corpus/face0.jpg'))
Now that we have some faces to work with, let's save them to a pickle file for use later on.
In [ ]:
from pickle import dump
#Creates base_corpus.pkl from face imgs in test_imgs/corpus
imgs = [imread(fname) for fname in glob('test_imgs/corpus/face*.jpg')]
dump(imgs, open('findme/base_corpus.pkl', 'wb'))
In [ ]:
found = pull_faces('test_imgs/initial/me*.jpg', 'test_imgs/corpus/me{}.jpg')
print('Total number of target faces found:', found)
plt.imshow(imread('test_imgs/corpus/me0.jpg'))
That was easy enough. In order to have a large enough corpus of target faces, I included pictures of myself with other people and deleted their faces after the code block ran. It ended up having eleven target faces.
Now that we have our faces, we need to create the features and labels that will be used to train our facial recognition model. We've already classified our data based on the face's filename; all we need to do is assign a 1 or 0 to each group for our labels. We'll also need to scale each image to a standard size. Thankfully the output for each bounding box is a square, so we don't have to worry about introducing distortions.
In [ ]:
#Load the two sets of images
from pickle import load
notme = load(open('findme/base_corpus.pkl', 'rb'))
me = [imread(fname) for fname in glob('test_imgs/corpus/me*.jpg')]
#Create features and labels
features = notme + me
labels = [0] * len(notme) + [1] * len(me)
#Preprocess images for the model
def preprocess(img: np.ndarray) -> np.ndarray:
"""Resizes a given image and remove alpha channel"""
img = cv2.resize(img, (45, 45), interpolation=cv2.INTER_AREA)[:,:,:3]
return img
features = [preprocess(face) for face in features]
Simple enough. Let's do a quick check before shuffling. The first image should be part of the base corpus:
In [ ]:
print('Is the target:', labels[0] == 1)
plt.imshow(features[0], cmap='gray')
And the last image should be of the target:
In [ ]:
print('Is the target:', labels[-1] == 1)
plt.imshow(features[-1], cmap='gray')
Looks good. Let's create a quick data and file checkpoint. This means we'll be able to load the file in from this point on without having to run most of the above code.
In [ ]:
#Convert into numpy arrays
features = np.array(features)
labels = np.array(labels)
dump(features, open('test_imgs/features.pkl', 'wb'))
dump(labels, open('test_imgs/labels.pkl', 'wb'))
In [ ]:
# DATA/FILE CHECKPOINT
from pickle import load
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.image import imread, imsave
%matplotlib inline
from findme.imageutil import crop, draw_boxes, preprocess
from findme.models import find_faces
features = load(open('findme/features.pkl', 'rb'))
labels = load(open('findme/labels.pkl', 'rb'))
features = features[-24:]
labels = labels[-24:]
That's it for our data. You'll notice that we only loaded a subset of our dataset. This ensures that the number of target and non-target images matches, which leads to a better model even though it has less data overall. We'll split our data in the next section.
We've already created all of our data. Now for the model we're going to train. First, we need to convert our labels to one-hot encoding for use in the model. This means our output layer will have two nodes: True and False.
In [ ]:
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
labels = enc.fit_transform(labels.reshape(-1, 1)).toarray()
print('Not target label:', labels[0])
print('Is target label:', labels[-1])
Now we need to define our model architecture one layer at a time. We'll create three convolutional layers, two fully-connected layers, and the output layer.
In [ ]:
from keras.layers import Activation, Convolution2D, Dense, Dropout, Flatten, MaxPooling2D
from keras.metrics import binary_accuracy
from keras.models import Sequential
SHAPE = features[0].shape
NB_FILTER = 16
def make_model() -> Sequential:
"""Create a Sequential Keras model to boolean classify faces"""
model = Sequential()
#First Convolution
model.add(Convolution2D(NB_FILTER, (3, 3), input_shape=SHAPE))
model.add(Activation('relu'))
model.add(MaxPooling2D())
model.add(Dropout(0.1))
# Second Convolution
model.add(Convolution2D(NB_FILTER*2, (2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D())
model.add(Dropout(0.2))
# Third Convolution
model.add(Convolution2D(NB_FILTER*4, (2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D())
model.add(Dropout(0.3))
# Flatten for Fully Connected
model.add(Flatten())
# First Fully Connected
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.4))
# Second Fully Connected
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
# Output
model.add(Dense(2))
model.compile(loss = 'mean_squared_error', optimizer = 'rmsprop', metrics=[binary_accuracy])
return model
print(make_model().summary())
Now we need to train the model. Even though we have a large model in terms of its parameters, we can still let the model train for many epochs because our feature set is so small. On a MacBook Air, it takes around 30 seconds to train the model with 500 epochs. To save space, I've disabled the full training printout that Keras provides, but you can watch the accuracy progress yourself by changing verbose
from 0
to 1
.
We also need to shuffle our data because feeding all of the non-target and target faces into the model in order will lead to a biased model. Scikit-Learn has a convenient function to do this for us. Rather than just calling random, this function preserves the relationship between the feature and label indexes.
In [ ]:
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.utils import shuffle
model = KerasClassifier(build_fn=make_model, epochs=500, batch_size=len(labels), verbose=0)
X, Y = shuffle(features, labels, random_state=42)
model.fit(X, Y)
Let's quickly see how well it trained to the given data. Because the dataset is so small, we didn't want to keep any for a test or validation set. We'll test it on a new image later.
In [ ]:
preds = model.predict(features)
print('Non-target faces predicted correctly:', np.all(preds[:12] == 0))
print('Non-target faces predicted correctly:', preds[-12:] == 1))
That's it. While Keras has its own mechanisms for training and validating models, we're using a wrapper around our Keras model so it conforms to the Scikit-Learn model API. We can use fit
and predict
when working with the model in our code, and it let's us train and use our model with the other helper modules sk-learn provides. For example, we could have evaluated the model using StratifiedKFold and cross_val_score which would look like this:
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold, cross_val_score
model = KerasClassifier(build_fn=make_model, epochs=5, batch_size=len(labels), verbose=0)
# evaluate using 10-fold cross validation
kfold = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)
result = cross_val_score(model, features, labels, cv=kfold)
print(result.mean())
This method allows us to determine how effective our model is but does not return a trained model for us to use.
In [ ]:
test_img = imread('test_imgs/evaluate/me1.jpg')
plt.imshow(test_img)
Now for the function itself. Because we've already made function around the core parts of our data pipeline, this function is going to be incredibly short yet powerful.
In [ ]:
def target_in_img(img: np.ndarray) -> (bool, np.array([int])):
"""Returns whether the target is in a given image and where"""
for bbox in find_faces(img):
face = preprocess(crop(img, *bbox))
if model.predict(np.array([face])) == 1:
return True, bbox
return False, None
Yeah. That's it. Let's break down the steps:
find_faces
returns a list of bounding boxes containing facesmodel
predicts whether the face is or is not the targetpred == 1
), return True and the current bounding boxNow let's test it. If it works properly, we should see a bounding bx appear around the target's face.
In [ ]:
found, bbox = target_in_img(test_img)
print('Target face found in test image:', found)
if found:
plt.imshow(draw_boxes([bbox], test_img, line_width=20))
We're finally done.