Traditional methods - OpenCV+scikit_learn

It's for traditional methods of object detection, using OpenCV to preprocess and extract features and then use machine learning algorithm to classify. Generally, it can be divided into three modules: preprocessing, feature extraction and classification. First, some prepration works.



In [1]:

    
import cv2
import numpy as np
from skimage.feature import hog
from sklearn.decomposition import PCA
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from lib.data_utils import get_MNIST_data

Then read the MNIST data.



In [2]:

    
data = get_MNIST_data(subtract_mean=False)

# check if we load the data successfully
print(data['X_train'].shape)









    



(41000, 28, 28, 1)

Preprocessing



In [ ]:

Feature extraction

Different methods exist to extract feature. Here we try ORB (Oriented FAST and Rotated BRIEF).



In [56]:

    
# check the min number of keypoints
orb = cv2.ORB_create(edgeThreshold=2, patchSize=2)
len_k = 500
for key in ['X_train', 'X_test']:
    for img in data[key]:
        k = orb.detect(img.astype(np.uint8).reshape((28,28)))
        if len(k) < len_k:
            len_k = len(k)
print('minimum number of keypoints:', len_k)









    



minimum number of keypoints: 11



In [108]:

    
# compute the ORB descriptors
feats = {'X_train': np.zeros((41000,len_k*32)), 'X_test': np.zeros((1000,len_k*32))}
for key in feats.keys():
    print('compute for data: ', key)
    for i, img in zip(range(data[key].shape[0]), data[key]):
        k = orb.detect(img.astype(np.uint8).reshape((28,28)))
        _, feat = orb.compute(img.astype(np.uint8).reshape((28,28)), k[:len_k])
        feats[key][i,:] = feat.reshape(-1)









    



compute for data:  X_test
compute for data:  X_train



In [109]:

    
# check the computed features size
print(feats['X_train'].shape)
print(feats['X_test'].shape)









    



(41000, 352)
(1000, 352)

Here we try HOG (Histogram of Oriented Gradients).



In [20]:

    
# compute the HOG for each image
feats = {'X_train': [], 'X_test': []}
for key in feats.keys():
    print('compute for data: ', key)
    for img in data[key]:
        feat = hog(img.reshape((28,28)),
                   pixels_per_cell=(7,7),
                   cells_per_block=(4,4),
                   block_norm='L2-Hys')
        feats[key].append(feat.reshape(-1))









    



compute for data:  X_test
compute for data:  X_train



In [21]:

    
feats['X_train'] = np.array(feats['X_train'])
feats['X_test'] = np.array(feats['X_test'])
# check the computed features size
print(feats['X_train'].shape)
print(feats['X_test'].shape)









    



(41000, 144)
(1000, 144)

It's possible to use PCA to reduce dimensions of feature to avoid curse of dimensionality for common classifiers.



In [44]:

    
# initialize PCA with top 50
pca = PCA(n_components=50)
pca.fit(feats['X_train'])
feats_reduce = {'X_train': [], 'X_test': []}
for key in feats.keys():
    feats_reduce[key] = pca.transform(feats[key])



In [45]:

    
# check the computed features size
print(feats_reduce['X_train'].shape)
print(feats_reduce['X_test'].shape)









    



(41000, 50)
(1000, 50)



In [ ]:

Classification

Different machine learning methods are used to classify the digits.



In [22]:

    
# decision tree
dt = DecisionTreeClassifier()
dt.fit(feats['X_train'],data['y_train'])
print(dt.score(feats['X_test'], data['y_test']))
# test accuracy of 57.2% using ORB
# test accuracy of 90.2% using HOG (7, 2)
# test accuracy of 90.3% using HOG (7, 4)



In [46]:

    
# decision tree for reduced data
dt = DecisionTreeClassifier()
dt.fit(feats_reduce['X_train'],data['y_train'])
print(dt.score(feats_reduce['X_test'], data['y_test']))
# test accuracy of 89% using HOG (7, 2)



In [19]:

    
# k nearest neighbors
knn = KNeighborsClassifier(n_neighbors=10)
knn.fit(feats['X_train'],data['y_train'])
print(knn.score(feats['X_test'], data['y_test']))
# test accuracy of 29.9% using ORB
# test accuracy of 94.2% using HOG (7, 2)
# test accuracy of 97.3% using HOG (7, 4)



In [47]:

    
# k nearest neighbors for reduced data
knn = KNeighborsClassifier(n_neighbors=10)
knn.fit(feats_reduce['X_train'],data['y_train'])
print(knn.score(feats_reduce['X_test'], data['y_test']))
# test accuracy of 94% using HOG (7, 2)



In [17]:

    
# random forest
rf = RandomForestClassifier()
rf.fit(feats['X_train'],data['y_train'])
print(rf.score(feats['X_test'], data['y_test']))
# test accuracy of 59.6% using ORB
# test accuracy of 96% using HOG (7, 2)
# test accuracy of 94.3% using HOG (8, 3)
# test accuracy of 96% using HOG (7, 4)



In [23]:

    
# SVM
svm = SVC()
svm.fit(feats['X_train'],data['y_train'])
print(svm.score(feats['X_test'], data['y_test']))
# test accuracy of 51.1% using ORB
# test accuracy of 11.5% using HOG (7, 4)



In [ ]:



In [ ]: