Assigning an input image one label from a fixed set of categories. 即已知一固定的分类标签集合,然后对于输入的图像,从分类标签集合中找出一个分类标签,并把这个分类标签分配给该输入图像。
As we present (an inexhaustive) list of challenges below, keep in mind the raw representation of images as a 3-D array of brightness values:
The nearest neighbor classifier will take a test image, compare it to every single one of the training images, and predict the label of the closest training image.
L1 distance: $$d_{1}(I_{1},I_{2})=\sum_{p}{|I_1^p-I_2^p|}$$
L2 distance: $$d_1(I_1-I_2)=\sqrt{\sum_p |I_1^p-I_2^p|}$$
Sample Code:
import numpy as np
class NearestNeighbor(object):
def __init__(self):
pass
def train(self, X, y):
""" X is N x D where each row is an example. Y is 1-dimension of size N """
# the nearest neighbor classifier simply remembers all the training data
self.Xtr = X
self.ytr = y
def predict(self, X):
""" X is N x D where each row is an example we wish to predict label for """
num_test = X.shape[0]
# lets make sure that the output type matches the input type
Ypred = np.zeros(num_test, dtype = self.ytr.dtype)
# loop over all test rows
for i in xrange(num_test):
# find the nearest training image to the i'th test image
# using the L1 distance (sum of absolute value differences)
distances = np.sum(np.abs(self.Xtr - X[i,:]), axis = 1)
# using the L2 distance
# distances = np.sqrt(np.sum(np.square(self.Xtr - X[i,:]), axis = 1))
min_index = np.argmin(distances) # get the index with smallest distance
Ypred[i] = self.ytr[min_index] # predict the label of the nearest example
return Ypred
Xtr, Ytr, Xte, Yte = load_CIFAR10('data/cifar10/') # a magic function we provide
# flatten out all images to be one-dimensional
Xtr_rows = Xtr.reshape(Xtr.shape[0], 32 * 32 * 3) # Xtr_rows becomes 50000 x 3072
Xte_rows = Xte.reshape(Xte.shape[0], 32 * 32 * 3) # Xte_rows becomes 10000 x 3072
nn = NearestNeighbor() # create a Nearest Neighbor classifier class
nn.train(Xtr_rows, Ytr) # train the classifier on the training images and labels
Yte_predict = nn.predict(Xte_rows) # predict labels on the test images
# and now print the classification accuracy, which is the average number
# of examples that are correctly predicted (i.e. label matches)
print('accuracy: %f' % ( np.mean(Yte_predict == Yte)))
L1和L2比较: 在面对两个向量之间的差异时,L2比L1更加不能容忍这些差异。即相对于1个巨大的差异,L2距离更倾向于接受多个中等程度的差异。
Instead of finding the single closest image in the training set, we will find the top k closest images, and have them vote on the label of the test image. In particular, when k = 1, we recover the Nearest Neighbor classifier. Intuitively, higher values of k have a smoothing effect that makes the classifier more resistant to outliers
Do NOT use the test set for the purpose of tweaking hyperparameters.If you only use the test set once at end, it remains a good proxy for measuring the generalization of your classifier.
Split your training set into training set(训练集) and a validation set(验证集). Use validation set to tune all hyperparameters. At the end run a single time on the test set(测试集) and report performance.
Cross-validation(交叉验证):讲训练集平均分成几份,其中一部分作为验证集,其余部分作为训练集,并循环取其中的一份验证其余训练,最后取所有验证结果的平均值作为算法验证结果
In practice:交叉验证一般会耗费较多的计算资源,所以通常直接把训练集按照50%-90%的比例分成训练集和验证集,但如果hyperparameter很多,而验证集的数量不够,则可以考虑使用交叉验证,一般都是分成3/5或10份。一旦找到最优的超参数,就让算法以该参数在测试集跑且只跑一次,并根据测试结果评价算法。