Nearest Neighbor Classification

Pearson correlation is used as the measure of distance in the nearest neighbor classification. In the paper, they have 91.6% classification rate while we 33/34 = 97% classification rate on test data as shown below.


In [1]:
load("../transformed data/paper3.rda")
load("DP.rda")

In [2]:
# Build the classifier
cl_nn = function(new_s, train, train_label){
    # use Pearson correlation
    corr = apply(train, 1, cor, new_s)
    train_label[corr==max(corr)]
}

# prediction
nn_train_pr = apply(train_cl,1, cl_nn, train_cl, train_response)
nn_test_pr = apply(test_cl,1, cl_nn, train_cl, train_response)

# show result of prediction
table(Train_Predict = nn_train_pr, Train_Actual = train_response)
table(Test_Predict = nn_test_pr, Test_Actual = test_response)


             Train_Actual
Train_Predict ALL AML
          ALL  27   0
          AML   0  11
            Test_Actual
Test_Predict ALL AML
         ALL  20   1
         AML   0  13

In [ ]: