Introduction

MSc Machine Learning Assignment - Classification task. Private Kaggle Competition: "Are you sure Brighton's seagull is not a man-made object?" The aim of the assignment was to build a classifier able to distinguish between man-made and not man-made objects. Each data instance was represented by a 4608 dimensional feature vector. This vector was a concatenation of 4096 dimensional deep Convolutional Neural Networks (CNNs) features extracted from the fc7 activation layer of CaffeNet and 512 dimensional GIST features.

Three Additional pieces of information were granted: confidence label for each training instance, the test data class proportions, and additional training data containing missing values.

This Notebook contains the final workflow employed, to produce the model used to make the final predictions. The original Notebook contained a lot of trial and error methods, such as fine tuning the range of parameters fitted in the model. Some of these details from the original, and rather messy Notebook have been excluded here. Thus, this Notebook only intends to show the code for the key processes leading up to the development of the final model. In addition, this notebook contains a report/commentary documenting the theory concerning the steps of the workflow. The theory is adopted from the Literature, and is referenced appropriately.

Report also available in PDF, contact me for request.

1. Approach

1.1) Introduction of SVM

The approach of choice here was the Support Vector Machine (SVM). SVMs were pioneered in the late seventies [1]. SVMs are supervised learning models, which are extensively used for classification [2] and regression tasks [3]. In this context, SVM was employed for a binary classification task.

In layman’s terms, the basic premise of SVM for classification tasks is to find the optimal separating hyperplane (also called the decision boundary) between classes, through maximizing the margin between the data points closest to the decision boundary and the decision boundary itself. The points closest to the decision boundary are termed support vectors. The reason why the margin is maximized is to improve generalisation of the decision boundary; it is likely that many decision boundaries exist, but the decision boundary that maximizes the margin increases the likelihood that future outliers will be correctly classified [4]. This intuition seems relatively simple, but is complicated by ‘soft’ and ‘hard’ margins. A hard margin is only applicable when the data set is linearly separable. A soft margin is applicable when the data set is not linearly separable. Essentially, a soft margin is aware of future misclassifications due to the data not being linearly separable, thus tolerates misclassifications using a penalty term. On the contrary, a hard margin does not tolerate misclassifications. For a hard margin, misclassifications are dealt with by minimizing the margin. These concepts will be formalized below.

Assume the problem of binary classification on a dataset $\{(x_1, y_1), (x_2, y_2), ..., (x_n, y_n)\}$ , where $x_i$ $\in$ $R^d$, i.e. $x_i$ is a data point represented as a d-dimensional vector, and $y_i$ $\in$ $\{-1, 1\}$ , which represents the class label of that data point, for $i= 1, 2, ..., n$ . A better optimal separation can be found by first transforming the data into a higher dimensional feature space by a non-linear mapping function $\phi$ [2]. This $\phi$ is also referred to as the ‘kernel’. A possible decision boundary can then be represented by $w \cdot \phi(x) + b = 0$ , where $w$ is the weight vector orthogonal to the decision boundary and $b$ is an intercept term. It follows that, if the data set is linearly separable, then the decision boundary that maximizes the margin can be found by solving the following optimization: $\min (\frac{1}{2} w \cdot w) $ under the constraint $y_i (w \cdot \phi(x_i) + b) \ge 1 $ where $i = 1, 2, ..., n$ . This encapsulates the concept of a ‘hard’ margin. However, in the case of non-linearly separable data, the above constraint has to be relaxed by the introduction of a slack variable $\varepsilon$ . The optimization problem then becomes: $\min(\frac{1}{2} w \cdot w + C \sum_{i=1}^n \varepsilon_i$) such that $y_i (w \cdot \phi(x_i) + b) \ge 1 - \varepsilon_i $ where $i = 1, 2, ..., n$ and $\varepsilon_i \ge 0$. The $\sum_{i=1}^n \varepsilon_i$ term can be interpreted as the misclassification cost. This new objective function comprises two aims. The first aim still remains to maximize the margin, and the second aim is to reduce the number of misclassifications. The trade-off between these two aims is controlled by the parameter $C$. This encapsulates the concept of a ‘soft’ margin. $C$ is coined the regularization parameter. A high value of $C$ increases the penalty for misclassifications, thus places more emphasis on the second goal. A large misclassification penalty enforces the model to reduce the number of misclassifications. Hence, a high enough value of $C$ could induce over-fitting. A small $C$ decreases the penalty for misclassifications, thus places more emphasis on the first goal. A small classification penalty enforces the model to tolerate classifications more readily. Hence, a small enough value of $C$ could induce under-fitting.
The SVM classifier is trained using the hinge-loss as the loss function [5].

1.2) Suitability of SVM

SVM is a popular technique because of its solid mathematical background, high generalisation capability, ability to find global solutions and ability to find solutions that are non-linear [6]. However, SVMs can be impacted by data sets that do not have equal class balances. The methods for dealing with this will be discussed in Section 2.1 of this report. Thus, SVMs are still applicable to data sets with class imbalances, such as the data set provided here. It has been argued that SVMs show superior performance than other techniques when analysis is conducted on high- dimensional data [7]. The dataset here, even after pre- processing, has many dimensions. Thus, the use of SVM in this context is justified. Another downfall of SVM is the dependency on feature scaling; the performance of an SVM can be highly impacted by the selection of the feature scaling method. However, feature scaling is an important pre-processing technique. One encouraging reason for employing feature scaling is that the gradient descent algorithm converges much faster with feature scaling than without feature scaling. In particular, feature scaling reduces the time it takes for the SVM to find support vectors [8].

2. Data Preparation Before Pre-Processing

This section will cover how the training data for the final model was prepared. Several additonal pieces of information were provided in the assignment outline. This section will demonstrate how these strands of information were incorporated, if they were incorporated at all.



In [1]:

    
#Import Relevant Modules and Packages 
import pandas as pd
import numpy as np 
from sklearn.svm import SVC
from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn import preprocessing
from sklearn.model_selection import GridSearchCV
from sklearn.decomposition import PCA
from scipy import stats
from sklearn.feature_selection import VarianceThreshold
#see all rows of dataframe
#pd.set_option('display.max_rows', 500)



In [2]:

    
#Load the complete training data set 
training_data = pd.read_csv("/Users/Max/Desktop/Max's Folder/Uni Work/Data Science MSc/Machine Learning/ML Kaggle Competition /Data Sets/Training Data Set.csv", header=0, index_col=0)



In [3]:

    
#Observe the original training data 
training_data.head()









    Out[3]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.503
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
      prediction
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      0.00000
      0.16784
      1.477
      0.75651
      0.38741
      0.0000
      0.21295
      0.0000
      0.000000
      0.225750
      ...
      0.025833
      0.021306
      0.027640
      0.036184
      0.047010
      0.037981
      0.049249
      0.059802
      0.035669
      0
    
    
      2
      0.00000
      0.00000
      0.000
      0.44260
      0.00000
      0.0000
      0.15024
      1.4806
      0.635870
      0.020341
      ...
      0.017774
      0.020330
      0.019916
      0.033483
      0.015937
      0.021656
      0.018347
      0.017458
      0.018744
      0
    
    
      3
      0.00000
      0.00000
      0.000
      0.47042
      0.00000
      1.2779
      0.45954
      0.0000
      0.000000
      0.000000
      ...
      0.017935
      0.005156
      0.041298
      0.014921
      0.015868
      0.012122
      0.015664
      0.011410
      0.017450
      1
    
    
      4
      0.00000
      0.00000
      0.000
      0.00000
      0.00000
      0.0000
      0.00000
      0.0000
      0.030878
      0.928510
      ...
      0.039596
      0.007086
      0.013696
      0.028789
      0.022858
      0.030883
      0.026539
      0.021337
      0.018109
      1
    
    
      5
      0.49099
      0.83388
      0.000
      0.00000
      0.00000
      0.0000
      0.00000
      0.0000
      0.188490
      0.764420
      ...
      0.008161
      0.036306
      0.029198
      0.045733
      0.008041
      0.013111
      0.022239
      0.058815
      0.014322
      1
    
  

5 rows × 4609 columns



In [4]:

    
#quantify class counts of original training data 
training_data.prediction.value_counts()









    Out[4]:





1    205
0    175
Name: prediction, dtype: int64

2.1) Dealing with Missing Values – Imputation

Imputation is the act of replacing missing data values in a data set with meaningful values. Simply removing rows with missing feature values is bad practice if the data is scarce, as a lot of information could be lost. In addition, deletion methods can introduce bias [9]. The incomplete additional training data was combined with the complete original training data because the complete original data was scarce in number. However, the incomplete additional training data was missing values, therefore imputation was appropriate, if not required. Two methods of imputation were employed. The first method of imputation employed was imputation via feature means. However, this method has been heavily criticized. In particular, it has been hypothesized that imputation via mean introduces bias and underestimates variability [10]. The second method of imputation employed was k- Nearest-Neighbours (kNN) [11]. This is a technique, which is part of hot-deck imputation techniques [12], where missing feature values are filled from data points that are similar, or geometrically speaking, points that are closest in distance. This method is more appropriate than using the mean imputation, given the flaws of feature mean imputation. Therefore, kNN was the imputation method used to build the final model. The kNN implementation was found in the ‘fancyimpute’ package [13]. The k of kNN can be considered a parameter that needs to be chosen carefully. Fortunately, the literature provides some direction on this. The work of [14] suggests that kNN with 3 nearest-neighbours is the best for the trade-off between imputation error and preservation of data structure. In summary, kNN was employed for imputation, and k was set to 3.

This section will cover how the incomplete additional training data set was incorporated to develop a larger training data set. In particular, the additional training data was combined with the original training data. The additonal training data was incomplete, with several NaN entries. Thus, imputation was performed to replace NaN entries with meaningful values.



In [5]:

    
#Load additional training data 
add_training_data = pd.read_csv("/Users/Max/Desktop/Max's Folder/Uni Work/Data Science MSc/Machine Learning/ML Kaggle Competition /Data Sets/Additional Training Data Set .csv", header=0, index_col=0)



In [6]:

    
#observe additional training data 
add_training_data









    Out[6]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.503
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
      prediction
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      381
      0.36854
      0.000000
      NaN
      0.000000
      0.00000
      NaN
      0.360540
      0.659070
      0.907850
      1.207200
      ...
      0.016446
      0.014056
      0.043309
      NaN
      0.037343
      0.006719
      NaN
      0.024491
      0.015025
      1
    
    
      382
      0.00000
      0.000000
      0.000000
      1.194500
      1.10800
      1.443800
      0.000000
      0.718870
      NaN
      NaN
      ...
      0.005913
      NaN
      0.006697
      0.003553
      0.004236
      NaN
      0.008944
      0.020299
      NaN
      0
    
    
      383
      0.33315
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.103570
      0.094568
      NaN
      0.000000
      ...
      0.028948
      NaN
      NaN
      0.032514
      0.031939
      0.048637
      0.047683
      0.051014
      0.023250
      1
    
    
      384
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.425680
      0.674630
      0.000000
      0.797180
      0.441840
      ...
      0.048603
      0.012979
      0.039932
      0.024701
      0.027439
      0.014231
      0.044304
      0.057307
      0.025580
      1
    
    
      385
      0.00000
      0.000000
      NaN
      0.000000
      NaN
      0.000000
      0.000000
      0.536480
      0.000000
      1.662200
      ...
      0.063778
      0.014582
      0.094946
      0.072355
      0.036569
      0.019885
      0.075454
      0.057808
      NaN
      1
    
    
      386
      0.00000
      0.000000
      0.000000
      0.000000
      NaN
      0.033839
      NaN
      0.550010
      0.000000
      0.000000
      ...
      0.002547
      0.000970
      0.004847
      NaN
      NaN
      0.001280
      0.010284
      0.017465
      0.003835
      0
    
    
      387
      0.89589
      2.249300
      0.000000
      0.096874
      0.00000
      0.000000
      NaN
      0.000000
      0.000000
      0.000000
      ...
      0.021326
      0.048036
      NaN
      0.062308
      NaN
      0.029789
      0.023238
      0.049789
      0.042957
      1
    
    
      388
      0.00000
      NaN
      0.000000
      NaN
      0.15435
      0.000000
      0.345190
      0.790570
      0.000000
      0.603140
      ...
      0.020851
      0.054987
      0.048095
      0.054826
      0.044836
      0.047492
      0.031173
      NaN
      0.056663
      1
    
    
      389
      2.07390
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      NaN
      0.000000
      NaN
      ...
      0.069224
      NaN
      0.031577
      0.050929
      0.039233
      0.012247
      0.035869
      0.065476
      0.036050
      1
    
    
      390
      0.00000
      2.013900
      NaN
      0.000000
      0.00000
      0.000000
      0.551610
      0.000000
      0.770090
      0.000000
      ...
      0.010342
      0.069422
      0.073940
      0.042869
      0.014791
      0.046599
      NaN
      NaN
      0.012851
      1
    
    
      391
      NaN
      NaN
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      0.449090
      0.000000
      0.000000
      ...
      0.017124
      NaN
      0.037151
      0.066304
      0.030059
      0.017258
      0.057827
      0.046734
      0.013858
      0
    
    
      392
      0.00000
      NaN
      0.551660
      0.511390
      0.44663
      0.000000
      0.829450
      0.169020
      0.000000
      0.656110
      ...
      NaN
      NaN
      0.002628
      0.001312
      0.003053
      0.042552
      0.016416
      NaN
      0.009700
      1
    
    
      393
      0.00000
      1.758300
      0.000000
      0.000000
      0.70794
      0.000000
      1.016600
      NaN
      0.223690
      0.000000
      ...
      0.002593
      NaN
      0.041875
      NaN
      0.002247
      NaN
      NaN
      NaN
      0.004210
      1
    
    
      394
      0.00000
      0.000000
      0.000000
      NaN
      1.69510
      NaN
      0.000000
      0.000000
      NaN
      0.387880
      ...
      0.001936
      0.009446
      0.057476
      0.030121
      0.013046
      NaN
      0.039464
      NaN
      0.029116
      1
    
    
      395
      0.00000
      0.000000
      0.065205
      NaN
      0.00000
      0.000000
      0.165250
      0.000000
      NaN
      NaN
      ...
      0.044815
      0.025457
      0.033375
      0.057896
      0.033652
      NaN
      0.020977
      0.025914
      0.018434
      1
    
    
      396
      0.16352
      0.091154
      NaN
      0.127380
      NaN
      NaN
      0.000000
      0.000000
      1.230700
      0.600490
      ...
      0.027672
      0.006230
      0.010304
      0.005203
      NaN
      0.004434
      NaN
      0.004102
      0.015907
      0
    
    
      397
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      0.608360
      1.405400
      NaN
      ...
      0.004091
      NaN
      0.021434
      0.004124
      0.006042
      0.040793
      0.042516
      NaN
      0.029567
      1
    
    
      398
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.622480
      0.000000
      0.537490
      1.016600
      0.003190
      ...
      0.027232
      NaN
      0.042924
      NaN
      0.035156
      0.038203
      0.037265
      0.023543
      0.017802
      1
    
    
      399
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.370100
      0.132860
      0.455330
      0.182380
      1.002500
      ...
      0.016653
      0.044491
      0.024781
      0.041926
      NaN
      NaN
      NaN
      0.043103
      0.052243
      1
    
    
      400
      0.27942
      NaN
      0.000000
      1.303200
      0.00000
      0.445950
      0.000000
      0.000000
      0.664470
      NaN
      ...
      0.005121
      0.004739
      0.015804
      0.009327
      0.006316
      NaN
      0.015994
      0.023798
      NaN
      1
    
    
      401
      0.00000
      1.153500
      0.000000
      0.000000
      NaN
      0.607270
      0.133640
      0.000000
      0.000000
      0.000000
      ...
      0.021836
      0.027867
      0.044285
      0.056987
      0.024753
      0.020699
      0.016404
      0.021481
      NaN
      1
    
    
      402
      0.00000
      0.000000
      0.000000
      NaN
      1.09950
      0.000000
      0.221540
      NaN
      NaN
      0.576960
      ...
      0.055716
      0.021611
      NaN
      0.006325
      0.024249
      0.018989
      0.010881
      0.029069
      0.035425
      0
    
    
      403
      0.69578
      1.076700
      0.000000
      0.000000
      0.00000
      0.422790
      0.074301
      NaN
      0.000000
      0.000000
      ...
      0.022869
      0.041593
      NaN
      NaN
      NaN
      0.028789
      NaN
      0.034176
      0.028748
      1
    
    
      404
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      0.000000
      0.000000
      1.911900
      ...
      NaN
      0.024381
      0.026518
      0.014335
      0.007303
      NaN
      NaN
      0.029905
      NaN
      1
    
    
      405
      0.00000
      0.000000
      0.000000
      0.095563
      1.35120
      0.000000
      0.073423
      0.530710
      0.150650
      0.000000
      ...
      0.001538
      0.007162
      0.009936
      NaN
      0.019254
      0.010202
      0.010988
      0.007354
      0.002542
      0
    
    
      406
      NaN
      0.456580
      0.000000
      NaN
      0.00000
      NaN
      0.000000
      0.088917
      0.000000
      0.574290
      ...
      0.028478
      0.001726
      0.001601
      0.030189
      0.028521
      0.002605
      0.003204
      0.019569
      0.018815
      1
    
    
      407
      NaN
      0.759570
      0.602440
      0.000000
      NaN
      1.053600
      NaN
      0.000000
      0.137830
      0.778610
      ...
      0.022562
      0.005456
      NaN
      0.012345
      0.017503
      0.026339
      0.041269
      0.020342
      0.015812
      0
    
    
      408
      0.00000
      0.000000
      0.000000
      NaN
      NaN
      0.591770
      0.171560
      0.312710
      NaN
      0.000000
      ...
      0.024517
      0.012467
      NaN
      0.026758
      0.030351
      0.025761
      0.037172
      0.019815
      0.023609
      0
    
    
      409
      NaN
      0.000000
      NaN
      0.698550
      0.00000
      0.000000
      0.197920
      NaN
      0.000000
      0.000000
      ...
      0.056415
      NaN
      NaN
      0.030497
      0.025438
      0.047698
      0.048943
      0.050613
      0.022563
      0
    
    
      410
      0.00000
      1.107400
      NaN
      NaN
      2.14430
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.010856
      0.032030
      0.047955
      0.029015
      0.014196
      0.032521
      0.035577
      0.011103
      0.010629
      0
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      3771
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.613830
      1.456800
      1.350900
      0.769510
      0.000000
      ...
      0.042000
      0.031909
      0.012088
      0.019042
      0.013035
      0.069337
      0.040768
      NaN
      0.033300
      1
    
    
      3772
      0.48447
      0.773390
      0.000000
      0.213940
      0.00000
      0.000000
      0.000000
      0.308140
      0.120880
      NaN
      ...
      0.020615
      0.010171
      0.024787
      0.021214
      0.015437
      0.021795
      0.032656
      0.011645
      0.012731
      1
    
    
      3773
      1.52270
      0.000000
      0.000000
      NaN
      0.00000
      NaN
      0.000000
      0.000000
      0.000000
      NaN
      ...
      0.018538
      0.010537
      0.027684
      0.031280
      0.027294
      0.013832
      NaN
      0.033723
      0.039819
      1
    
    
      3774
      NaN
      NaN
      0.000000
      0.000000
      NaN
      NaN
      0.000000
      0.000000
      NaN
      0.000000
      ...
      0.037988
      NaN
      0.043092
      0.031044
      0.036253
      0.059886
      0.047193
      0.065673
      0.021502
      1
    
    
      3775
      NaN
      NaN
      0.000000
      0.435930
      0.00000
      0.000000
      0.000000
      0.000000
      NaN
      NaN
      ...
      0.032696
      0.034843
      NaN
      0.035841
      0.018307
      NaN
      0.038746
      0.038803
      NaN
      0
    
    
      3776
      0.52334
      0.000000
      0.000000
      NaN
      0.00000
      0.000000
      0.000000
      0.174440
      0.000000
      0.650650
      ...
      0.042804
      0.010049
      0.012368
      0.032397
      0.040944
      0.016676
      NaN
      0.023827
      NaN
      1
    
    
      3777
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      NaN
      0.000000
      0.000000
      NaN
      1.662700
      ...
      0.015793
      0.022092
      0.014955
      0.018942
      0.016944
      0.032190
      0.028367
      0.018088
      NaN
      1
    
    
      3778
      NaN
      0.000000
      0.258000
      0.507590
      0.00000
      0.000000
      0.009438
      0.000000
      0.000000
      0.143250
      ...
      0.017947
      0.024456
      NaN
      0.041183
      0.047993
      0.033153
      0.040062
      0.023869
      NaN
      0
    
    
      3779
      0.00000
      0.000000
      NaN
      0.000000
      0.16713
      0.000000
      NaN
      0.000000
      0.000000
      0.000000
      ...
      0.033826
      0.056108
      0.062568
      NaN
      0.028362
      0.038791
      0.040587
      0.035817
      0.017098
      0
    
    
      3780
      0.00000
      0.000000
      NaN
      0.000000
      NaN
      1.302700
      0.000000
      0.000000
      NaN
      0.000000
      ...
      NaN
      NaN
      0.053071
      NaN
      0.023068
      0.002995
      0.007839
      NaN
      0.014431
      1
    
    
      3781
      0.00000
      NaN
      0.000000
      0.283930
      1.20350
      0.017472
      NaN
      0.000000
      NaN
      0.364180
      ...
      0.018298
      0.017468
      NaN
      NaN
      0.013216
      0.018009
      NaN
      0.013381
      0.009624
      1
    
    
      3782
      0.08295
      0.153360
      0.000000
      0.000000
      NaN
      0.000000
      0.973910
      NaN
      0.043460
      1.534700
      ...
      0.023122
      0.011625
      NaN
      0.008095
      0.011113
      0.041680
      0.019421
      0.020782
      0.012575
      1
    
    
      3783
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      NaN
      NaN
      0.000000
      0.461830
      1.176400
      ...
      NaN
      0.027828
      0.038889
      0.021408
      0.003900
      0.029600
      0.029911
      0.026131
      0.011419
      0
    
    
      3784
      0.00000
      0.738330
      NaN
      0.000000
      0.00000
      NaN
      NaN
      0.000000
      0.000000
      0.916750
      ...
      0.037651
      0.049141
      0.021031
      0.028404
      0.012977
      0.039746
      0.040718
      0.013986
      NaN
      1
    
    
      3785
      NaN
      0.190430
      0.363980
      NaN
      0.00000
      0.000000
      NaN
      0.383590
      0.272360
      0.000000
      ...
      0.036021
      0.035155
      0.031911
      0.057594
      0.080866
      0.034773
      0.047184
      0.064976
      NaN
      1
    
    
      3786
      NaN
      NaN
      NaN
      0.324010
      0.00000
      0.000000
      0.000000
      0.000000
      NaN
      NaN
      ...
      NaN
      0.005199
      NaN
      0.006255
      0.008951
      NaN
      NaN
      0.010050
      0.008300
      1
    
    
      3787
      0.00000
      0.000000
      0.501170
      0.057088
      0.16317
      0.646400
      0.298280
      0.757990
      0.087317
      0.146870
      ...
      0.021765
      0.016235
      0.044767
      0.024993
      0.010303
      0.009179
      0.031484
      0.070180
      0.046984
      1
    
    
      3788
      0.00000
      0.000000
      NaN
      0.000000
      0.00000
      NaN
      0.825670
      0.000000
      NaN
      NaN
      ...
      0.006477
      0.011424
      NaN
      0.004331
      NaN
      0.008933
      NaN
      0.017482
      0.031523
      0
    
    
      3789
      0.00000
      0.000000
      0.000000
      NaN
      0.00000
      1.338000
      0.000000
      0.000000
      0.000000
      0.113400
      ...
      0.006814
      0.000684
      0.001524
      NaN
      0.001442
      0.002318
      NaN
      0.011766
      0.007097
      1
    
    
      3790
      0.00000
      0.000000
      0.000000
      0.476500
      NaN
      0.176770
      0.085610
      0.752900
      0.000000
      0.350030
      ...
      NaN
      NaN
      0.021348
      0.031300
      NaN
      0.014002
      0.030157
      0.037464
      0.024539
      0
    
    
      3791
      0.00000
      0.317140
      0.246310
      0.000000
      0.00000
      NaN
      0.000000
      NaN
      0.000000
      NaN
      ...
      0.020298
      0.006256
      0.015469
      0.038902
      NaN
      0.020679
      0.027800
      0.035840
      NaN
      1
    
    
      3792
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      NaN
      0.754110
      0.151220
      0.133960
      ...
      NaN
      NaN
      0.056608
      0.034379
      NaN
      0.050165
      NaN
      0.038335
      NaN
      1
    
    
      3793
      0.00000
      NaN
      1.137600
      0.000000
      0.00000
      0.000000
      0.084377
      0.000000
      0.000000
      0.947300
      ...
      NaN
      0.037247
      0.041796
      0.026112
      0.010530
      0.046502
      NaN
      0.042460
      NaN
      0
    
    
      3794
      0.00000
      0.537230
      0.000000
      0.000000
      0.00000
      1.846200
      0.000000
      0.000000
      0.000000
      1.628400
      ...
      0.030879
      0.037300
      NaN
      0.017458
      0.017493
      0.027442
      0.014037
      0.011320
      0.010543
      0
    
    
      3795
      0.00000
      0.000000
      0.000000
      0.262230
      0.00000
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      ...
      0.034215
      0.043260
      NaN
      0.042648
      0.044968
      0.038014
      NaN
      0.054445
      0.032456
      0
    
    
      3796
      0.00000
      0.944010
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      NaN
      0.021328
      0.016077
      0.019606
      NaN
      0.005605
      0.003127
      0.009222
      0.019916
      1
    
    
      3797
      NaN
      0.000000
      NaN
      NaN
      NaN
      0.516080
      0.000000
      0.133720
      0.359210
      NaN
      ...
      0.019692
      0.013608
      0.020126
      0.021958
      0.035866
      0.025194
      0.029437
      NaN
      NaN
      1
    
    
      3798
      NaN
      0.000000
      0.146570
      0.000000
      0.26073
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.022441
      0.025916
      0.040383
      0.045961
      0.012540
      0.025097
      NaN
      0.030621
      NaN
      1
    
    
      3799
      0.00000
      NaN
      0.293200
      0.000000
      NaN
      0.000000
      0.262210
      0.000000
      0.000000
      0.000000
      ...
      0.012463
      0.024990
      0.034452
      0.014815
      0.008251
      0.058643
      NaN
      0.038955
      0.010777
      1
    
    
      3800
      0.00000
      0.000000
      0.000000
      0.000000
      0.00000
      NaN
      0.490140
      NaN
      1.968300
      0.008118
      ...
      0.055913
      0.009899
      NaN
      0.017555
      0.019566
      NaN
      0.048685
      NaN
      0.028767
      1
    
  

3420 rows × 4609 columns



In [7]:

    
#quantify class counts of additional training data 
add_training_data.prediction.value_counts()









    Out[7]:





1    1995
0    1425
Name: prediction, dtype: int64



In [8]:

    
#find number of NAs for each column for additional training data
add_training_data.isnull().sum()









    Out[8]:





CNNs          698
CNNs.1        653
CNNs.2        719
CNNs.3        648
CNNs.4        674
CNNs.5        693
CNNs.6        688
CNNs.7        710
CNNs.8        610
CNNs.9        680
CNNs.10       670
CNNs.11       693
CNNs.12       682
CNNs.13       681
CNNs.14       657
CNNs.15       704
CNNs.16       689
CNNs.17       668
CNNs.18       686
CNNs.19       706
CNNs.20       697
CNNs.21       672
CNNs.22       723
CNNs.23       654
CNNs.24       699
CNNs.25       672
CNNs.26       689
CNNs.27       648
CNNs.28       693
CNNs.29       712
             ... 
GIST.483      638
GIST.484      629
GIST.485      666
GIST.486      648
GIST.487      723
GIST.488      686
GIST.489      689
GIST.490      664
GIST.491      690
GIST.492      676
GIST.493      689
GIST.494      668
GIST.495      673
GIST.496      755
GIST.497      688
GIST.498      674
GIST.499      674
GIST.500      728
GIST.501      702
GIST.502      667
GIST.503      682
GIST.504      717
GIST.505      678
GIST.506      692
GIST.507      660
GIST.508      686
GIST.509      681
GIST.510      699
GIST.511      680
prediction      0
dtype: int64



In [9]:

    
#concatenate original training data with additional training data 
full_training_data_inc = pd.concat([training_data, add_training_data])
#observe concatenated training data 
full_training_data_inc









    Out[9]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.503
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
      prediction
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      0.000000
      0.167840
      1.477000
      0.756510
      0.387410
      0.000000
      0.212950
      0.00000
      0.000000
      0.225750
      ...
      0.025833
      0.021306
      0.027640
      0.036184
      0.047010
      0.037981
      0.049249
      0.059802
      0.035669
      0
    
    
      2
      0.000000
      0.000000
      0.000000
      0.442600
      0.000000
      0.000000
      0.150240
      1.48060
      0.635870
      0.020341
      ...
      0.017774
      0.020330
      0.019916
      0.033483
      0.015937
      0.021656
      0.018347
      0.017458
      0.018744
      0
    
    
      3
      0.000000
      0.000000
      0.000000
      0.470420
      0.000000
      1.277900
      0.459540
      0.00000
      0.000000
      0.000000
      ...
      0.017935
      0.005156
      0.041298
      0.014921
      0.015868
      0.012122
      0.015664
      0.011410
      0.017450
      1
    
    
      4
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.00000
      0.030878
      0.928510
      ...
      0.039596
      0.007086
      0.013696
      0.028789
      0.022858
      0.030883
      0.026539
      0.021337
      0.018109
      1
    
    
      5
      0.490990
      0.833880
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.00000
      0.188490
      0.764420
      ...
      0.008161
      0.036306
      0.029198
      0.045733
      0.008041
      0.013111
      0.022239
      0.058815
      0.014322
      1
    
    
      6
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      ...
      0.025636
      0.008809
      0.026506
      0.018506
      0.029058
      0.009211
      0.013236
      0.031606
      0.022141
      1
    
    
      7
      0.368230
      0.000000
      0.000000
      0.000000
      0.000000
      0.395810
      0.948560
      0.00000
      0.000000
      0.000000
      ...
      0.031240
      0.016638
      0.040408
      0.028362
      0.016704
      0.034409
      0.025067
      0.024614
      0.026773
      1
    
    
      8
      0.367450
      0.000000
      0.087409
      0.000000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.588080
      ...
      0.052771
      0.017765
      0.055323
      0.067212
      0.048452
      0.019376
      0.056357
      0.056325
      0.050188
      1
    
    
      9
      0.066494
      0.000000
      0.000000
      0.084850
      0.608320
      0.522730
      0.000000
      0.37833
      0.000000
      0.096584
      ...
      0.020576
      0.026661
      0.021242
      0.019962
      0.040603
      0.027398
      0.019766
      0.020432
      0.032214
      0
    
    
      10
      0.495670
      2.536900
      0.000000
      0.000000
      0.000000
      1.530300
      0.000000
      0.00000
      0.000000
      0.000000
      ...
      0.011404
      0.013138
      0.025195
      0.017418
      0.010645
      0.012981
      0.039255
      0.016495
      0.007007
      1
    
    
      11
      0.000000
      0.000000
      0.000000
      0.000000
      0.623180
      0.524910
      0.000000
      1.34940
      0.000000
      0.000000
      ...
      0.024555
      0.005029
      0.031665
      0.040577
      0.026261
      0.023069
      0.043602
      0.044524
      0.066983
      0
    
    
      12
      1.096500
      0.720820
      0.418210
      0.000000
      0.312950
      0.000000
      0.000000
      0.00000
      0.000000
      0.464130
      ...
      0.013453
      0.034157
      0.045233
      0.044563
      0.017900
      0.043618
      0.076412
      0.036831
      0.007185
      1
    
    
      13
      0.653430
      0.000000
      0.142020
      0.046679
      0.000000
      0.000000
      0.650850
      0.00000
      0.000000
      0.000000
      ...
      0.047733
      0.043096
      0.055382
      0.050194
      0.039210
      0.023657
      0.021919
      0.055182
      0.027263
      0
    
    
      14
      0.000000
      0.000000
      0.000000
      0.000000
      1.299300
      0.882630
      0.137290
      0.00000
      0.000000
      0.790330
      ...
      0.002243
      0.018213
      0.030337
      0.011113
      0.003206
      0.036056
      0.024078
      0.020279
      0.022261
      0
    
    
      15
      0.000000
      1.103800
      0.000000
      0.000000
      1.145400
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      ...
      0.041530
      0.036399
      0.031912
      0.029357
      0.056295
      0.012967
      0.021085
      0.042250
      0.037959
      0
    
    
      16
      0.658060
      1.518300
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      2.115800
      ...
      0.039075
      0.024941
      0.018340
      0.025604
      0.014863
      0.038411
      0.051662
      0.075225
      0.031801
      1
    
    
      17
      0.915600
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.827120
      0.00000
      0.000000
      0.000000
      ...
      0.030439
      0.048584
      0.034337
      0.049318
      0.021250
      0.035184
      0.032242
      0.025961
      0.023896
      0
    
    
      18
      0.122340
      0.484180
      0.575630
      0.000000
      0.056843
      0.000000
      0.269810
      0.80379
      0.000000
      0.847900
      ...
      0.043903
      0.025202
      0.036941
      0.101960
      0.042061
      0.033733
      0.056107
      0.043020
      0.034273
      1
    
    
      19
      0.000000
      0.000000
      0.887790
      0.796050
      0.949070
      0.000000
      0.000000
      0.00000
      0.000000
      0.150800
      ...
      0.062996
      0.046529
      0.029277
      0.048688
      0.030056
      0.066896
      0.064681
      0.064771
      0.033705
      0
    
    
      20
      0.000000
      0.911010
      0.000000
      0.000000
      0.336420
      0.000000
      0.000000
      0.18094
      0.263610
      0.988440
      ...
      0.008856
      0.041469
      0.009145
      0.009094
      0.009796
      0.027103
      0.042893
      0.056196
      0.012501
      1
    
    
      21
      0.000000
      0.000000
      0.000000
      0.198800
      0.000000
      0.901150
      0.000000
      0.00000
      0.413540
      0.000000
      ...
      0.020464
      0.037698
      0.028640
      0.023290
      0.030678
      0.032787
      0.049547
      0.027087
      0.044812
      0
    
    
      22
      0.457150
      1.565800
      0.000000
      0.000000
      0.000000
      0.719880
      0.000000
      0.00000
      0.107970
      0.000000
      ...
      0.036841
      0.060575
      0.041959
      0.047929
      0.039467
      0.049774
      0.082377
      0.057957
      0.035604
      1
    
    
      23
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      ...
      0.019534
      0.006583
      0.012513
      0.028868
      0.039953
      0.014977
      0.045486
      0.021237
      0.026507
      0
    
    
      24
      1.145900
      1.242200
      0.000000
      0.000000
      0.000000
      0.956940
      0.000000
      0.00000
      0.000000
      0.519750
      ...
      0.039259
      0.018149
      0.012153
      0.018000
      0.037783
      0.019433
      0.014982
      0.025554
      0.025585
      1
    
    
      25
      0.384660
      0.642160
      0.000000
      0.000000
      0.000000
      0.000000
      0.141730
      0.51014
      0.000000
      0.322540
      ...
      0.034577
      0.031162
      0.038961
      0.041850
      0.026021
      0.007156
      0.031507
      0.048640
      0.028067
      1
    
    
      26
      0.044793
      0.171220
      0.150470
      0.791640
      0.100370
      0.000000
      0.000000
      0.00000
      0.000000
      0.546100
      ...
      0.059782
      0.027918
      0.022951
      0.067286
      0.072825
      0.026592
      0.030550
      0.047546
      0.061572
      0
    
    
      27
      0.024028
      0.000000
      0.000000
      1.560600
      0.323740
      0.573730
      0.000000
      0.97208
      0.402180
      0.613800
      ...
      0.019090
      0.011114
      0.007980
      0.045997
      0.045494
      0.018651
      0.011630
      0.011288
      0.019492
      0
    
    
      28
      0.311980
      0.244520
      0.212100
      0.978550
      0.000000
      1.319800
      0.000000
      0.00000
      0.000000
      0.026332
      ...
      0.033687
      0.066910
      0.036916
      0.029357
      0.017351
      0.020543
      0.015300
      0.016477
      0.019715
      0
    
    
      29
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.55810
      0.485880
      0.815440
      ...
      0.039693
      0.020571
      0.044319
      0.039485
      0.047731
      0.043560
      0.043651
      0.042374
      0.048968
      0
    
    
      30
      0.269790
      0.024128
      0.000156
      0.000000
      0.174980
      0.000000
      0.337810
      0.00000
      0.000000
      0.000000
      ...
      0.011549
      0.085321
      0.016958
      0.008131
      0.022019
      0.031845
      0.020188
      0.007039
      0.012079
      1
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      3771
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.613830
      1.456800
      1.35090
      0.769510
      0.000000
      ...
      0.042000
      0.031909
      0.012088
      0.019042
      0.013035
      0.069337
      0.040768
      NaN
      0.033300
      1
    
    
      3772
      0.484470
      0.773390
      0.000000
      0.213940
      0.000000
      0.000000
      0.000000
      0.30814
      0.120880
      NaN
      ...
      0.020615
      0.010171
      0.024787
      0.021214
      0.015437
      0.021795
      0.032656
      0.011645
      0.012731
      1
    
    
      3773
      1.522700
      0.000000
      0.000000
      NaN
      0.000000
      NaN
      0.000000
      0.00000
      0.000000
      NaN
      ...
      0.018538
      0.010537
      0.027684
      0.031280
      0.027294
      0.013832
      NaN
      0.033723
      0.039819
      1
    
    
      3774
      NaN
      NaN
      0.000000
      0.000000
      NaN
      NaN
      0.000000
      0.00000
      NaN
      0.000000
      ...
      0.037988
      NaN
      0.043092
      0.031044
      0.036253
      0.059886
      0.047193
      0.065673
      0.021502
      1
    
    
      3775
      NaN
      NaN
      0.000000
      0.435930
      0.000000
      0.000000
      0.000000
      0.00000
      NaN
      NaN
      ...
      0.032696
      0.034843
      NaN
      0.035841
      0.018307
      NaN
      0.038746
      0.038803
      NaN
      0
    
    
      3776
      0.523340
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      0.000000
      0.17444
      0.000000
      0.650650
      ...
      0.042804
      0.010049
      0.012368
      0.032397
      0.040944
      0.016676
      NaN
      0.023827
      NaN
      1
    
    
      3777
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      NaN
      0.000000
      0.00000
      NaN
      1.662700
      ...
      0.015793
      0.022092
      0.014955
      0.018942
      0.016944
      0.032190
      0.028367
      0.018088
      NaN
      1
    
    
      3778
      NaN
      0.000000
      0.258000
      0.507590
      0.000000
      0.000000
      0.009438
      0.00000
      0.000000
      0.143250
      ...
      0.017947
      0.024456
      NaN
      0.041183
      0.047993
      0.033153
      0.040062
      0.023869
      NaN
      0
    
    
      3779
      0.000000
      0.000000
      NaN
      0.000000
      0.167130
      0.000000
      NaN
      0.00000
      0.000000
      0.000000
      ...
      0.033826
      0.056108
      0.062568
      NaN
      0.028362
      0.038791
      0.040587
      0.035817
      0.017098
      0
    
    
      3780
      0.000000
      0.000000
      NaN
      0.000000
      NaN
      1.302700
      0.000000
      0.00000
      NaN
      0.000000
      ...
      NaN
      NaN
      0.053071
      NaN
      0.023068
      0.002995
      0.007839
      NaN
      0.014431
      1
    
    
      3781
      0.000000
      NaN
      0.000000
      0.283930
      1.203500
      0.017472
      NaN
      0.00000
      NaN
      0.364180
      ...
      0.018298
      0.017468
      NaN
      NaN
      0.013216
      0.018009
      NaN
      0.013381
      0.009624
      1
    
    
      3782
      0.082950
      0.153360
      0.000000
      0.000000
      NaN
      0.000000
      0.973910
      NaN
      0.043460
      1.534700
      ...
      0.023122
      0.011625
      NaN
      0.008095
      0.011113
      0.041680
      0.019421
      0.020782
      0.012575
      1
    
    
      3783
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      NaN
      NaN
      0.00000
      0.461830
      1.176400
      ...
      NaN
      0.027828
      0.038889
      0.021408
      0.003900
      0.029600
      0.029911
      0.026131
      0.011419
      0
    
    
      3784
      0.000000
      0.738330
      NaN
      0.000000
      0.000000
      NaN
      NaN
      0.00000
      0.000000
      0.916750
      ...
      0.037651
      0.049141
      0.021031
      0.028404
      0.012977
      0.039746
      0.040718
      0.013986
      NaN
      1
    
    
      3785
      NaN
      0.190430
      0.363980
      NaN
      0.000000
      0.000000
      NaN
      0.38359
      0.272360
      0.000000
      ...
      0.036021
      0.035155
      0.031911
      0.057594
      0.080866
      0.034773
      0.047184
      0.064976
      NaN
      1
    
    
      3786
      NaN
      NaN
      NaN
      0.324010
      0.000000
      0.000000
      0.000000
      0.00000
      NaN
      NaN
      ...
      NaN
      0.005199
      NaN
      0.006255
      0.008951
      NaN
      NaN
      0.010050
      0.008300
      1
    
    
      3787
      0.000000
      0.000000
      0.501170
      0.057088
      0.163170
      0.646400
      0.298280
      0.75799
      0.087317
      0.146870
      ...
      0.021765
      0.016235
      0.044767
      0.024993
      0.010303
      0.009179
      0.031484
      0.070180
      0.046984
      1
    
    
      3788
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      NaN
      0.825670
      0.00000
      NaN
      NaN
      ...
      0.006477
      0.011424
      NaN
      0.004331
      NaN
      0.008933
      NaN
      0.017482
      0.031523
      0
    
    
      3789
      0.000000
      0.000000
      0.000000
      NaN
      0.000000
      1.338000
      0.000000
      0.00000
      0.000000
      0.113400
      ...
      0.006814
      0.000684
      0.001524
      NaN
      0.001442
      0.002318
      NaN
      0.011766
      0.007097
      1
    
    
      3790
      0.000000
      0.000000
      0.000000
      0.476500
      NaN
      0.176770
      0.085610
      0.75290
      0.000000
      0.350030
      ...
      NaN
      NaN
      0.021348
      0.031300
      NaN
      0.014002
      0.030157
      0.037464
      0.024539
      0
    
    
      3791
      0.000000
      0.317140
      0.246310
      0.000000
      0.000000
      NaN
      0.000000
      NaN
      0.000000
      NaN
      ...
      0.020298
      0.006256
      0.015469
      0.038902
      NaN
      0.020679
      0.027800
      0.035840
      NaN
      1
    
    
      3792
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      NaN
      0.75411
      0.151220
      0.133960
      ...
      NaN
      NaN
      0.056608
      0.034379
      NaN
      0.050165
      NaN
      0.038335
      NaN
      1
    
    
      3793
      0.000000
      NaN
      1.137600
      0.000000
      0.000000
      0.000000
      0.084377
      0.00000
      0.000000
      0.947300
      ...
      NaN
      0.037247
      0.041796
      0.026112
      0.010530
      0.046502
      NaN
      0.042460
      NaN
      0
    
    
      3794
      0.000000
      0.537230
      0.000000
      0.000000
      0.000000
      1.846200
      0.000000
      0.00000
      0.000000
      1.628400
      ...
      0.030879
      0.037300
      NaN
      0.017458
      0.017493
      0.027442
      0.014037
      0.011320
      0.010543
      0
    
    
      3795
      0.000000
      0.000000
      0.000000
      0.262230
      0.000000
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      ...
      0.034215
      0.043260
      NaN
      0.042648
      0.044968
      0.038014
      NaN
      0.054445
      0.032456
      0
    
    
      3796
      0.000000
      0.944010
      0.000000
      0.000000
      NaN
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      ...
      NaN
      0.021328
      0.016077
      0.019606
      NaN
      0.005605
      0.003127
      0.009222
      0.019916
      1
    
    
      3797
      NaN
      0.000000
      NaN
      NaN
      NaN
      0.516080
      0.000000
      0.13372
      0.359210
      NaN
      ...
      0.019692
      0.013608
      0.020126
      0.021958
      0.035866
      0.025194
      0.029437
      NaN
      NaN
      1
    
    
      3798
      NaN
      0.000000
      0.146570
      0.000000
      0.260730
      0.000000
      0.000000
      0.00000
      0.000000
      0.000000
      ...
      0.022441
      0.025916
      0.040383
      0.045961
      0.012540
      0.025097
      NaN
      0.030621
      NaN
      1
    
    
      3799
      0.000000
      NaN
      0.293200
      0.000000
      NaN
      0.000000
      0.262210
      0.00000
      0.000000
      0.000000
      ...
      0.012463
      0.024990
      0.034452
      0.014815
      0.008251
      0.058643
      NaN
      0.038955
      0.010777
      1
    
    
      3800
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      NaN
      0.490140
      NaN
      1.968300
      0.008118
      ...
      0.055913
      0.009899
      NaN
      0.017555
      0.019566
      NaN
      0.048685
      NaN
      0.028767
      1
    
  

3800 rows × 4609 columns

A couple of imputation methods were tried in the original Notebook:

Imputation using the column (feature) mean
Imputation with K-Nearest Neighbours

The most effective and theoretically advocated method producing the best results was the second method; imputation using the K-Nearest Neighbours. Note: the fancyimpute package may need installing before running. K was set to 3 here, see report for justification.



In [10]:

    
#imputation via KNN
from fancyimpute import KNN
knn_trial = full_training_data_inc
knn_trial

complete_knn = KNN(k=3).complete(knn_trial)









    



Imputing row 1/3800 with 0 missing, elapsed time: 718.091
Imputing row 101/3800 with 0 missing, elapsed time: 718.100
Imputing row 201/3800 with 0 missing, elapsed time: 718.105
Imputing row 301/3800 with 0 missing, elapsed time: 718.110
Imputing row 401/3800 with 863 missing, elapsed time: 719.570
Imputing row 501/3800 with 915 missing, elapsed time: 724.257
Imputing row 601/3800 with 880 missing, elapsed time: 727.962
Imputing row 701/3800 with 887 missing, elapsed time: 731.787
Imputing row 801/3800 with 900 missing, elapsed time: 735.372
Imputing row 901/3800 with 920 missing, elapsed time: 738.906
Imputing row 1001/3800 with 948 missing, elapsed time: 742.838
Imputing row 1101/3800 with 940 missing, elapsed time: 747.011
Imputing row 1201/3800 with 942 missing, elapsed time: 750.557
Imputing row 1301/3800 with 894 missing, elapsed time: 754.097
Imputing row 1401/3800 with 911 missing, elapsed time: 757.717
Imputing row 1501/3800 with 980 missing, elapsed time: 761.267
Imputing row 1601/3800 with 958 missing, elapsed time: 764.812
Imputing row 1701/3800 with 971 missing, elapsed time: 768.598
Imputing row 1801/3800 with 909 missing, elapsed time: 772.426
Imputing row 1901/3800 with 899 missing, elapsed time: 776.196
Imputing row 2001/3800 with 902 missing, elapsed time: 779.857
Imputing row 2101/3800 with 911 missing, elapsed time: 783.604
Imputing row 2201/3800 with 922 missing, elapsed time: 788.482
Imputing row 2301/3800 with 908 missing, elapsed time: 792.492
Imputing row 2401/3800 with 903 missing, elapsed time: 796.774
Imputing row 2501/3800 with 868 missing, elapsed time: 800.573
Imputing row 2601/3800 with 939 missing, elapsed time: 804.432
Imputing row 2701/3800 with 937 missing, elapsed time: 807.967
Imputing row 2801/3800 with 942 missing, elapsed time: 811.614
Imputing row 2901/3800 with 934 missing, elapsed time: 815.260
Imputing row 3001/3800 with 915 missing, elapsed time: 819.455
Imputing row 3101/3800 with 918 missing, elapsed time: 823.486
Imputing row 3201/3800 with 915 missing, elapsed time: 827.943
Imputing row 3301/3800 with 946 missing, elapsed time: 832.058
Imputing row 3401/3800 with 914 missing, elapsed time: 835.635
Imputing row 3501/3800 with 926 missing, elapsed time: 839.312
Imputing row 3601/3800 with 903 missing, elapsed time: 842.923
Imputing row 3701/3800 with 958 missing, elapsed time: 846.536



In [11]:

    
#convert imputed matrix back to dataframe for visualisation and convert 'prediction' dtype to int
complete_knn_df = pd.DataFrame(complete_knn, index=full_training_data_inc.index, columns=full_training_data_inc.columns)
full_training_data = complete_knn_df
full_training_data.prediction = full_training_data.prediction.astype('int')
full_training_data









    Out[11]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.503
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
      prediction
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      0.000000
      0.167840
      1.477000
      0.756510
      0.387410
      0.000000
      0.212950
      0.000000
      0.000000
      0.225750
      ...
      0.025833
      0.021306
      0.027640
      0.036184
      0.047010
      0.037981
      0.049249
      0.059802
      0.035669
      0
    
    
      2
      0.000000
      0.000000
      0.000000
      0.442600
      0.000000
      0.000000
      0.150240
      1.480600
      0.635870
      0.020341
      ...
      0.017774
      0.020330
      0.019916
      0.033483
      0.015937
      0.021656
      0.018347
      0.017458
      0.018744
      0
    
    
      3
      0.000000
      0.000000
      0.000000
      0.470420
      0.000000
      1.277900
      0.459540
      0.000000
      0.000000
      0.000000
      ...
      0.017935
      0.005156
      0.041298
      0.014921
      0.015868
      0.012122
      0.015664
      0.011410
      0.017450
      1
    
    
      4
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.030878
      0.928510
      ...
      0.039596
      0.007086
      0.013696
      0.028789
      0.022858
      0.030883
      0.026539
      0.021337
      0.018109
      1
    
    
      5
      0.490990
      0.833880
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.188490
      0.764420
      ...
      0.008161
      0.036306
      0.029198
      0.045733
      0.008041
      0.013111
      0.022239
      0.058815
      0.014322
      1
    
    
      6
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.025636
      0.008809
      0.026506
      0.018506
      0.029058
      0.009211
      0.013236
      0.031606
      0.022141
      1
    
    
      7
      0.368230
      0.000000
      0.000000
      0.000000
      0.000000
      0.395810
      0.948560
      0.000000
      0.000000
      0.000000
      ...
      0.031240
      0.016638
      0.040408
      0.028362
      0.016704
      0.034409
      0.025067
      0.024614
      0.026773
      1
    
    
      8
      0.367450
      0.000000
      0.087409
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.588080
      ...
      0.052771
      0.017765
      0.055323
      0.067212
      0.048452
      0.019376
      0.056357
      0.056325
      0.050188
      1
    
    
      9
      0.066494
      0.000000
      0.000000
      0.084850
      0.608320
      0.522730
      0.000000
      0.378330
      0.000000
      0.096584
      ...
      0.020576
      0.026661
      0.021242
      0.019962
      0.040603
      0.027398
      0.019766
      0.020432
      0.032214
      0
    
    
      10
      0.495670
      2.536900
      0.000000
      0.000000
      0.000000
      1.530300
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.011404
      0.013138
      0.025195
      0.017418
      0.010645
      0.012981
      0.039255
      0.016495
      0.007007
      1
    
    
      11
      0.000000
      0.000000
      0.000000
      0.000000
      0.623180
      0.524910
      0.000000
      1.349400
      0.000000
      0.000000
      ...
      0.024555
      0.005029
      0.031665
      0.040577
      0.026261
      0.023069
      0.043602
      0.044524
      0.066983
      0
    
    
      12
      1.096500
      0.720820
      0.418210
      0.000000
      0.312950
      0.000000
      0.000000
      0.000000
      0.000000
      0.464130
      ...
      0.013453
      0.034157
      0.045233
      0.044563
      0.017900
      0.043618
      0.076412
      0.036831
      0.007185
      1
    
    
      13
      0.653430
      0.000000
      0.142020
      0.046679
      0.000000
      0.000000
      0.650850
      0.000000
      0.000000
      0.000000
      ...
      0.047733
      0.043096
      0.055382
      0.050194
      0.039210
      0.023657
      0.021919
      0.055182
      0.027263
      0
    
    
      14
      0.000000
      0.000000
      0.000000
      0.000000
      1.299300
      0.882630
      0.137290
      0.000000
      0.000000
      0.790330
      ...
      0.002243
      0.018213
      0.030337
      0.011113
      0.003206
      0.036056
      0.024078
      0.020279
      0.022261
      0
    
    
      15
      0.000000
      1.103800
      0.000000
      0.000000
      1.145400
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.041530
      0.036399
      0.031912
      0.029357
      0.056295
      0.012967
      0.021085
      0.042250
      0.037959
      0
    
    
      16
      0.658060
      1.518300
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      2.115800
      ...
      0.039075
      0.024941
      0.018340
      0.025604
      0.014863
      0.038411
      0.051662
      0.075225
      0.031801
      1
    
    
      17
      0.915600
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.827120
      0.000000
      0.000000
      0.000000
      ...
      0.030439
      0.048584
      0.034337
      0.049318
      0.021250
      0.035184
      0.032242
      0.025961
      0.023896
      0
    
    
      18
      0.122340
      0.484180
      0.575630
      0.000000
      0.056843
      0.000000
      0.269810
      0.803790
      0.000000
      0.847900
      ...
      0.043903
      0.025202
      0.036941
      0.101960
      0.042061
      0.033733
      0.056107
      0.043020
      0.034273
      1
    
    
      19
      0.000000
      0.000000
      0.887790
      0.796050
      0.949070
      0.000000
      0.000000
      0.000000
      0.000000
      0.150800
      ...
      0.062996
      0.046529
      0.029277
      0.048688
      0.030056
      0.066896
      0.064681
      0.064771
      0.033705
      0
    
    
      20
      0.000000
      0.911010
      0.000000
      0.000000
      0.336420
      0.000000
      0.000000
      0.180940
      0.263610
      0.988440
      ...
      0.008856
      0.041469
      0.009145
      0.009094
      0.009796
      0.027103
      0.042893
      0.056196
      0.012501
      1
    
    
      21
      0.000000
      0.000000
      0.000000
      0.198800
      0.000000
      0.901150
      0.000000
      0.000000
      0.413540
      0.000000
      ...
      0.020464
      0.037698
      0.028640
      0.023290
      0.030678
      0.032787
      0.049547
      0.027087
      0.044812
      0
    
    
      22
      0.457150
      1.565800
      0.000000
      0.000000
      0.000000
      0.719880
      0.000000
      0.000000
      0.107970
      0.000000
      ...
      0.036841
      0.060575
      0.041959
      0.047929
      0.039467
      0.049774
      0.082377
      0.057957
      0.035604
      1
    
    
      23
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.019534
      0.006583
      0.012513
      0.028868
      0.039953
      0.014977
      0.045486
      0.021237
      0.026507
      0
    
    
      24
      1.145900
      1.242200
      0.000000
      0.000000
      0.000000
      0.956940
      0.000000
      0.000000
      0.000000
      0.519750
      ...
      0.039259
      0.018149
      0.012153
      0.018000
      0.037783
      0.019433
      0.014982
      0.025554
      0.025585
      1
    
    
      25
      0.384660
      0.642160
      0.000000
      0.000000
      0.000000
      0.000000
      0.141730
      0.510140
      0.000000
      0.322540
      ...
      0.034577
      0.031162
      0.038961
      0.041850
      0.026021
      0.007156
      0.031507
      0.048640
      0.028067
      1
    
    
      26
      0.044793
      0.171220
      0.150470
      0.791640
      0.100370
      0.000000
      0.000000
      0.000000
      0.000000
      0.546100
      ...
      0.059782
      0.027918
      0.022951
      0.067286
      0.072825
      0.026592
      0.030550
      0.047546
      0.061572
      0
    
    
      27
      0.024028
      0.000000
      0.000000
      1.560600
      0.323740
      0.573730
      0.000000
      0.972080
      0.402180
      0.613800
      ...
      0.019090
      0.011114
      0.007980
      0.045997
      0.045494
      0.018651
      0.011630
      0.011288
      0.019492
      0
    
    
      28
      0.311980
      0.244520
      0.212100
      0.978550
      0.000000
      1.319800
      0.000000
      0.000000
      0.000000
      0.026332
      ...
      0.033687
      0.066910
      0.036916
      0.029357
      0.017351
      0.020543
      0.015300
      0.016477
      0.019715
      0
    
    
      29
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.558100
      0.485880
      0.815440
      ...
      0.039693
      0.020571
      0.044319
      0.039485
      0.047731
      0.043560
      0.043651
      0.042374
      0.048968
      0
    
    
      30
      0.269790
      0.024128
      0.000156
      0.000000
      0.174980
      0.000000
      0.337810
      0.000000
      0.000000
      0.000000
      ...
      0.011549
      0.085321
      0.016958
      0.008131
      0.022019
      0.031845
      0.020188
      0.007039
      0.012079
      1
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      3771
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.613830
      1.456800
      1.350900
      0.769510
      0.000000
      ...
      0.042000
      0.031909
      0.012088
      0.019042
      0.013035
      0.069337
      0.040768
      0.015824
      0.033300
      1
    
    
      3772
      0.484470
      0.773390
      0.000000
      0.213940
      0.000000
      0.000000
      0.000000
      0.308140
      0.120880
      0.807669
      ...
      0.020615
      0.010171
      0.024787
      0.021214
      0.015437
      0.021795
      0.032656
      0.011645
      0.012731
      1
    
    
      3773
      1.522700
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.018538
      0.010537
      0.027684
      0.031280
      0.027294
      0.013832
      0.020979
      0.033723
      0.039819
      1
    
    
      3774
      0.945470
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.037988
      0.022253
      0.043092
      0.031044
      0.036253
      0.059886
      0.047193
      0.065673
      0.021502
      1
    
    
      3775
      0.946234
      0.067972
      0.000000
      0.435930
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.426852
      ...
      0.032696
      0.034843
      0.034115
      0.035841
      0.018307
      0.029870
      0.038746
      0.038803
      0.026157
      0
    
    
      3776
      0.523340
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.174440
      0.000000
      0.650650
      ...
      0.042804
      0.010049
      0.012368
      0.032397
      0.040944
      0.016676
      0.022170
      0.023827
      0.033898
      1
    
    
      3777
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.276959
      0.000000
      0.000000
      0.504155
      1.662700
      ...
      0.015793
      0.022092
      0.014955
      0.018942
      0.016944
      0.032190
      0.028367
      0.018088
      0.015747
      1
    
    
      3778
      0.304390
      0.000000
      0.258000
      0.507590
      0.000000
      0.000000
      0.009438
      0.000000
      0.000000
      0.143250
      ...
      0.017947
      0.024456
      0.041372
      0.041183
      0.047993
      0.033153
      0.040062
      0.023869
      0.035208
      0
    
    
      3779
      0.000000
      0.000000
      0.000000
      0.000000
      0.167130
      0.000000
      0.081230
      0.000000
      0.000000
      0.000000
      ...
      0.033826
      0.056108
      0.062568
      0.034892
      0.028362
      0.038791
      0.040587
      0.035817
      0.017098
      0
    
    
      3780
      0.000000
      0.000000
      0.063054
      0.000000
      0.000000
      1.302700
      0.000000
      0.000000
      0.221710
      0.000000
      ...
      0.015461
      0.007169
      0.053071
      0.034744
      0.023068
      0.002995
      0.007839
      0.036122
      0.014431
      1
    
    
      3781
      0.000000
      0.072509
      0.000000
      0.283930
      1.203500
      0.017472
      0.085080
      0.000000
      0.039984
      0.364180
      ...
      0.018298
      0.017468
      0.020724
      0.016279
      0.013216
      0.018009
      0.011172
      0.013381
      0.009624
      1
    
    
      3782
      0.082950
      0.153360
      0.000000
      0.000000
      0.000000
      0.000000
      0.973910
      0.272972
      0.043460
      1.534700
      ...
      0.023122
      0.011625
      0.010129
      0.008095
      0.011113
      0.041680
      0.019421
      0.020782
      0.012575
      1
    
    
      3783
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.461830
      1.176400
      ...
      0.013459
      0.027828
      0.038889
      0.021408
      0.003900
      0.029600
      0.029911
      0.026131
      0.011419
      0
    
    
      3784
      0.000000
      0.738330
      0.000000
      0.000000
      0.000000
      0.319599
      0.000000
      0.000000
      0.000000
      0.916750
      ...
      0.037651
      0.049141
      0.021031
      0.028404
      0.012977
      0.039746
      0.040718
      0.013986
      0.018132
      1
    
    
      3785
      0.456985
      0.190430
      0.363980
      0.000000
      0.000000
      0.000000
      0.361088
      0.383590
      0.272360
      0.000000
      ...
      0.036021
      0.035155
      0.031911
      0.057594
      0.080866
      0.034773
      0.047184
      0.064976
      0.019408
      1
    
    
      3786
      0.092517
      0.214898
      0.000000
      0.324010
      0.000000
      0.000000
      0.000000
      0.000000
      0.191760
      0.985159
      ...
      0.009291
      0.005199
      0.023028
      0.006255
      0.008951
      0.023196
      0.024608
      0.010050
      0.008300
      1
    
    
      3787
      0.000000
      0.000000
      0.501170
      0.057088
      0.163170
      0.646400
      0.298280
      0.757990
      0.087317
      0.146870
      ...
      0.021765
      0.016235
      0.044767
      0.024993
      0.010303
      0.009179
      0.031484
      0.070180
      0.046984
      1
    
    
      3788
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.825670
      0.000000
      0.143230
      0.000000
      ...
      0.006477
      0.011424
      0.013247
      0.004331
      0.009335
      0.008933
      0.015937
      0.017482
      0.031523
      0
    
    
      3789
      0.000000
      0.000000
      0.000000
      0.102527
      0.000000
      1.338000
      0.000000
      0.000000
      0.000000
      0.113400
      ...
      0.006814
      0.000684
      0.001524
      0.017507
      0.001442
      0.002318
      0.016162
      0.011766
      0.007097
      1
    
    
      3790
      0.000000
      0.000000
      0.000000
      0.476500
      0.265184
      0.176770
      0.085610
      0.752900
      0.000000
      0.350030
      ...
      0.012905
      0.008344
      0.021348
      0.031300
      0.015413
      0.014002
      0.030157
      0.037464
      0.024539
      0
    
    
      3791
      0.000000
      0.317140
      0.246310
      0.000000
      0.000000
      0.000000
      0.000000
      0.457169
      0.000000
      0.345913
      ...
      0.020298
      0.006256
      0.015469
      0.038902
      0.012438
      0.020679
      0.027800
      0.035840
      0.015295
      1
    
    
      3792
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.152793
      0.754110
      0.151220
      0.133960
      ...
      0.024765
      0.026685
      0.056608
      0.034379
      0.030035
      0.050165
      0.030962
      0.038335
      0.034809
      1
    
    
      3793
      0.000000
      0.000000
      1.137600
      0.000000
      0.000000
      0.000000
      0.084377
      0.000000
      0.000000
      0.947300
      ...
      0.022490
      0.037247
      0.041796
      0.026112
      0.010530
      0.046502
      0.041396
      0.042460
      0.033046
      0
    
    
      3794
      0.000000
      0.537230
      0.000000
      0.000000
      0.000000
      1.846200
      0.000000
      0.000000
      0.000000
      1.628400
      ...
      0.030879
      0.037300
      0.026726
      0.017458
      0.017493
      0.027442
      0.014037
      0.011320
      0.010543
      0
    
    
      3795
      0.000000
      0.000000
      0.000000
      0.262230
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.034215
      0.043260
      0.037229
      0.042648
      0.044968
      0.038014
      0.060082
      0.054445
      0.032456
      0
    
    
      3796
      0.000000
      0.944010
      0.000000
      0.000000
      0.090126
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.032911
      0.021328
      0.016077
      0.019606
      0.035862
      0.005605
      0.003127
      0.009222
      0.019916
      1
    
    
      3797
      0.000000
      0.000000
      0.000000
      0.620882
      0.000000
      0.516080
      0.000000
      0.133720
      0.359210
      0.769915
      ...
      0.019692
      0.013608
      0.020126
      0.021958
      0.035866
      0.025194
      0.029437
      0.029789
      0.018152
      1
    
    
      3798
      0.000000
      0.000000
      0.146570
      0.000000
      0.260730
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.022441
      0.025916
      0.040383
      0.045961
      0.012540
      0.025097
      0.016432
      0.030621
      0.016492
      1
    
    
      3799
      0.000000
      0.041131
      0.293200
      0.000000
      0.024413
      0.000000
      0.262210
      0.000000
      0.000000
      0.000000
      ...
      0.012463
      0.024990
      0.034452
      0.014815
      0.008251
      0.058643
      0.050752
      0.038955
      0.010777
      1
    
    
      3800
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.490140
      0.330143
      1.968300
      0.008118
      ...
      0.055913
      0.009899
      0.019455
      0.017555
      0.019566
      0.016966
      0.048685
      0.035669
      0.028767
      1
    
  

3800 rows × 4609 columns



In [12]:

    
#quantify class counts for full training data 
full_training_data.prediction.value_counts()









    Out[12]:





1    2200
0    1600
Name: prediction, dtype: int64

2.2) Dealing with Confidence Labels

One approach employed to incorporate the confidence labels was to use the confidence label of each instance as the corresponding sample weight for the instance. Theoretically, a confidence label of smaller than 1 would reduce the C parameter, which results in a lower penalty for misclassification of an instance whose label is not known with certainty. However, the implementation of this did not follow the theory; introducing the sample weights reduced the overall accuracy of the model. This matter was complicated more by the fact that samples generated from over-sampling via SMOTE had to be assigned a confidence label, which is difficult to determine objectively. Thus, it was decided that only data instances with a confidence label of 1 should be retained in the training data. This obviously leads to a massive loss of information. However, after removing instances, which do not have a confidence label of 1, 1922 training instances remained, which can be assumed to be a reasonable training data size. After truncating the data set, the procedure described in Section 2.2 was repeated for the truncated training data. In summary, the training data was truncated to only include instances that have a confidence label of 1. The minority class of the training data was over-sampled using SMOTE to balance the class split. Class weights were then applied during the training of the SVM to ensure that the model was more sensitive to correctly classifying the majority class of the test data.

This section will cover how the confidence labels, one of the additional pieces of information provided in the assignment outline, were incorporated into the final training data set.



In [13]:

    
#Load confidence annotations  
confidence_labels = pd.read_csv("/Users/Max/Desktop/Max's Folder/Uni Work/Data Science MSc/Machine Learning/ML Kaggle Competition /Data Sets/Annotation Confidence .csv", header=0, index_col=0)



In [14]:

    
#quantify confidence labels (how many are 1, how many are 0.66)
print(confidence_labels.confidence.value_counts())

#observe confidence annotations 
confidence_labels









    



1.00    1922
0.66    1878
Name: confidence, dtype: int64






    Out[14]:






  
    
      
      confidence
    
    
      ID
      
    
  
  
    
      1
      0.66
    
    
      2
      1.00
    
    
      3
      1.00
    
    
      4
      1.00
    
    
      5
      1.00
    
    
      6
      1.00
    
    
      7
      1.00
    
    
      8
      1.00
    
    
      9
      1.00
    
    
      10
      1.00
    
    
      11
      0.66
    
    
      12
      0.66
    
    
      13
      0.66
    
    
      14
      1.00
    
    
      15
      0.66
    
    
      16
      1.00
    
    
      17
      0.66
    
    
      18
      1.00
    
    
      19
      1.00
    
    
      20
      1.00
    
    
      21
      1.00
    
    
      22
      1.00
    
    
      23
      0.66
    
    
      24
      1.00
    
    
      25
      1.00
    
    
      26
      1.00
    
    
      27
      1.00
    
    
      28
      0.66
    
    
      29
      1.00
    
    
      30
      1.00
    
    
      ...
      ...
    
    
      3771
      0.66
    
    
      3772
      1.00
    
    
      3773
      1.00
    
    
      3774
      0.66
    
    
      3775
      0.66
    
    
      3776
      1.00
    
    
      3777
      0.66
    
    
      3778
      0.66
    
    
      3779
      1.00
    
    
      3780
      0.66
    
    
      3781
      0.66
    
    
      3782
      1.00
    
    
      3783
      1.00
    
    
      3784
      1.00
    
    
      3785
      1.00
    
    
      3786
      0.66
    
    
      3787
      1.00
    
    
      3788
      1.00
    
    
      3789
      1.00
    
    
      3790
      1.00
    
    
      3791
      1.00
    
    
      3792
      0.66
    
    
      3793
      0.66
    
    
      3794
      0.66
    
    
      3795
      0.66
    
    
      3796
      1.00
    
    
      3797
      0.66
    
    
      3798
      0.66
    
    
      3799
      0.66
    
    
      3800
      1.00
    
  

3800 rows × 1 columns



In [15]:

    
#adding confidence of label column to imputed full training data set 
full_train_wcl = pd.merge(full_training_data, confidence_labels, left_index=True, right_index=True)
full_train_wcl









    Out[15]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
      prediction
      confidence
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      0.000000
      0.167840
      1.477000
      0.756510
      0.387410
      0.000000
      0.212950
      0.000000
      0.000000
      0.225750
      ...
      0.021306
      0.027640
      0.036184
      0.047010
      0.037981
      0.049249
      0.059802
      0.035669
      0
      0.66
    
    
      2
      0.000000
      0.000000
      0.000000
      0.442600
      0.000000
      0.000000
      0.150240
      1.480600
      0.635870
      0.020341
      ...
      0.020330
      0.019916
      0.033483
      0.015937
      0.021656
      0.018347
      0.017458
      0.018744
      0
      1.00
    
    
      3
      0.000000
      0.000000
      0.000000
      0.470420
      0.000000
      1.277900
      0.459540
      0.000000
      0.000000
      0.000000
      ...
      0.005156
      0.041298
      0.014921
      0.015868
      0.012122
      0.015664
      0.011410
      0.017450
      1
      1.00
    
    
      4
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.030878
      0.928510
      ...
      0.007086
      0.013696
      0.028789
      0.022858
      0.030883
      0.026539
      0.021337
      0.018109
      1
      1.00
    
    
      5
      0.490990
      0.833880
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.188490
      0.764420
      ...
      0.036306
      0.029198
      0.045733
      0.008041
      0.013111
      0.022239
      0.058815
      0.014322
      1
      1.00
    
    
      6
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.008809
      0.026506
      0.018506
      0.029058
      0.009211
      0.013236
      0.031606
      0.022141
      1
      1.00
    
    
      7
      0.368230
      0.000000
      0.000000
      0.000000
      0.000000
      0.395810
      0.948560
      0.000000
      0.000000
      0.000000
      ...
      0.016638
      0.040408
      0.028362
      0.016704
      0.034409
      0.025067
      0.024614
      0.026773
      1
      1.00
    
    
      8
      0.367450
      0.000000
      0.087409
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.588080
      ...
      0.017765
      0.055323
      0.067212
      0.048452
      0.019376
      0.056357
      0.056325
      0.050188
      1
      1.00
    
    
      9
      0.066494
      0.000000
      0.000000
      0.084850
      0.608320
      0.522730
      0.000000
      0.378330
      0.000000
      0.096584
      ...
      0.026661
      0.021242
      0.019962
      0.040603
      0.027398
      0.019766
      0.020432
      0.032214
      0
      1.00
    
    
      10
      0.495670
      2.536900
      0.000000
      0.000000
      0.000000
      1.530300
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.013138
      0.025195
      0.017418
      0.010645
      0.012981
      0.039255
      0.016495
      0.007007
      1
      1.00
    
    
      11
      0.000000
      0.000000
      0.000000
      0.000000
      0.623180
      0.524910
      0.000000
      1.349400
      0.000000
      0.000000
      ...
      0.005029
      0.031665
      0.040577
      0.026261
      0.023069
      0.043602
      0.044524
      0.066983
      0
      0.66
    
    
      12
      1.096500
      0.720820
      0.418210
      0.000000
      0.312950
      0.000000
      0.000000
      0.000000
      0.000000
      0.464130
      ...
      0.034157
      0.045233
      0.044563
      0.017900
      0.043618
      0.076412
      0.036831
      0.007185
      1
      0.66
    
    
      13
      0.653430
      0.000000
      0.142020
      0.046679
      0.000000
      0.000000
      0.650850
      0.000000
      0.000000
      0.000000
      ...
      0.043096
      0.055382
      0.050194
      0.039210
      0.023657
      0.021919
      0.055182
      0.027263
      0
      0.66
    
    
      14
      0.000000
      0.000000
      0.000000
      0.000000
      1.299300
      0.882630
      0.137290
      0.000000
      0.000000
      0.790330
      ...
      0.018213
      0.030337
      0.011113
      0.003206
      0.036056
      0.024078
      0.020279
      0.022261
      0
      1.00
    
    
      15
      0.000000
      1.103800
      0.000000
      0.000000
      1.145400
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.036399
      0.031912
      0.029357
      0.056295
      0.012967
      0.021085
      0.042250
      0.037959
      0
      0.66
    
    
      16
      0.658060
      1.518300
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      2.115800
      ...
      0.024941
      0.018340
      0.025604
      0.014863
      0.038411
      0.051662
      0.075225
      0.031801
      1
      1.00
    
    
      17
      0.915600
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.827120
      0.000000
      0.000000
      0.000000
      ...
      0.048584
      0.034337
      0.049318
      0.021250
      0.035184
      0.032242
      0.025961
      0.023896
      0
      0.66
    
    
      18
      0.122340
      0.484180
      0.575630
      0.000000
      0.056843
      0.000000
      0.269810
      0.803790
      0.000000
      0.847900
      ...
      0.025202
      0.036941
      0.101960
      0.042061
      0.033733
      0.056107
      0.043020
      0.034273
      1
      1.00
    
    
      19
      0.000000
      0.000000
      0.887790
      0.796050
      0.949070
      0.000000
      0.000000
      0.000000
      0.000000
      0.150800
      ...
      0.046529
      0.029277
      0.048688
      0.030056
      0.066896
      0.064681
      0.064771
      0.033705
      0
      1.00
    
    
      20
      0.000000
      0.911010
      0.000000
      0.000000
      0.336420
      0.000000
      0.000000
      0.180940
      0.263610
      0.988440
      ...
      0.041469
      0.009145
      0.009094
      0.009796
      0.027103
      0.042893
      0.056196
      0.012501
      1
      1.00
    
    
      21
      0.000000
      0.000000
      0.000000
      0.198800
      0.000000
      0.901150
      0.000000
      0.000000
      0.413540
      0.000000
      ...
      0.037698
      0.028640
      0.023290
      0.030678
      0.032787
      0.049547
      0.027087
      0.044812
      0
      1.00
    
    
      22
      0.457150
      1.565800
      0.000000
      0.000000
      0.000000
      0.719880
      0.000000
      0.000000
      0.107970
      0.000000
      ...
      0.060575
      0.041959
      0.047929
      0.039467
      0.049774
      0.082377
      0.057957
      0.035604
      1
      1.00
    
    
      23
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.006583
      0.012513
      0.028868
      0.039953
      0.014977
      0.045486
      0.021237
      0.026507
      0
      0.66
    
    
      24
      1.145900
      1.242200
      0.000000
      0.000000
      0.000000
      0.956940
      0.000000
      0.000000
      0.000000
      0.519750
      ...
      0.018149
      0.012153
      0.018000
      0.037783
      0.019433
      0.014982
      0.025554
      0.025585
      1
      1.00
    
    
      25
      0.384660
      0.642160
      0.000000
      0.000000
      0.000000
      0.000000
      0.141730
      0.510140
      0.000000
      0.322540
      ...
      0.031162
      0.038961
      0.041850
      0.026021
      0.007156
      0.031507
      0.048640
      0.028067
      1
      1.00
    
    
      26
      0.044793
      0.171220
      0.150470
      0.791640
      0.100370
      0.000000
      0.000000
      0.000000
      0.000000
      0.546100
      ...
      0.027918
      0.022951
      0.067286
      0.072825
      0.026592
      0.030550
      0.047546
      0.061572
      0
      1.00
    
    
      27
      0.024028
      0.000000
      0.000000
      1.560600
      0.323740
      0.573730
      0.000000
      0.972080
      0.402180
      0.613800
      ...
      0.011114
      0.007980
      0.045997
      0.045494
      0.018651
      0.011630
      0.011288
      0.019492
      0
      1.00
    
    
      28
      0.311980
      0.244520
      0.212100
      0.978550
      0.000000
      1.319800
      0.000000
      0.000000
      0.000000
      0.026332
      ...
      0.066910
      0.036916
      0.029357
      0.017351
      0.020543
      0.015300
      0.016477
      0.019715
      0
      0.66
    
    
      29
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.558100
      0.485880
      0.815440
      ...
      0.020571
      0.044319
      0.039485
      0.047731
      0.043560
      0.043651
      0.042374
      0.048968
      0
      1.00
    
    
      30
      0.269790
      0.024128
      0.000156
      0.000000
      0.174980
      0.000000
      0.337810
      0.000000
      0.000000
      0.000000
      ...
      0.085321
      0.016958
      0.008131
      0.022019
      0.031845
      0.020188
      0.007039
      0.012079
      1
      1.00
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      3771
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.613830
      1.456800
      1.350900
      0.769510
      0.000000
      ...
      0.031909
      0.012088
      0.019042
      0.013035
      0.069337
      0.040768
      0.015824
      0.033300
      1
      0.66
    
    
      3772
      0.484470
      0.773390
      0.000000
      0.213940
      0.000000
      0.000000
      0.000000
      0.308140
      0.120880
      0.807669
      ...
      0.010171
      0.024787
      0.021214
      0.015437
      0.021795
      0.032656
      0.011645
      0.012731
      1
      1.00
    
    
      3773
      1.522700
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.010537
      0.027684
      0.031280
      0.027294
      0.013832
      0.020979
      0.033723
      0.039819
      1
      1.00
    
    
      3774
      0.945470
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.022253
      0.043092
      0.031044
      0.036253
      0.059886
      0.047193
      0.065673
      0.021502
      1
      0.66
    
    
      3775
      0.946234
      0.067972
      0.000000
      0.435930
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.426852
      ...
      0.034843
      0.034115
      0.035841
      0.018307
      0.029870
      0.038746
      0.038803
      0.026157
      0
      0.66
    
    
      3776
      0.523340
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.174440
      0.000000
      0.650650
      ...
      0.010049
      0.012368
      0.032397
      0.040944
      0.016676
      0.022170
      0.023827
      0.033898
      1
      1.00
    
    
      3777
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.276959
      0.000000
      0.000000
      0.504155
      1.662700
      ...
      0.022092
      0.014955
      0.018942
      0.016944
      0.032190
      0.028367
      0.018088
      0.015747
      1
      0.66
    
    
      3778
      0.304390
      0.000000
      0.258000
      0.507590
      0.000000
      0.000000
      0.009438
      0.000000
      0.000000
      0.143250
      ...
      0.024456
      0.041372
      0.041183
      0.047993
      0.033153
      0.040062
      0.023869
      0.035208
      0
      0.66
    
    
      3779
      0.000000
      0.000000
      0.000000
      0.000000
      0.167130
      0.000000
      0.081230
      0.000000
      0.000000
      0.000000
      ...
      0.056108
      0.062568
      0.034892
      0.028362
      0.038791
      0.040587
      0.035817
      0.017098
      0
      1.00
    
    
      3780
      0.000000
      0.000000
      0.063054
      0.000000
      0.000000
      1.302700
      0.000000
      0.000000
      0.221710
      0.000000
      ...
      0.007169
      0.053071
      0.034744
      0.023068
      0.002995
      0.007839
      0.036122
      0.014431
      1
      0.66
    
    
      3781
      0.000000
      0.072509
      0.000000
      0.283930
      1.203500
      0.017472
      0.085080
      0.000000
      0.039984
      0.364180
      ...
      0.017468
      0.020724
      0.016279
      0.013216
      0.018009
      0.011172
      0.013381
      0.009624
      1
      0.66
    
    
      3782
      0.082950
      0.153360
      0.000000
      0.000000
      0.000000
      0.000000
      0.973910
      0.272972
      0.043460
      1.534700
      ...
      0.011625
      0.010129
      0.008095
      0.011113
      0.041680
      0.019421
      0.020782
      0.012575
      1
      1.00
    
    
      3783
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.461830
      1.176400
      ...
      0.027828
      0.038889
      0.021408
      0.003900
      0.029600
      0.029911
      0.026131
      0.011419
      0
      1.00
    
    
      3784
      0.000000
      0.738330
      0.000000
      0.000000
      0.000000
      0.319599
      0.000000
      0.000000
      0.000000
      0.916750
      ...
      0.049141
      0.021031
      0.028404
      0.012977
      0.039746
      0.040718
      0.013986
      0.018132
      1
      1.00
    
    
      3785
      0.456985
      0.190430
      0.363980
      0.000000
      0.000000
      0.000000
      0.361088
      0.383590
      0.272360
      0.000000
      ...
      0.035155
      0.031911
      0.057594
      0.080866
      0.034773
      0.047184
      0.064976
      0.019408
      1
      1.00
    
    
      3786
      0.092517
      0.214898
      0.000000
      0.324010
      0.000000
      0.000000
      0.000000
      0.000000
      0.191760
      0.985159
      ...
      0.005199
      0.023028
      0.006255
      0.008951
      0.023196
      0.024608
      0.010050
      0.008300
      1
      0.66
    
    
      3787
      0.000000
      0.000000
      0.501170
      0.057088
      0.163170
      0.646400
      0.298280
      0.757990
      0.087317
      0.146870
      ...
      0.016235
      0.044767
      0.024993
      0.010303
      0.009179
      0.031484
      0.070180
      0.046984
      1
      1.00
    
    
      3788
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.825670
      0.000000
      0.143230
      0.000000
      ...
      0.011424
      0.013247
      0.004331
      0.009335
      0.008933
      0.015937
      0.017482
      0.031523
      0
      1.00
    
    
      3789
      0.000000
      0.000000
      0.000000
      0.102527
      0.000000
      1.338000
      0.000000
      0.000000
      0.000000
      0.113400
      ...
      0.000684
      0.001524
      0.017507
      0.001442
      0.002318
      0.016162
      0.011766
      0.007097
      1
      1.00
    
    
      3790
      0.000000
      0.000000
      0.000000
      0.476500
      0.265184
      0.176770
      0.085610
      0.752900
      0.000000
      0.350030
      ...
      0.008344
      0.021348
      0.031300
      0.015413
      0.014002
      0.030157
      0.037464
      0.024539
      0
      1.00
    
    
      3791
      0.000000
      0.317140
      0.246310
      0.000000
      0.000000
      0.000000
      0.000000
      0.457169
      0.000000
      0.345913
      ...
      0.006256
      0.015469
      0.038902
      0.012438
      0.020679
      0.027800
      0.035840
      0.015295
      1
      1.00
    
    
      3792
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.152793
      0.754110
      0.151220
      0.133960
      ...
      0.026685
      0.056608
      0.034379
      0.030035
      0.050165
      0.030962
      0.038335
      0.034809
      1
      0.66
    
    
      3793
      0.000000
      0.000000
      1.137600
      0.000000
      0.000000
      0.000000
      0.084377
      0.000000
      0.000000
      0.947300
      ...
      0.037247
      0.041796
      0.026112
      0.010530
      0.046502
      0.041396
      0.042460
      0.033046
      0
      0.66
    
    
      3794
      0.000000
      0.537230
      0.000000
      0.000000
      0.000000
      1.846200
      0.000000
      0.000000
      0.000000
      1.628400
      ...
      0.037300
      0.026726
      0.017458
      0.017493
      0.027442
      0.014037
      0.011320
      0.010543
      0
      0.66
    
    
      3795
      0.000000
      0.000000
      0.000000
      0.262230
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.043260
      0.037229
      0.042648
      0.044968
      0.038014
      0.060082
      0.054445
      0.032456
      0
      0.66
    
    
      3796
      0.000000
      0.944010
      0.000000
      0.000000
      0.090126
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.021328
      0.016077
      0.019606
      0.035862
      0.005605
      0.003127
      0.009222
      0.019916
      1
      1.00
    
    
      3797
      0.000000
      0.000000
      0.000000
      0.620882
      0.000000
      0.516080
      0.000000
      0.133720
      0.359210
      0.769915
      ...
      0.013608
      0.020126
      0.021958
      0.035866
      0.025194
      0.029437
      0.029789
      0.018152
      1
      0.66
    
    
      3798
      0.000000
      0.000000
      0.146570
      0.000000
      0.260730
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.025916
      0.040383
      0.045961
      0.012540
      0.025097
      0.016432
      0.030621
      0.016492
      1
      0.66
    
    
      3799
      0.000000
      0.041131
      0.293200
      0.000000
      0.024413
      0.000000
      0.262210
      0.000000
      0.000000
      0.000000
      ...
      0.024990
      0.034452
      0.014815
      0.008251
      0.058643
      0.050752
      0.038955
      0.010777
      1
      0.66
    
    
      3800
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.490140
      0.330143
      1.968300
      0.008118
      ...
      0.009899
      0.019455
      0.017555
      0.019566
      0.016966
      0.048685
      0.035669
      0.028767
      1
      1.00
    
  

3800 rows × 4610 columns

The original Notebook tried a couple of methods of incorporating the confidence labels into the model:

Use all data samples, irrespective of confidence labels. However, the confidence label of each instance was set to be sample weight of each instance in the training phase.
Only use instances that have a confidence label of 1.

The best model was based on Method 2. Thus, only method 2 will be shown for this section here.



In [16]:

    
#only keep data instance with confidence label = 1
conf_full_train = full_train_wcl.loc[full_train_wcl['confidence'] == 1]
conf_full_train









    Out[16]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
      prediction
      confidence
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      2
      0.000000
      0.000000
      0.000000
      0.442600
      0.000000
      0.000000
      0.150240
      1.480600
      0.635870
      0.020341
      ...
      0.020330
      0.019916
      0.033483
      0.015937
      0.021656
      0.018347
      0.017458
      0.018744
      0
      1.0
    
    
      3
      0.000000
      0.000000
      0.000000
      0.470420
      0.000000
      1.277900
      0.459540
      0.000000
      0.000000
      0.000000
      ...
      0.005156
      0.041298
      0.014921
      0.015868
      0.012122
      0.015664
      0.011410
      0.017450
      1
      1.0
    
    
      4
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.030878
      0.928510
      ...
      0.007086
      0.013696
      0.028789
      0.022858
      0.030883
      0.026539
      0.021337
      0.018109
      1
      1.0
    
    
      5
      0.490990
      0.833880
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.188490
      0.764420
      ...
      0.036306
      0.029198
      0.045733
      0.008041
      0.013111
      0.022239
      0.058815
      0.014322
      1
      1.0
    
    
      6
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.008809
      0.026506
      0.018506
      0.029058
      0.009211
      0.013236
      0.031606
      0.022141
      1
      1.0
    
    
      7
      0.368230
      0.000000
      0.000000
      0.000000
      0.000000
      0.395810
      0.948560
      0.000000
      0.000000
      0.000000
      ...
      0.016638
      0.040408
      0.028362
      0.016704
      0.034409
      0.025067
      0.024614
      0.026773
      1
      1.0
    
    
      8
      0.367450
      0.000000
      0.087409
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.588080
      ...
      0.017765
      0.055323
      0.067212
      0.048452
      0.019376
      0.056357
      0.056325
      0.050188
      1
      1.0
    
    
      9
      0.066494
      0.000000
      0.000000
      0.084850
      0.608320
      0.522730
      0.000000
      0.378330
      0.000000
      0.096584
      ...
      0.026661
      0.021242
      0.019962
      0.040603
      0.027398
      0.019766
      0.020432
      0.032214
      0
      1.0
    
    
      10
      0.495670
      2.536900
      0.000000
      0.000000
      0.000000
      1.530300
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.013138
      0.025195
      0.017418
      0.010645
      0.012981
      0.039255
      0.016495
      0.007007
      1
      1.0
    
    
      14
      0.000000
      0.000000
      0.000000
      0.000000
      1.299300
      0.882630
      0.137290
      0.000000
      0.000000
      0.790330
      ...
      0.018213
      0.030337
      0.011113
      0.003206
      0.036056
      0.024078
      0.020279
      0.022261
      0
      1.0
    
    
      16
      0.658060
      1.518300
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      2.115800
      ...
      0.024941
      0.018340
      0.025604
      0.014863
      0.038411
      0.051662
      0.075225
      0.031801
      1
      1.0
    
    
      18
      0.122340
      0.484180
      0.575630
      0.000000
      0.056843
      0.000000
      0.269810
      0.803790
      0.000000
      0.847900
      ...
      0.025202
      0.036941
      0.101960
      0.042061
      0.033733
      0.056107
      0.043020
      0.034273
      1
      1.0
    
    
      19
      0.000000
      0.000000
      0.887790
      0.796050
      0.949070
      0.000000
      0.000000
      0.000000
      0.000000
      0.150800
      ...
      0.046529
      0.029277
      0.048688
      0.030056
      0.066896
      0.064681
      0.064771
      0.033705
      0
      1.0
    
    
      20
      0.000000
      0.911010
      0.000000
      0.000000
      0.336420
      0.000000
      0.000000
      0.180940
      0.263610
      0.988440
      ...
      0.041469
      0.009145
      0.009094
      0.009796
      0.027103
      0.042893
      0.056196
      0.012501
      1
      1.0
    
    
      21
      0.000000
      0.000000
      0.000000
      0.198800
      0.000000
      0.901150
      0.000000
      0.000000
      0.413540
      0.000000
      ...
      0.037698
      0.028640
      0.023290
      0.030678
      0.032787
      0.049547
      0.027087
      0.044812
      0
      1.0
    
    
      22
      0.457150
      1.565800
      0.000000
      0.000000
      0.000000
      0.719880
      0.000000
      0.000000
      0.107970
      0.000000
      ...
      0.060575
      0.041959
      0.047929
      0.039467
      0.049774
      0.082377
      0.057957
      0.035604
      1
      1.0
    
    
      24
      1.145900
      1.242200
      0.000000
      0.000000
      0.000000
      0.956940
      0.000000
      0.000000
      0.000000
      0.519750
      ...
      0.018149
      0.012153
      0.018000
      0.037783
      0.019433
      0.014982
      0.025554
      0.025585
      1
      1.0
    
    
      25
      0.384660
      0.642160
      0.000000
      0.000000
      0.000000
      0.000000
      0.141730
      0.510140
      0.000000
      0.322540
      ...
      0.031162
      0.038961
      0.041850
      0.026021
      0.007156
      0.031507
      0.048640
      0.028067
      1
      1.0
    
    
      26
      0.044793
      0.171220
      0.150470
      0.791640
      0.100370
      0.000000
      0.000000
      0.000000
      0.000000
      0.546100
      ...
      0.027918
      0.022951
      0.067286
      0.072825
      0.026592
      0.030550
      0.047546
      0.061572
      0
      1.0
    
    
      27
      0.024028
      0.000000
      0.000000
      1.560600
      0.323740
      0.573730
      0.000000
      0.972080
      0.402180
      0.613800
      ...
      0.011114
      0.007980
      0.045997
      0.045494
      0.018651
      0.011630
      0.011288
      0.019492
      0
      1.0
    
    
      29
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.558100
      0.485880
      0.815440
      ...
      0.020571
      0.044319
      0.039485
      0.047731
      0.043560
      0.043651
      0.042374
      0.048968
      0
      1.0
    
    
      30
      0.269790
      0.024128
      0.000156
      0.000000
      0.174980
      0.000000
      0.337810
      0.000000
      0.000000
      0.000000
      ...
      0.085321
      0.016958
      0.008131
      0.022019
      0.031845
      0.020188
      0.007039
      0.012079
      1
      1.0
    
    
      31
      0.000000
      0.588330
      0.000000
      0.438650
      0.000000
      0.000000
      0.000000
      0.000000
      1.365300
      0.000000
      ...
      0.001618
      0.011399
      0.013321
      0.011482
      0.004453
      0.015329
      0.011126
      0.013029
      0
      1.0
    
    
      32
      0.000000
      0.000000
      0.000000
      0.802240
      0.000000
      0.808990
      0.000000
      0.000000
      0.146920
      0.000000
      ...
      0.005147
      0.020457
      0.016770
      0.026065
      0.008703
      0.009715
      0.017511
      0.020147
      0
      1.0
    
    
      34
      0.000000
      0.144530
      0.000000
      1.216500
      0.000000
      1.906900
      0.000000
      0.770710
      0.000000
      0.807150
      ...
      0.014471
      0.018758
      0.017670
      0.024831
      0.006508
      0.015084
      0.019683
      0.018637
      0
      1.0
    
    
      36
      0.530580
      0.180050
      0.004283
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.411300
      ...
      0.021727
      0.015733
      0.024622
      0.023215
      0.021216
      0.036337
      0.043055
      0.023884
      1
      1.0
    
    
      41
      0.000000
      1.095000
      0.000000
      0.000000
      0.855240
      0.000000
      0.270590
      1.435700
      0.000000
      0.000000
      ...
      0.013380
      0.047201
      0.036005
      0.007252
      0.025466
      0.082856
      0.034223
      0.023905
      0
      1.0
    
    
      42
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.445730
      0.000000
      0.987190
      ...
      0.004484
      0.005163
      0.047327
      0.022377
      0.013530
      0.019314
      0.020233
      0.016985
      1
      1.0
    
    
      44
      0.563240
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.157900
      ...
      0.023095
      0.027842
      0.048980
      0.006251
      0.034309
      0.035927
      0.089608
      0.013321
      1
      1.0
    
    
      46
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.791610
      0.038102
      0.000000
      0.345330
      0.000000
      ...
      0.003776
      0.016488
      0.038568
      0.027254
      0.030923
      0.038226
      0.043074
      0.038702
      1
      1.0
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      3736
      0.000000
      0.000000
      0.379880
      0.000000
      0.430587
      0.000000
      1.310400
      0.718670
      0.000000
      1.206700
      ...
      0.022346
      0.034921
      0.021108
      0.028067
      0.028694
      0.035580
      0.017478
      0.012353
      0
      1.0
    
    
      3737
      0.000000
      0.000000
      0.000000
      0.201489
      0.000000
      0.000000
      0.142460
      1.482600
      0.000000
      0.000000
      ...
      0.005164
      0.016672
      0.019354
      0.020468
      0.026234
      0.026968
      0.021722
      0.031073
      0
      1.0
    
    
      3738
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.560290
      0.000000
      0.226157
      0.000000
      1.286300
      ...
      0.035721
      0.033835
      0.034028
      0.018540
      0.024019
      0.023726
      0.019669
      0.018807
      1
      1.0
    
    
      3744
      0.000000
      0.000000
      0.000000
      1.864400
      0.000000
      0.000000
      0.000000
      0.558016
      0.177810
      0.686460
      ...
      0.022472
      0.106690
      0.088522
      0.046389
      0.020237
      0.031843
      0.037608
      0.043574
      0
      1.0
    
    
      3746
      2.080200
      4.446900
      0.000000
      0.000000
      0.186504
      0.000000
      0.000000
      0.000000
      0.000000
      1.808500
      ...
      0.062080
      0.045719
      0.071660
      0.046617
      0.060897
      0.059359
      0.034978
      0.064414
      1
      1.0
    
    
      3749
      0.000000
      0.123121
      0.000000
      0.000000
      0.000000
      2.269100
      1.344900
      0.642410
      0.000000
      0.000000
      ...
      0.018565
      0.042016
      0.038489
      0.022180
      0.032714
      0.031088
      0.020690
      0.026536
      0
      1.0
    
    
      3750
      0.000000
      0.326500
      0.000000
      0.000000
      0.000000
      0.000000
      0.168060
      0.627750
      1.645500
      0.812560
      ...
      0.040700
      0.042014
      0.046628
      0.033452
      0.029408
      0.027535
      0.026706
      0.027509
      1
      1.0
    
    
      3752
      0.000000
      0.000000
      0.004252
      0.316155
      0.032557
      0.519740
      0.470590
      0.677960
      0.000000
      0.226440
      ...
      0.019529
      0.021082
      0.028168
      0.031610
      0.033692
      0.048022
      0.052835
      0.025539
      0
      1.0
    
    
      3753
      0.000000
      0.000000
      0.201820
      0.620210
      0.200890
      0.603980
      0.000000
      0.800740
      0.792020
      0.023076
      ...
      0.001814
      0.012130
      0.025099
      0.006197
      0.004581
      0.013265
      0.016408
      0.019732
      0
      1.0
    
    
      3754
      0.000000
      0.442410
      0.019417
      0.000000
      0.000000
      0.121730
      0.508880
      0.270060
      0.172236
      0.000000
      ...
      0.018659
      0.035507
      0.049965
      0.024217
      0.018574
      0.018603
      0.046691
      0.030947
      1
      1.0
    
    
      3756
      0.706050
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.101560
      0.925570
      0.008006
      0.000000
      ...
      0.030222
      0.018685
      0.030030
      0.015352
      0.028444
      0.045679
      0.059895
      0.047354
      1
      1.0
    
    
      3760
      0.000000
      0.000000
      0.000000
      0.764230
      0.000000
      0.279088
      0.274390
      0.016104
      0.347830
      0.958480
      ...
      0.010139
      0.021263
      0.022355
      0.029823
      0.009918
      0.037257
      0.036307
      0.021285
      1
      1.0
    
    
      3762
      0.267280
      0.000000
      0.511380
      0.000000
      0.002737
      0.706080
      0.033460
      0.000000
      0.121620
      0.106471
      ...
      0.041355
      0.042160
      0.045000
      0.019701
      0.036989
      0.037595
      0.043883
      0.036089
      0
      1.0
    
    
      3763
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.106660
      0.644870
      1.045280
      ...
      0.026702
      0.025092
      0.003558
      0.014179
      0.042395
      0.021666
      0.006262
      0.039873
      1
      1.0
    
    
      3765
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.191198
      0.000000
      0.000000
      0.184490
      ...
      0.022409
      0.042996
      0.012631
      0.009260
      0.027802
      0.033995
      0.013917
      0.015682
      0
      1.0
    
    
      3772
      0.484470
      0.773390
      0.000000
      0.213940
      0.000000
      0.000000
      0.000000
      0.308140
      0.120880
      0.807669
      ...
      0.010171
      0.024787
      0.021214
      0.015437
      0.021795
      0.032656
      0.011645
      0.012731
      1
      1.0
    
    
      3773
      1.522700
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.010537
      0.027684
      0.031280
      0.027294
      0.013832
      0.020979
      0.033723
      0.039819
      1
      1.0
    
    
      3776
      0.523340
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.174440
      0.000000
      0.650650
      ...
      0.010049
      0.012368
      0.032397
      0.040944
      0.016676
      0.022170
      0.023827
      0.033898
      1
      1.0
    
    
      3779
      0.000000
      0.000000
      0.000000
      0.000000
      0.167130
      0.000000
      0.081230
      0.000000
      0.000000
      0.000000
      ...
      0.056108
      0.062568
      0.034892
      0.028362
      0.038791
      0.040587
      0.035817
      0.017098
      0
      1.0
    
    
      3782
      0.082950
      0.153360
      0.000000
      0.000000
      0.000000
      0.000000
      0.973910
      0.272972
      0.043460
      1.534700
      ...
      0.011625
      0.010129
      0.008095
      0.011113
      0.041680
      0.019421
      0.020782
      0.012575
      1
      1.0
    
    
      3783
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.461830
      1.176400
      ...
      0.027828
      0.038889
      0.021408
      0.003900
      0.029600
      0.029911
      0.026131
      0.011419
      0
      1.0
    
    
      3784
      0.000000
      0.738330
      0.000000
      0.000000
      0.000000
      0.319599
      0.000000
      0.000000
      0.000000
      0.916750
      ...
      0.049141
      0.021031
      0.028404
      0.012977
      0.039746
      0.040718
      0.013986
      0.018132
      1
      1.0
    
    
      3785
      0.456985
      0.190430
      0.363980
      0.000000
      0.000000
      0.000000
      0.361088
      0.383590
      0.272360
      0.000000
      ...
      0.035155
      0.031911
      0.057594
      0.080866
      0.034773
      0.047184
      0.064976
      0.019408
      1
      1.0
    
    
      3787
      0.000000
      0.000000
      0.501170
      0.057088
      0.163170
      0.646400
      0.298280
      0.757990
      0.087317
      0.146870
      ...
      0.016235
      0.044767
      0.024993
      0.010303
      0.009179
      0.031484
      0.070180
      0.046984
      1
      1.0
    
    
      3788
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.825670
      0.000000
      0.143230
      0.000000
      ...
      0.011424
      0.013247
      0.004331
      0.009335
      0.008933
      0.015937
      0.017482
      0.031523
      0
      1.0
    
    
      3789
      0.000000
      0.000000
      0.000000
      0.102527
      0.000000
      1.338000
      0.000000
      0.000000
      0.000000
      0.113400
      ...
      0.000684
      0.001524
      0.017507
      0.001442
      0.002318
      0.016162
      0.011766
      0.007097
      1
      1.0
    
    
      3790
      0.000000
      0.000000
      0.000000
      0.476500
      0.265184
      0.176770
      0.085610
      0.752900
      0.000000
      0.350030
      ...
      0.008344
      0.021348
      0.031300
      0.015413
      0.014002
      0.030157
      0.037464
      0.024539
      0
      1.0
    
    
      3791
      0.000000
      0.317140
      0.246310
      0.000000
      0.000000
      0.000000
      0.000000
      0.457169
      0.000000
      0.345913
      ...
      0.006256
      0.015469
      0.038902
      0.012438
      0.020679
      0.027800
      0.035840
      0.015295
      1
      1.0
    
    
      3796
      0.000000
      0.944010
      0.000000
      0.000000
      0.090126
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.021328
      0.016077
      0.019606
      0.035862
      0.005605
      0.003127
      0.009222
      0.019916
      1
      1.0
    
    
      3800
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.490140
      0.330143
      1.968300
      0.008118
      ...
      0.009899
      0.019455
      0.017555
      0.019566
      0.016966
      0.048685
      0.035669
      0.028767
      1
      1.0
    
  

1922 rows × 4610 columns



In [17]:

    
#quantify class counts 
conf_full_train.prediction.value_counts()









    Out[17]:





1    1106
0     816
Name: prediction, dtype: int64



In [18]:

    
#convert full training data dataframe with confidence instances only to matrix
conf_ft_matrix = conf_full_train.as_matrix(columns=None)
conf_ft_matrix
conf_ft_matrix.shape









    Out[18]:





(1922, 4610)



In [19]:

    
#splitting full training data with confidence into inputs and outputs 
conf_ft_inputs = conf_ft_matrix[:,0:4608]
print(conf_ft_inputs.shape)
conf_ft_outputs = conf_ft_matrix[:,4608]
print(conf_ft_outputs.shape)









    



(1922, 4608)
(1922,)

2.3) Dealing with Class Imbalance

Binary classification tasks suffer from imbalanced class splits. Training a model on a data set with more instances for one class than the other class can result in biases towards the majority class, as sensitivity will be lost in detecting the minority class [17]. This is pertinent because the training data (additional and original data included) has an unbalanced class split, with more instances of Class 1 than Class 0. Thus, training the model on this data would result in a model that is biased towards Class 1 detections. To exacerbate this issue, the test data is also unbalanced, but the majority class for the test data is Class 0. Some researchers have already attempted to tackle this problem. There are two primary methods of dealing with class imbalances: balancing or further unbalancing the data set as needs fit, or introducing class weights, where the underlying algorithm applies disparate misclassification penalties to different classes [15]. Both approaches will be combined here, to first balance the data set, and then train the model to be bias towards Class 0 instances, as Class 0 is the majority class in the test data. The ‘imbalanced-learn’ API [16] has implementations of class balancing strategies found in the literature, such as SMOTE [17]. The premise of SMOTE is over-sampling of the minority class until a balance in the data set is reached. The sampling method is based on sampling via kNN. Unlike kNN for imputation, the best k was suggested to be 5 here. Once the data set was balanced through SMOTE, class weights were introduced. Considering the test data has more instances belonging to Class 0, the class weights were adjusted so that misclassification of Class 0 is penalized more heavily than misclassification of Class 1. The ratio of class weights for training were adjusted to match the class proportions of the test data, i.e. Class 0 weight = 1.33 and Class 1 weight = 1. The reason over-sampling of the minority class was preferred over under-sampling of the majority class is because the data quantity was already scarce (evident from Section 2.3). Furthermore, over-sampling to reach a class balance permits the use of ‘accuracy’ as the accuracy metric, as opposed to using AUC, which is more complex. In summary, as well as balancing the train data class-split, the model itself was adjusted to place more emphasis on correct Class 0 classifications.

This section will cover how the class imbalance of the training data was addressed. The best approach for this was Over-Sampling using SMOTE. This technique over-samples the minority class until the data set is completely balanced. Note: may need to install imblearn package first.



In [20]:

    
from imblearn.over_sampling import SMOTE 
from collections import Counter



In [21]:

    
#fit over-sampling to training data inputs and putputs
over_sampler = SMOTE(ratio='auto', k_neighbors=5, kind='regular', random_state=0)
over_sampler.fit(conf_ft_inputs, conf_ft_outputs)









    Out[21]:





SMOTE(k=None, k_neighbors=5, kind='regular', m=None, m_neighbors=10, n_jobs=1,
   out_step=0.5, random_state=0, ratio='auto', svm_estimator=None)



In [22]:

    
#create new inputs and outputs with correct class proportions 
resampled_x, resampled_y = over_sampler.fit_sample(conf_ft_inputs, conf_ft_outputs)



In [23]:

    
#quantify original class proportions prior to over-sampling
Counter(conf_ft_outputs)









    Out[23]:





Counter({0.0: 816, 1.0: 1106})



In [24]:

    
#quantify class proportions after over-sampling
Counter(resampled_y)









    Out[24]:





Counter({0.0: 1106, 1.0: 1106})



In [25]:

    
#assign newly sampled input and outputs to old variable name used for inputs and outputs before
#over-sampling 
conf_ft_inputs = resampled_x
conf_ft_outputs = resampled_y
print(Counter(conf_ft_outputs))









    



Counter({0.0: 1106, 1.0: 1106})

3. Pre-Processing

The pre-processing of the data consisted of several steps. First, the features were rescaled appropriately. Secondly, Feature Extraction was performed to reduce the unwieldy dimensionality of the training data set, concomitantly increasing the signal-to-noise ratio and decreasing time complexity.

This section will cover the Pre-Processing conducted that produced the model capable of producing the best predictions. Feature Scaling was achieved via several methods. The best method was standardisation. Feature Extraction was achieved via PCA.

3.1) Feature Scaling Feature scaling is important because it ensures that features have values plotted on the same scale, irrespective of the original units used to describe the original features. Feature scaling can be in the form of standardization, normalization or rescaling. The correct choice of feature scaling method is arbitrary and highly dependent on context. Thus, all three approaches were tried. The optimal results were obtained for standardization.



In [26]:

    
#standardise the full training data with confidence labels 1 only
scaler_2 = preprocessing.StandardScaler().fit(conf_ft_inputs)
std_conf_ft_in = scaler_2.transform(conf_ft_inputs)
std_conf_ft_in









    Out[26]:





array([[-0.45543783, -0.4535694 , -0.30705458, ..., -0.70537079,
        -0.89527325, -0.42860037],
       [-0.45543783, -0.4535694 , -0.30705458, ..., -0.88589433,
        -1.30662422, -0.52141952],
       [-0.45543783, -0.4535694 , -0.30705458, ..., -0.15417854,
        -0.63144547, -0.47414918],
       ..., 
       [-0.45543783, -0.4535694 , -0.30705458, ..., -1.34145834,
        -0.93541528, -0.13391536],
       [ 0.5815841 , -0.4535694 ,  0.80741398, ...,  0.92682162,
         2.74538189,  1.97810955],
       [-0.45543783, -0.37984075, -0.30705458, ..., -0.22601672,
        -0.67602594,  0.20590276]])

3.2) Principal Component Analysis (PCA)

High-dimensionality should be reduced because it is likely to contain noisy features and because high-dimensionality increases computational time complexity [18]. Dimensionality reduction can be achieved via feature selection methods, such as filters and wrappers [19], or via feature extraction methods, such as PCA [20]. Here, the dimensionality reduction was conducted via feature extraction, vicariously through PCA. The rationale behind this is that the relative importance of GIST and CNN features is undetermined. Furthermore, feature selection methods may require some domain expertise to be effective. PCA uses the covariance matrix, its eigenvectors and eigenvalues to engineer principal components, which are uncorrelated eigenvectors that explain some proportion of the variance found in the dataset. The optimal number of principal components to engineer is arbitrary. Thus, the optimal number of principal components can be configured experimentally. This can be achieved by plotting the change in variance explained as a function of the number of principal components included, and by calculating the test score during cross validation for data transformed using different numbers of principal components.



In [27]:

    
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

#preprocessing: PCA (feature construction). High number of pcs chosen to plot a graph
#showing how much more variance is explained as pc number increases 
pca_2 = PCA(n_components=700, random_state=0)
std_conf_ft_in_pca = pca_2.fit_transform(std_conf_ft_in)
#quantify amount of variance explained by principal components
print("Total Variance Explained by PCs (%): ", np.sum(pca_2.explained_variance_ratio_))









    



Total Variance Explained by PCs (%):  0.917280339666

The cell below will plot how much more of the variance in the data set is explained as the number of principal components included is increased.



In [28]:

    
#calculate a list of cumulative sums for amount of variance explained
cumulative_variance = np.cumsum(pca_2.explained_variance_ratio_)
len(cumulative_variance)
#add 0 to the beginning of the list, otherwise list starts with variance explained by 1 pc
cumulative_variance = np.insert(cumulative_variance, 0, 0) 

#define range of pcs
pcs_4_var_exp = np.arange(0,701,1)
len(pcs_4_var_exp)

fig_1 = plt.figure(figsize=(7,4))
plt.title('Number of PCs and Change In Variance Explained')
plt.xlabel('Number of PCs')
plt.ylabel('Variance Explained (%)')
plt.plot(pcs_4_var_exp, cumulative_variance, 'x-', color="r")
plt.show()

The graph above suggests that the maximum number of principal components should not exceed 300, as less and less variance is explained as the number of principal components included increases beyond 300. For the optimisation, the optimal number of principal components was initially assumed to be 230.



In [29]:

    
#preprocessing: PCA (feature construction)
pca_2 = PCA(n_components=230, random_state=0)
std_conf_ft_in_pca = pca_2.fit_transform(std_conf_ft_in)
#quantify ratio of variance explain by principal components
print("Total Variance Explained by PCs (%): ", np.sum(pca_2.explained_variance_ratio_))









    



Total Variance Explained by PCs (%):  0.78651096325

4. Model Selection

The optimization was conducted through the use of a Grid search. In addition, the optimization was conducted for two kernels: the polynomial kernel and the RBF kernel. The initial search for optimal parameters was conducted on a logarithmic scale to explore as much of the parameter space as possible. From the results, the parameter ranges were refined and pruned to only include the potential best candidates. The choice of parameters was purely based on accuracy metrics, not on any other practical factors such as memory consumption or time complexity of predictions. The best model was determined on the following merits:

Good generalisation - achieving a high testing 356 score during cross-validation.
Avoidance of over-fitting - restriction on the magnitude of training scores during cross-validation. In particular, a training score beyond 360 an arbitrary limit is indicative of over-fitting. 361 Thus, a balance had to be struck to ensure that good 362 generalisation can be assumed. This section covers how the best model was selected. Two kernels were tried and tested: RBF and polynomial. RBF outperformed polynomial, therefore only the optimisation results of RBF will be presented here. Furthermore, the parameter ranges to try have already been pruned at this point, so only the final ranges will be used to perform a Grid Search.

4.1) Parameter Optimisation



In [30]:

    
#this cell takes around 7 minutes to run
#parameter optimisation with Exhaustive Grid Search, with class weight 
original_c_range = np.arange(0.85, 1.01, 0.01)
gamma_range = np.arange(0.00001, 0.00023, 0.00002)

#define parameter ranges to test
param_grid = [{'C': original_c_range, 'gamma': gamma_range, 'kernel': ['rbf'],
             'class_weight':[{0:1.33, 1:1}]}]

#define model to do parameter search on
svr = SVC()
clf = GridSearchCV(svr, param_grid, scoring='accuracy', cv=5,)
clf.fit(std_conf_ft_in_pca, conf_ft_outputs)

#create dictionary of results
results_dict = clf.cv_results_

#convert the results into a dataframe
df_results = pd.DataFrame.from_dict(results_dict)
df_results









    Out[30]:






  
    
      
      mean_fit_time
      mean_score_time
      mean_test_score
      mean_train_score
      param_C
      param_class_weight
      param_gamma
      param_kernel
      params
      rank_test_score
      ...
      split2_test_score
      split2_train_score
      split3_test_score
      split3_train_score
      split4_test_score
      split4_train_score
      std_fit_time
      std_score_time
      std_test_score
      std_train_score
    
  
  
    
      0
      0.603728
      0.144425
      0.876582
      0.886303
      0.85
      {0: 1.33, 1: 1}
      1e-05
      rbf
      {'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...
      183
      ...
      0.880090
      0.881356
      0.873303
      0.885311
      0.857466
      0.890960
      0.050004
      0.038218
      0.011687
      0.003265
    
    
      1
      0.404284
      0.083440
      0.884268
      0.900656
      0.85
      {0: 1.33, 1: 1}
      3e-05
      rbf
      {'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...
      163
      ...
      0.904977
      0.897740
      0.873303
      0.900565
      0.871041
      0.904520
      0.020091
      0.000607
      0.013607
      0.003712
    
    
      2
      0.362656
      0.078480
      0.886528
      0.907550
      0.85
      {0: 1.33, 1: 1}
      5e-05
      rbf
      {'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...
      151
      ...
      0.907240
      0.907345
      0.882353
      0.909040
      0.871041
      0.908475
      0.004201
      0.001375
      0.012674
      0.001895
    
    
      3
      0.361930
      0.076454
      0.889241
      0.915574
      0.85
      {0: 1.33, 1: 1}
      7e-05
      rbf
      {'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...
      124
      ...
      0.907240
      0.917514
      0.884615
      0.917514
      0.873303
      0.916949
      0.012624
      0.003171
      0.012640
      0.002570
    
    
      4
      0.353906
      0.075184
      0.891953
      0.920208
      0.85
      {0: 1.33, 1: 1}
      9e-05
      rbf
      {'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...
      55
      ...
      0.902715
      0.924294
      0.884615
      0.922034
      0.877828
      0.918644
      0.007083
      0.001746
      0.011165
      0.002548
    
    
      5
      0.350159
      0.074468
      0.892857
      0.924841
      0.85
      {0: 1.33, 1: 1}
      0.00011
      rbf
      {'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...
      37
      ...
      0.902715
      0.926554
      0.880090
      0.927119
      0.877828
      0.922599
      0.004887
      0.001453
      0.013739
      0.001945
    
    
      6
      0.349687
      0.075331
      0.890145
      0.930831
      0.85
      {0: 1.33, 1: 1}
      0.00013
      rbf
      {'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...
      100
      ...
      0.902715
      0.935593
      0.875566
      0.931638
      0.875566
      0.928249
      0.002590
      0.001693
      0.015527
      0.003002
    
    
      7
      0.352193
      0.075244
      0.890597
      0.937273
      0.85
      {0: 1.33, 1: 1}
      0.00015
      rbf
      {'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...
      89
      ...
      0.898190
      0.940113
      0.877828
      0.938418
      0.877828
      0.935593
      0.005976
      0.000707
      0.014850
      0.002770
    
    
      8
      0.351918
      0.076226
      0.890145
      0.943715
      0.85
      {0: 1.33, 1: 1}
      0.00017
      rbf
      {'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...
      100
      ...
      0.895928
      0.946893
      0.880090
      0.946893
      0.875566
      0.939548
      0.003888
      0.001631
      0.013724
      0.003263
    
    
      9
      0.357914
      0.077536
      0.892857
      0.949140
      0.85
      {0: 1.33, 1: 1}
      0.00019
      rbf
      {'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...
      37
      ...
      0.895928
      0.951977
      0.884615
      0.953672
      0.880090
      0.944068
      0.005477
      0.001646
      0.011818
      0.003673
    
    
      10
      0.363275
      0.079470
      0.894665
      0.953774
      0.85
      {0: 1.33, 1: 1}
      0.00021
      rbf
      {'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...
      2
      ...
      0.893665
      0.955932
      0.893665
      0.957062
      0.884615
      0.948588
      0.006450
      0.002137
      0.009169
      0.003426
    
    
      11
      0.485079
      0.106174
      0.876582
      0.886303
      0.86
      {0: 1.33, 1: 1}
      1e-05
      rbf
      {'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...
      183
      ...
      0.880090
      0.881356
      0.873303
      0.885311
      0.857466
      0.891525
      0.007442
      0.000637
      0.011687
      0.003473
    
    
      12
      0.386279
      0.085589
      0.884268
      0.900543
      0.86
      {0: 1.33, 1: 1}
      3e-05
      rbf
      {'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...
      163
      ...
      0.904977
      0.897740
      0.873303
      0.900565
      0.871041
      0.903955
      0.002964
      0.004466
      0.013607
      0.003600
    
    
      13
      0.412019
      0.091459
      0.886528
      0.907663
      0.86
      {0: 1.33, 1: 1}
      5e-05
      rbf
      {'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...
      151
      ...
      0.907240
      0.907345
      0.882353
      0.909040
      0.871041
      0.908475
      0.063577
      0.013460
      0.012674
      0.001683
    
    
      14
      0.501851
      0.108831
      0.888788
      0.915800
      0.86
      {0: 1.33, 1: 1}
      7e-05
      rbf
      {'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...
      131
      ...
      0.907240
      0.918079
      0.882353
      0.918079
      0.873303
      0.916949
      0.028417
      0.036872
      0.012836
      0.002749
    
    
      15
      0.535873
      0.107396
      0.891953
      0.920547
      0.86
      {0: 1.33, 1: 1}
      9e-05
      rbf
      {'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...
      55
      ...
      0.902715
      0.924294
      0.884615
      0.922034
      0.877828
      0.919774
      0.062958
      0.021096
      0.011165
      0.002370
    
    
      16
      0.362678
      0.076567
      0.892857
      0.925519
      0.86
      {0: 1.33, 1: 1}
      0.00011
      rbf
      {'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...
      37
      ...
      0.902715
      0.927684
      0.880090
      0.927684
      0.877828
      0.922599
      0.016281
      0.003425
      0.013739
      0.002233
    
    
      17
      0.349087
      0.075353
      0.890145
      0.931170
      0.86
      {0: 1.33, 1: 1}
      0.00013
      rbf
      {'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...
      100
      ...
      0.902715
      0.936158
      0.875566
      0.932203
      0.875566
      0.928249
      0.002904
      0.002170
      0.015527
      0.003080
    
    
      18
      0.348154
      0.074966
      0.890597
      0.937499
      0.86
      {0: 1.33, 1: 1}
      0.00015
      rbf
      {'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...
      89
      ...
      0.898190
      0.940113
      0.877828
      0.938983
      0.877828
      0.936158
      0.004562
      0.000820
      0.014850
      0.002762
    
    
      19
      0.355929
      0.076138
      0.890597
      0.944167
      0.86
      {0: 1.33, 1: 1}
      0.00017
      rbf
      {'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...
      89
      ...
      0.895928
      0.947458
      0.880090
      0.947458
      0.875566
      0.940113
      0.008124
      0.002002
      0.013588
      0.003219
    
    
      20
      0.362350
      0.077477
      0.893309
      0.949366
      0.86
      {0: 1.33, 1: 1}
      0.00019
      rbf
      {'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...
      29
      ...
      0.895928
      0.952542
      0.886878
      0.953672
      0.880090
      0.944633
      0.016530
      0.002222
      0.011534
      0.003614
    
    
      21
      0.364886
      0.078124
      0.894213
      0.954565
      0.86
      {0: 1.33, 1: 1}
      0.00021
      rbf
      {'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...
      6
      ...
      0.893665
      0.957062
      0.895928
      0.957062
      0.884615
      0.950282
      0.006267
      0.001469
      0.007560
      0.003077
    
    
      22
      0.485646
      0.107777
      0.876582
      0.886303
      0.87
      {0: 1.33, 1: 1}
      1e-05
      rbf
      {'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...
      183
      ...
      0.880090
      0.881356
      0.873303
      0.884746
      0.857466
      0.891525
      0.013046
      0.004606
      0.011687
      0.003596
    
    
      23
      0.386740
      0.082835
      0.884268
      0.900656
      0.87
      {0: 1.33, 1: 1}
      3e-05
      rbf
      {'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...
      163
      ...
      0.904977
      0.897740
      0.875566
      0.901130
      0.868778
      0.903955
      0.003106
      0.000673
      0.013757
      0.003608
    
    
      24
      0.362153
      0.077962
      0.886528
      0.908228
      0.87
      {0: 1.33, 1: 1}
      5e-05
      rbf
      {'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...
      151
      ...
      0.907240
      0.908475
      0.882353
      0.909040
      0.871041
      0.909605
      0.008094
      0.001478
      0.012674
      0.001897
    
    
      25
      0.357825
      0.074856
      0.888788
      0.916252
      0.87
      {0: 1.33, 1: 1}
      7e-05
      rbf
      {'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...
      131
      ...
      0.907240
      0.918079
      0.882353
      0.919774
      0.873303
      0.916949
      0.014739
      0.000388
      0.012836
      0.002897
    
    
      26
      0.349499
      0.075018
      0.891501
      0.920660
      0.87
      {0: 1.33, 1: 1}
      9e-05
      rbf
      {'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...
      76
      ...
      0.902715
      0.924859
      0.884615
      0.922034
      0.875566
      0.919774
      0.008123
      0.001843
      0.011757
      0.002553
    
    
      27
      0.349411
      0.073545
      0.892857
      0.926084
      0.87
      {0: 1.33, 1: 1}
      0.00011
      rbf
      {'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...
      37
      ...
      0.902715
      0.928814
      0.880090
      0.927684
      0.877828
      0.923164
      0.009098
      0.000617
      0.013739
      0.002455
    
    
      28
      0.350789
      0.074659
      0.889693
      0.931735
      0.87
      {0: 1.33, 1: 1}
      0.00013
      rbf
      {'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...
      110
      ...
      0.900452
      0.936723
      0.875566
      0.933333
      0.875566
      0.928814
      0.003453
      0.001541
      0.015183
      0.003264
    
    
      29
      0.351878
      0.075582
      0.890597
      0.937951
      0.87
      {0: 1.33, 1: 1}
      0.00015
      rbf
      {'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...
      89
      ...
      0.898190
      0.940113
      0.877828
      0.939548
      0.877828
      0.936723
      0.003958
      0.001245
      0.014850
      0.002662
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      157
      0.444571
      0.089726
      0.888788
      0.917608
      0.99
      {0: 1.33, 1: 1}
      7e-05
      rbf
      {'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...
      131
      ...
      0.902715
      0.920339
      0.882353
      0.919774
      0.873303
      0.918079
      0.083508
      0.014550
      0.011862
      0.002369
    
    
      158
      0.413661
      0.078748
      0.892405
      0.923711
      0.99
      {0: 1.33, 1: 1}
      9e-05
      rbf
      {'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...
      45
      ...
      0.902715
      0.927119
      0.882353
      0.925424
      0.877828
      0.920904
      0.036201
      0.008534
      0.011993
      0.002496
    
    
      159
      0.360614
      0.075561
      0.891953
      0.929475
      0.99
      {0: 1.33, 1: 1}
      0.00011
      rbf
      {'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...
      55
      ...
      0.902715
      0.932768
      0.877828
      0.929944
      0.880090
      0.927119
      0.011327
      0.004267
      0.014065
      0.002174
    
    
      160
      0.362758
      0.079489
      0.890597
      0.936030
      0.99
      {0: 1.33, 1: 1}
      0.00013
      rbf
      {'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...
      89
      ...
      0.900452
      0.938418
      0.877828
      0.936723
      0.875566
      0.933898
      0.009160
      0.011340
      0.015524
      0.002634
    
    
      161
      0.388988
      0.077340
      0.891049
      0.943941
      0.99
      {0: 1.33, 1: 1}
      0.00015
      rbf
      {'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...
      83
      ...
      0.893665
      0.946893
      0.875566
      0.946328
      0.882353
      0.940678
      0.027769
      0.003906
      0.013942
      0.002747
    
    
      162
      0.375911
      0.076609
      0.891953
      0.949931
      0.99
      {0: 1.33, 1: 1}
      0.00017
      rbf
      {'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...
      55
      ...
      0.893665
      0.952542
      0.880090
      0.953107
      0.882353
      0.946328
      0.013229
      0.003802
      0.012065
      0.003017
    
    
      163
      0.376194
      0.076933
      0.894213
      0.955695
      0.99
      {0: 1.33, 1: 1}
      0.00019
      rbf
      {'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...
      6
      ...
      0.895928
      0.958757
      0.889140
      0.958192
      0.884615
      0.951412
      0.008439
      0.002787
      0.009481
      0.003299
    
    
      164
      0.419080
      0.085645
      0.894213
      0.959312
      0.99
      {0: 1.33, 1: 1}
      0.00021
      rbf
      {'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...
      6
      ...
      0.893665
      0.964972
      0.889140
      0.961017
      0.891403
      0.953107
      0.025396
      0.010956
      0.005569
      0.004251
    
    
      165
      0.561820
      0.114892
      0.878391
      0.887433
      1
      {0: 1.33, 1: 1}
      1e-05
      rbf
      {'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...
      171
      ...
      0.884615
      0.883616
      0.873303
      0.886441
      0.861991
      0.890960
      0.035033
      0.015328
      0.010643
      0.003193
    
    
      166
      0.482596
      0.100246
      0.883816
      0.902239
      1
      {0: 1.33, 1: 1}
      3e-05
      rbf
      {'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...
      168
      ...
      0.904977
      0.900000
      0.875566
      0.901130
      0.868778
      0.906215
      0.016734
      0.010950
      0.014051
      0.003113
    
    
      167
      0.463714
      0.096052
      0.890145
      0.911280
      1
      {0: 1.33, 1: 1}
      5e-05
      rbf
      {'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...
      100
      ...
      0.914027
      0.913559
      0.884615
      0.911864
      0.875566
      0.912994
      0.022301
      0.008897
      0.014257
      0.003145
    
    
      168
      0.413456
      0.095846
      0.888788
      0.917834
      1
      {0: 1.33, 1: 1}
      7e-05
      rbf
      {'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...
      131
      ...
      0.902715
      0.920904
      0.882353
      0.919774
      0.873303
      0.918079
      0.025419
      0.004453
      0.011862
      0.002349
    
    
      169
      0.383512
      0.073859
      0.891953
      0.923824
      1
      {0: 1.33, 1: 1}
      9e-05
      rbf
      {'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...
      55
      ...
      0.902715
      0.927119
      0.880090
      0.925989
      0.877828
      0.920904
      0.038308
      0.001840
      0.012399
      0.002754
    
    
      170
      0.407128
      0.084379
      0.891953
      0.930040
      1
      {0: 1.33, 1: 1}
      0.00011
      rbf
      {'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...
      55
      ...
      0.902715
      0.933333
      0.877828
      0.930508
      0.880090
      0.928814
      0.038685
      0.010551
      0.014065
      0.002089
    
    
      171
      0.415157
      0.091723
      0.891501
      0.936369
      1
      {0: 1.33, 1: 1}
      0.00013
      rbf
      {'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...
      76
      ...
      0.900452
      0.938983
      0.877828
      0.936723
      0.877828
      0.935028
      0.020458
      0.007832
      0.014888
      0.002592
    
    
      172
      0.422001
      0.087517
      0.891501
      0.944394
      1
      {0: 1.33, 1: 1}
      0.00015
      rbf
      {'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...
      76
      ...
      0.893665
      0.946893
      0.877828
      0.946893
      0.882353
      0.941243
      0.041294
      0.012433
      0.013461
      0.002453
    
    
      173
      0.416073
      0.089631
      0.892857
      0.950383
      1
      {0: 1.33, 1: 1}
      0.00017
      rbf
      {'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...
      37
      ...
      0.893665
      0.953107
      0.882353
      0.953107
      0.884615
      0.946893
      0.046913
      0.012780
      0.011286
      0.002904
    
    
      174
      0.415082
      0.085842
      0.893761
      0.956261
      1
      {0: 1.33, 1: 1}
      0.00019
      rbf
      {'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...
      19
      ...
      0.895928
      0.959887
      0.889140
      0.958192
      0.884615
      0.951977
      0.037486
      0.011709
      0.008651
      0.003138
    
    
      175
      0.404942
      0.082702
      0.894665
      0.960103
      1
      {0: 1.33, 1: 1}
      0.00021
      rbf
      {'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...
      2
      ...
      0.895928
      0.966102
      0.889140
      0.962147
      0.891403
      0.953672
      0.022592
      0.004818
      0.005598
      0.004607
    
    
      176
      0.486412
      0.103226
      0.878391
      0.887885
      1.01
      {0: 1.33, 1: 1}
      1e-05
      rbf
      {'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...
      171
      ...
      0.884615
      0.883616
      0.873303
      0.887571
      0.861991
      0.891525
      0.010723
      0.001293
      0.010643
      0.003409
    
    
      177
      0.451571
      0.094131
      0.883816
      0.902465
      1.01
      {0: 1.33, 1: 1}
      3e-05
      rbf
      {'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...
      168
      ...
      0.904977
      0.900000
      0.875566
      0.901130
      0.868778
      0.906215
      0.041644
      0.011842
      0.014051
      0.002849
    
    
      178
      0.427228
      0.086585
      0.890145
      0.911280
      1.01
      {0: 1.33, 1: 1}
      5e-05
      rbf
      {'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...
      100
      ...
      0.914027
      0.913559
      0.884615
      0.911864
      0.875566
      0.913559
      0.032641
      0.008355
      0.014257
      0.003169
    
    
      179
      0.370701
      0.078460
      0.889241
      0.917947
      1.01
      {0: 1.33, 1: 1}
      7e-05
      rbf
      {'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...
      124
      ...
      0.902715
      0.920904
      0.882353
      0.919774
      0.873303
      0.918079
      0.024616
      0.010087
      0.012414
      0.002204
    
    
      180
      0.358549
      0.072419
      0.891953
      0.923937
      1.01
      {0: 1.33, 1: 1}
      9e-05
      rbf
      {'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...
      55
      ...
      0.902715
      0.927119
      0.880090
      0.925989
      0.877828
      0.920904
      0.011928
      0.000619
      0.012399
      0.002613
    
    
      181
      0.364770
      0.077377
      0.891501
      0.930266
      1.01
      {0: 1.33, 1: 1}
      0.00011
      rbf
      {'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...
      76
      ...
      0.900452
      0.933333
      0.877828
      0.931073
      0.880090
      0.928814
      0.021795
      0.009686
      0.013745
      0.002157
    
    
      182
      0.350404
      0.073023
      0.890597
      0.936708
      1.01
      {0: 1.33, 1: 1}
      0.00013
      rbf
      {'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...
      89
      ...
      0.900452
      0.938983
      0.873303
      0.937288
      0.877828
      0.935593
      0.005686
      0.000777
      0.015800
      0.002676
    
    
      183
      0.357588
      0.076488
      0.891049
      0.944959
      1.01
      {0: 1.33, 1: 1}
      0.00015
      rbf
      {'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...
      83
      ...
      0.893665
      0.947458
      0.875566
      0.946893
      0.882353
      0.942938
      0.009768
      0.005555
      0.013942
      0.002239
    
    
      184
      0.360025
      0.074841
      0.892857
      0.950948
      1.01
      {0: 1.33, 1: 1}
      0.00017
      rbf
      {'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...
      37
      ...
      0.893665
      0.954237
      0.884615
      0.954237
      0.884615
      0.946893
      0.014623
      0.001618
      0.010039
      0.003393
    
    
      185
      0.384417
      0.076336
      0.893309
      0.956600
      1.01
      {0: 1.33, 1: 1}
      0.00019
      rbf
      {'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...
      29
      ...
      0.895928
      0.959887
      0.889140
      0.958757
      0.884615
      0.951977
      0.042491
      0.002015
      0.007837
      0.003174
    
    
      186
      0.379126
      0.079808
      0.894665
      0.960668
      1.01
      {0: 1.33, 1: 1}
      0.00021
      rbf
      {'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...
      2
      ...
      0.895928
      0.966102
      0.889140
      0.962712
      0.893665
      0.954802
      0.013541
      0.006596
      0.005739
      0.004306
    
  

187 rows × 24 columns

The cell below will plot two heat-maps side by side: one for showing how the training accuracy changes during cross-validation for different combinations of parameters, and one for showing how the testing accuracy changes during cross-validation for different combinations of parameters.



In [31]:

    
#Draw heatmap of the validation accuracy as a function of gamma and C
fig = plt.figure(figsize=(10, 10))
ix=fig.add_subplot(1,2,1)
val_scores = clf.cv_results_['mean_test_score'].reshape(len(original_c_range),len(gamma_range))
val_scores

ax = sns.heatmap(val_scores, linewidths=0.5, square=True, cmap='PuBuGn', 
                 xticklabels=gamma_range, yticklabels=original_c_range, cbar_kws={'shrink':0.5})
ax.invert_yaxis()
plt.yticks(rotation=0, fontsize=10)
plt.xticks(rotation= 70,fontsize=10)
plt.xlabel('Gamma', fontsize=15)
plt.ylabel('C', fontsize=15)
plt.title('Validation Accuracy', fontsize=15)

#Draw heatmap of the validation accuracy as a function of gamma and C
ix=fig.add_subplot(1,2,2)
train_scores = clf.cv_results_['mean_train_score'].reshape(len(original_c_range),len(gamma_range))
train_scores
#plt.figure(figsize=(6, 6))
ax_1 = sns.heatmap(train_scores, linewidths=0.5, square=True, cmap='PuBuGn', 
                 xticklabels=gamma_range, yticklabels=original_c_range, cbar_kws={'shrink':0.5})
ax_1.invert_yaxis()
plt.yticks(rotation=0, fontsize=10)
plt.xticks(rotation= 70,fontsize=10)
plt.xlabel('Gamma', fontsize=15)
plt.ylabel('C', fontsize=15)
plt.title('Training Accuracy', fontsize=15)
plt.show()

The cells below will plot a Validation Curves for Gamma.



In [32]:

    
#import module/library 
from sklearn.model_selection import validation_curve
import matplotlib.pyplot as plt
%matplotlib inline



In [33]:

    
#specifying gamma parameter range to plot for validation curve 
param_range = gamma_range
param_range









    Out[33]:





array([  1.00000000e-05,   3.00000000e-05,   5.00000000e-05,
         7.00000000e-05,   9.00000000e-05,   1.10000000e-04,
         1.30000000e-04,   1.50000000e-04,   1.70000000e-04,
         1.90000000e-04,   2.10000000e-04])



In [34]:

    
#calculating train and validation scores 
train_scores, valid_scores = validation_curve(SVC(C=0.92, kernel='rbf', class_weight={0:1.33, 1:1}), std_conf_ft_in_pca, conf_ft_outputs, param_name='gamma',param_range=param_range,scoring='accuracy')
train_scores_mean = np.mean(train_scores, axis=1)
train_scores_std = np.std(train_scores, axis=1)
valid_scores_mean = np.mean(valid_scores, axis=1)
valid_scores_std = np.std(valid_scores, axis=1)



In [35]:

    
#plotting validation curve  
plt.title('Gamma Validation Curve for SVM With RBF Kernel | C=0.92')
plt.xlabel('Gamma')
plt.ylabel('Score')
plt.xticks(rotation=70)
plt.ylim(0.8,1.0)
plt.xlim(0.0001,0.00021)
plt.xticks(param_range)
lw=2
plt.plot(param_range, train_scores_mean, 'o-',label="Training Score", color='darkorange', lw=lw)
plt.fill_between(param_range, train_scores_mean-train_scores_std, train_scores_mean+train_scores_std, alpha=0.2, color='darkorange', lw=lw)
plt.plot(param_range, valid_scores_mean, 'o-',label="Testing Score", color='navy', lw=lw)
plt.fill_between(param_range, valid_scores_mean-valid_scores_std, valid_scores_mean+valid_scores_std, alpha=0.2, color='navy', lw=lw)
plt.legend(loc='best')
plt.show()

The cells below will plot the Learning Curve.



In [36]:

    
#import module/library 
from sklearn.model_selection import learning_curve



In [37]:

    
#define training data size increments 
td_size = np.arange(0.1, 1.1, 0.1)
#calculating train and validation scores
train_sizes, train_scores, valid_scores = learning_curve(SVC(C=0.92, kernel='rbf', gamma=0.00011, class_weight={0:1.33, 1:1}), std_conf_ft_in_pca, conf_ft_outputs, train_sizes=td_size ,scoring='accuracy')
train_scores_mean = np.mean(train_scores, axis=1)
train_scores_std = np.std(train_scores, axis=1)
valid_scores_mean = np.mean(valid_scores, axis=1)
valid_scores_std = np.std(valid_scores, axis=1)



In [38]:

    
#plotting learning curve 
fig = plt.figure(figsize=(5,5))
plt.title('Learning Curve with SVM with RBF Kernel| C=0.92 & Gamma = 0.00011', fontsize=9)
plt.xlabel('Train Data Size')
plt.ylabel('Score')
plt.ylim(0.8,1)
lw=2
plt.plot(train_sizes, train_scores_mean, 'o-', color="r", label="Training Score")
plt.fill_between(train_sizes, train_scores_mean-train_scores_std, train_scores_mean+train_scores_std, alpha=0.2, color='red', lw=lw)
plt.plot(train_sizes, valid_scores_mean, 'o-', color="g",label="Testing Score")
plt.fill_between(train_sizes, valid_scores_mean-valid_scores_std, valid_scores_mean+valid_scores_std, alpha=0.2, color='green', lw=lw)
plt.legend(loc='best')
plt.show()

Finding Best Number of Principal Components

The cells below will show the optimisation for the number of principal components to include. This is done by doing using a range of principal components, conducting PCA for each specified number in the interval and calculating the average of the test score over 3-fold cross-validation. This procedure is repeated 5 times to combat the randomness of PCA. The average test accuracy over the 5 runs is then plotted against the number of principal components included.



In [39]:

    
#this cell may take several minutes to run 
#plot how the number of PC's changes the test accuracy
no_pcs = np.arange(20, 310, 10)
compute_average_of_5 = []
for t in range(0,5):
    pcs_accuracy_change = []
    for i in no_pcs:
        dummy_inputs = std_conf_ft_in
        dummy_outputs = conf_ft_outputs
        pca_dummy = PCA(n_components=i,)
        pca_dummy.fit(dummy_inputs)
        dummy_inputs_pca = pca_dummy.transform(dummy_inputs)
        dummy_model = SVC(C=0.92, kernel='rbf', gamma=0.00011, class_weight={0:1.33, 1:1})
        dummy_model.fit(dummy_inputs_pca, dummy_outputs,)
        dummy_scores = cross_val_score(dummy_model, dummy_inputs_pca, dummy_outputs, cv=3, scoring='accuracy')
        mean_cv = dummy_scores.mean()
        pcs_accuracy_change.append(mean_cv) 
    print (len(pcs_accuracy_change))
    compute_average_of_5.append(pcs_accuracy_change)



In [40]:

    
#calculate position specific average for the five trials 
from __future__ import division
average_acc_4_pcs = [sum(e)/len(e) for e in zip(*compute_average_of_5)]



In [41]:

    
plt.title('Number of PCs and Change In Accuracy')
plt.xlabel('Number of PCs')
plt.ylabel('Accuracy (%)')
plt.plot(no_pcs, average_acc_4_pcs, 'o-', color="r")
plt.show()

Making Predictions

The following cells will prepare the test data by getting it into the right format.



In [43]:

    
#Load the complete training data set 
test_data = pd.read_csv("/Users/Max/Desktop/Max's Folder/Uni Work/Data Science MSc/Machine Learning/ML Kaggle Competition /Data Sets/Testing Data Set.csv", header=0, index_col=0)



In [44]:

    
##Observe the test data 
test_data









    Out[44]:






  
    
      
      CNNs
      CNNs.1
      CNNs.2
      CNNs.3
      CNNs.4
      CNNs.5
      CNNs.6
      CNNs.7
      CNNs.8
      CNNs.9
      ...
      GIST.502
      GIST.503
      GIST.504
      GIST.505
      GIST.506
      GIST.507
      GIST.508
      GIST.509
      GIST.510
      GIST.511
    
    
      ID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      0.194830
      1.350300
      0.213490
      0.000000
      0.000000
      0.000000
      0.351890
      0.088491
      0.000000
      0.019326
      ...
      0.034106
      0.033771
      0.033252
      0.065845
      0.032537
      0.013666
      0.017434
      0.019322
      0.022847
      0.018033
    
    
      2
      0.000000
      0.000000
      0.000000
      0.165510
      0.000000
      0.000000
      0.387750
      0.000000
      0.000000
      0.000000
      ...
      0.016437
      0.016466
      0.027004
      0.033501
      0.022096
      0.017171
      0.008196
      0.009801
      0.024652
      0.022242
    
    
      3
      0.000000
      0.200780
      0.000000
      0.000000
      2.094100
      0.000000
      0.299910
      0.378720
      0.075307
      0.000000
      ...
      0.008704
      0.009539
      0.024631
      0.008418
      0.004711
      0.005842
      0.012716
      0.026749
      0.018274
      0.019011
    
    
      4
      0.000000
      0.347870
      0.000000
      0.073645
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.229500
      ...
      0.017548
      0.014710
      0.024185
      0.014486
      0.009172
      0.023696
      0.012986
      0.030495
      0.043319
      0.016503
    
    
      5
      0.884630
      0.324790
      0.000000
      0.000000
      0.000000
      0.000000
      0.011088
      0.000000
      0.000000
      0.482680
      ...
      0.022969
      0.044128
      0.002334
      0.019548
      0.021441
      0.021289
      0.044368
      0.043576
      0.036581
      0.029603
    
    
      6
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.166160
      1.004200
      0.000000
      0.189100
      ...
      0.043562
      0.021703
      0.026738
      0.031587
      0.034307
      0.030655
      0.038916
      0.034066
      0.045780
      0.047206
    
    
      7
      0.000000
      0.000000
      0.000000
      0.000000
      0.325900
      0.000000
      0.687090
      0.618480
      0.447820
      0.015397
      ...
      0.004481
      0.001124
      0.000679
      0.000611
      0.004346
      0.001564
      0.003356
      0.003834
      0.011894
      0.003052
    
    
      8
      0.000000
      0.364670
      0.000000
      0.101220
      0.724070
      0.102940
      0.000000
      0.000000
      0.000000
      0.408030
      ...
      0.019751
      0.026489
      0.022905
      0.019998
      0.023690
      0.042806
      0.011671
      0.010686
      0.019967
      0.039470
    
    
      9
      0.000000
      0.000000
      0.782400
      0.000000
      0.000000
      0.000000
      0.647210
      0.000000
      0.000000
      0.016120
      ...
      0.040826
      0.025450
      0.030382
      0.029829
      0.039562
      0.007886
      0.032521
      0.016721
      0.038936
      0.028714
    
    
      10
      0.000000
      0.150050
      0.000000
      0.000000
      0.000000
      0.307400
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.031852
      0.047037
      0.012018
      0.006477
      0.028726
      0.022898
      0.014857
      0.006256
      0.007239
      0.023779
    
    
      11
      0.000000
      0.000000
      0.000000
      0.000000
      0.168120
      0.000000
      0.557410
      0.357510
      0.000000
      0.000000
      ...
      0.048258
      0.032145
      0.035011
      0.068682
      0.039521
      0.036103
      0.049371
      0.056364
      0.047784
      0.029587
    
    
      12
      0.000000
      0.000000
      0.000000
      1.113900
      0.000000
      0.000000
      0.654960
      0.368960
      0.116540
      0.672640
      ...
      0.014656
      0.022645
      0.009311
      0.018720
      0.014527
      0.034215
      0.004362
      0.020660
      0.013758
      0.016608
    
    
      13
      0.152170
      0.000000
      0.054608
      0.000000
      0.000000
      0.000000
      0.022096
      0.000000
      0.853110
      1.165300
      ...
      0.062972
      0.040143
      0.016354
      0.045206
      0.044716
      0.023516
      0.009517
      0.023491
      0.051504
      0.034649
    
    
      14
      0.000000
      0.000000
      0.000000
      0.437360
      0.085099
      1.019300
      0.202030
      0.202820
      0.911960
      0.000000
      ...
      0.029723
      0.025426
      0.002588
      0.009861
      0.024655
      0.016447
      0.003662
      0.010975
      0.035349
      0.011683
    
    
      15
      0.213150
      0.706040
      0.000000
      0.000000
      0.000000
      0.000000
      0.861110
      0.387440
      0.000000
      0.000000
      ...
      0.036866
      0.053719
      0.021971
      0.025420
      0.035830
      0.042621
      0.049735
      0.059815
      0.062077
      0.078901
    
    
      16
      0.367090
      0.198290
      0.399630
      1.044100
      1.062400
      0.739550
      0.555000
      0.728430
      0.549300
      0.395800
      ...
      0.017267
      0.005520
      0.005491
      0.008309
      0.012251
      0.007130
      0.009657
      0.017222
      0.007176
      0.010277
    
    
      17
      1.081300
      0.709940
      0.000000
      0.000000
      0.000000
      0.038946
      0.000000
      0.000000
      0.621500
      0.586920
      ...
      0.017681
      0.015070
      0.002573
      0.014901
      0.016915
      0.015710
      0.003059
      0.006623
      0.047580
      0.022572
    
    
      18
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.433450
      0.000000
      0.000000
      ...
      0.039709
      0.004613
      0.002927
      0.030539
      0.032032
      0.004746
      0.006050
      0.016470
      0.024361
      0.006372
    
    
      19
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.628250
      0.082205
      0.000000
      0.000000
      ...
      0.010681
      0.005907
      0.006614
      0.013034
      0.016809
      0.006511
      0.013259
      0.029419
      0.021673
      0.005455
    
    
      20
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.316730
      0.530300
      0.000000
      0.000000
      ...
      0.006544
      0.008341
      0.001190
      0.020739
      0.004576
      0.007154
      0.003611
      0.027442
      0.006166
      0.006204
    
    
      21
      0.177610
      1.023500
      0.000000
      0.253690
      0.782780
      0.000000
      0.000000
      0.000000
      0.000000
      0.929450
      ...
      0.027354
      0.017965
      0.035577
      0.011575
      0.017646
      0.035258
      0.031280
      0.034345
      0.009078
      0.014279
    
    
      22
      0.000000
      0.490950
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.196700
      0.000000
      1.791300
      ...
      0.014434
      0.006604
      0.058954
      0.019730
      0.018328
      0.013492
      0.063643
      0.020623
      0.010762
      0.013761
    
    
      23
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.380200
      0.000000
      ...
      0.012042
      0.017029
      0.032655
      0.031286
      0.020286
      0.022929
      0.038708
      0.040450
      0.014474
      0.034527
    
    
      24
      0.000000
      0.000000
      0.000000
      0.397890
      0.000000
      0.396670
      0.000000
      0.933390
      0.000000
      0.000000
      ...
      0.010552
      0.003758
      0.013717
      0.034272
      0.024939
      0.008811
      0.007765
      0.008986
      0.002322
      0.001472
    
    
      25
      0.609620
      0.000000
      1.208400
      0.000000
      0.441790
      0.000000
      0.852390
      0.703310
      0.046152
      0.437660
      ...
      0.041628
      0.021789
      0.009921
      0.028715
      0.037731
      0.013175
      0.009148
      0.018026
      0.038273
      0.032657
    
    
      26
      0.039763
      0.350370
      0.030108
      0.031349
      0.701510
      0.000000
      1.393400
      0.616590
      0.820280
      0.140160
      ...
      0.004995
      0.016679
      0.006073
      0.003541
      0.013030
      0.019313
      0.001189
      0.000711
      0.010844
      0.018480
    
    
      27
      0.000000
      0.000000
      0.000000
      0.727110
      0.000000
      0.400440
      0.000000
      0.694860
      0.000000
      0.513950
      ...
      0.009220
      0.003783
      0.006770
      0.008080
      0.009801
      0.005275
      0.009081
      0.012781
      0.010865
      0.007519
    
    
      28
      0.000000
      0.000000
      0.000000
      0.428450
      0.270870
      0.772110
      0.000000
      0.342840
      0.000000
      0.576240
      ...
      0.042825
      0.034460
      0.049683
      0.069336
      0.041269
      0.041437
      0.049926
      0.059883
      0.056016
      0.049831
    
    
      29
      0.071236
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.033911
      0.017138
      0.044084
      0.026645
      0.033558
      0.051734
      0.029514
      0.031265
      0.039061
      0.039235
    
    
      30
      0.000000
      1.060000
      0.000000
      0.000000
      0.000000
      0.000000
      0.049577
      1.533800
      0.000000
      0.000000
      ...
      0.027339
      0.032716
      0.021481
      0.031347
      0.029081
      0.033590
      0.028272
      0.038660
      0.032977
      0.015283
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      4171
      0.000000
      2.734600
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.411400
      0.000000
      ...
      0.042138
      0.011998
      0.026259
      0.041903
      0.060205
      0.031540
      0.052135
      0.037284
      0.061335
      0.047338
    
    
      4172
      0.786940
      0.651490
      1.183200
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.070863
      0.040415
      0.053479
      0.059165
      0.061798
      0.071912
      0.040353
      0.070295
      0.030428
      0.035420
    
    
      4173
      0.000000
      1.991300
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.073568
      0.017017
      0.043290
      0.044839
      0.057458
      0.039226
      0.048257
      0.044225
      0.056488
      0.037646
    
    
      4174
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.326700
      0.000000
      0.000000
      0.000000
      0.106750
      ...
      0.017138
      0.013441
      0.024706
      0.025719
      0.038360
      0.017907
      0.032489
      0.024224
      0.022035
      0.028300
    
    
      4175
      1.095200
      1.567200
      0.000000
      0.000000
      0.000000
      0.000000
      0.070119
      0.000000
      0.000000
      0.000000
      ...
      0.042875
      0.044804
      0.037184
      0.068091
      0.103290
      0.039282
      0.023520
      0.043291
      0.055544
      0.043498
    
    
      4176
      0.000000
      0.302120
      0.000000
      0.384000
      0.000000
      0.000000
      0.000000
      0.397090
      0.000000
      0.319180
      ...
      0.048888
      0.039039
      0.039669
      0.037022
      0.037721
      0.031369
      0.035511
      0.038268
      0.042715
      0.013048
    
    
      4177
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.366940
      1.154200
      0.000000
      0.000000
      ...
      0.032254
      0.005745
      0.026515
      0.029190
      0.027137
      0.007291
      0.042835
      0.043519
      0.026551
      0.011185
    
    
      4178
      0.000000
      0.061280
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.138560
      0.000000
      1.694200
      ...
      0.016446
      0.012421
      0.038739
      0.032971
      0.025489
      0.053939
      0.043233
      0.061656
      0.038132
      0.027813
    
    
      4179
      0.000000
      0.569260
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.739310
      0.749620
      0.000000
      ...
      0.037481
      0.008475
      0.022053
      0.027696
      0.024141
      0.010281
      0.021287
      0.040340
      0.041724
      0.008595
    
    
      4180
      0.000000
      0.000000
      0.000000
      0.000000
      0.494700
      0.000000
      0.000000
      0.777980
      0.000000
      0.000000
      ...
      0.016469
      0.015347
      0.031718
      0.027136
      0.027073
      0.017448
      0.023853
      0.015471
      0.019030
      0.019459
    
    
      4181
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.317400
      0.000000
      ...
      0.044988
      0.021043
      0.041841
      0.014692
      0.021267
      0.032977
      0.033808
      0.018784
      0.018042
      0.019031
    
    
      4182
      0.000000
      0.410600
      1.555400
      1.061000
      0.165900
      0.000000
      0.455830
      0.000000
      0.000000
      0.000000
      ...
      0.031500
      0.026408
      0.046603
      0.034717
      0.035535
      0.031554
      0.029539
      0.044249
      0.023049
      0.045432
    
    
      4183
      1.638500
      0.000000
      0.000000
      0.000000
      0.178880
      0.000000
      0.000000
      0.000000
      0.000000
      0.710170
      ...
      0.026997
      0.003131
      0.022698
      0.032954
      0.027425
      0.004165
      0.040303
      0.038697
      0.020735
      0.020643
    
    
      4184
      0.000000
      0.000000
      0.000000
      1.645800
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.003552
      ...
      0.032534
      0.042982
      0.042220
      0.034992
      0.028178
      0.022535
      0.050429
      0.020034
      0.023449
      0.029159
    
    
      4185
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.577500
      0.000000
      0.000000
      ...
      0.065182
      0.039138
      0.031415
      0.034866
      0.031182
      0.052862
      0.022938
      0.036160
      0.036301
      0.042277
    
    
      4186
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.182140
      0.000000
      0.000000
      0.000000
      ...
      0.042382
      0.031378
      0.049776
      0.063646
      0.021612
      0.015708
      0.042415
      0.053048
      0.044192
      0.037856
    
    
      4187
      0.000000
      0.000000
      0.000000
      0.000000
      0.023010
      0.000000
      0.000000
      0.220640
      1.228100
      0.000000
      ...
      0.034456
      0.002692
      0.020770
      0.053716
      0.040541
      0.002834
      0.020102
      0.034843
      0.024613
      0.005409
    
    
      4188
      0.000000
      0.000000
      0.000000
      0.000000
      0.216160
      0.000000
      0.048958
      1.981900
      0.000000
      0.082497
      ...
      0.077684
      0.059553
      0.060726
      0.059117
      0.045136
      0.037635
      0.041499
      0.039308
      0.022825
      0.032905
    
    
      4189
      0.000000
      0.882690
      0.429830
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.188320
      ...
      0.025227
      0.033569
      0.034830
      0.031152
      0.037935
      0.059416
      0.069511
      0.062773
      0.041863
      0.035643
    
    
      4190
      0.000000
      0.728680
      0.000000
      0.093624
      0.687730
      0.037979
      0.000000
      0.000000
      0.000000
      1.126700
      ...
      0.020896
      0.015064
      0.016455
      0.026685
      0.017372
      0.010616
      0.012713
      0.023552
      0.022268
      0.016904
    
    
      4191
      0.000000
      0.481380
      0.000000
      0.000000
      0.000000
      0.773610
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.060230
      0.038550
      0.026339
      0.028310
      0.027146
      0.034679
      0.010065
      0.010213
      0.026589
      0.048003
    
    
      4192
      0.000000
      0.462310
      0.000000
      0.167100
      0.000000
      0.768900
      0.716370
      0.000000
      0.000000
      0.169180
      ...
      0.018414
      0.021935
      0.024465
      0.026636
      0.030005
      0.028487
      0.017633
      0.019037
      0.040453
      0.021242
    
    
      4193
      0.000000
      1.089200
      0.000000
      0.000000
      0.000000
      0.626980
      1.125300
      0.000000
      0.000000
      0.000000
      ...
      0.014440
      0.005520
      0.014940
      0.030454
      0.047470
      0.016407
      0.004174
      0.037404
      0.040862
      0.014487
    
    
      4194
      0.139580
      0.447620
      0.192890
      0.409270
      0.000000
      0.000000
      0.000000
      0.661030
      0.000000
      0.341400
      ...
      0.039888
      0.041431
      0.040149
      0.021435
      0.043586
      0.081216
      0.030188
      0.045357
      0.047763
      0.048098
    
    
      4195
      0.000000
      0.003217
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.136060
      0.197230
      0.000000
      ...
      0.013453
      0.002702
      0.015512
      0.036910
      0.027067
      0.002744
      0.013099
      0.050646
      0.023090
      0.005154
    
    
      4196
      0.527870
      0.297660
      0.331530
      0.000000
      0.000000
      0.000000
      0.574640
      0.258230
      0.481020
      0.000000
      ...
      0.029119
      0.027585
      0.019465
      0.020351
      0.033806
      0.032674
      0.015499
      0.039595
      0.058310
      0.059833
    
    
      4197
      0.000000
      0.000000
      1.155300
      0.207210
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.029210
      0.061133
      0.033955
      0.057905
      0.045448
      0.057810
      0.039487
      0.043194
      0.062088
      0.031589
    
    
      4198
      0.000000
      0.117070
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      1.578400
      0.881420
      1.545200
      ...
      0.007609
      0.005828
      0.004176
      0.015774
      0.024296
      0.005301
      0.005606
      0.029187
      0.035203
      0.014258
    
    
      4199
      0.354620
      0.000000
      0.037829
      0.000000
      0.647960
      0.000000
      0.856870
      0.687660
      0.000000
      1.021000
      ...
      0.023272
      0.010051
      0.016886
      0.041528
      0.010361
      0.006299
      0.017601
      0.026010
      0.015067
      0.029002
    
    
      4200
      0.000000
      1.894900
      0.000000
      0.000000
      0.000000
      0.719250
      0.000000
      0.000000
      0.000000
      0.000000
      ...
      0.034280
      0.025630
      0.023124
      0.022218
      0.034257
      0.027829
      0.020365
      0.021096
      0.035821
      0.017233
    
  

4200 rows × 4608 columns



In [45]:

    
#turn test dataframe into matrix 
test_data_matrix = test_data.as_matrix(columns=None)
test_data_matrix.shape









    Out[45]:





(4200, 4608)

The following cell will apply the same pre-processing applied to the training data to the test data.



In [46]:

    
#pre-process test data in same way as train data  
scaled_test = scaler_2.transform(test_data_matrix)
transformed_test = pca_2.transform(scaled_test)
transformed_test.shape









    Out[46]:





(4200, 230)

The following cells will produce predictions on the test data using the final model.



In [47]:

    
#define and fit final model with best parameters from grid search
final_model = SVC(C=0.92, cache_size=1000, kernel='rbf', gamma=0.00011, class_weight={0:1.33, 1:1})
final_model.fit(std_conf_ft_in_pca, conf_ft_outputs)









    Out[47]:





SVC(C=0.92, cache_size=1000, class_weight={0: 1.33, 1: 1}, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.00011, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)



In [48]:

    
#make test data predictions
predictions = final_model.predict(transformed_test)

#create dictionary for outputs matched with ID
to_export = {'ID': np.arange(1, 4201, 1), 'prediction': predictions}
to_export

#convert to dataframe 
final_predictions = pd.DataFrame.from_dict(to_export)
final_predictions









    Out[48]:






  
    
      
      ID
      prediction
    
  
  
    
      0
      1
      1.0
    
    
      1
      2
      0.0
    
    
      2
      3
      0.0
    
    
      3
      4
      1.0
    
    
      4
      5
      1.0
    
    
      5
      6
      1.0
    
    
      6
      7
      0.0
    
    
      7
      8
      0.0
    
    
      8
      9
      0.0
    
    
      9
      10
      1.0
    
    
      10
      11
      0.0
    
    
      11
      12
      0.0
    
    
      12
      13
      1.0
    
    
      13
      14
      0.0
    
    
      14
      15
      1.0
    
    
      15
      16
      0.0
    
    
      16
      17
      1.0
    
    
      17
      18
      0.0
    
    
      18
      19
      0.0
    
    
      19
      20
      0.0
    
    
      20
      21
      0.0
    
    
      21
      22
      1.0
    
    
      22
      23
      1.0
    
    
      23
      24
      0.0
    
    
      24
      25
      0.0
    
    
      25
      26
      1.0
    
    
      26
      27
      0.0
    
    
      27
      28
      0.0
    
    
      28
      29
      0.0
    
    
      29
      30
      1.0
    
    
      ...
      ...
      ...
    
    
      4170
      4171
      1.0
    
    
      4171
      4172
      0.0
    
    
      4172
      4173
      0.0
    
    
      4173
      4174
      1.0
    
    
      4174
      4175
      1.0
    
    
      4175
      4176
      0.0
    
    
      4176
      4177
      1.0
    
    
      4177
      4178
      1.0
    
    
      4178
      4179
      1.0
    
    
      4179
      4180
      0.0
    
    
      4180
      4181
      0.0
    
    
      4181
      4182
      0.0
    
    
      4182
      4183
      0.0
    
    
      4183
      4184
      0.0
    
    
      4184
      4185
      1.0
    
    
      4185
      4186
      0.0
    
    
      4186
      4187
      1.0
    
    
      4187
      4188
      0.0
    
    
      4188
      4189
      0.0
    
    
      4189
      4190
      0.0
    
    
      4190
      4191
      1.0
    
    
      4191
      4192
      1.0
    
    
      4192
      4193
      1.0
    
    
      4193
      4194
      0.0
    
    
      4194
      4195
      1.0
    
    
      4195
      4196
      1.0
    
    
      4196
      4197
      0.0
    
    
      4197
      4198
      1.0
    
    
      4198
      4199
      0.0
    
    
      4199
      4200
      1.0
    
  

4200 rows × 2 columns



In [49]:

    
#convert prediction column float type entries to integers
final_predictions = final_predictions.astype('int')
final_predictions









    Out[49]:






  
    
      
      ID
      prediction
    
  
  
    
      0
      1
      1
    
    
      1
      2
      0
    
    
      2
      3
      0
    
    
      3
      4
      1
    
    
      4
      5
      1
    
    
      5
      6
      1
    
    
      6
      7
      0
    
    
      7
      8
      0
    
    
      8
      9
      0
    
    
      9
      10
      1
    
    
      10
      11
      0
    
    
      11
      12
      0
    
    
      12
      13
      1
    
    
      13
      14
      0
    
    
      14
      15
      1
    
    
      15
      16
      0
    
    
      16
      17
      1
    
    
      17
      18
      0
    
    
      18
      19
      0
    
    
      19
      20
      0
    
    
      20
      21
      0
    
    
      21
      22
      1
    
    
      22
      23
      1
    
    
      23
      24
      0
    
    
      24
      25
      0
    
    
      25
      26
      1
    
    
      26
      27
      0
    
    
      27
      28
      0
    
    
      28
      29
      0
    
    
      29
      30
      1
    
    
      ...
      ...
      ...
    
    
      4170
      4171
      1
    
    
      4171
      4172
      0
    
    
      4172
      4173
      0
    
    
      4173
      4174
      1
    
    
      4174
      4175
      1
    
    
      4175
      4176
      0
    
    
      4176
      4177
      1
    
    
      4177
      4178
      1
    
    
      4178
      4179
      1
    
    
      4179
      4180
      0
    
    
      4180
      4181
      0
    
    
      4181
      4182
      0
    
    
      4182
      4183
      0
    
    
      4183
      4184
      0
    
    
      4184
      4185
      1
    
    
      4185
      4186
      0
    
    
      4186
      4187
      1
    
    
      4187
      4188
      0
    
    
      4188
      4189
      0
    
    
      4189
      4190
      0
    
    
      4190
      4191
      1
    
    
      4191
      4192
      1
    
    
      4192
      4193
      1
    
    
      4193
      4194
      0
    
    
      4194
      4195
      1
    
    
      4195
      4196
      1
    
    
      4196
      4197
      0
    
    
      4197
      4198
      1
    
    
      4198
      4199
      0
    
    
      4199
      4200
      1
    
  

4200 rows × 2 columns



In [50]:

    
#check properties of predictions: class balance should be 42.86(1):57.14(0)
#i.e. should predict 2400 Class 0 instances, and 1800 Class 1 instances
final_predictions.prediction.value_counts()









    Out[50]:





0    2470
1    1730
Name: prediction, dtype: int64

References

[1] Vapnik V. (1979) Estimation of Dependences based on Empirical Data. Springer Verlag, New York, 1982.

[2] Cortes C, Vapnik V. (1995) Support Vector Networks. Machine Learning. Vol. 20: pages 273-297.

[3] Drucker H, Burges CJC, Kaufman AS, Vapnik V. (1997) Support vector regression machines. Advances in Neural Information Processing Systems. Vol. 9: pages 155-161.

[4] Vapnik VN. (1982) Estimation of Dependences Based on Empirical Data. Addendum 1, New York: Springer-Verlag.

[5] Rosasco L, De Vito ED, Caponnetto A, Piana M, Verri A. (2004) Are Loss Functions All The Same? Neural Computation. Vol. 16: pages 1063-1076.

[6] Batuwita R, Palade V. (2012) Class Imbalance learning methods for Support Vector Machines. In: Imbalanced Learning: Foundations, Algorithms and Applications, by He H, Ma Y. John Wiley & Sons: Chapter 6.

[7] Lian H. (2012) On feature selection with principal component analysis for one-class SVM. Pattern Recognition Letters. Vol. 33: pages 1027-1031.

[8] Juszczak P, Tax DJ, Dui RW. (2002) Feature scaling in support vector data descriptions. Proc. 8th Annual. Conf. Adv. School Comput. Imaging: pages 1-8. Accessed on link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.100.25 24&rep=rep1&type=pdf

[9] Greenland S, Finkle WD. (1995) A critical look at methods for handling missing covariates in epidemiologic regression analyses. AM J Epdimiol. Vol. 142: pages 1255-1264.

[10] Horton NJ, Kleinman KP. (2007) Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models. Am Stat. Vol. 61: pages 79- 90.

[11] Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics. Vol. 17: pages 520-525.

[12] Andridge RR, Little RJ. (2010) A Review of Hot Deck Imputation for Survey Non-response. Int Stat Review. Vol. 78: pages 40-64.

[13] Rubinsteyn A, Feldman S, O’Donnell T, Beaulieu-Jones B. (2015) fancyimpute 0.2.0. Package found on: https://github.com/hammerlab/fancyimpute.

[14] Beretta L, Santaniello A. (2016) Nearest neighbour imputation algorithms: a critical evaluation. BMC Medical Informatics and Decision Making. Vol. 16: pages 197-208.

[15] Barandela R, Valdovinos RM, Sanchez JS, Ferri Fj. (2004) The Imbalanced Training Sample Problem: Under or Over Sampling? Spring-Verlag, Berlin: pages 806-814.

[16] Lematre G, Nogueira F, Adrias CK. (2017) Imbalanced- learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. Journal of Machine Learning Research. Vol. 18: pages 1-5.

[17] Chawla NV, Bowyer KW, Hall Lo, Kegelmeyer WP. (2002) SMOTE: Synthetic minority oversampling. Journal of Artifical Intelligence Research. Vol. 16: pages 321-357.

[18] Strong DM, Lee YW, Wang RY. (1997) Data Quality in context. Communications of the ACM. Vol. 40: pages 103-110.

[19] Blum AL, Langley P. (1997) Selection of relevant features and examples in Machine Learning. Artificial Intelligence. Vol. 97: pages 245-271.

[20] Hira ZM, Gillies DF. (2015) A Review of Feature Selection and Feature Extraction Methods Applied on Microarray data. Advances in Bioinformatics. Vol. 2015: pages 1-13.

	CNNs	CNNs.1	CNNs.2	CNNs.3	CNNs.4	CNNs.5	CNNs.6	CNNs.7	CNNs.8	CNNs.9	...	GIST.503	GIST.504	GIST.505	GIST.506	GIST.507	GIST.508	GIST.509	GIST.510	GIST.511	prediction
ID
1	0.00000	0.16784	1.477	0.75651	0.38741	0.0000	0.21295	0.0000	0.000000	0.225750	...	0.025833	0.021306	0.027640	0.036184	0.047010	0.037981	0.049249	0.059802	0.035669	0
2	0.00000	0.00000	0.000	0.44260	0.00000	0.0000	0.15024	1.4806	0.635870	0.020341	...	0.017774	0.020330	0.019916	0.033483	0.015937	0.021656	0.018347	0.017458	0.018744	0
3	0.00000	0.00000	0.000	0.47042	0.00000	1.2779	0.45954	0.0000	0.000000	0.000000	...	0.017935	0.005156	0.041298	0.014921	0.015868	0.012122	0.015664	0.011410	0.017450	1
4	0.00000	0.00000	0.000	0.00000	0.00000	0.0000	0.00000	0.0000	0.030878	0.928510	...	0.039596	0.007086	0.013696	0.028789	0.022858	0.030883	0.026539	0.021337	0.018109	1
5	0.49099	0.83388	0.000	0.00000	0.00000	0.0000	0.00000	0.0000	0.188490	0.764420	...	0.008161	0.036306	0.029198	0.045733	0.008041	0.013111	0.022239	0.058815	0.014322	1

	CNNs	CNNs.1	CNNs.2	CNNs.3	CNNs.4	CNNs.5	CNNs.6	CNNs.7	CNNs.8	CNNs.9	...	GIST.503	GIST.504	GIST.505	GIST.506	GIST.507	GIST.508	GIST.509	GIST.510	GIST.511	prediction
ID
381	0.36854	0.000000	NaN	0.000000	0.00000	NaN	0.360540	0.659070	0.907850	1.207200	...	0.016446	0.014056	0.043309	NaN	0.037343	0.006719	NaN	0.024491	0.015025	1
382	0.00000	0.000000	0.000000	1.194500	1.10800	1.443800	0.000000	0.718870	NaN	NaN	...	0.005913	NaN	0.006697	0.003553	0.004236	NaN	0.008944	0.020299	NaN	0
383	0.33315	0.000000	0.000000	0.000000	0.00000	0.000000	0.103570	0.094568	NaN	0.000000	...	0.028948	NaN	NaN	0.032514	0.031939	0.048637	0.047683	0.051014	0.023250	1
384	0.00000	0.000000	0.000000	0.000000	0.00000	0.425680	0.674630	0.000000	0.797180	0.441840	...	0.048603	0.012979	0.039932	0.024701	0.027439	0.014231	0.044304	0.057307	0.025580	1
385	0.00000	0.000000	NaN	0.000000	NaN	0.000000	0.000000	0.536480	0.000000	1.662200	...	0.063778	0.014582	0.094946	0.072355	0.036569	0.019885	0.075454	0.057808	NaN	1
386	0.00000	0.000000	0.000000	0.000000	NaN	0.033839	NaN	0.550010	0.000000	0.000000	...	0.002547	0.000970	0.004847	NaN	NaN	0.001280	0.010284	0.017465	0.003835	0
387	0.89589	2.249300	0.000000	0.096874	0.00000	0.000000	NaN	0.000000	0.000000	0.000000	...	0.021326	0.048036	NaN	0.062308	NaN	0.029789	0.023238	0.049789	0.042957	1
388	0.00000	NaN	0.000000	NaN	0.15435	0.000000	0.345190	0.790570	0.000000	0.603140	...	0.020851	0.054987	0.048095	0.054826	0.044836	0.047492	0.031173	NaN	0.056663	1
389	2.07390	0.000000	0.000000	0.000000	0.00000	0.000000	0.000000	NaN	0.000000	NaN	...	0.069224	NaN	0.031577	0.050929	0.039233	0.012247	0.035869	0.065476	0.036050	1
390	0.00000	2.013900	NaN	0.000000	0.00000	0.000000	0.551610	0.000000	0.770090	0.000000	...	0.010342	0.069422	0.073940	0.042869	0.014791	0.046599	NaN	NaN	0.012851	1
391	NaN	NaN	0.000000	0.000000	NaN	0.000000	0.000000	0.449090	0.000000	0.000000	...	0.017124	NaN	0.037151	0.066304	0.030059	0.017258	0.057827	0.046734	0.013858	0
392	0.00000	NaN	0.551660	0.511390	0.44663	0.000000	0.829450	0.169020	0.000000	0.656110	...	NaN	NaN	0.002628	0.001312	0.003053	0.042552	0.016416	NaN	0.009700	1
393	0.00000	1.758300	0.000000	0.000000	0.70794	0.000000	1.016600	NaN	0.223690	0.000000	...	0.002593	NaN	0.041875	NaN	0.002247	NaN	NaN	NaN	0.004210	1
394	0.00000	0.000000	0.000000	NaN	1.69510	NaN	0.000000	0.000000	NaN	0.387880	...	0.001936	0.009446	0.057476	0.030121	0.013046	NaN	0.039464	NaN	0.029116	1
395	0.00000	0.000000	0.065205	NaN	0.00000	0.000000	0.165250	0.000000	NaN	NaN	...	0.044815	0.025457	0.033375	0.057896	0.033652	NaN	0.020977	0.025914	0.018434	1
396	0.16352	0.091154	NaN	0.127380	NaN	NaN	0.000000	0.000000	1.230700	0.600490	...	0.027672	0.006230	0.010304	0.005203	NaN	0.004434	NaN	0.004102	0.015907	0
397	0.00000	0.000000	0.000000	0.000000	0.00000	0.000000	0.000000	0.608360	1.405400	NaN	...	0.004091	NaN	0.021434	0.004124	0.006042	0.040793	0.042516	NaN	0.029567	1
398	0.00000	0.000000	0.000000	0.000000	0.00000	0.622480	0.000000	0.537490	1.016600	0.003190	...	0.027232	NaN	0.042924	NaN	0.035156	0.038203	0.037265	0.023543	0.017802	1
399	0.00000	0.000000	0.000000	0.000000	0.00000	0.370100	0.132860	0.455330	0.182380	1.002500	...	0.016653	0.044491	0.024781	0.041926	NaN	NaN	NaN	0.043103	0.052243	1
400	0.27942	NaN	0.000000	1.303200	0.00000	0.445950	0.000000	0.000000	0.664470	NaN	...	0.005121	0.004739	0.015804	0.009327	0.006316	NaN	0.015994	0.023798	NaN	1
401	0.00000	1.153500	0.000000	0.000000	NaN	0.607270	0.133640	0.000000	0.000000	0.000000	...	0.021836	0.027867	0.044285	0.056987	0.024753	0.020699	0.016404	0.021481	NaN	1
402	0.00000	0.000000	0.000000	NaN	1.09950	0.000000	0.221540	NaN	NaN	0.576960	...	0.055716	0.021611	NaN	0.006325	0.024249	0.018989	0.010881	0.029069	0.035425	0
403	0.69578	1.076700	0.000000	0.000000	0.00000	0.422790	0.074301	NaN	0.000000	0.000000	...	0.022869	0.041593	NaN	NaN	NaN	0.028789	NaN	0.034176	0.028748	1
404	0.00000	0.000000	0.000000	0.000000	0.00000	0.000000	0.000000	0.000000	0.000000	1.911900	...	NaN	0.024381	0.026518	0.014335	0.007303	NaN	NaN	0.029905	NaN	1
405	0.00000	0.000000	0.000000	0.095563	1.35120	0.000000	0.073423	0.530710	0.150650	0.000000	...	0.001538	0.007162	0.009936	NaN	0.019254	0.010202	0.010988	0.007354	0.002542	0
406	NaN	0.456580	0.000000	NaN	0.00000	NaN	0.000000	0.088917	0.000000	0.574290	...	0.028478	0.001726	0.001601	0.030189	0.028521	0.002605	0.003204	0.019569	0.018815	1
407	NaN	0.759570	0.602440	0.000000	NaN	1.053600	NaN	0.000000	0.137830	0.778610	...	0.022562	0.005456	NaN	0.012345	0.017503	0.026339	0.041269	0.020342	0.015812	0
408	0.00000	0.000000	0.000000	NaN	NaN	0.591770	0.171560	0.312710	NaN	0.000000	...	0.024517	0.012467	NaN	0.026758	0.030351	0.025761	0.037172	0.019815	0.023609	0
409	NaN	0.000000	NaN	0.698550	0.00000	0.000000	0.197920	NaN	0.000000	0.000000	...	0.056415	NaN	NaN	0.030497	0.025438	0.047698	0.048943	0.050613	0.022563	0
410	0.00000	1.107400	NaN	NaN	2.14430	0.000000	0.000000	0.000000	0.000000	0.000000	...	0.010856	0.032030	0.047955	0.029015	0.014196	0.032521	0.035577	0.011103	0.010629	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
3771	0.00000	0.000000	0.000000	0.000000	0.00000	0.613830	1.456800	1.350900	0.769510	0.000000	...	0.042000	0.031909	0.012088	0.019042	0.013035	0.069337	0.040768	NaN	0.033300	1
3772	0.48447	0.773390	0.000000	0.213940	0.00000	0.000000	0.000000	0.308140	0.120880	NaN	...	0.020615	0.010171	0.024787	0.021214	0.015437	0.021795	0.032656	0.011645	0.012731	1
3773	1.52270	0.000000	0.000000	NaN	0.00000	NaN	0.000000	0.000000	0.000000	NaN	...	0.018538	0.010537	0.027684	0.031280	0.027294	0.013832	NaN	0.033723	0.039819	1
3774	NaN	NaN	0.000000	0.000000	NaN	NaN	0.000000	0.000000	NaN	0.000000	...	0.037988	NaN	0.043092	0.031044	0.036253	0.059886	0.047193	0.065673	0.021502	1
3775	NaN	NaN	0.000000	0.435930	0.00000	0.000000	0.000000	0.000000	NaN	NaN	...	0.032696	0.034843	NaN	0.035841	0.018307	NaN	0.038746	0.038803	NaN	0
3776	0.52334	0.000000	0.000000	NaN	0.00000	0.000000	0.000000	0.174440	0.000000	0.650650	...	0.042804	0.010049	0.012368	0.032397	0.040944	0.016676	NaN	0.023827	NaN	1
3777	0.00000	0.000000	0.000000	0.000000	0.00000	NaN	0.000000	0.000000	NaN	1.662700	...	0.015793	0.022092	0.014955	0.018942	0.016944	0.032190	0.028367	0.018088	NaN	1
3778	NaN	0.000000	0.258000	0.507590	0.00000	0.000000	0.009438	0.000000	0.000000	0.143250	...	0.017947	0.024456	NaN	0.041183	0.047993	0.033153	0.040062	0.023869	NaN	0
3779	0.00000	0.000000	NaN	0.000000	0.16713	0.000000	NaN	0.000000	0.000000	0.000000	...	0.033826	0.056108	0.062568	NaN	0.028362	0.038791	0.040587	0.035817	0.017098	0
3780	0.00000	0.000000	NaN	0.000000	NaN	1.302700	0.000000	0.000000	NaN	0.000000	...	NaN	NaN	0.053071	NaN	0.023068	0.002995	0.007839	NaN	0.014431	1
3781	0.00000	NaN	0.000000	0.283930	1.20350	0.017472	NaN	0.000000	NaN	0.364180	...	0.018298	0.017468	NaN	NaN	0.013216	0.018009	NaN	0.013381	0.009624	1
3782	0.08295	0.153360	0.000000	0.000000	NaN	0.000000	0.973910	NaN	0.043460	1.534700	...	0.023122	0.011625	NaN	0.008095	0.011113	0.041680	0.019421	0.020782	0.012575	1
3783	0.00000	0.000000	0.000000	0.000000	0.00000	NaN	NaN	0.000000	0.461830	1.176400	...	NaN	0.027828	0.038889	0.021408	0.003900	0.029600	0.029911	0.026131	0.011419	0
3784	0.00000	0.738330	NaN	0.000000	0.00000	NaN	NaN	0.000000	0.000000	0.916750	...	0.037651	0.049141	0.021031	0.028404	0.012977	0.039746	0.040718	0.013986	NaN	1
3785	NaN	0.190430	0.363980	NaN	0.00000	0.000000	NaN	0.383590	0.272360	0.000000	...	0.036021	0.035155	0.031911	0.057594	0.080866	0.034773	0.047184	0.064976	NaN	1
3786	NaN	NaN	NaN	0.324010	0.00000	0.000000	0.000000	0.000000	NaN	NaN	...	NaN	0.005199	NaN	0.006255	0.008951	NaN	NaN	0.010050	0.008300	1
3787	0.00000	0.000000	0.501170	0.057088	0.16317	0.646400	0.298280	0.757990	0.087317	0.146870	...	0.021765	0.016235	0.044767	0.024993	0.010303	0.009179	0.031484	0.070180	0.046984	1
3788	0.00000	0.000000	NaN	0.000000	0.00000	NaN	0.825670	0.000000	NaN	NaN	...	0.006477	0.011424	NaN	0.004331	NaN	0.008933	NaN	0.017482	0.031523	0
3789	0.00000	0.000000	0.000000	NaN	0.00000	1.338000	0.000000	0.000000	0.000000	0.113400	...	0.006814	0.000684	0.001524	NaN	0.001442	0.002318	NaN	0.011766	0.007097	1
3790	0.00000	0.000000	0.000000	0.476500	NaN	0.176770	0.085610	0.752900	0.000000	0.350030	...	NaN	NaN	0.021348	0.031300	NaN	0.014002	0.030157	0.037464	0.024539	0
3791	0.00000	0.317140	0.246310	0.000000	0.00000	NaN	0.000000	NaN	0.000000	NaN	...	0.020298	0.006256	0.015469	0.038902	NaN	0.020679	0.027800	0.035840	NaN	1
3792	0.00000	0.000000	0.000000	0.000000	0.00000	0.000000	NaN	0.754110	0.151220	0.133960	...	NaN	NaN	0.056608	0.034379	NaN	0.050165	NaN	0.038335	NaN	1
3793	0.00000	NaN	1.137600	0.000000	0.00000	0.000000	0.084377	0.000000	0.000000	0.947300	...	NaN	0.037247	0.041796	0.026112	0.010530	0.046502	NaN	0.042460	NaN	0
3794	0.00000	0.537230	0.000000	0.000000	0.00000	1.846200	0.000000	0.000000	0.000000	1.628400	...	0.030879	0.037300	NaN	0.017458	0.017493	0.027442	0.014037	0.011320	0.010543	0
3795	0.00000	0.000000	0.000000	0.262230	0.00000	0.000000	0.000000	NaN	0.000000	0.000000	...	0.034215	0.043260	NaN	0.042648	0.044968	0.038014	NaN	0.054445	0.032456	0
3796	0.00000	0.944010	0.000000	0.000000	NaN	0.000000	0.000000	0.000000	0.000000	0.000000	...	NaN	0.021328	0.016077	0.019606	NaN	0.005605	0.003127	0.009222	0.019916	1
3797	NaN	0.000000	NaN	NaN	NaN	0.516080	0.000000	0.133720	0.359210	NaN	...	0.019692	0.013608	0.020126	0.021958	0.035866	0.025194	0.029437	NaN	NaN	1
3798	NaN	0.000000	0.146570	0.000000	0.26073	0.000000	0.000000	0.000000	0.000000	0.000000	...	0.022441	0.025916	0.040383	0.045961	0.012540	0.025097	NaN	0.030621	NaN	1
3799	0.00000	NaN	0.293200	0.000000	NaN	0.000000	0.262210	0.000000	0.000000	0.000000	...	0.012463	0.024990	0.034452	0.014815	0.008251	0.058643	NaN	0.038955	0.010777	1
3800	0.00000	0.000000	0.000000	0.000000	0.00000	NaN	0.490140	NaN	1.968300	0.008118	...	0.055913	0.009899	NaN	0.017555	0.019566	NaN	0.048685	NaN	0.028767	1

	CNNs	CNNs.1	CNNs.2	CNNs.3	CNNs.4	CNNs.5	CNNs.6	CNNs.7	CNNs.8	CNNs.9	...	GIST.503	GIST.504	GIST.505	GIST.506	GIST.507	GIST.508	GIST.509	GIST.510	GIST.511	prediction
ID
1	0.000000	0.167840	1.477000	0.756510	0.387410	0.000000	0.212950	0.00000	0.000000	0.225750	...	0.025833	0.021306	0.027640	0.036184	0.047010	0.037981	0.049249	0.059802	0.035669	0
2	0.000000	0.000000	0.000000	0.442600	0.000000	0.000000	0.150240	1.48060	0.635870	0.020341	...	0.017774	0.020330	0.019916	0.033483	0.015937	0.021656	0.018347	0.017458	0.018744	0
3	0.000000	0.000000	0.000000	0.470420	0.000000	1.277900	0.459540	0.00000	0.000000	0.000000	...	0.017935	0.005156	0.041298	0.014921	0.015868	0.012122	0.015664	0.011410	0.017450	1
4	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.00000	0.030878	0.928510	...	0.039596	0.007086	0.013696	0.028789	0.022858	0.030883	0.026539	0.021337	0.018109	1
5	0.490990	0.833880	0.000000	0.000000	0.000000	0.000000	0.000000	0.00000	0.188490	0.764420	...	0.008161	0.036306	0.029198	0.045733	0.008041	0.013111	0.022239	0.058815	0.014322	1
6	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.00000	0.000000	0.000000	...	0.025636	0.008809	0.026506	0.018506	0.029058	0.009211	0.013236	0.031606	0.022141	1
7	0.368230	0.000000	0.000000	0.000000	0.000000	0.395810	0.948560	0.00000	0.000000	0.000000	...	0.031240	0.016638	0.040408	0.028362	0.016704	0.034409	0.025067	0.024614	0.026773	1
8	0.367450	0.000000	0.087409	0.000000	0.000000	0.000000	0.000000	0.00000	0.000000	0.588080	...	0.052771	0.017765	0.055323	0.067212	0.048452	0.019376	0.056357	0.056325	0.050188	1
9	0.066494	0.000000	0.000000	0.084850	0.608320	0.522730	0.000000	0.37833	0.000000	0.096584	...	0.020576	0.026661	0.021242	0.019962	0.040603	0.027398	0.019766	0.020432	0.032214	0
10	0.495670	2.536900	0.000000	0.000000	0.000000	1.530300	0.000000	0.00000	0.000000	0.000000	...	0.011404	0.013138	0.025195	0.017418	0.010645	0.012981	0.039255	0.016495	0.007007	1
11	0.000000	0.000000	0.000000	0.000000	0.623180	0.524910	0.000000	1.34940	0.000000	0.000000	...	0.024555	0.005029	0.031665	0.040577	0.026261	0.023069	0.043602	0.044524	0.066983	0
12	1.096500	0.720820	0.418210	0.000000	0.312950	0.000000	0.000000	0.00000	0.000000	0.464130	...	0.013453	0.034157	0.045233	0.044563	0.017900	0.043618	0.076412	0.036831	0.007185	1
13	0.653430	0.000000	0.142020	0.046679	0.000000	0.000000	0.650850	0.00000	0.000000	0.000000	...	0.047733	0.043096	0.055382	0.050194	0.039210	0.023657	0.021919	0.055182	0.027263	0
14	0.000000	0.000000	0.000000	0.000000	1.299300	0.882630	0.137290	0.00000	0.000000	0.790330	...	0.002243	0.018213	0.030337	0.011113	0.003206	0.036056	0.024078	0.020279	0.022261	0
15	0.000000	1.103800	0.000000	0.000000	1.145400	0.000000	0.000000	0.00000	0.000000	0.000000	...	0.041530	0.036399	0.031912	0.029357	0.056295	0.012967	0.021085	0.042250	0.037959	0
16	0.658060	1.518300	0.000000	0.000000	0.000000	0.000000	0.000000	0.00000	0.000000	2.115800	...	0.039075	0.024941	0.018340	0.025604	0.014863	0.038411	0.051662	0.075225	0.031801	1
17	0.915600	0.000000	0.000000	0.000000	0.000000	0.000000	0.827120	0.00000	0.000000	0.000000	...	0.030439	0.048584	0.034337	0.049318	0.021250	0.035184	0.032242	0.025961	0.023896	0
18	0.122340	0.484180	0.575630	0.000000	0.056843	0.000000	0.269810	0.80379	0.000000	0.847900	...	0.043903	0.025202	0.036941	0.101960	0.042061	0.033733	0.056107	0.043020	0.034273	1
19	0.000000	0.000000	0.887790	0.796050	0.949070	0.000000	0.000000	0.00000	0.000000	0.150800	...	0.062996	0.046529	0.029277	0.048688	0.030056	0.066896	0.064681	0.064771	0.033705	0
20	0.000000	0.911010	0.000000	0.000000	0.336420	0.000000	0.000000	0.18094	0.263610	0.988440	...	0.008856	0.041469	0.009145	0.009094	0.009796	0.027103	0.042893	0.056196	0.012501	1
21	0.000000	0.000000	0.000000	0.198800	0.000000	0.901150	0.000000	0.00000	0.413540	0.000000	...	0.020464	0.037698	0.028640	0.023290	0.030678	0.032787	0.049547	0.027087	0.044812	0
22	0.457150	1.565800	0.000000	0.000000	0.000000	0.719880	0.000000	0.00000	0.107970	0.000000	...	0.036841	0.060575	0.041959	0.047929	0.039467	0.049774	0.082377	0.057957	0.035604	1
23	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.00000	0.000000	0.000000	...	0.019534	0.006583	0.012513	0.028868	0.039953	0.014977	0.045486	0.021237	0.026507	0
24	1.145900	1.242200	0.000000	0.000000	0.000000	0.956940	0.000000	0.00000	0.000000	0.519750	...	0.039259	0.018149	0.012153	0.018000	0.037783	0.019433	0.014982	0.025554	0.025585	1
25	0.384660	0.642160	0.000000	0.000000	0.000000	0.000000	0.141730	0.51014	0.000000	0.322540	...	0.034577	0.031162	0.038961	0.041850	0.026021	0.007156	0.031507	0.048640	0.028067	1
26	0.044793	0.171220	0.150470	0.791640	0.100370	0.000000	0.000000	0.00000	0.000000	0.546100	...	0.059782	0.027918	0.022951	0.067286	0.072825	0.026592	0.030550	0.047546	0.061572	0
27	0.024028	0.000000	0.000000	1.560600	0.323740	0.573730	0.000000	0.97208	0.402180	0.613800	...	0.019090	0.011114	0.007980	0.045997	0.045494	0.018651	0.011630	0.011288	0.019492	0
28	0.311980	0.244520	0.212100	0.978550	0.000000	1.319800	0.000000	0.00000	0.000000	0.026332	...	0.033687	0.066910	0.036916	0.029357	0.017351	0.020543	0.015300	0.016477	0.019715	0
29	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	1.55810	0.485880	0.815440	...	0.039693	0.020571	0.044319	0.039485	0.047731	0.043560	0.043651	0.042374	0.048968	0
30	0.269790	0.024128	0.000156	0.000000	0.174980	0.000000	0.337810	0.00000	0.000000	0.000000	...	0.011549	0.085321	0.016958	0.008131	0.022019	0.031845	0.020188	0.007039	0.012079	1
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
3771	0.000000	0.000000	0.000000	0.000000	0.000000	0.613830	1.456800	1.35090	0.769510	0.000000	...	0.042000	0.031909	0.012088	0.019042	0.013035	0.069337	0.040768	NaN	0.033300	1
3772	0.484470	0.773390	0.000000	0.213940	0.000000	0.000000	0.000000	0.30814	0.120880	NaN	...	0.020615	0.010171	0.024787	0.021214	0.015437	0.021795	0.032656	0.011645	0.012731	1
3773	1.522700	0.000000	0.000000	NaN	0.000000	NaN	0.000000	0.00000	0.000000	NaN	...	0.018538	0.010537	0.027684	0.031280	0.027294	0.013832	NaN	0.033723	0.039819	1
3774	NaN	NaN	0.000000	0.000000	NaN	NaN	0.000000	0.00000	NaN	0.000000	...	0.037988	NaN	0.043092	0.031044	0.036253	0.059886	0.047193	0.065673	0.021502	1
3775	NaN	NaN	0.000000	0.435930	0.000000	0.000000	0.000000	0.00000	NaN	NaN	...	0.032696	0.034843	NaN	0.035841	0.018307	NaN	0.038746	0.038803	NaN	0
3776	0.523340	0.000000	0.000000	NaN	0.000000	0.000000	0.000000	0.17444	0.000000	0.650650	...	0.042804	0.010049	0.012368	0.032397	0.040944	0.016676	NaN	0.023827	NaN	1
3777	0.000000	0.000000	0.000000	0.000000	0.000000	NaN	0.000000	0.00000	NaN	1.662700	...	0.015793	0.022092	0.014955	0.018942	0.016944	0.032190	0.028367	0.018088	NaN	1
3778	NaN	0.000000	0.258000	0.507590	0.000000	0.000000	0.009438	0.00000	0.000000	0.143250	...	0.017947	0.024456	NaN	0.041183	0.047993	0.033153	0.040062	0.023869	NaN	0
3779	0.000000	0.000000	NaN	0.000000	0.167130	0.000000	NaN	0.00000	0.000000	0.000000	...	0.033826	0.056108	0.062568	NaN	0.028362	0.038791	0.040587	0.035817	0.017098	0
3780	0.000000	0.000000	NaN	0.000000	NaN	1.302700	0.000000	0.00000	NaN	0.000000	...	NaN	NaN	0.053071	NaN	0.023068	0.002995	0.007839	NaN	0.014431	1
3781	0.000000	NaN	0.000000	0.283930	1.203500	0.017472	NaN	0.00000	NaN	0.364180	...	0.018298	0.017468	NaN	NaN	0.013216	0.018009	NaN	0.013381	0.009624	1
3782	0.082950	0.153360	0.000000	0.000000	NaN	0.000000	0.973910	NaN	0.043460	1.534700	...	0.023122	0.011625	NaN	0.008095	0.011113	0.041680	0.019421	0.020782	0.012575	1
3783	0.000000	0.000000	0.000000	0.000000	0.000000	NaN	NaN	0.00000	0.461830	1.176400	...	NaN	0.027828	0.038889	0.021408	0.003900	0.029600	0.029911	0.026131	0.011419	0
3784	0.000000	0.738330	NaN	0.000000	0.000000	NaN	NaN	0.00000	0.000000	0.916750	...	0.037651	0.049141	0.021031	0.028404	0.012977	0.039746	0.040718	0.013986	NaN	1
3785	NaN	0.190430	0.363980	NaN	0.000000	0.000000	NaN	0.38359	0.272360	0.000000	...	0.036021	0.035155	0.031911	0.057594	0.080866	0.034773	0.047184	0.064976	NaN	1
3786	NaN	NaN	NaN	0.324010	0.000000	0.000000	0.000000	0.00000	NaN	NaN	...	NaN	0.005199	NaN	0.006255	0.008951	NaN	NaN	0.010050	0.008300	1
3787	0.000000	0.000000	0.501170	0.057088	0.163170	0.646400	0.298280	0.75799	0.087317	0.146870	...	0.021765	0.016235	0.044767	0.024993	0.010303	0.009179	0.031484	0.070180	0.046984	1
3788	0.000000	0.000000	NaN	0.000000	0.000000	NaN	0.825670	0.00000	NaN	NaN	...	0.006477	0.011424	NaN	0.004331	NaN	0.008933	NaN	0.017482	0.031523	0
3789	0.000000	0.000000	0.000000	NaN	0.000000	1.338000	0.000000	0.00000	0.000000	0.113400	...	0.006814	0.000684	0.001524	NaN	0.001442	0.002318	NaN	0.011766	0.007097	1
3790	0.000000	0.000000	0.000000	0.476500	NaN	0.176770	0.085610	0.75290	0.000000	0.350030	...	NaN	NaN	0.021348	0.031300	NaN	0.014002	0.030157	0.037464	0.024539	0
3791	0.000000	0.317140	0.246310	0.000000	0.000000	NaN	0.000000	NaN	0.000000	NaN	...	0.020298	0.006256	0.015469	0.038902	NaN	0.020679	0.027800	0.035840	NaN	1
3792	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	NaN	0.75411	0.151220	0.133960	...	NaN	NaN	0.056608	0.034379	NaN	0.050165	NaN	0.038335	NaN	1
3793	0.000000	NaN	1.137600	0.000000	0.000000	0.000000	0.084377	0.00000	0.000000	0.947300	...	NaN	0.037247	0.041796	0.026112	0.010530	0.046502	NaN	0.042460	NaN	0
3794	0.000000	0.537230	0.000000	0.000000	0.000000	1.846200	0.000000	0.00000	0.000000	1.628400	...	0.030879	0.037300	NaN	0.017458	0.017493	0.027442	0.014037	0.011320	0.010543	0
3795	0.000000	0.000000	0.000000	0.262230	0.000000	0.000000	0.000000	NaN	0.000000	0.000000	...	0.034215	0.043260	NaN	0.042648	0.044968	0.038014	NaN	0.054445	0.032456	0
3796	0.000000	0.944010	0.000000	0.000000	NaN	0.000000	0.000000	0.00000	0.000000	0.000000	...	NaN	0.021328	0.016077	0.019606	NaN	0.005605	0.003127	0.009222	0.019916	1
3797	NaN	0.000000	NaN	NaN	NaN	0.516080	0.000000	0.13372	0.359210	NaN	...	0.019692	0.013608	0.020126	0.021958	0.035866	0.025194	0.029437	NaN	NaN	1
3798	NaN	0.000000	0.146570	0.000000	0.260730	0.000000	0.000000	0.00000	0.000000	0.000000	...	0.022441	0.025916	0.040383	0.045961	0.012540	0.025097	NaN	0.030621	NaN	1
3799	0.000000	NaN	0.293200	0.000000	NaN	0.000000	0.262210	0.00000	0.000000	0.000000	...	0.012463	0.024990	0.034452	0.014815	0.008251	0.058643	NaN	0.038955	0.010777	1
3800	0.000000	0.000000	0.000000	0.000000	0.000000	NaN	0.490140	NaN	1.968300	0.008118	...	0.055913	0.009899	NaN	0.017555	0.019566	NaN	0.048685	NaN	0.028767	1

	confidence
ID
1	0.66
2	1.00
3	1.00
4	1.00
5	1.00
6	1.00
7	1.00
8	1.00
9	1.00
10	1.00
11	0.66
12	0.66
13	0.66
14	1.00
15	0.66
16	1.00
17	0.66
18	1.00
19	1.00
20	1.00
21	1.00
22	1.00
23	0.66
24	1.00
25	1.00
26	1.00
27	1.00
28	0.66
29	1.00
30	1.00
...	...
3771	0.66
3772	1.00
3773	1.00
3774	0.66
3775	0.66
3776	1.00
3777	0.66
3778	0.66
3779	1.00
3780	0.66
3781	0.66
3782	1.00
3783	1.00
3784	1.00
3785	1.00
3786	0.66
3787	1.00
3788	1.00
3789	1.00
3790	1.00
3791	1.00
3792	0.66
3793	0.66
3794	0.66
3795	0.66
3796	1.00
3797	0.66
3798	0.66
3799	0.66
3800	1.00

	mean_fit_time	mean_score_time	mean_test_score	mean_train_score	param_C	param_class_weight	param_gamma	param_kernel	params	rank_test_score	...	split2_test_score	split2_train_score	split3_test_score	split3_train_score	split4_test_score	split4_train_score	std_fit_time	std_score_time	std_test_score	std_train_score
0	0.603728	0.144425	0.876582	0.886303	0.85	{0: 1.33, 1: 1}	1e-05	rbf	{'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...	183	...	0.880090	0.881356	0.873303	0.885311	0.857466	0.890960	0.050004	0.038218	0.011687	0.003265
1	0.404284	0.083440	0.884268	0.900656	0.85	{0: 1.33, 1: 1}	3e-05	rbf	{'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...	163	...	0.904977	0.897740	0.873303	0.900565	0.871041	0.904520	0.020091	0.000607	0.013607	0.003712
2	0.362656	0.078480	0.886528	0.907550	0.85	{0: 1.33, 1: 1}	5e-05	rbf	{'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...	151	...	0.907240	0.907345	0.882353	0.909040	0.871041	0.908475	0.004201	0.001375	0.012674	0.001895
3	0.361930	0.076454	0.889241	0.915574	0.85	{0: 1.33, 1: 1}	7e-05	rbf	{'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...	124	...	0.907240	0.917514	0.884615	0.917514	0.873303	0.916949	0.012624	0.003171	0.012640	0.002570
4	0.353906	0.075184	0.891953	0.920208	0.85	{0: 1.33, 1: 1}	9e-05	rbf	{'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...	55	...	0.902715	0.924294	0.884615	0.922034	0.877828	0.918644	0.007083	0.001746	0.011165	0.002548
5	0.350159	0.074468	0.892857	0.924841	0.85	{0: 1.33, 1: 1}	0.00011	rbf	{'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...	37	...	0.902715	0.926554	0.880090	0.927119	0.877828	0.922599	0.004887	0.001453	0.013739	0.001945
6	0.349687	0.075331	0.890145	0.930831	0.85	{0: 1.33, 1: 1}	0.00013	rbf	{'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...	100	...	0.902715	0.935593	0.875566	0.931638	0.875566	0.928249	0.002590	0.001693	0.015527	0.003002
7	0.352193	0.075244	0.890597	0.937273	0.85	{0: 1.33, 1: 1}	0.00015	rbf	{'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...	89	...	0.898190	0.940113	0.877828	0.938418	0.877828	0.935593	0.005976	0.000707	0.014850	0.002770
8	0.351918	0.076226	0.890145	0.943715	0.85	{0: 1.33, 1: 1}	0.00017	rbf	{'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...	100	...	0.895928	0.946893	0.880090	0.946893	0.875566	0.939548	0.003888	0.001631	0.013724	0.003263
9	0.357914	0.077536	0.892857	0.949140	0.85	{0: 1.33, 1: 1}	0.00019	rbf	{'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...	37	...	0.895928	0.951977	0.884615	0.953672	0.880090	0.944068	0.005477	0.001646	0.011818	0.003673
10	0.363275	0.079470	0.894665	0.953774	0.85	{0: 1.33, 1: 1}	0.00021	rbf	{'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...	2	...	0.893665	0.955932	0.893665	0.957062	0.884615	0.948588	0.006450	0.002137	0.009169	0.003426
11	0.485079	0.106174	0.876582	0.886303	0.86	{0: 1.33, 1: 1}	1e-05	rbf	{'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...	183	...	0.880090	0.881356	0.873303	0.885311	0.857466	0.891525	0.007442	0.000637	0.011687	0.003473
12	0.386279	0.085589	0.884268	0.900543	0.86	{0: 1.33, 1: 1}	3e-05	rbf	{'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...	163	...	0.904977	0.897740	0.873303	0.900565	0.871041	0.903955	0.002964	0.004466	0.013607	0.003600
13	0.412019	0.091459	0.886528	0.907663	0.86	{0: 1.33, 1: 1}	5e-05	rbf	{'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...	151	...	0.907240	0.907345	0.882353	0.909040	0.871041	0.908475	0.063577	0.013460	0.012674	0.001683
14	0.501851	0.108831	0.888788	0.915800	0.86	{0: 1.33, 1: 1}	7e-05	rbf	{'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...	131	...	0.907240	0.918079	0.882353	0.918079	0.873303	0.916949	0.028417	0.036872	0.012836	0.002749
15	0.535873	0.107396	0.891953	0.920547	0.86	{0: 1.33, 1: 1}	9e-05	rbf	{'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...	55	...	0.902715	0.924294	0.884615	0.922034	0.877828	0.919774	0.062958	0.021096	0.011165	0.002370
16	0.362678	0.076567	0.892857	0.925519	0.86	{0: 1.33, 1: 1}	0.00011	rbf	{'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...	37	...	0.902715	0.927684	0.880090	0.927684	0.877828	0.922599	0.016281	0.003425	0.013739	0.002233
17	0.349087	0.075353	0.890145	0.931170	0.86	{0: 1.33, 1: 1}	0.00013	rbf	{'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...	100	...	0.902715	0.936158	0.875566	0.932203	0.875566	0.928249	0.002904	0.002170	0.015527	0.003080
18	0.348154	0.074966	0.890597	0.937499	0.86	{0: 1.33, 1: 1}	0.00015	rbf	{'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...	89	...	0.898190	0.940113	0.877828	0.938983	0.877828	0.936158	0.004562	0.000820	0.014850	0.002762
19	0.355929	0.076138	0.890597	0.944167	0.86	{0: 1.33, 1: 1}	0.00017	rbf	{'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...	89	...	0.895928	0.947458	0.880090	0.947458	0.875566	0.940113	0.008124	0.002002	0.013588	0.003219
20	0.362350	0.077477	0.893309	0.949366	0.86	{0: 1.33, 1: 1}	0.00019	rbf	{'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...	29	...	0.895928	0.952542	0.886878	0.953672	0.880090	0.944633	0.016530	0.002222	0.011534	0.003614
21	0.364886	0.078124	0.894213	0.954565	0.86	{0: 1.33, 1: 1}	0.00021	rbf	{'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...	6	...	0.893665	0.957062	0.895928	0.957062	0.884615	0.950282	0.006267	0.001469	0.007560	0.003077
22	0.485646	0.107777	0.876582	0.886303	0.87	{0: 1.33, 1: 1}	1e-05	rbf	{'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...	183	...	0.880090	0.881356	0.873303	0.884746	0.857466	0.891525	0.013046	0.004606	0.011687	0.003596
23	0.386740	0.082835	0.884268	0.900656	0.87	{0: 1.33, 1: 1}	3e-05	rbf	{'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...	163	...	0.904977	0.897740	0.875566	0.901130	0.868778	0.903955	0.003106	0.000673	0.013757	0.003608
24	0.362153	0.077962	0.886528	0.908228	0.87	{0: 1.33, 1: 1}	5e-05	rbf	{'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...	151	...	0.907240	0.908475	0.882353	0.909040	0.871041	0.909605	0.008094	0.001478	0.012674	0.001897
25	0.357825	0.074856	0.888788	0.916252	0.87	{0: 1.33, 1: 1}	7e-05	rbf	{'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...	131	...	0.907240	0.918079	0.882353	0.919774	0.873303	0.916949	0.014739	0.000388	0.012836	0.002897
26	0.349499	0.075018	0.891501	0.920660	0.87	{0: 1.33, 1: 1}	9e-05	rbf	{'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...	76	...	0.902715	0.924859	0.884615	0.922034	0.875566	0.919774	0.008123	0.001843	0.011757	0.002553
27	0.349411	0.073545	0.892857	0.926084	0.87	{0: 1.33, 1: 1}	0.00011	rbf	{'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...	37	...	0.902715	0.928814	0.880090	0.927684	0.877828	0.923164	0.009098	0.000617	0.013739	0.002455
28	0.350789	0.074659	0.889693	0.931735	0.87	{0: 1.33, 1: 1}	0.00013	rbf	{'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...	110	...	0.900452	0.936723	0.875566	0.933333	0.875566	0.928814	0.003453	0.001541	0.015183	0.003264
29	0.351878	0.075582	0.890597	0.937951	0.87	{0: 1.33, 1: 1}	0.00015	rbf	{'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...	89	...	0.898190	0.940113	0.877828	0.939548	0.877828	0.936723	0.003958	0.001245	0.014850	0.002662
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
157	0.444571	0.089726	0.888788	0.917608	0.99	{0: 1.33, 1: 1}	7e-05	rbf	{'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...	131	...	0.902715	0.920339	0.882353	0.919774	0.873303	0.918079	0.083508	0.014550	0.011862	0.002369
158	0.413661	0.078748	0.892405	0.923711	0.99	{0: 1.33, 1: 1}	9e-05	rbf	{'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...	45	...	0.902715	0.927119	0.882353	0.925424	0.877828	0.920904	0.036201	0.008534	0.011993	0.002496
159	0.360614	0.075561	0.891953	0.929475	0.99	{0: 1.33, 1: 1}	0.00011	rbf	{'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...	55	...	0.902715	0.932768	0.877828	0.929944	0.880090	0.927119	0.011327	0.004267	0.014065	0.002174
160	0.362758	0.079489	0.890597	0.936030	0.99	{0: 1.33, 1: 1}	0.00013	rbf	{'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...	89	...	0.900452	0.938418	0.877828	0.936723	0.875566	0.933898	0.009160	0.011340	0.015524	0.002634
161	0.388988	0.077340	0.891049	0.943941	0.99	{0: 1.33, 1: 1}	0.00015	rbf	{'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...	83	...	0.893665	0.946893	0.875566	0.946328	0.882353	0.940678	0.027769	0.003906	0.013942	0.002747
162	0.375911	0.076609	0.891953	0.949931	0.99	{0: 1.33, 1: 1}	0.00017	rbf	{'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...	55	...	0.893665	0.952542	0.880090	0.953107	0.882353	0.946328	0.013229	0.003802	0.012065	0.003017
163	0.376194	0.076933	0.894213	0.955695	0.99	{0: 1.33, 1: 1}	0.00019	rbf	{'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...	6	...	0.895928	0.958757	0.889140	0.958192	0.884615	0.951412	0.008439	0.002787	0.009481	0.003299
164	0.419080	0.085645	0.894213	0.959312	0.99	{0: 1.33, 1: 1}	0.00021	rbf	{'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...	6	...	0.893665	0.964972	0.889140	0.961017	0.891403	0.953107	0.025396	0.010956	0.005569	0.004251
165	0.561820	0.114892	0.878391	0.887433	1	{0: 1.33, 1: 1}	1e-05	rbf	{'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...	171	...	0.884615	0.883616	0.873303	0.886441	0.861991	0.890960	0.035033	0.015328	0.010643	0.003193
166	0.482596	0.100246	0.883816	0.902239	1	{0: 1.33, 1: 1}	3e-05	rbf	{'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...	168	...	0.904977	0.900000	0.875566	0.901130	0.868778	0.906215	0.016734	0.010950	0.014051	0.003113
167	0.463714	0.096052	0.890145	0.911280	1	{0: 1.33, 1: 1}	5e-05	rbf	{'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...	100	...	0.914027	0.913559	0.884615	0.911864	0.875566	0.912994	0.022301	0.008897	0.014257	0.003145
168	0.413456	0.095846	0.888788	0.917834	1	{0: 1.33, 1: 1}	7e-05	rbf	{'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...	131	...	0.902715	0.920904	0.882353	0.919774	0.873303	0.918079	0.025419	0.004453	0.011862	0.002349
169	0.383512	0.073859	0.891953	0.923824	1	{0: 1.33, 1: 1}	9e-05	rbf	{'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...	55	...	0.902715	0.927119	0.880090	0.925989	0.877828	0.920904	0.038308	0.001840	0.012399	0.002754
170	0.407128	0.084379	0.891953	0.930040	1	{0: 1.33, 1: 1}	0.00011	rbf	{'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...	55	...	0.902715	0.933333	0.877828	0.930508	0.880090	0.928814	0.038685	0.010551	0.014065	0.002089
171	0.415157	0.091723	0.891501	0.936369	1	{0: 1.33, 1: 1}	0.00013	rbf	{'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...	76	...	0.900452	0.938983	0.877828	0.936723	0.877828	0.935028	0.020458	0.007832	0.014888	0.002592
172	0.422001	0.087517	0.891501	0.944394	1	{0: 1.33, 1: 1}	0.00015	rbf	{'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...	76	...	0.893665	0.946893	0.877828	0.946893	0.882353	0.941243	0.041294	0.012433	0.013461	0.002453
173	0.416073	0.089631	0.892857	0.950383	1	{0: 1.33, 1: 1}	0.00017	rbf	{'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...	37	...	0.893665	0.953107	0.882353	0.953107	0.884615	0.946893	0.046913	0.012780	0.011286	0.002904
174	0.415082	0.085842	0.893761	0.956261	1	{0: 1.33, 1: 1}	0.00019	rbf	{'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...	19	...	0.895928	0.959887	0.889140	0.958192	0.884615	0.951977	0.037486	0.011709	0.008651	0.003138
175	0.404942	0.082702	0.894665	0.960103	1	{0: 1.33, 1: 1}	0.00021	rbf	{'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...	2	...	0.895928	0.966102	0.889140	0.962147	0.891403	0.953672	0.022592	0.004818	0.005598	0.004607
176	0.486412	0.103226	0.878391	0.887885	1.01	{0: 1.33, 1: 1}	1e-05	rbf	{'kernel': 'rbf', 'gamma': 1e-05, 'class_weigh...	171	...	0.884615	0.883616	0.873303	0.887571	0.861991	0.891525	0.010723	0.001293	0.010643	0.003409
177	0.451571	0.094131	0.883816	0.902465	1.01	{0: 1.33, 1: 1}	3e-05	rbf	{'kernel': 'rbf', 'gamma': 3e-05, 'class_weigh...	168	...	0.904977	0.900000	0.875566	0.901130	0.868778	0.906215	0.041644	0.011842	0.014051	0.002849
178	0.427228	0.086585	0.890145	0.911280	1.01	{0: 1.33, 1: 1}	5e-05	rbf	{'kernel': 'rbf', 'gamma': 5e-05, 'class_weigh...	100	...	0.914027	0.913559	0.884615	0.911864	0.875566	0.913559	0.032641	0.008355	0.014257	0.003169
179	0.370701	0.078460	0.889241	0.917947	1.01	{0: 1.33, 1: 1}	7e-05	rbf	{'kernel': 'rbf', 'gamma': 7e-05, 'class_weigh...	124	...	0.902715	0.920904	0.882353	0.919774	0.873303	0.918079	0.024616	0.010087	0.012414	0.002204
180	0.358549	0.072419	0.891953	0.923937	1.01	{0: 1.33, 1: 1}	9e-05	rbf	{'kernel': 'rbf', 'gamma': 9e-05, 'class_weigh...	55	...	0.902715	0.927119	0.880090	0.925989	0.877828	0.920904	0.011928	0.000619	0.012399	0.002613
181	0.364770	0.077377	0.891501	0.930266	1.01	{0: 1.33, 1: 1}	0.00011	rbf	{'kernel': 'rbf', 'gamma': 0.00011, 'class_wei...	76	...	0.900452	0.933333	0.877828	0.931073	0.880090	0.928814	0.021795	0.009686	0.013745	0.002157
182	0.350404	0.073023	0.890597	0.936708	1.01	{0: 1.33, 1: 1}	0.00013	rbf	{'kernel': 'rbf', 'gamma': 0.00013, 'class_wei...	89	...	0.900452	0.938983	0.873303	0.937288	0.877828	0.935593	0.005686	0.000777	0.015800	0.002676
183	0.357588	0.076488	0.891049	0.944959	1.01	{0: 1.33, 1: 1}	0.00015	rbf	{'kernel': 'rbf', 'gamma': 0.00015, 'class_wei...	83	...	0.893665	0.947458	0.875566	0.946893	0.882353	0.942938	0.009768	0.005555	0.013942	0.002239
184	0.360025	0.074841	0.892857	0.950948	1.01	{0: 1.33, 1: 1}	0.00017	rbf	{'kernel': 'rbf', 'gamma': 0.00017, 'class_wei...	37	...	0.893665	0.954237	0.884615	0.954237	0.884615	0.946893	0.014623	0.001618	0.010039	0.003393
185	0.384417	0.076336	0.893309	0.956600	1.01	{0: 1.33, 1: 1}	0.00019	rbf	{'kernel': 'rbf', 'gamma': 0.00019, 'class_wei...	29	...	0.895928	0.959887	0.889140	0.958757	0.884615	0.951977	0.042491	0.002015	0.007837	0.003174
186	0.379126	0.079808	0.894665	0.960668	1.01	{0: 1.33, 1: 1}	0.00021	rbf	{'kernel': 'rbf', 'gamma': 0.00021, 'class_wei...	2	...	0.895928	0.966102	0.889140	0.962712	0.893665	0.954802	0.013541	0.006596	0.005739	0.004306

	CNNs	CNNs.1	CNNs.2	CNNs.3	CNNs.4	CNNs.5	CNNs.6	CNNs.7	CNNs.8	CNNs.9	...	GIST.502	GIST.503	GIST.504	GIST.505	GIST.506	GIST.507	GIST.508	GIST.509	GIST.510	GIST.511
ID
1	0.194830	1.350300	0.213490	0.000000	0.000000	0.000000	0.351890	0.088491	0.000000	0.019326	...	0.034106	0.033771	0.033252	0.065845	0.032537	0.013666	0.017434	0.019322	0.022847	0.018033
2	0.000000	0.000000	0.000000	0.165510	0.000000	0.000000	0.387750	0.000000	0.000000	0.000000	...	0.016437	0.016466	0.027004	0.033501	0.022096	0.017171	0.008196	0.009801	0.024652	0.022242
3	0.000000	0.200780	0.000000	0.000000	2.094100	0.000000	0.299910	0.378720	0.075307	0.000000	...	0.008704	0.009539	0.024631	0.008418	0.004711	0.005842	0.012716	0.026749	0.018274	0.019011
4	0.000000	0.347870	0.000000	0.073645	0.000000	0.000000	0.000000	0.000000	0.000000	1.229500	...	0.017548	0.014710	0.024185	0.014486	0.009172	0.023696	0.012986	0.030495	0.043319	0.016503
5	0.884630	0.324790	0.000000	0.000000	0.000000	0.000000	0.011088	0.000000	0.000000	0.482680	...	0.022969	0.044128	0.002334	0.019548	0.021441	0.021289	0.044368	0.043576	0.036581	0.029603
6	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.166160	1.004200	0.000000	0.189100	...	0.043562	0.021703	0.026738	0.031587	0.034307	0.030655	0.038916	0.034066	0.045780	0.047206
7	0.000000	0.000000	0.000000	0.000000	0.325900	0.000000	0.687090	0.618480	0.447820	0.015397	...	0.004481	0.001124	0.000679	0.000611	0.004346	0.001564	0.003356	0.003834	0.011894	0.003052
8	0.000000	0.364670	0.000000	0.101220	0.724070	0.102940	0.000000	0.000000	0.000000	0.408030	...	0.019751	0.026489	0.022905	0.019998	0.023690	0.042806	0.011671	0.010686	0.019967	0.039470
9	0.000000	0.000000	0.782400	0.000000	0.000000	0.000000	0.647210	0.000000	0.000000	0.016120	...	0.040826	0.025450	0.030382	0.029829	0.039562	0.007886	0.032521	0.016721	0.038936	0.028714
10	0.000000	0.150050	0.000000	0.000000	0.000000	0.307400	0.000000	0.000000	0.000000	0.000000	...	0.031852	0.047037	0.012018	0.006477	0.028726	0.022898	0.014857	0.006256	0.007239	0.023779
11	0.000000	0.000000	0.000000	0.000000	0.168120	0.000000	0.557410	0.357510	0.000000	0.000000	...	0.048258	0.032145	0.035011	0.068682	0.039521	0.036103	0.049371	0.056364	0.047784	0.029587
12	0.000000	0.000000	0.000000	1.113900	0.000000	0.000000	0.654960	0.368960	0.116540	0.672640	...	0.014656	0.022645	0.009311	0.018720	0.014527	0.034215	0.004362	0.020660	0.013758	0.016608
13	0.152170	0.000000	0.054608	0.000000	0.000000	0.000000	0.022096	0.000000	0.853110	1.165300	...	0.062972	0.040143	0.016354	0.045206	0.044716	0.023516	0.009517	0.023491	0.051504	0.034649
14	0.000000	0.000000	0.000000	0.437360	0.085099	1.019300	0.202030	0.202820	0.911960	0.000000	...	0.029723	0.025426	0.002588	0.009861	0.024655	0.016447	0.003662	0.010975	0.035349	0.011683
15	0.213150	0.706040	0.000000	0.000000	0.000000	0.000000	0.861110	0.387440	0.000000	0.000000	...	0.036866	0.053719	0.021971	0.025420	0.035830	0.042621	0.049735	0.059815	0.062077	0.078901
16	0.367090	0.198290	0.399630	1.044100	1.062400	0.739550	0.555000	0.728430	0.549300	0.395800	...	0.017267	0.005520	0.005491	0.008309	0.012251	0.007130	0.009657	0.017222	0.007176	0.010277
17	1.081300	0.709940	0.000000	0.000000	0.000000	0.038946	0.000000	0.000000	0.621500	0.586920	...	0.017681	0.015070	0.002573	0.014901	0.016915	0.015710	0.003059	0.006623	0.047580	0.022572
18	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.433450	0.000000	0.000000	...	0.039709	0.004613	0.002927	0.030539	0.032032	0.004746	0.006050	0.016470	0.024361	0.006372
19	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.628250	0.082205	0.000000	0.000000	...	0.010681	0.005907	0.006614	0.013034	0.016809	0.006511	0.013259	0.029419	0.021673	0.005455
20	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.316730	0.530300	0.000000	0.000000	...	0.006544	0.008341	0.001190	0.020739	0.004576	0.007154	0.003611	0.027442	0.006166	0.006204
21	0.177610	1.023500	0.000000	0.253690	0.782780	0.000000	0.000000	0.000000	0.000000	0.929450	...	0.027354	0.017965	0.035577	0.011575	0.017646	0.035258	0.031280	0.034345	0.009078	0.014279
22	0.000000	0.490950	0.000000	0.000000	0.000000	0.000000	0.000000	1.196700	0.000000	1.791300	...	0.014434	0.006604	0.058954	0.019730	0.018328	0.013492	0.063643	0.020623	0.010762	0.013761
23	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.380200	0.000000	...	0.012042	0.017029	0.032655	0.031286	0.020286	0.022929	0.038708	0.040450	0.014474	0.034527
24	0.000000	0.000000	0.000000	0.397890	0.000000	0.396670	0.000000	0.933390	0.000000	0.000000	...	0.010552	0.003758	0.013717	0.034272	0.024939	0.008811	0.007765	0.008986	0.002322	0.001472
25	0.609620	0.000000	1.208400	0.000000	0.441790	0.000000	0.852390	0.703310	0.046152	0.437660	...	0.041628	0.021789	0.009921	0.028715	0.037731	0.013175	0.009148	0.018026	0.038273	0.032657
26	0.039763	0.350370	0.030108	0.031349	0.701510	0.000000	1.393400	0.616590	0.820280	0.140160	...	0.004995	0.016679	0.006073	0.003541	0.013030	0.019313	0.001189	0.000711	0.010844	0.018480
27	0.000000	0.000000	0.000000	0.727110	0.000000	0.400440	0.000000	0.694860	0.000000	0.513950	...	0.009220	0.003783	0.006770	0.008080	0.009801	0.005275	0.009081	0.012781	0.010865	0.007519
28	0.000000	0.000000	0.000000	0.428450	0.270870	0.772110	0.000000	0.342840	0.000000	0.576240	...	0.042825	0.034460	0.049683	0.069336	0.041269	0.041437	0.049926	0.059883	0.056016	0.049831
29	0.071236	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	...	0.033911	0.017138	0.044084	0.026645	0.033558	0.051734	0.029514	0.031265	0.039061	0.039235
30	0.000000	1.060000	0.000000	0.000000	0.000000	0.000000	0.049577	1.533800	0.000000	0.000000	...	0.027339	0.032716	0.021481	0.031347	0.029081	0.033590	0.028272	0.038660	0.032977	0.015283
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
4171	0.000000	2.734600	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	1.411400	0.000000	...	0.042138	0.011998	0.026259	0.041903	0.060205	0.031540	0.052135	0.037284	0.061335	0.047338
4172	0.786940	0.651490	1.183200	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	...	0.070863	0.040415	0.053479	0.059165	0.061798	0.071912	0.040353	0.070295	0.030428	0.035420
4173	0.000000	1.991300	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	...	0.073568	0.017017	0.043290	0.044839	0.057458	0.039226	0.048257	0.044225	0.056488	0.037646
4174	0.000000	0.000000	0.000000	0.000000	0.000000	1.326700	0.000000	0.000000	0.000000	0.106750	...	0.017138	0.013441	0.024706	0.025719	0.038360	0.017907	0.032489	0.024224	0.022035	0.028300
4175	1.095200	1.567200	0.000000	0.000000	0.000000	0.000000	0.070119	0.000000	0.000000	0.000000	...	0.042875	0.044804	0.037184	0.068091	0.103290	0.039282	0.023520	0.043291	0.055544	0.043498
4176	0.000000	0.302120	0.000000	0.384000	0.000000	0.000000	0.000000	0.397090	0.000000	0.319180	...	0.048888	0.039039	0.039669	0.037022	0.037721	0.031369	0.035511	0.038268	0.042715	0.013048
4177	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.366940	1.154200	0.000000	0.000000	...	0.032254	0.005745	0.026515	0.029190	0.027137	0.007291	0.042835	0.043519	0.026551	0.011185
4178	0.000000	0.061280	0.000000	0.000000	0.000000	0.000000	0.000000	0.138560	0.000000	1.694200	...	0.016446	0.012421	0.038739	0.032971	0.025489	0.053939	0.043233	0.061656	0.038132	0.027813
4179	0.000000	0.569260	0.000000	0.000000	0.000000	0.000000	0.000000	0.739310	0.749620	0.000000	...	0.037481	0.008475	0.022053	0.027696	0.024141	0.010281	0.021287	0.040340	0.041724	0.008595
4180	0.000000	0.000000	0.000000	0.000000	0.494700	0.000000	0.000000	0.777980	0.000000	0.000000	...	0.016469	0.015347	0.031718	0.027136	0.027073	0.017448	0.023853	0.015471	0.019030	0.019459
4181	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	1.317400	0.000000	...	0.044988	0.021043	0.041841	0.014692	0.021267	0.032977	0.033808	0.018784	0.018042	0.019031
4182	0.000000	0.410600	1.555400	1.061000	0.165900	0.000000	0.455830	0.000000	0.000000	0.000000	...	0.031500	0.026408	0.046603	0.034717	0.035535	0.031554	0.029539	0.044249	0.023049	0.045432
4183	1.638500	0.000000	0.000000	0.000000	0.178880	0.000000	0.000000	0.000000	0.000000	0.710170	...	0.026997	0.003131	0.022698	0.032954	0.027425	0.004165	0.040303	0.038697	0.020735	0.020643
4184	0.000000	0.000000	0.000000	1.645800	0.000000	0.000000	0.000000	0.000000	0.000000	0.003552	...	0.032534	0.042982	0.042220	0.034992	0.028178	0.022535	0.050429	0.020034	0.023449	0.029159
4185	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	1.577500	0.000000	0.000000	...	0.065182	0.039138	0.031415	0.034866	0.031182	0.052862	0.022938	0.036160	0.036301	0.042277
4186	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.182140	0.000000	0.000000	0.000000	...	0.042382	0.031378	0.049776	0.063646	0.021612	0.015708	0.042415	0.053048	0.044192	0.037856
4187	0.000000	0.000000	0.000000	0.000000	0.023010	0.000000	0.000000	0.220640	1.228100	0.000000	...	0.034456	0.002692	0.020770	0.053716	0.040541	0.002834	0.020102	0.034843	0.024613	0.005409
4188	0.000000	0.000000	0.000000	0.000000	0.216160	0.000000	0.048958	1.981900	0.000000	0.082497	...	0.077684	0.059553	0.060726	0.059117	0.045136	0.037635	0.041499	0.039308	0.022825	0.032905
4189	0.000000	0.882690	0.429830	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.188320	...	0.025227	0.033569	0.034830	0.031152	0.037935	0.059416	0.069511	0.062773	0.041863	0.035643
4190	0.000000	0.728680	0.000000	0.093624	0.687730	0.037979	0.000000	0.000000	0.000000	1.126700	...	0.020896	0.015064	0.016455	0.026685	0.017372	0.010616	0.012713	0.023552	0.022268	0.016904
4191	0.000000	0.481380	0.000000	0.000000	0.000000	0.773610	0.000000	0.000000	0.000000	0.000000	...	0.060230	0.038550	0.026339	0.028310	0.027146	0.034679	0.010065	0.010213	0.026589	0.048003
4192	0.000000	0.462310	0.000000	0.167100	0.000000	0.768900	0.716370	0.000000	0.000000	0.169180	...	0.018414	0.021935	0.024465	0.026636	0.030005	0.028487	0.017633	0.019037	0.040453	0.021242
4193	0.000000	1.089200	0.000000	0.000000	0.000000	0.626980	1.125300	0.000000	0.000000	0.000000	...	0.014440	0.005520	0.014940	0.030454	0.047470	0.016407	0.004174	0.037404	0.040862	0.014487
4194	0.139580	0.447620	0.192890	0.409270	0.000000	0.000000	0.000000	0.661030	0.000000	0.341400	...	0.039888	0.041431	0.040149	0.021435	0.043586	0.081216	0.030188	0.045357	0.047763	0.048098
4195	0.000000	0.003217	0.000000	0.000000	0.000000	0.000000	0.000000	0.136060	0.197230	0.000000	...	0.013453	0.002702	0.015512	0.036910	0.027067	0.002744	0.013099	0.050646	0.023090	0.005154
4196	0.527870	0.297660	0.331530	0.000000	0.000000	0.000000	0.574640	0.258230	0.481020	0.000000	...	0.029119	0.027585	0.019465	0.020351	0.033806	0.032674	0.015499	0.039595	0.058310	0.059833
4197	0.000000	0.000000	1.155300	0.207210	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	...	0.029210	0.061133	0.033955	0.057905	0.045448	0.057810	0.039487	0.043194	0.062088	0.031589
4198	0.000000	0.117070	0.000000	0.000000	0.000000	0.000000	0.000000	1.578400	0.881420	1.545200	...	0.007609	0.005828	0.004176	0.015774	0.024296	0.005301	0.005606	0.029187	0.035203	0.014258
4199	0.354620	0.000000	0.037829	0.000000	0.647960	0.000000	0.856870	0.687660	0.000000	1.021000	...	0.023272	0.010051	0.016886	0.041528	0.010361	0.006299	0.017601	0.026010	0.015067	0.029002
4200	0.000000	1.894900	0.000000	0.000000	0.000000	0.719250	0.000000	0.000000	0.000000	0.000000	...	0.034280	0.025630	0.023124	0.022218	0.034257	0.027829	0.020365	0.021096	0.035821	0.017233

	ID	prediction
0	1	1.0
1	2	0.0
2	3	0.0
3	4	1.0
4	5	1.0
5	6	1.0
6	7	0.0
7	8	0.0
8	9	0.0
9	10	1.0
10	11	0.0
11	12	0.0
12	13	1.0
13	14	0.0
14	15	1.0
15	16	0.0
16	17	1.0
17	18	0.0
18	19	0.0
19	20	0.0
20	21	0.0
21	22	1.0
22	23	1.0
23	24	0.0
24	25	0.0
25	26	1.0
26	27	0.0
27	28	0.0
28	29	0.0
29	30	1.0
...	...	...
4170	4171	1.0
4171	4172	0.0
4172	4173	0.0
4173	4174	1.0
4174	4175	1.0
4175	4176	0.0
4176	4177	1.0
4177	4178	1.0
4178	4179	1.0
4179	4180	0.0
4180	4181	0.0
4181	4182	0.0
4182	4183	0.0
4183	4184	0.0
4184	4185	1.0
4185	4186	0.0
4186	4187	1.0
4187	4188	0.0
4188	4189	0.0
4189	4190	0.0
4190	4191	1.0
4191	4192	1.0
4192	4193	1.0
4193	4194	0.0
4194	4195	1.0
4195	4196	1.0
4196	4197	0.0
4197	4198	1.0
4198	4199	0.0
4199	4200	1.0