ROIseries "Wine Tasting" part 2

This is the second part of the Wine Tasting tutorial. Please refer to the first part to get more infos.

Features Sommelier setup and Cross Validation


In [1]:
import ROIseries_feature_sommelier as RS_test
%matplotlib inline

Replace the following pathes with the pathes printed out at the end of the first part of the tutorial.


In [2]:
features_csv = [
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\R\\features\\R_features_2017-04-21T15-26-10.12803554534927Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\R\\features\\R_features_2017-04-21T15-26-10.23702710866943Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\R\\features\\R_features_2017-04-21T15-26-10.3300461173059Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\R\\features\\R_features_2017-04-21T15-26-10.40805816650405Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\G\\features\\G_features_2017-04-21T15-26-10.59602737426772Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\G\\features\\G_features_2017-04-21T15-26-10.70505917072311Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\G\\features\\G_features_2017-04-21T15-26-10.78303098678603Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\G\\features\\G_features_2017-04-21T15-26-10.87604999542251Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\B\\features\\B_features_2017-04-21T15-26-11.07906639575973Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\B\\features\\B_features_2017-04-21T15-26-11.17305099964156Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\B\\features\\B_features_2017-04-21T15-26-11.26602977514282Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\B\\features\\B_features_2017-04-21T15-26-11.36005461215988Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NIR\\features\\NIR_features_2017-04-21T15-26-11.54705822467819Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NIR\\features\\NIR_features_2017-04-21T15-26-11.64104282856002Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NIR\\features\\NIR_features_2017-04-21T15-26-11.7340618371965Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NIR\\features\\NIR_features_2017-04-21T15-26-11.82804644107833Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NDVI\\features\\NDVI_features_2017-04-21T15-26-12.01505005359664Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NDVI\\features\\NDVI_features_2017-04-21T15-26-12.10903465747848Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NDVI\\features\\NDVI_features_2017-04-21T15-26-12.20205366611495Z.csv",
    "C:\\Program Files\\Harris\\ENVI54\\IDL86\\NDVI\\features\\NDVI_features_2017-04-21T15-26-12.29603826999679Z.csv"
]

Replace the following path with the path to the data/sentinel_2a/table/scene_properties.csv within the ROIseries repo.


In [3]:
scene_properties_csv = r"D:\Programming\code\ROIseries\data\sentinel_2a\table\scene_properties.csv"

Reformat the features and the ground truth into one CSV of ROWS X COLUMS = SAMPLES X FEATURES.

The returned value in the csv variable stores the path to the resulting table.


In [4]:
csv = RS_test.ROIseries_feature_sommelier.read_features_and_groundtruth(features_csv,scene_properties_csv)

Instantiate the feature_sommelier and do a 10 fold cross validation


In [5]:
class_column = "cloudy"
strata_column = "id"
positive_classname = True
RS_cloudy = RS_test.ROIseries_feature_sommelier(csv,class_column, strata_column, positive_classname)
RS_cloudy.folds=10
RS_cloudy.CV()


Fold 0/10
Fold 1/10
Fold 2/10
Fold 3/10
Fold 4/10
Fold 5/10
Fold 6/10
Fold 7/10
Fold 8/10
Fold 9/10

Plots from the results of the 10 fold cross validation

Average and standard deviation of the precision and recall curves from the CV


In [6]:
RS_cloudy.plot_pr()


Average and standard deviation of ROC curves from the CV


In [7]:
RS_cloudy.plot_roc()


Average feature importance from the CV


In [8]:
RS_cloudy.plot_feature_importance()


Average performance measurements of confusion matrix from the CV: Barplot and Numbers


In [9]:
RS_cloudy.plot_performance()



In [10]:
RS_cloudy.plot_performance(get_data=True)


Out[10]:
F                     0.965449
G                     0.959715
deviation            -0.041071
kappa                 0.907565
overall_acc           0.956818
precision             0.987500
recall                0.946429
true_negative_rate    0.975000
dtype: float64