Dummy data for classification


In [56]:
import seaborn as sns 
import matplotlib.pyplot as plt

%matplotlib inline

from sklearn.datasets import make_blobs

data, labels = make_blobs(n_features=2, centers=2,cluster_std=2,random_state=3)


plt.scatter(data[:,0], data[:,1], c = labels, cmap='coolwarm');


Attempt to Classify Data


In [59]:
#Import LinearSVC
from sklearn.svm import LinearSVC

#Create instance of Support Vector Classifier
svc = LinearSVC()

#Fit estimator to 70% of the data
svc.fit(data[:70], labels[:70])

#Predict final 30%
y_pred = svc.predict(data[70:])

#Establish true y values
y_true = labels[70:]

Metrics

Precision Score

TP - True Positives
FP - False Positives

Precision - Accuracy of positive predictions.
Precision = TP/(TP + FP)


In [74]:
from sklearn.metrics import precision_score

print("Precision score: {}".format(precision_score(y_true,y_pred)))


Precision score: 1.0

Recall Score

FN - False Negatives

Recall (aka sensitivity or true positive rate): Fraction of positives That were correctly identified.
Recall = TP/(TP+FN)


In [75]:
from sklearn.metrics import recall_score

print("Recall score: {}".format(recall_score(y_true,y_pred)))


Recall score: 0.9333333333333333

Accuracy Score


In [62]:
from sklearn.metrics import accuracy_score

print("Accuracy score: {}".format(accuracy_score(y_true,y_pred)))


Accuracy score: 0.9666666666666667

Confusion Matrix


In [67]:
from sklearn.metrics import confusion_matrix
import pandas as pd

confusion_df = pd.DataFrame(confusion_matrix(y_true,y_pred),
             columns=["Predicted Class " + str(class_name) for class_name in [0,1]],
             index = ["Class " + str(class_name) for class_name in [0,1]])

print(confusion_df)


         Predicted Class 0  Predicted Class 1
Class 0                 15                  0
Class 1                  1                 14

Classification Report


In [68]:
from sklearn.metrics import classification_report

print(classification_report(y_true,y_pred))


             precision    recall  f1-score   support

          0       0.94      1.00      0.97        15
          1       1.00      0.93      0.97        15

avg / total       0.97      0.97      0.97        30

F1 Score


In [72]:
from sklearn.metrics import f1_score

print("F1 Score: {}".format(f1_score(y_true,y_pred)))


F1 Score: 0.9655172413793104

Accuracy Score


In [62]:
from sklearn.metrics import accuracy_score

print("Accuracy score: {}".format(accuracy_score(y_true,y_pred)))


Accuracy score: 0.9666666666666667

Metric Curves


In [ ]:
from sklearn.metrics import precision_recall_curve

precisions,recalls, thresholds

In [ ]:

Other classification metrics available from sklearn.metrics

  • auc
  • average_precision_score
  • brier_score_loss
  • cohen_kappa_score
  • dcg_score
  • fbeta_score
  • hamming_loss
  • hinge_loss
  • jaccard_similarity_score
  • loss..matthews_corrcoef
  • precision_recall_curve
  • precision_recall_fscore_support
  • roc_auc_score
  • roc_curve
  • zero_one_loss

sklearn.metrics also offers Regression Metrics, Model Selection Scorer, Multilabel ranking metrics, Clusterin Metrics, Biclustering metrics, and Pairwise metrics.


In [ ]: