prelim_month - confusion matrix
2017.09.20 - work log - prelim_month - confusion matrix
2017.09.20-work_log-prelim_month-confusion_matrix.ipynb
In [1]:
import datetime
import math
import pandas
import pandas_ml
import sklearn
import sklearn.metrics
import six
import statsmodels
import statsmodels.api
print( "packages imported at " + str( datetime.datetime.now() ) )
In [2]:
%pwd
Out[2]:
First, initialize my dev django project, so I can run code in this notebook that references my django models and can talk to the database using my project's settings.
You need to have installed your virtualenv with django as a kernel, then select that kernel for this notebook.
In [3]:
%run ../django_init.py
Import any sourcenet
or context_analysis
models or classes.
In [4]:
# python_utilities
from python_utilities.analysis.statistics.confusion_matrix_helper import ConfusionMatrixHelper
from python_utilities.analysis.statistics.stats_helper import StatsHelper
from python_utilities.dictionaries.dict_helper import DictHelper
# context_analysis models.
from context_analysis.models import Reliability_Names
print( "sourcenet and context_analysis packages imported at " + str( datetime.datetime.now() ) )
Write functions here to do math, so that we can reuse said tools below.
A basic confusion matrix ( https://en.wikipedia.org/wiki/Confusion_matrix ) contains counts of true positives, true negatives, false positives, and false negatives for a given binary or boolean (yes/no) classification decision you are asking someone or something to make.
To create a confusion matrix, you need two associated vectors containing classification decisions (0s and 1s), one that contains ground truth, and one that contains values predicted by whatever coder you are testing. For each associated pair of values:
Once you have your basic confusion matrix, the counts of true positives, true negatives, false positives, and false negatives can then be used to calculate a set of different scores and values one can use to assess the quality of predictive models. These scores include "precision and recall", "accuracy", an "F1 score" (a harmonic mean), and a "diagnostic odds ratio", among many others.
For each person detected across the set of articles, look at whether the automated coder correctly detected the person, independent of eventual lookup or person type.
First, build lists of ground truth and predicted values per person.
In [5]:
# declare variables
reliability_names_label = None
label_in_list = []
reliability_names_qs = None
ground_truth_coder_index = 1
predicted_coder_index = 2
# processing
column_name = ""
predicted_value = -1
predicted_list = []
ground_truth_value = -1
ground_truth_list = []
reliability_names_instance = None
# set label
reliability_names_label = "prelim_month"
# lookup Reliability_Names for selected label
label_in_list.append( reliability_names_label )
reliability_names_qs = Reliability_Names.objects.filter( label__in = label_in_list )
print( "Found " + str( reliability_names_qs.count() ) + " rows with label in " + str( label_in_list ) )
# loop over records
predicted_value = -1
predicted_list = []
ground_truth_value = -1
ground_truth_list = []
ground_truth_positive_count = 0
predicted_positive_count = 0
true_positive_count = 0
false_positive_count = 0
ground_truth_negative_count = 0
predicted_negative_count = 0
true_negative_count = 0
false_negative_count = 0
for reliability_names_instance in reliability_names_qs:
# get detected flag from ground truth and predicted columns and add them to list.
# ==> ground truth
column_name = Reliability_Names.FIELD_NAME_PREFIX_CODER
column_name += str( ground_truth_coder_index )
column_name += "_" + Reliability_Names.FIELD_NAME_SUFFIX_DETECTED
ground_truth_value = getattr( reliability_names_instance, column_name )
ground_truth_list.append( ground_truth_value )
# ==> predicted
column_name = Reliability_Names.FIELD_NAME_PREFIX_CODER
column_name += str( predicted_coder_index )
column_name += "_" + Reliability_Names.FIELD_NAME_SUFFIX_DETECTED
predicted_value = getattr( reliability_names_instance, column_name )
predicted_list.append( predicted_value )
#-- END loop over Reliability_Names instances. --#
print( "==> population values count: " + str( len( ground_truth_list ) ) )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
print( "==> percentage agreement = " + str( StatsHelper.percentage_agreement( ground_truth_list, predicted_list ) ) )
In [6]:
print( "==> population values: " + str( len( ground_truth_list ) ) )
list_name = "ACTUAL_VALUE_LIST"
string_list = map( str, ground_truth_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
list_name = "PREDICTED_VALUE_LIST"
string_list = map( str, predicted_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
In [14]:
confusion_matrix = pandas_ml.ConfusionMatrix( ground_truth_list, predicted_list )
print("Confusion matrix:\n%s" % confusion_matrix)
confusion_matrix.print_stats()
stats_dict = confusion_matrix.stats()
print( str( stats_dict ) )
print( str( stats_dict[ 'TPR' ] ) )
# get counts in variables
true_positive_count = confusion_matrix.TP
false_positive_count = confusion_matrix.FP
true_negative_count = confusion_matrix.TN
false_negative_count = confusion_matrix.FN
# and derive population and predicted counts
ground_truth_positive_count = true_positive_count + false_negative_count
predicted_positive_count = true_positive_count + false_positive_count
ground_truth_negative_count = true_negative_count + false_positive_count
predicted_negative_count = true_negative_count + false_negative_count
print( "==> Predicted positives: " + str( predicted_positive_count ) + " ( " + str( ( true_positive_count + false_positive_count ) ) + " )" )
print( "==> Ground truth positives: " + str( ground_truth_positive_count ) + " ( " + str( ( true_positive_count + false_negative_count ) ) + " )" )
print( "==> True positives: " + str( true_positive_count ) )
print( "==> False positives: " + str( false_positive_count ) )
print( "==> Predicted negatives: " + str( predicted_negative_count ) + " ( " + str( ( true_negative_count + false_negative_count ) ) + " )" )
print( "==> Ground truth negatives: " + str( ground_truth_negative_count ) + " ( " + str( ( true_negative_count + false_positive_count ) ) + " )" )
print( "==> True negatives: " + str( true_negative_count ) )
print( "==> False negatives: " + str( false_negative_count ) )
print( "==> Precision (true positive/predicted positive): " + str( ( true_positive_count / predicted_positive_count ) ) )
print( "==> Recall (true positive/ground truth positive): " + str( ( true_positive_count / ground_truth_positive_count ) ) )
In [40]:
confusion_helper = ConfusionMatrixHelper.populate_confusion_matrix( ground_truth_list, predicted_list )
print( str( confusion_helper ) )
For each person detected across the set of articles, look at whether the automated coder correctly looked up the person (so compare person IDs).
First, build lists of ground truth and predicted values per person.
In [41]:
# declare variables
reliability_names_label = None
label_in_list = []
reliability_names_qs = None
ground_truth_coder_index = 1
predicted_coder_index = 2
# processing
column_name = ""
predicted_value = -1
predicted_list = []
ground_truth_value = -1
ground_truth_list = []
reliability_names_instance = None
# set label
reliability_names_label = "prelim_month"
# lookup Reliability_Names for selected label
label_in_list.append( reliability_names_label )
reliability_names_qs = Reliability_Names.objects.filter( label__in = label_in_list )
print( "Found " + str( reliability_names_qs.count() ) + " rows with label in " + str( label_in_list ) )
# loop over records
predicted_value = -1
predicted_list = []
ground_truth_value = -1
ground_truth_list = []
ground_truth_positive_count = 0
predicted_positive_count = 0
true_positive_count = 0
false_positive_count = 0
ground_truth_negative_count = 0
predicted_negative_count = 0
true_negative_count = 0
false_negative_count = 0
for reliability_names_instance in reliability_names_qs:
# get person_id from ground truth and predicted columns and add them to list.
# ==> ground truth
column_name = Reliability_Names.FIELD_NAME_PREFIX_CODER
column_name += str( ground_truth_coder_index )
column_name += "_" + Reliability_Names.FIELD_NAME_SUFFIX_PERSON_ID
ground_truth_value = getattr( reliability_names_instance, column_name )
ground_truth_list.append( ground_truth_value )
# ==> predicted
column_name = Reliability_Names.FIELD_NAME_PREFIX_CODER
column_name += str( predicted_coder_index )
column_name += "_" + Reliability_Names.FIELD_NAME_SUFFIX_PERSON_ID
predicted_value = getattr( reliability_names_instance, column_name )
predicted_list.append( predicted_value )
#-- END loop over Reliability_Names instances. --#
print( "==> population values count: " + str( len( ground_truth_list ) ) )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
print( "==> percentage agreement = " + str( StatsHelper.percentage_agreement( ground_truth_list, predicted_list ) ) )
In [ ]:
print( "==> population values: " + str( len( ground_truth_list ) ) )
list_name = "ACTUAL_VALUE_LIST"
string_list = map( str, ground_truth_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
list_name = "PREDICTED_VALUE_LIST"
string_list = map( str, predicted_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
In [42]:
confusion_helper = ConfusionMatrixHelper.populate_confusion_matrix( ground_truth_list, predicted_list )
print( str( confusion_helper ) )
In [22]:
def build_confusion_lists( column_name_suffix_IN,
desired_value_IN,
label_list_IN = [ "prelim_month", ],
ground_truth_coder_index_IN = 1,
predicted_coder_index_IN = 2,
debug_flag_IN = False ):
'''
Accepts suffix of column name of interest and desired value. Also accepts optional labels
list, indexes of ground_truth and predicted coder users, and a debug flag. Uses these
values to loop over records whose label matches the on in the list passed in. For each,
in the specified column, checks to see if the ground_truth and predicted values match
the desired value. If so, positive, so 1 is stored for the row. If no, negative, so 0
is stored for the row.
Returns dictionary with value lists inside, ground truth values list mapped to key
"ground_truth" and predicted values list mapped to key "predicted".
'''
# return reference
lists_OUT = {}
# declare variables
reliability_names_label = None
label_in_list = []
reliability_names_qs = None
ground_truth_coder_index = -1
predicted_coder_index = -1
# processing
debug_flag = False
desired_column_suffix = None
desired_value = None
ground_truth_column_name = None
ground_truth_column_value = None
ground_truth_value = -1
ground_truth_list = []
predicted_column_name = None
predicted_column_value = None
predicted_value = -1
predicted_list = []
reliability_names_instance = None
# got required values?
# column name suffix?
if ( column_name_suffix_IN is not None ):
# desired value?
if ( desired_value_IN is not None ):
# ==> initialize
desired_column_suffix = column_name_suffix_IN
desired_value = desired_value_IN
label_in_list = label_list_IN
ground_truth_coder_index = ground_truth_coder_index_IN
predicted_coder_index = predicted_coder_index_IN
debug_flag = debug_flag_IN
# create ground truth column name
ground_truth_column_name = Reliability_Names.FIELD_NAME_PREFIX_CODER
ground_truth_column_name += str( ground_truth_coder_index )
ground_truth_column_name += "_" + desired_column_suffix
# create predicted column name.
predicted_column_name = Reliability_Names.FIELD_NAME_PREFIX_CODER
predicted_column_name += str( predicted_coder_index )
predicted_column_name += "_" + desired_column_suffix
# ==> processing
# lookup Reliability_Names for selected label(s)
reliability_names_qs = Reliability_Names.objects.filter( label__in = label_in_list )
print( "Found " + str( reliability_names_qs.count() ) + " rows with label in " + str( label_in_list ) )
# reset all lists and values.
ground_truth_column_value = ""
ground_truth_value = -1
ground_truth_list = []
predicted_column_value = ""
predicted_value = -1
predicted_list = []
# loop over records to build ground_truth and predicted value lists
# where 1 = value matching desired value in multi-value categorical
# variable and 0 = any value other than the desired value.
for reliability_names_instance in reliability_names_qs:
# get detected flag from ground truth and predicted columns and add them to list.
# ==> ground truth
# get column value.
ground_truth_column_value = getattr( reliability_names_instance, ground_truth_column_name )
# does it match desired value?
if ( ground_truth_column_value == desired_value ):
# it does - True (or positive or 1!)!
ground_truth_value = 1
else:
# it does not - False (or negative or 0!)!
ground_truth_value = 0
#-- END check to see if current value matches desired value. --#
# add value to list.
ground_truth_list.append( ground_truth_value )
# ==> predicted
# get column value.
predicted_column_value = getattr( reliability_names_instance, predicted_column_name )
# does it match desired value?
if ( predicted_column_value == desired_value ):
# it does - True (or positive or 1!)!
predicted_value = 1
else:
# it does not - False (or negative or 0!)!
predicted_value = 0
#-- END check to see if current value matches desired value. --#
# add to predicted list.
predicted_list.append( predicted_value )
if ( debug_flag == True ):
print( "----> gt: " + str( ground_truth_column_value ) + " ( " + str( ground_truth_value ) + " ) - p: " + str( predicted_column_value ) + " ( " + str( predicted_value ) + " )" )
#-- END DEBUG --#
#-- END loop over Reliability_Names instances. --#
else:
print( "ERROR - you must specify a desired value." )
#-- END check to see if desired value passed in. --#
else:
print( "ERROR - you must provide the suffix of the column you want to examine." )
#-- END check to see if column name suffix passed in. --#
# package up and return lists.
lists_OUT[ "ground_truth" ] = ground_truth_list
lists_OUT[ "predicted" ] = predicted_list
return lists_OUT
#-- END function build_confusion_lists() --#
print( "Function build_confusion_lists() defined at " + str( datetime.datetime.now() ) )
For each person detected across the set of articles, look at whether the automated coder assigned the correct type.
First, build lists of ground truth and predicted values per person.
In [23]:
confusion_lists = build_confusion_lists( Reliability_Names.FIELD_NAME_SUFFIX_PERSON_TYPE,
Reliability_Names.PERSON_TYPE_AUTHOR )
ground_truth_list = confusion_lists.get( "ground_truth", None )
predicted_list = confusion_lists.get( "predicted", None )
print( "==> population values count: " + str( len( ground_truth_list ) ) )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
print( "==> percentage agreement = " + str( StatsHelper.percentage_agreement( ground_truth_list, predicted_list ) ) )
In [ ]:
print( "==> population values: " + str( len( ground_truth_list ) ) )
list_name = "ACTUAL_VALUE_LIST"
string_list = map( str, ground_truth_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
list_name = "PREDICTED_VALUE_LIST"
string_list = map( str, predicted_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
In [16]:
confusion_helper = ConfusionMatrixHelper.populate_confusion_matrix( ground_truth_list, predicted_list )
print( str( confusion_helper ) )
For each person detected across the set of articles classified by Ground truth as a subject, look at whether the automated coder assigned the correct person type.
First, build lists of ground truth and predicted values per person.
In [24]:
# subjects = "mentioned"
confusion_lists = build_confusion_lists( Reliability_Names.FIELD_NAME_SUFFIX_PERSON_TYPE,
Reliability_Names.SUBJECT_TYPE_MENTIONED )
ground_truth_list = confusion_lists.get( "ground_truth", None )
predicted_list = confusion_lists.get( "predicted", None )
print( "==> population values count: " + str( len( ground_truth_list ) ) )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
print( "==> percentage agreement = " + str( StatsHelper.percentage_agreement( ground_truth_list, predicted_list ) ) )
In [ ]:
print( "==> population values: " + str( len( ground_truth_list ) ) )
list_name = "ACTUAL_VALUE_LIST"
string_list = map( str, ground_truth_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
list_name = "PREDICTED_VALUE_LIST"
string_list = map( str, predicted_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
In [18]:
confusion_helper = ConfusionMatrixHelper.populate_confusion_matrix( ground_truth_list, predicted_list )
print( str( confusion_helper ) )
For each person detected across the set of articles classified by Ground truth as a source, look at whether the automated coder assigned the correct person type.
First, build lists of ground truth and predicted values per person.
In [25]:
# subjects = "mentioned"
confusion_lists = build_confusion_lists( Reliability_Names.FIELD_NAME_SUFFIX_PERSON_TYPE,
Reliability_Names.SUBJECT_TYPE_QUOTED )
ground_truth_list = confusion_lists.get( "ground_truth", None )
predicted_list = confusion_lists.get( "predicted", None )
print( "==> population values count: " + str( len( ground_truth_list ) ) )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
print( "==> percentage agreement = " + str( StatsHelper.percentage_agreement( ground_truth_list, predicted_list ) ) )
In [ ]:
print( "==> population values: " + str( len( ground_truth_list ) ) )
list_name = "ACTUAL_VALUE_LIST"
string_list = map( str, ground_truth_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
list_name = "PREDICTED_VALUE_LIST"
string_list = map( str, predicted_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
In [26]:
confusion_helper = ConfusionMatrixHelper.populate_confusion_matrix( ground_truth_list,
predicted_list,
calc_type_IN = ConfusionMatrixHelper.CALC_TYPE_PANDAS_ML )
print( str( confusion_helper ) )
TODO:
Want a way to limit to disagreements where quoted? Might not - this is a start to assessing erroneous agreement. If yes, 1 < coding time < 4 hours.
Reliability_Names.person_type
only has three values - "author", "subject", "source" - might need a row-level measure of "has_mention
", "has_quote
" to more readily capture rows where disagreement is over quoted-or-not.TODO:
DONE:
Use lists of population and predicted values to derive confusion matrix counts:
In [7]:
# loop over lists to derive counts
predicted_value = -1
ground_truth_value = -1
ground_truth_positive_count = 0
predicted_positive_count = 0
true_positive_count = 0
false_positive_count = 0
ground_truth_negative_count = 0
predicted_negative_count = 0
true_negative_count = 0
false_negative_count = 0
list_index = -1
for predicted_value in predicted_list:
# increment index and get associated item from ground_truth_list
list_index += 1
ground_truth_value = ground_truth_list[ list_index ]
# add to counts
# ==> ground truth
if ( ground_truth_value == 0 ):
# ground truth negative
ground_truth_negative_count += 1
# not zero - so 1 (or supports other integer values)
else:
# ground truth positive
ground_truth_positive_count += 1
#-- END check to see if positive or negative --#
if ( predicted_value == 0 ):
# predicted negative
predicted_negative_count += 1
# equal to ground_truth?
if ( predicted_value == ground_truth_value ):
# true negative
true_negative_count += 1
else:
# false negative
false_negative_count += 1
#-- END check to see if true or false --#
# not zero - so 1 (or supports other integer values)
else:
# predicted positive
predicted_positive_count += 1
# equal to ground_truth?
if ( predicted_value == ground_truth_value ):
# true positive
true_positive_count += 1
else:
# false positive
false_positive_count += 1
#-- END check to see if true or false --#
#-- END check to see if positive or negative --#
#-- END loop over list items. --#
print( "==> Predicted positives: " + str( predicted_positive_count ) + " ( " + str( ( true_positive_count + false_positive_count ) ) + " )" )
print( "==> Ground truth positives: " + str( ground_truth_positive_count ) + " ( " + str( ( true_positive_count + false_negative_count ) ) + " )" )
print( "==> True positives: " + str( true_positive_count ) )
print( "==> False positives: " + str( false_positive_count ) )
print( "==> Predicted negatives: " + str( predicted_negative_count ) + " ( " + str( ( true_negative_count + false_negative_count ) ) + " )" )
print( "==> Ground truth negatives: " + str( ground_truth_negative_count ) + " ( " + str( ( true_negative_count + false_positive_count ) ) + " )" )
print( "==> True negatives: " + str( true_negative_count ) )
print( "==> False negatives: " + str( false_negative_count ) )
print( "==> Precision (true positive/predicted positive): " + str( ( true_positive_count / predicted_positive_count ) ) )
print( "==> Recall (true positive/ground truth positive): " + str( ( true_positive_count / ground_truth_positive_count ) ) )
In [8]:
# try scikit-learn: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html
sklearn.metrics.precision_recall_fscore_support( ground_truth_list, predicted_list )
Out[8]:
In [ ]:
# scikit-learn confusion matrix
# http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html
conf_matrix = sklearn.metrics.confusion_matrix( ground_truth_list, predicted_list )
print( str( conf_matrix ) )
# get counts in variables
true_positive_count = conf_matrix[ 1 ][ 1 ]
false_positive_count = conf_matrix[ 0 ][ 1 ]
true_negative_count = conf_matrix[ 0 ][ 0 ]
false_negative_count = conf_matrix[ 1 ][ 0 ]
# and derive population and predicted counts
ground_truth_positive_count = true_positive_count + false_negative_count
predicted_positive_count = true_positive_count + false_positive_count
ground_truth_negative_count = true_negative_count + false_positive_count
predicted_negative_count = true_negative_count + false_negative_count
print( "==> Predicted positives: " + str( predicted_positive_count ) + " ( " + str( ( true_positive_count + false_positive_count ) ) + " )" )
print( "==> Ground truth positives: " + str( ground_truth_positive_count ) + " ( " + str( ( true_positive_count + false_negative_count ) ) + " )" )
print( "==> True positives: " + str( true_positive_count ) )
print( "==> False positives: " + str( false_positive_count ) )
print( "==> Predicted negatives: " + str( predicted_negative_count ) + " ( " + str( ( true_negative_count + false_negative_count ) ) + " )" )
print( "==> Ground truth negatives: " + str( ground_truth_negative_count ) + " ( " + str( ( true_negative_count + false_positive_count ) ) + " )" )
print( "==> True negatives: " + str( true_negative_count ) )
print( "==> False negatives: " + str( false_negative_count ) )
print( "==> Precision (true positive/predicted positive): " + str( ( true_positive_count / predicted_positive_count ) ) )
print( "==> Recall (true positive/ground truth positive): " + str( ( true_positive_count / ground_truth_positive_count ) ) )
In [ ]:
# pandas
# https://stackoverflow.com/questions/2148543/how-to-write-a-confusion-matrix-in-python
y_actu = pandas.Series( ground_truth_list, name='Actual')
y_pred = pandas.Series( predicted_list, name='Predicted')
df_confusion = pandas.crosstab(y_actu, y_pred)
print( str( df_confusion ) )
# get counts in variables
true_positive_count = df_confusion[ 1 ][ 1 ]
false_positive_count = df_confusion[ 1 ][ 0 ]
true_negative_count = df_confusion[ 0 ][ 0 ]
false_negative_count = df_confusion[ 0 ][ 1 ]
# and derive population and predicted counts
ground_truth_positive_count = true_positive_count + false_negative_count
predicted_positive_count = true_positive_count + false_positive_count
ground_truth_negative_count = true_negative_count + false_positive_count
predicted_negative_count = true_negative_count + false_negative_count
print( "==> Predicted positives: " + str( predicted_positive_count ) + " ( " + str( ( true_positive_count + false_positive_count ) ) + " )" )
print( "==> Ground truth positives: " + str( ground_truth_positive_count ) + " ( " + str( ( true_positive_count + false_negative_count ) ) + " )" )
print( "==> True positives: " + str( true_positive_count ) )
print( "==> False positives: " + str( false_positive_count ) )
print( "==> Predicted negatives: " + str( predicted_negative_count ) + " ( " + str( ( true_negative_count + false_negative_count ) ) + " )" )
print( "==> Ground truth negatives: " + str( ground_truth_negative_count ) + " ( " + str( ( true_negative_count + false_positive_count ) ) + " )" )
print( "==> True negatives: " + str( true_negative_count ) )
print( "==> False negatives: " + str( false_negative_count ) )
print( "==> Precision (true positive/predicted positive): " + str( ( true_positive_count / predicted_positive_count ) ) )
print( "==> Recall (true positive/ground truth positive): " + str( ( true_positive_count / ground_truth_positive_count ) ) )
Use confusion matrix to derive related metrics:
In [9]:
# build up confusion outputs from confusion matrix.
# assume we have the following set one way or another above.
#ground_truth_positive_count = 0
#predicted_positive_count = 0
#ground_truth_negative_count = 0
#predicted_negative_count = 0
#true_positive_count = 0
#false_positive_count = 0
#true_negative_count = 0
#false_negative_count = 0
# add base measures to confusion_outputs
confusion_outputs[ "population_positive" ] = ground_truth_positive_count
confusion_outputs[ "predicted_positive" ] = predicted_positive_count
confusion_outputs[ "population_negative" ] = ground_truth_negative_count
confusion_outputs[ "predicted_negative" ] = predicted_negative_count
confusion_outputs[ "true_positive" ] = true_positive_count
confusion_outputs[ "false_positive" ] = false_positive_count
confusion_outputs[ "true_negative" ] = true_negative_count
confusion_outputs[ "false_negative" ] = false_negative_count
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [10]:
# declare variables
precision = None
# ==> Positive predictive value (PPV), Precision
try:
precision = ( true_positive_count / predicted_positive_count )
except:
# error - None
precision = None
#-- END check to see if Exception. --#
confusion_outputs[ "precision" ] = precision
confusion_outputs[ "PPV" ] = precision
print( "==> Positive predictive value (PPV), Precision = " + str( precision ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [11]:
# declare variables
recall = None
# ==> True positive rate (TPR), Recall, Sensitivity, probability of detection
try:
recall = ( true_positive_count / ground_truth_positive_count )
except:
# error - None
recall = None
#-- END check to see if Exception. --#
confusion_outputs[ "recall" ] = recall
confusion_outputs[ "TPR" ] = recall
print( "==> True positive rate (TPR), Recall = " + str( recall ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [12]:
# declare variables
false_negative_rate = None
# ==> False negative rate (FNR), Miss rate
try:
false_negative_rate = ( false_negative_count / ground_truth_positive_count )
except:
# error - None
false_negative_rate = None
#-- END check to see if Exception. --#
confusion_outputs[ "false_negative_rate" ] = false_negative_rate
confusion_outputs[ "FNR" ] = false_negative_rate
print( "==> false negative rate (FNR) = " + str( false_negative_rate ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [13]:
# declare variables
false_positive_rate = None
# ==> False positive rate (FPR), Fall-out
try:
false_positive_rate = ( false_positive_count / ground_truth_negative_count )
except:
# error - None
false_positive_rate = None
#-- END check to see if Exception. --#
confusion_outputs[ "false_positive_rate" ] = false_positive_rate
confusion_outputs[ "FPR" ] = false_positive_rate
print( "==> False positive rate (FPR), Fall-out = " + str( false_positive_rate ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [14]:
# declare variables
true_negative_rate = None
# ==> True negative rate (TNR), Specificity (SPC)
try:
true_negative_rate = ( true_negative_count / ground_truth_negative_count )
except:
# error - None
true_negative_rate = None
#-- END check to see if Exception. --#
confusion_outputs[ "true_negative_rate" ] = true_negative_rate
confusion_outputs[ "TNR" ] = true_negative_rate
confusion_outputs[ "specificity" ] = true_negative_rate
confusion_outputs[ "SPC" ] = true_negative_rate
print( "==> True negative rate (TNR), Specificity (SPC) = " + str( true_negative_rate ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [15]:
# declare variables
false_omission_rate = None
# ==> False omission rate (FOR) = Σ False negative/Σ Predicted condition negative
try:
false_omission_rate = ( false_negative_count / predicted_negative_count )
except:
# error - None
false_omission_rate = None
#-- END check to see if Exception. --#
confusion_outputs[ "false_omission_rate" ] = false_omission_rate
confusion_outputs[ "FOR" ] = false_omission_rate
print( "==> False omission rate (FOR) = " + str( false_omission_rate ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [16]:
# declare variables
positive_likelihood_ratio = None
tpr = None
fpr = None
# ==> Positive likelihood ratio (LR+) = TPR/FPR
tpr = confusion_outputs.get( "TPR", None )
fpr = confusion_outputs.get( "FPR", None )
try:
positive_likelihood_ratio = ( tpr / fpr )
except:
# error - None
positive_likelihood_ratio = None
#-- END check to see if Exception. --#
confusion_outputs[ "positive_likelihood_ratio" ] = positive_likelihood_ratio
confusion_outputs[ "LR+" ] = positive_likelihood_ratio
print( "==> Positive likelihood ratio (LR+) = " + str( positive_likelihood_ratio ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [17]:
# declare variables
negative_likelihood_ratio = None
fnr = None
tnr = None
# ==> Negative likelihood ratio (LR-) = FNR/TNR
fnr = confusion_outputs.get( "FNR", None )
tnr = confusion_outputs.get( "TNR", None )
try:
negative_likelihood_ratio = ( fnr / tnr )
except:
# error - None
negative_likelihood_ratio = None
#-- END check to see if Exception. --#
confusion_outputs[ "negative_likelihood_ratio" ] = negative_likelihood_ratio
confusion_outputs[ "LR-" ] = negative_likelihood_ratio
print( "==> Negative likelihood ratio (LR-) = " + str( negative_likelihood_ratio ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [18]:
# declare variables
accuracy = None
total_population = None
# ==> Accuracy (ACC) = Σ True positive + Σ True negative/Σ Total population
total_population = true_positive_count + true_negative_count + false_positive_count + false_negative_count
try:
accuracy = ( ( true_positive_count + true_negative_count ) / total_population )
except:
# error - None
accuracy = None
#-- END check to see if Exception. --#
confusion_outputs[ "accuracy" ] = accuracy
confusion_outputs[ "ACC" ] = accuracy
confusion_outputs[ "total_population" ] = total_population
print( "==> Accuracy (ACC) = " + str( accuracy ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [19]:
# declare variables
false_discovery_rate = None
# ==> False discovery rate (FDR), probability of false alarm = Σ False positive/Σ Predicted condition positive
try:
false_discovery_rate = ( false_positive_count / predicted_positive_count )
except:
# error - None
false_discovery_rate = None
#-- END check to see if Exception. --#
confusion_outputs[ "false_discovery_rate" ] = false_discovery_rate
confusion_outputs[ "FDR" ] = false_discovery_rate
print( "==> False discovery rate (FDR), probability of false alarm = " + str( false_discovery_rate ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [20]:
# declare variables
negative_predictive_value = None
# ==> Negative predictive value (NPV) = Σ True negative/Σ Predicted condition negative
try:
negative_predictive_value = ( true_negative_count / predicted_negative_count )
except:
# error - None
negative_predictive_value = None
#-- END check to see if Exception. --#
confusion_outputs[ "negative_predictive_value" ] = negative_predictive_value
confusion_outputs[ "NPV" ] = negative_predictive_value
print( "==> Negative predictive value (NPV) = " + str( negative_predictive_value ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [21]:
# declare variables
diagnostic_odds_ratio = None
lr_plus = None
lr_minus = None
# ==> Diagnostic odds ratio (DOR) = LR+/LR−
lr_plus = confusion_outputs.get( "LR+", None )
lr_minus = confusion_outputs.get( "LR-", None )
try:
diagnostic_odds_ratio = ( lr_plus / lr_minus )
except:
# error - None
diagnostic_odds_ratio = None
#-- END check to see if Exception. --#
confusion_outputs[ "diagnostic_odds_ratio" ] = diagnostic_odds_ratio
confusion_outputs[ "DOR" ] = diagnostic_odds_ratio
print( "==> Diagnostic odds ratio (DOR) = " + str( diagnostic_odds_ratio ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [22]:
# declare variables
f1_score = None
recall = None
precision = None
# ==> F1 score = 2 / ( ( 1 / Recall ) + ( 1 / Precision ) )
recall = confusion_outputs.get( "recall", None )
precision = confusion_outputs.get( "precision", None )
try:
f1_score = ( 2 / ( ( 1 / recall ) + ( 1 / precision ) ) )
except:
# error - None
f1_score = None
#-- END check to see if Exception. --#
confusion_outputs[ "f1_score" ] = f1_score
print( "==> F1 score = " + str( f1_score ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [23]:
# declare variables
matthews_correlation_coefficient = None
numerator = None
temp_math = None
denominator = None
# ==> Matthews correlation coefficient (MCC) = ( ( T P × T N ) − ( F P × F N ) ) / sqrt( ( T P + F P ) * ( T P + F N ) * ( T N + F P ) * ( T N + F N ) )
numerator = ( ( true_positive_count * true_negative_count ) - ( false_positive_count * false_negative_count ) )
temp_math = ( ( true_positive_count + false_positive_count ) * ( true_positive_count + false_negative_count ) * ( true_negative_count + false_positive_count ) * ( true_negative_count + false_negative_count ) )
denominator = math.sqrt( temp_math )
try:
matthews_correlation_coefficient = numerator / denominator
except:
# error - None
matthews_correlation_coefficient = None
#-- END check to see if Exception. --#
confusion_outputs[ "matthews_correlation_coefficient" ] = matthews_correlation_coefficient
confusion_outputs[ "MCC" ] = matthews_correlation_coefficient
print( "==> Matthews correlation coefficient (MCC) = " + str( matthews_correlation_coefficient ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [24]:
# declare variables
informedness = None
tpr = None
tnr = None
# ==> Informedness or Bookmaker Informedness (BM) = TPR + TNR − 1
tpr = confusion_outputs.get( "TPR", None )
tnr = confusion_outputs.get( "TNR", None )
try:
informedness = tpr + tnr - 1
except:
# error - None
informedness = None
#-- END check to see if Exception. --#
confusion_outputs[ "informedness" ] = informedness
confusion_outputs[ "BM" ] = informedness
print( "==> Informedness or Bookmaker Informedness (BM) = " + str( informedness ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs )
In [25]:
# declare variables
markedness = None
ppv = None
npv = None
# ==> Markedness (MK) = PPV + NPV − 1
ppv = confusion_outputs.get( "PPV", None )
npv = confusion_outputs.get( "NPV", None )
try:
markedness = ppv + npv - 1
except:
# error - None
markedness = None
#-- END check to see if Exception. --#
confusion_outputs[ "markedness" ] = markedness
confusion_outputs[ "MK" ] = markedness
print( "==> Markedness (MK) = " + str( markedness ) )
print( "==> Confusion outputs:" )
DictHelper.print_dict( confusion_outputs,
prefix_IN = "EXPECTED_OUTPUT_MAP[ \"",
separator_IN = "\" ] = ",
suffix_IN = None )
In [27]:
print( "==> population values: " + str( len( ground_truth_list ) ) )
list_name = "ACTUAL_VALUE_LIST"
string_list = map( str, ground_truth_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
print( "==> predicted values count: " + str( len( predicted_list ) ) )
list_name = "PREDICTED_VALUE_LIST"
string_list = map( str, predicted_list )
list_values = ", ".join( string_list )
print( list_name + " = [ " + list_values + " ]" )
In [26]:
confusion_helper = ConfusionMatrixHelper.populate_confusion_matrix( ground_truth_list, predicted_list )
print( str( confusion_helper ) )