Coefficient of Determination (R2)



In [3]:

    
%matplotlib inline



In [1]:

    
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn
seaborn.set_style('whitegrid')



In [2]:

    
def r2(actual, predicted):
    if isinstance(actual, list):
        actual = np.array(actual)
    if isinstance(predicted, list):
        predicted = np.array(predicted)
        
    plt.scatter(actual, predicted);
    plt.plot(actual, actual, 'r', alpha=0.5)
    plt.scatter(actual, actual, facecolors='none', edgecolor='r', linestyle='-')
    plt.axhline(y=actual.mean(), ls='dashed')
        
    mean = actual.mean()
    ss_total = ((actual - mean) ** 2).sum()
    ss_residual = ((actual - predicted) ** 2).sum()
    return 1 - ss_residual / ss_total

We'll get a perfect R2=1 if actual and predicted values are the same



In [4]:

    
actual = list(range(10))
actual









    Out[4]:





[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]



In [80]:

    
predicted = list(range(10))
r2(actual, predicted)









    Out[80]:





1.0

Now if we're a bit off on our predictions R2 will be still pretty high but not a perfect 1.



In [5]:

    
predicted = list(range(8)) + [9, 10]
predicted









    Out[5]:





[0, 1, 2, 3, 4, 5, 6, 7, 9, 10]



In [81]:

    
r2(actual, predicted)









    Out[81]:





0.97575757575757571

Intuitively it makes sense that we get a worse R2 if the predictions are farther off.



In [6]:

    
predicted = list(range(8)) + [13, 15]
predicted









    Out[6]:





[0, 1, 2, 3, 4, 5, 6, 7, 13, 15]



In [82]:

    
r2(actual, predicted)









    Out[82]:





0.26060606060606062

Now here's the interesting situation: R2 can be negative.



In [7]:

    
predicted = list(range(8)) + [100, 150]
predicted









    Out[7]:





[0, 1, 2, 3, 4, 5, 6, 7, 100, 150]

This is because R2 is calculated relative to a hypothetical model that always predicts the mean of the actual values. Obviously we don't have access to the mean of actual values at the time of prediction but we do at evaluation time.

So looking at lines 12-13 in In[2] you can see that:

mean = actual.mean()
ss_total = ((actual - mean) ** 2).sum()

and then R2 is:

ss_residual = ((actual - predicted) ** 2).sum()
R2 = 1 - ss_residual / ss_total

So obviously if we're way off on our predictions we'll be worse than this hypothetical mean model.



In [83]:

    
r2(actual, predicted)









    Out[83]:





-342.57575757575756

Special cases



In [84]:

    
actual = [1] * 10



In [85]:

    
r2(actual, [1] * 10)









    



/Users/amir.ziai/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:15: RuntimeWarning: invalid value encountered in true_divide






    Out[85]:





nan



In [86]:

    
r2(actual, [-100] * 10)









    



/Users/amir.ziai/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:15: RuntimeWarning: divide by zero encountered in true_divide






    Out[86]:





-inf



In [87]:

    
actual = list(range(10))



In [88]:

    
r2(actual, [1] * 10)









    Out[88]:





-1.4848484848484849

Predicting exactly the same as mean everytime gives us an R2=0



In [89]:

    
r2(actual, [4.5] * 10)









    Out[89]:





0.0