Day 7: Pearson Correlation Coefficient I

https://www.hackerrank.com/challenges/s10-pearson-correlation-coefficient

Objective

In this challenge, we practice calculating the Pearson correlation coefficient. Check out the Tutorial tab for learning materials!

Task

Given two -element data sets, X and Y, calculate the value of the Pearson correlation coefficient. The first line contains an integer, n , denoting the size of data sets X and Y.

  • The second line contains n space-separated real numbers (scaled to at most one decimal place), defining data set X.
  • The third line contains n space-separated real numbers (scaled to at most one decimal place), defining data set Y.

Output Format

Print the value of the Pearson correlation coefficient, rounded to a scale of decimal places.

Sample Input

10
10 9.8 8 7.8 7.7 7 6 5 4 2 
200 44 32 24 22 17 15 12 8 4

Sample Output

0.612

In [44]:
import math

def input_floats():
    return [float(i) for i in input().split(" ")]

def variance(x, m):
    return sum((i - m) ** 2 for i in x) / len(x)

def covariance(x, y, xm, ym):
    return sum((i - xm) * (j - ym) for i, j in zip(x, y)) / len(x)

In [45]:
#x = input_floats()
# y = input_floats()
x = [10, 9.8, 8, 7.8, 7.7, 7, 6, 5, 4, 2] 
y = [200, 44, 32, 24, 22, 17, 15, 12, 8, 4]
N = len(x)

x_mean = sum(x) / N
y_mean = sum(y)/ N

x_var = variance(x, x_mean)
y_var = variance(y, y_mean)

covxy = covariance(x, y, x_mean, y_mean)

pearson = covxy / (math.sqrt(x_var) * math.sqrt(y_var))

In [46]:
x_mean, x_var, math.sqrt(x_var)


Out[46]:
(6.730000000000001, 5.724100000000002, 2.392509143138225)

In [47]:
y_mean, y_var, math.sqrt(y_var)


Out[47]:
(37.8, 3046.959999999999, 55.199275357562435)

In [49]:
covxy


Out[49]:
80.886

In [48]:
pearson


Out[48]:
0.6124721937208478