In this challenge, we practice calculating correlation. Check out the Resources tab to learn more!
You are provided the popularity scores for a set of juices (the higher, the better): [10, 9.8, 8, 7.8, 7.7, 7, 6, 5, 4, 2]
These are the respective prices for the juices: [200, 44, 32, 24, 22, 17, 15, 12, 8, 4]
Write a program computing (or calculate manually) the Pearson coefficient and the Spearman Rank coefficient of correlation between these values.
In [5]:
# #Python Import Libraries
import scipy
from scipy import stats
# #Data
arr_popularity = [10, 9.8, 8, 7.8, 7.7, 7, 6, 5, 4, 2]
arr_price = [200, 44, 32, 24, 22, 17, 15, 12, 8, 4]
In [8]:
scipy.stats.pearsonr(arr_popularity, arr_price)
Out[8]:
In [9]:
scipy.stats.spearmanr(arr_popularity, arr_price)
Out[9]:
In this challenge, we practice using linear regression techniques. Check out the Resources tab to learn more!
You are given the Math aptitude test (x) scores for a set of students, as well as their respective scores for a Statistics course (y). The students enrolled in Statistics immediately after taking the math aptitude test.
The scores (x, y) for each student are:
(95,85)
(85,95)
(80,70)
(70,65)
(60,70)
If a student scored an 80 on the Math aptitude test, what score would we expect her to achieve in Statistics?
Determine the equation of the best-fit line using the least squares method, and then compute the value of y when x=80.
In [60]:
# #Python Import Libraries
import sklearn
import numpy as np
In [54]:
arr_x = [i[0] for i in arr_data]
arr_y = [i[1] for i in arr_data]
In [57]:
stats.linregress(arr_x, arr_y)
Out[57]:
In [58]:
m, c, r_val, p_val, err = stats.linregress(arr_x, arr_y)
In [61]:
# #y = mx + c
m*80 + c
Out[61]:
Answer : 78.3
In [ ]: