Assignment 1

Abhijith Ravikumar Puthussery

Computing regression parameters using
1) Closed Form (Using linear algebra) 2) Gradient Descent
Consider the following 5 point synthetic data set:
X: 0,1,2,3,4
Y: 1,3,7,13,21

In [1]:
import pandas as pd
import numpy as np
from sklearn import datasets, linear_model
import matplotlib.pyplot as plt
%matplotlib inline

Defining the Data


In [2]:
data = {'X' : [0,1,2,3,4],'Y' : [1,3,7,13,21]}
data_frame = pd.DataFrame(data)
print(data_frame)


   X   Y
0  0   1
1  1   3
2  2   7
3  3  13
4  4  21

[5 rows x 2 columns]

In [3]:
x1 = data_frame.X[:, np.newaxis]
y1 = data_frame.Y[:, np.newaxis]

Closed Form Linear Regression


In [4]:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x1,y1)


Out[4]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

Printing the variables


In [5]:
print('Coefficients =', lr.coef_)
print('Residual sum of squares = %0.2f' % np.mean((lr.predict(x1)-y1) ** 2))
print('Variance = %0.2f' % lr.score(x1,y1))


Coefficients = [[ 5.]]
Residual sum of squares = 2.80
Variance = 0.95

Closed form regression line


In [6]:
plt.scatter(x1,y1,color='red')
plt.plot(x1,lr.predict(x1),color='blue')
plt.xlabel('X')
plt.ylabel('Y')


Out[6]:
<matplotlib.text.Text at 0x7f90449b91d0>
/usr/lib/python3/dist-packages/matplotlib/collections.py:549: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == 'face':

Gradient Descent Function


In [7]:
def gradient(x1, y1, theta, itr, alpha):
    theta_i = []
    cost_i = []
    k = float(x1.shape[0])
    for i in range(1,itr):
        theta = theta - (alpha/k)*x1.T*(x1*theta-y1)
        theta_i.append(theta)
        cost_i.append(costfunc(x1,y1,theta))
        return (theta, theta_i, cost_i)

#Cost Function
def costfunc(x1, y1, theta):
    k = float(x1.shape[0])
    cost = (1./(2*k))*(x1*theta-y1).T*(x1*theta-y1)
    return cost.flat[0]

In [8]:
theta = np.matrix([[0]]) #Theta
alpha = 0.2 #Learning Rate
itr = 100 #Number of Iterations

theta, theta_i, cost_i = gradient(x1, y1, theta, itr, alpha)
result = x1*theta
plt.plot(x1, result)
plt.scatter(x1, y1, color='black')
plt.xlabel('X')
plt.ylabel('Y')


Out[8]:
<matplotlib.text.Text at 0x7f90427d8400>
/usr/lib/python3/dist-packages/matplotlib/collections.py:549: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == 'face':

In [ ]: