Linear Regression - Part 2

In this tutorial we shall see, where linear regression limitations.

Imports


In [ ]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model

%matplotlib inline

Data


In [ ]:
n_samples = 30

true_fun = lambda X: np.cos(1.5 * np.pi * X)
X = np.sort(np.random.rand(n_samples)).reshape(-1, 1)
Y = true_fun(X) + np.random.randn() * 0.1

In [ ]:
X[:5]

In [ ]:
Y[:5]

Modelling


In [ ]:
regr = linear_model.LinearRegression()
regr.fit(X, Y)

Evaluation


In [ ]:
from sklearn.metrics.regression import mean_squared_error
mean_squared_error(Y, regr.predict(X))

Visualisation


In [ ]:
pred_Y = regr.predict(X)

plt.figure(figsize=(15, 5))
plt.scatter(X, Y, color='b', alpha=0.4)
plt.plot(X, pred_Y + 0, color='r', alpha=0.4)
plt.legend(['Actual Line', 'Predicted Line + 0 offset'])
plt.xlabel('X - Input values')
plt.ylabel('Y - Response values')

Notes:

  • Linear Models are great but they have their limitations.
    • Example: Like above they cannot describe(non linear) complexity well.
  • Polynomial Models generally do a great job at defining complex models but they too have theri limits.
    • Can take more time
    • Explodes with increase in number of dimensions