CSAL4243: Introduction to Machine Learning

Muhammad Mudassir Khan (mudasssir.khan@ucp.edu.pk)

Lecture 2: Linear Regression

Overview



What is Machine Learning?

  • Machine Learning is making computers/machcines learn from data
  • Learning improve over time with more data

Definition

Mitchell ( 1997 ) define Machine Learning as “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P , if its performance at tasks in T , as measured by P , improves with experience E .”

Example: playing checkers.

T = the task of playing checkers.

E = the experience of playing many games of checkers

P = the probability that the program will win the next game.



The three different types of machine learning

<img style="float: left;" src="images/01_01.png", width=500>



Supervised Learning

<img style="float: left;" src="images/01_02.png", width=500>



Regression for predicting continuous outcomes

<img style="float: left;" src="images/01_04.png", width=300> <img style="float: right;" src="images/01_11.png", width=500>

Classification for predicting class labels

<img style="float: left;" src="images/01_03.png", width=300> <img style="float: right;" src="images/01_12.png", width=500>



Unsupervised Learning



Reinforcement Learning

<img style="float: left;" src="images/01_05.png", width=300>



Machine Learning pipeline

  • x is called input variables or input features.

  • y is called output or target variable. Also sometimes known as label.

  • h is called hypothesis or model.

  • pair (x(i),y(i)) is called a sample or training example

  • dataset of all training examples is called training set.

  • m is the number of samples in a dataset.

  • n is the number of features in a dataset excluding label.

<img style="float: left;" src="images/02_02.png", width=400> <img style="float: right;" src="images/02_03.png", width=400>

Question ?

  • What is x(2) and y(2)?



Goal of Machine Learning algorithm

  • How well the algorithm will perform on unseen data.
  • Also called generalization.



Linear Regression with one variable

Model Representation

  • Model is represented by h$\theta$(x) or simply h(x)

  • For Linear regression with one input variable h(x) = $\theta$0 + $\theta$1x

  • $\theta$0 and $\theta$1 are called weights or parameters.
  • Need to find $\theta$0 and $\theta$1 that maximizes the performance of model.

Question

<img style="float: left;" src="images/02_15.jpg", width=600>




Cost Function

<img style="float: left;" src="images/02_14.png", width=700>

Let $\hat{y}$ = h(x) = $\theta$0 + $\theta$1x

Error in single sample (x,y) = $\hat{y}$ - y = h(x) - y

Cummulative error of all m samples = $\sum_{i=1}^{m} (h(x^i) - y^i)^2$

Finally mean error or cost function = J($\theta$) = $\frac{1}{2m}\sum_{i=1}^{m} (h(x^i) - y^i)^2$



Simple case when $\theta_0$ = 0

<img style="float: center;" src="images/02_06.png", width=700>


Question

<img style="float: center;" src="images/02_15.png", width=700>


<img style="float: center;" src="images/02_07.png", width=700>

<img style="float: center;" src="images/02_08.png", width=700>

<img style="float: center;" src="images/02_09.png", width=700>

When both $\theta_0$ and $\theta_1$ can vary


<img style="float: center;" src="images/02_10.png", width=700>

<img style="float: center;" src="images/02_11.png", width=700>

<img style="float: center;" src="images/02_12.png", width=700>

<img style="float: center;" src="images/02_13.png", width=700>



So what is the price of the house?

Read data


In [1]:
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt

# read data in pandas frame
dataframe = pd.read_csv('datasets/house_dataset1.csv')

# assign x and y
x_feature = dataframe[['Size']]
y_labels = dataframe[['Price']]

In [2]:
# check data by printing first few rows
dataframe.head()


Out[2]:
Size Price
0 2104 399900
1 1600 329900
2 2400 369000
3 1416 232000
4 3000 539900

Plot data


In [3]:
#visualize results
plt.scatter(x_feature, y_labels)
plt.show()



In [14]:
y_labels.shape


Out[14]:
(47, 1)

Train model


In [4]:
#train model on data
body_reg = linear_model.LinearRegression()
body_reg.fit(x_feature, y_labels)

print ('theta0 = ',body_reg.intercept_)
print ('theta1 = ',body_reg.coef_)


theta0 =  [ 71270.49244873]
theta1 =  [[ 134.52528772]]

Predict output using trained model


In [5]:
hx = body_reg.predict(x_feature)

Plot results


In [6]:
plt.scatter(x_feature, y_labels)
plt.plot(x_feature, hx)
plt.show()


Do it yourself


In [18]:
theta0 = 0
theta1 = 0
inc = 1.0

#loop over all values of theta1 from 0 to 1000 with an increment of inc and find cost. 
# The one with minimum cost is the answer.
m = x_feature.shape[0]
n = x_feature.shape[1]

# optimal values to be determined
minCost = 100000000000000
optimal_theta = 0

while theta1 < 1000:
    cost = 0;
    for indx in range(m):
        hx = theta1*x_feature.values[indx,0] + theta0
        cost += pow((hx - y_labels.values[indx,0]),2)
               
    cost = cost/(2*m)        
#     print(theta1)
#     print(cost)
    
    if cost < minCost:
        minCost =  cost
        optimal_theta = theta1
    theta1 += inc
        
print ('theta0 = ', theta0)        
print ('theta1 = ',optimal_theta)


theta0 =  0
theta1 =  165.0

Predict labels using model and print it


In [19]:
hx = optimal_theta*x_feature

In [20]:
plt.scatter(x_feature, y_labels)
plt.plot(x_feature, hx)
plt.show()


Credits

Raschka, Sebastian. Python machine learning. Birmingham, UK: Packt Publishing, 2015. Print.

Andrew Ng, Machine Learning, Coursera