Mitchell ( 1997 ) define Machine Learning as “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P , if its performance at tasks in T , as measured by P , improves with experience E .”
Example: playing checkers.
T = the task of playing checkers.
E = the experience of playing many games of checkers
P = the probability that the program will win the next game.
<img style="float: left;" src="images/01_01.png", width=500>
<img style="float: left;" src="images/01_02.png", width=500>
<img style="float: left;" src="images/01_04.png", width=300> <img style="float: right;" src="images/01_11.png", width=500>
<img style="float: left;" src="images/01_03.png", width=300> <img style="float: right;" src="images/01_12.png", width=500>
<img style="float: left;" src="images/01_06.png", width=300>
<img style="float: left;" src="images/01_05.png", width=300>
<img style="float: left;" src="images/model.png", width=500>
x is called input variables or input features.
y is called output or target variable. Also sometimes known as label.
h is called hypothesis or model.
pair (x(i),y(i)) is called a sample or training example
dataset of all training examples is called training set.
m is the number of samples in a dataset.
n is the number of features in a dataset excluding label.
<img style="float: left;" src="images/02_02.png", width=400> <img style="float: right;" src="images/02_03.png", width=400>
Model is represented by h$\theta$(x) or simply h(x)
For Linear regression with one input variable h(x) = $\theta$0 + $\theta$1x
<img style="float: left;" src="images/02_04.png", width=500>
Let $\hat{y}$ = h(x) = $\theta$0 + $\theta$1x
Error in single sample (x,y) = $\hat{y}$ - y = h(x) - y
Cummulative error of all m samples = $\sum_{i=1}^{m} (h(x^i) - y^i)^2$
Finally mean error or cost function = J($\theta$) = $\frac{1}{2m}\sum_{i=1}^{m} (h(x^i) - y^i)^2$
<img style="float: center;" src="images/02_06.png", width=700>
<img style="float: center;" src="images/02_07.png", width=700>
<img style="float: center;" src="images/02_08.png", width=700>
<img style="float: center;" src="images/02_09.png", width=700>
<img style="float: center;" src="images/02_10.png", width=700>
<img style="float: center;" src="images/02_11.png", width=700>
<img style="float: center;" src="images/02_12.png", width=700>
<img style="float: center;" src="images/02_13.png", width=700>
In [1]:
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt
# read data in pandas frame
dataframe = pd.read_csv('datasets/house_dataset1.csv')
# assign x and y
x_feature = dataframe[['Size']]
y_labels = dataframe[['Price']]
In [90]:
# check data by printing first few rows
dataframe.head()
Out[90]:
In [91]:
#visualize results
plt.scatter(x_feature, y_labels)
plt.show()
In [92]:
y_labels.shape
Out[92]:
In [93]:
#train model on data
body_reg = linear_model.LinearRegression()
body_reg.fit(x_feature, y_labels)
Out[93]:
In [94]:
hx = body_reg.predict(x_feature)
In [95]:
plt.scatter(x_feature, y_labels)
plt.plot(x_feature, hx)
plt.show()
In [99]:
body_reg.coef_()
In [86]:
theta0 = 0
theta1 = 0
inc = 1.0
#loop over all values of theta1 from -3.14 to 3.14 with an increment of inc and find cost.
# The one with minimum cost is the answer.
m = x_feature.shape[0]
n = x_feature.shape[1]
# optimal values to be determined
minCost = 100000000000000
optimal_theta = 0
while theta1 < 1000:
cost = 0;
for indx in range(m):
hx = theta1*x_feature.values[indx,0] + theta0
cost += pow((hx - y_labels.values[indx,0]),2)
cost = cost/(2*m)
# print(theta1)
# print(cost)
if cost < minCost:
minCost = cost
optimal_theta = theta1
theta1 += inc
print (optimal_theta)
In [88]:
pred = optimal_theta*x_feature
In [84]:
pred.shape
Out[84]:
In [89]:
plt.scatter(x_feature, y_labels)
plt.plot(x_feature, pred)
plt.show()