Regression & Classification Intuition

Credit: Deep Learning A-Z™: Hands-On Artificial Neural Networks

Simple Linear Regression Intuition

y: Dependent variable (DV), a variable you are trying to explain.
x: Independent variable (IV), assume that it cause the dependent variable to change, or imply an association
$b_1$: Coefficient, how a unit change in x affects a unit change in y.
$b_0$: constant term

Simple Linear Regression is basically trying to draw a line that best fit the data.

An example below: Salary based on years of experience

$b_0$ : intercept (constant), which means as soon as somebody graduate, that person has a starting salary of 30k.
$b_1$ : slope of the line, the steeper the line, the more money you get for extra years of experience.

The best fit line is the line which has the minimum sum of squared errors (ordinary least squared).

errors is calculated by taking the sum of difference between predicted value and actual value.

Multiple Linear Regression Intuition

Similar to Simple Linear Regression but with multiple independent variables.

Logistic Regression Intuition

When we have a use case where our dependent variable is an boolean action (Y/N), applying linear regression is not a good approach. Instead of predicting exactly if the person will do the action or not, we predict the probability or likelihood of that person will take that action.

The probability is from 0 to 1. In this graph, we can see clearly the middle part between the 2 straight line of 0 and 1. That part has an increasing probability as it moves to the right, or the age is increasing. Which means the older they are, the higher the probability this person will likely to take an action.

Mathematically speaking, if we apply the sigmoid function on top of the linear regression, we will get the logistic regression. Similar to linear regression, logistic regression will find the best fitting line that can fit the dataset.

Logistic regression is used to predict the probability $\hat p$. Let's do an example:

We have 4 independent variables (age): 20, 30, 40, 50.
How can we find the probability?
- First, we project those 4 values onto the curve. Those are called the fitted values.
- Then we project those values onto the y-axis to find the probability between 0 and 1.

How to get predicted value from probability?

We put a threshold, for example 0.5
If the predicted values < threshold, we project the values downward onto the zero-line. $\hat y = 0$
If the predicted values >= threshold, we project the values upward onto the one-line. $\hat y = 1$

Depends on your use case, you can choose your own threshold.