Title: Logistic Regression
Slug: logistic_regression
Summary: How to train a logistic regression in scikit-learn.
Date: 2017-09-21 12:00
Category: Machine Learning
Tags: Logistic Regression
Authors: Chris Albon
Despite having "regression" in its name, a logistic regression is actually a widely used binary classifier (i.e. the target vector can only take two values). In a logistic regression, a linear model (e.g. $\beta_{0}+\beta_{1}x$) is included in a logistic (also called sigmoid) function, ${\frac {1}{1+e^{-z}}}$, such that:
$$P(y_i=1 \mid X)={\frac {1}{1+e^{-(\beta_{0}+\beta_{1}x)}}}$$where $P(y_i=1 \mid X)$ is the probability of the $i$th observation's target value, $y_i$, being class 1, $X$ is the training data, $\beta_0$ and $\beta_1$ are the parameters to be learned, and $e$ is Euler's number.
In [1]:
    
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
    
In [2]:
    
# Load data with only two classes
iris = datasets.load_iris()
X = iris.data[:100,:]
y = iris.target[:100]
    
In [3]:
    
# Standarize features
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
    
In [4]:
    
# Create logistic regression object
clf = LogisticRegression(random_state=0)
    
In [5]:
    
# Train model
model = clf.fit(X_std, y)
    
In [6]:
    
# Create new observation
new_observation = [[.5, .5, .5, .5]]
    
In [7]:
    
# Predict class
model.predict(new_observation)
    
    Out[7]:
In [8]:
    
# View predicted probabilities
model.predict_proba(new_observation)
    
    Out[8]: