Stochastic Gradient Descent l
That is a simple yet very efficient approach to discriminative learning of linear classifiers under convex loss function such as(linear) Support Vector Machine
and Logistic Regression
.
In [10]:
from sklearn.linear_model import SGDClassifier
X = [[0, 0], [1, 1]]
y = [0, 1]
clf = SGDClassifier(loss='hinge', penalty='l2')
clf.fit(X, y)
Out[10]:
In [2]:
clf.predict([[2., 2.]])
Out[2]:
In [3]:
clf.coef_
Out[3]:
In [4]:
#To get the signed distance to the hyperplane
clf.decision_function([[2., 2.]])
Out[4]:
The concrete loss function can be set via the loss
paramters,
Using loss = 'log' or loss='modified_huber' enables the predict_proba
method, which gives a vector of probabilty estimates $P(y \rvert x) $per sample x:
In [8]:
clf = SGDClassifier(loss='log').fit(X, y)
clf.predict_proba([[1., 1.]])
Out[8]:
In [13]:
from sklearn.preprocessing import StandardScaler
sclarer = StandardScaler()
sclarer.fit(X_train)
sclarer.fit(X_test)
Given a set of training examples$(x_1, y_1), (x_2, y_2),\ldots, (x_n,y_n)$ where $x_i \in R^n$ and $y_i \in \{-1, 1\}$, our goal is to learn a linear scoring function $f(x)=\omega ^ T x+ b$ and the regularized training error given by $$E(w,b)=\frac{1}{n}\sum_{i=1}^{n}L(y_i-f(x_i))+\alpha R(\omega)$$