notebook.community

Edit and run

Ridge regression addresses some of the problems of Oridnatry Least Squares by imposing a penalty on the size of coefficients. The ridge coefficients minimize a penalized residual sum of squares

$$ min_{w} ||Xw - y ||_2^2 + \alpha ||w||_2^2$$

Here $\alpha \gt 0$ is a complexity parameter that controls the amount of shrinkage.

Ridge will take in its fit method arrays X, y and will store the coefficient $w$ of the linear model in its coef_ member



In [1]:

    
from sklearn import linear_model
clf = linear_model.Ridge(alpha = .5)
clf.fit([[0,0],[0,0],[1,1]], [0, .1, 1])









    Out[1]:





Ridge(alpha=0.5, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)



In [2]:

    
clf.coef_









    Out[2]:





array([ 0.34545455,  0.34545455])



In [3]:

    
clf.intercept_









    Out[3]:





0.13636363636363641

Setting the regularization parameter: generalized Cross-Validation

RidgeCV implements the ridge regression with built-in cross-validation of the alpha parameter.



In [5]:

    
from sklearn import linear_model
clf = linear_model.RidgeCV(alphas=[0.1, 1.0, 10.0])
clf.fit([[0,0],[0,0],[1,1]],[0,.1,1])









    Out[5]:





RidgeCV(alphas=[0.1, 1.0, 10.0], cv=None, fit_intercept=True, gcv_mode=None,
    normalize=False, scoring=None, store_cv_values=False)



In [6]:

    
clf.alpha_









    Out[6]:





0.10000000000000001

References

http://scikit-learn.org/stable/modules/linear_model.html