Title: Recursive Feature Elimination
Slug: recursive_feature_elimination
Summary: How to do recursive feature elimination for machine learning in Python.
Date: 2017-09-14 12:00
Category: Machine Learning
Tags: Feature Selection
Authors: Chris Albon
In [20]:
# Load libraries
from sklearn.datasets import make_regression
from sklearn.feature_selection import RFECV
from sklearn import datasets, linear_model
import warnings
# Suppress an annoying but harmless warning
warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")
In [21]:
# Generate features matrix, target vector, and the true coefficients
X, y = make_regression(n_samples = 10000,
n_features = 100,
n_informative = 2,
random_state = 1)
In [22]:
# Create a linear regression
ols = linear_model.LinearRegression()
In [23]:
# Create recursive feature eliminator that scores features by mean squared errors
rfecv = RFECV(estimator=ols, step=1, scoring='neg_mean_squared_error')
# Fit recursive feature eliminator
rfecv.fit(X, y)
# Recursive feature elimination
rfecv.transform(X)
Out[23]:
In [24]:
# Number of best features
rfecv.n_features_
Out[24]: