Title: Variance Thresholding For Feature Selection
Slug: variance_thresholding_for_feature_selection
Summary: How to select the best features for machine learning using variance thresholding in Python.
Date: 2017-09-14 12:00
Category: Machine Learning
Tags: Feature Selection
Authors: Chris Albon
In [1]:
from sklearn import datasets
from sklearn.feature_selection import VarianceThreshold
In [2]:
# Load iris data
iris = datasets.load_iris()
# Create features and target
X = iris.data
y = iris.target
In [3]:
# Create VarianceThreshold object with a variance with a threshold of 0.5
thresholder = VarianceThreshold(threshold=.5)
# Conduct variance thresholding
X_high_variance = thresholder.fit_transform(X)
In [4]:
# View first five rows with features with variances above threshold
X_high_variance[0:5]
Out[4]: