Standardization, or mean removal and variance scaling


scale function


In [1]:
from sklearn import preprocessing
import numpy as np

X_train = np.array([[ 1., -1.,  2.],
                    [ 2.,  0.,  0.],
                    [ 0.,  1., -1.]])
X_scaled = preprocessing.scale(X_train)

X_scaled


Out[1]:
array([[ 0.        , -1.22474487,  1.33630621],
       [ 1.22474487,  0.        , -0.26726124],
       [-1.22474487,  1.22474487, -1.06904497]])

In [3]:
X_scaled.mean(axis=0), X_scaled.std(axis=0)


Out[3]:
(array([ 0.,  0.,  0.]), array([ 1.,  1.,  1.]))

StandardScaler


In [12]:
scaler = preprocessing.StandardScaler().fit(X_train)
scaler


Out[12]:
StandardScaler(copy=True, with_mean=True, with_std=True)

In [13]:
scaler.mean_, scaler.scale_, scaler.transform(X_train)


Out[13]:
(array([ 1.        ,  0.        ,  0.33333333]),
 array([ 0.81649658,  0.81649658,  1.24721913]),
 array([[ 0.        , -1.22474487,  1.33630621],
        [ 1.22474487,  0.        , -0.26726124],
        [-1.22474487,  1.22474487, -1.06904497]]))

Once a scaler instance has been fitted, it can perform the same transformation on new data:


In [5]:
X_test = [[-1., 1., 0.]]
scaler.transform(X_test)


Out[5]:
array([[-2.44948974,  1.22474487, -0.26726124]])

Scaling features to a range


MinMaxScaler


In [6]:
X_train = np.array([[ 1., -1.,  2.],
                    [ 2.,  0.,  0.],
                    [ 0.,  1., -1.]])

min_max_scaler = preprocessing.MinMaxScaler()
X_train_minmax = min_max_scaler.fit_transform(X_train)
X_train_minmax


Out[6]:
array([[ 0.5       ,  0.        ,  1.        ],
       [ 1.        ,  0.5       ,  0.33333333],
       [ 0.        ,  1.        ,  0.        ]])

In [7]:
X_test = np.array([[ -3., -1.,  4.]])
X_test_minmax = min_max_scaler.transform(X_test)
X_test_minmax


Out[7]:
array([[-1.5       ,  0.        ,  1.66666667]])

In [8]:
min_max_scaler.scale_, min_max_scaler.min_


Out[8]:
(array([ 0.5       ,  0.5       ,  0.33333333]),
 array([ 0.        ,  0.5       ,  0.33333333]))

MaxAbsScaler


In [9]:
X_train = np.array([[ 1., -1.,  2.],
                    [ 2.,  0.,  0.],
                    [ 0.,  1., -1.]])

max_abs_scaler = preprocessing.MaxAbsScaler()
X_train_maxabs = max_abs_scaler.fit_transform(X_train)
X_train_maxabs                # doctest +NORMALIZE_WHITESPACE^


Out[9]:
array([[ 0.5, -1. ,  1. ],
       [ 1. ,  0. ,  0. ],
       [ 0. ,  1. , -0.5]])

In [10]:
X_test = np.array([[ -3., -1.,  4.]])
X_test_maxabs = max_abs_scaler.transform(X_test)
X_test_maxabs


Out[10]:
array([[-1.5, -1. ,  2. ]])

In [11]:
max_abs_scaler.scale_


Out[11]:
array([ 2.,  1.,  2.])

minmax_scale function


In [14]:
preprocessing.minmax_scale(X_train)


Out[14]:
array([[ 0.5       ,  0.        ,  1.        ],
       [ 1.        ,  0.5       ,  0.33333333],
       [ 0.        ,  1.        ,  0.        ]])

maxabs_scale function


In [15]:
preprocessing.maxabs_scale(X_train)


Out[15]:
array([[ 0.5, -1. ,  1. ],
       [ 1. ,  0. ,  0. ],
       [ 0. ,  1. , -0.5]])

Scaling Sparse Data

Centering sparse data destroys the sparseness which may be desirable. It might make sense to scale the sparse data with the other data--especially if features contain different scales. Use MaxAbsScaler in this situation.

Non-Linear Transformation


QuantileTransformer


In [18]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()

X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

np.percentile(X_train[:, 0], [0, 25, 50, 75, 100])


Out[18]:
array([ 4.3,  5.1,  5.8,  6.5,  7.9])

In [19]:
quantile_transformer = preprocessing.QuantileTransformer(random_state=0)

X_train_trans = quantile_transformer.fit_transform(X_train)

X_test_trans = quantile_transformer.transform(X_test)

In [20]:
np.percentile(X_train_trans[:, 0], [0, 25, 50, 75, 100])


Out[20]:
array([  9.99999998e-08,   2.38738739e-01,   5.09009009e-01,
         7.43243243e-01,   9.99999900e-01])

In [21]:
np.percentile(X_test[:, 0], [0, 25, 50, 75, 100]), np.percentile(X_test_trans[:, 0], [0, 25, 50, 75, 100])


Out[21]:
(array([ 4.4  ,  5.125,  5.75 ,  6.175,  7.3  ]),
 array([ 0.01351351,  0.25012513,  0.47972973,  0.6021021 ,  0.94144144]))

You can map the data to a normal distribution:


In [22]:
quantile_transformer = preprocessing.QuantileTransformer(output_distribution='normal', random_state=0)

X_trans = quantile_transformer.fit_transform(X)

quantile_transformer.quantiles_


Out[22]:
array([[ 4.3       ,  2.        ,  1.        ,  0.1       ],
       [ 4.31491491,  2.02982983,  1.01491491,  0.1       ],
       [ 4.32982983,  2.05965966,  1.02982983,  0.1       ],
       ..., 
       [ 7.84034034,  4.34034034,  6.84034034,  2.5       ],
       [ 7.87017017,  4.37017017,  6.87017017,  2.5       ],
       [ 7.9       ,  4.4       ,  6.9       ,  2.5       ]])

In [ ]: