Title: Normalize A Column In Pandas
Slug: pandas_normalize_column
Summary: Normalize A Column In Pandas
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon
Based on: Sandman via StackOverflow.
In [1]:
# Import required modules
import pandas as pd
from sklearn import preprocessing
# Set charts to view inline
%matplotlib inline
In [2]:
# Create an example dataframe with a column of unnormalized data
data = {'score': [234,24,14,27,-74,46,73,-18,59,160]}
df = pd.DataFrame(data)
df
Out[2]:
In [3]:
# View the unnormalized data
df['score'].plot(kind='bar')
Out[3]:
In [4]:
# Create x, where x the 'scores' column's values as floats
x = df['score'].values.astype(float)
# Create a minimum and maximum processor object
min_max_scaler = preprocessing.MinMaxScaler()
# Create an object to transform the data to fit minmax processor
x_scaled = min_max_scaler.fit_transform(x)
# Run the normalizer on the dataframe
df_normalized = pd.DataFrame(x_scaled)
In [5]:
# View the dataframe
df_normalized
Out[5]:
In [6]:
# Plot the dataframe
df_normalized.plot(kind='bar')
Out[6]: