Title: Handling Missing Values In Time Series
Slug: handling_missing_values_in_time_series
Summary: How to handle the missing values in time series in pandas for machine learning in Python.
Date: 2017-09-11 12:00
Category: Machine Learning
Tags: Preprocessing Dates And Times
Authors: Chris Albon
In [1]:
# Load libraries
import pandas as pd
import numpy as np
In [2]:
# Create date
time_index = pd.date_range('01/01/2010', periods=5, freq='M')
# Create data frame, set index
df = pd.DataFrame(index=time_index)
# Create feature with a gap of missing values
df['Sales'] = [1.0,2.0,np.nan,np.nan,5.0]
In [3]:
# Interpolate missing values
df.interpolate()
Out[3]:
In [4]:
# Forward-fill
df.ffill()
Out[4]:
In [5]:
# Back-fill
df.bfill()
Out[5]:
In [6]:
# Interpolate missing values
df.interpolate(limit=1, limit_direction='forward')
Out[6]: