In [5]:
import pandas as pd
import numpy as np
from fbprophet import Prophet
DATA_HOME_DIR = '/data/airline'
The input to Prophet is always a dataframe with two columns: ds
and y
. The ds (datestamp)
column must contain a date or datetime (either is fine). The y
column must be numeric, and represents the measurement we wish to forecast.
In [60]:
df = pd.read_csv(DATA_HOME_DIR+'/international-airline-passengers.csv',
sep=';',
names=['ds', 'y'],
header=0,
parse_dates=[0],
nrows=144,
)
df.head(3)
df.info()
Out[60]:
It looks like we have a exponential growth trend in the data, so in order to accomodate for the linear fitting we take the log.
In [63]:
df['y'] = np.log(df['y'])
We only have monthly data, so certainly there will be no weekly seasonality in the date. Also forecasting must take this into account and choose the right frequency.
In [70]:
m = Prophet(weekly_seasonality=False)
m.fit(df)
Out[70]:
Predictions are then made on a dataframe with a column ds containing the dates for which a prediction is to be made. You can get a suitable dataframe that extends into the future a specified number of days using the helper method Prophet.make_future_dataframe
. By default it will also include the dates from the history, so we will see the model fit as well.
In [71]:
future = m.make_future_dataframe(periods=36, freq='M')
future.tail()
Out[71]:
The predict
method will assign each row in future a predicted value which it names yhat. If you pass in historical dates, it will provide an in-sample fit. The forecast object here is a new dataframe that includes a column yhat with the forecast, as well as columns for components and uncertainty intervals.
In [72]:
forecast = m.predict(future)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()
Out[72]:
In [73]:
m.plot(forecast)
Out[73]:
If you want to see the forecast components, you can use the Prophet.plot_components
method. By default you’ll see the trend, yearly seasonality, and weekly seasonality of the time series. If you include holidays, you’ll see those here, too.
In [74]:
m.plot_components(forecast)
Out[74]:
In [ ]: