In [5]:
import pandas as pd
import numpy as np
from fbprophet import Prophet

DATA_HOME_DIR = '/data/airline'

The input to Prophet is always a dataframe with two columns: ds and y. The ds (datestamp) column must contain a date or datetime (either is fine). The y column must be numeric, and represents the measurement we wish to forecast.


In [60]:
df = pd.read_csv(DATA_HOME_DIR+'/international-airline-passengers.csv',
                 sep=';',
                 names=['ds', 'y'],
                 header=0,
                 parse_dates=[0],
                 nrows=144,
                )
df.head(3)
df.info()


Out[60]:
ds y
0 1949-01-01 112
1 1949-02-01 118
2 1949-03-01 132
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 144 entries, 0 to 143
Data columns (total 2 columns):
ds    144 non-null datetime64[ns]
y     144 non-null int64
dtypes: datetime64[ns](1), int64(1)
memory usage: 2.3 KB

It looks like we have a exponential growth trend in the data, so in order to accomodate for the linear fitting we take the log.


In [63]:
df['y'] = np.log(df['y'])

We only have monthly data, so certainly there will be no weekly seasonality in the date. Also forecasting must take this into account and choose the right frequency.


In [70]:
m = Prophet(weekly_seasonality=False)

m.fit(df)


Out[70]:
<fbprophet.forecaster.Prophet at 0x7fef10349b00>

Predictions are then made on a dataframe with a column ds containing the dates for which a prediction is to be made. You can get a suitable dataframe that extends into the future a specified number of days using the helper method Prophet.make_future_dataframe. By default it will also include the dates from the history, so we will see the model fit as well.


In [71]:
future = m.make_future_dataframe(periods=36, freq='M')
future.tail()


Out[71]:
ds
176 1963-08-31
177 1963-09-30
178 1963-10-31
179 1963-11-30
180 1963-12-31

The predict method will assign each row in future a predicted value which it names yhat. If you pass in historical dates, it will provide an in-sample fit. The forecast object here is a new dataframe that includes a column yhat with the forecast, as well as columns for components and uncertainty intervals.


In [72]:
forecast = m.predict(future)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()


Out[72]:
ds yhat yhat_lower yhat_upper
176 1963-08-31 6.597007 6.467691 6.725999
177 1963-09-30 6.423107 6.283125 6.561592
178 1963-10-31 6.291777 6.144299 6.434977
179 1963-11-30 6.443954 6.289768 6.594750
180 1963-12-31 6.473974 6.313121 6.625843

In [73]:
m.plot(forecast)


Out[73]:

If you want to see the forecast components, you can use the Prophet.plot_components method. By default you’ll see the trend, yearly seasonality, and weekly seasonality of the time series. If you include holidays, you’ll see those here, too.


In [74]:
m.plot_components(forecast)


Out[74]:

In [ ]: