In [38]:
import pandas as pd
import numpy as np
from fbprophet import Prophet
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize']=(20,10)
plt.style.use('ggplot')
Let's load our data to analyze. For this example, I'm going to use some stock market data to be able to show some clear trend changes. This data can be downloaded from FRED (https://fred.stlouisfed.org/series/SP500) or just grab it from the examples directory.
In [68]:
market_df = pd.read_csv('../examples/SP500.csv', index_col='DATE', parse_dates=True)
In [69]:
market_df.head()
Out[69]:
In [70]:
df = market_df.reset_index().rename(columns={'DATE':'ds', 'SP500':'y'})
df['y'] = np.log(df['y'])
In [71]:
df.head()
Out[71]:
In [72]:
df.set_index('ds').y.plot()
Out[72]:
As before, let's instantiate prophet and fit our data (including our future dataframe). Take a look at http://pythondata.com/forecasting-time-series-data-prophet-jupyter-notebook/ for more information on the basics of Prophet.
In [73]:
model = Prophet()
model.fit(df);
future = model.make_future_dataframe(periods=366)
forecast = model.predict(future)
Prophet creates changespoint for us by default and stores them in .changepoints
. You can see below what the possible changepoints are (they are just shown as dates). By default, Prophet adds 25 changepoints into the initial 80% of the dataset. The number of changepoints can be set by using the n_changepoints
parameter when initiallizing prophet (e.g., model=Prophet(n_changepoints=30)
In [74]:
print model.changepoints
We can view the possible changepoints by plotting the forecast and changepoints using the following code:
In [95]:
figure = model.plot(forecast)
for changepoint in model.changepoints:
plt.axvline(changepoint,ls='--', lw=1)
Taking a look at the possible changepoints (drawn in orange/red) in the above chart, we can see they fit pretty well with some of the highs and lows.
Prophet will also let us take a look at the magnitudes of these possible changepoints. You can look at this visualization with the following code (edited from the fbprophet example here -> https://github.com/facebookincubator/prophet/blob/master/notebooks/trend_changepoints.ipynb)
In [97]:
deltas = model.params['delta'].mean(0)
fig = plt.figure(facecolor='w')
ax = fig.add_subplot(111)
ax.bar(range(len(deltas)), deltas)
ax.grid(True, which='major', c='gray', ls='-', lw=1, alpha=0.2)
ax.set_ylabel('Rate change')
ax.set_xlabel('Potential changepoint')
fig.tight_layout()
We can see from the above chart, that there are quite a few of these changes points (found between 10 and 20 on the chart) that are very minimal in magnitude and are most likely to be ignored by prophet during forecasting be used in the forecasting.
Now, if we know where trends changed in the past, we can add these known changepoints into our dataframe for use by Prophet.
For this data, I'm going to use the FRED website to find some of the low points and high points to use as trend changepoints. Note: In actuality, just because there is a low or high doesn't mean its a real changepoint or trend change, but let's assume it does.
In [104]:
m = Prophet(changepoints=['2009-03-09', '2010-07-02', '2011-09-26', '2012-03-20', '2010-04-06'])
forecast = m.fit(df).predict(future)
m.plot(forecast);
From the above, you can easily see that our identifying just a few changepoints has drastically changed the forecast of this model. There's a significant differrence between this model and the original with the default prophet changepoints. Unless you are very sure about your trend changepoints in the past, its probably good to keep the defaults that prophet provides.
Prophet's use (and accessibility) of trend changepoints is wonderful, especially for those signals / datasets that have significant changes in trend during the lifetime.
Lastly - please don't think that because prophet does an OK job of forecasting the SP500 chart in this example that you should use it to 'predict' the markets. The markets are awfully tough to forecast...I used this market data because I knew there were some very clear changepoints in the data.
In [ ]: