This notebook covers using metrics to analyze the 'accuracy' of prophet models. In this notebook, we will extend the previous example (http://pythondata.com/forecasting-time-series-data-prophet-part-3/).
In [1]:
import pandas as pd
import numpy as np
from fbprophet import Prophet
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
%matplotlib inline
plt.rcParams['figure.figsize']=(20,10)
plt.style.use('ggplot')
In [2]:
sales_df = pd.read_csv('../examples/retail_sales.csv', index_col='date', parse_dates=True)
In [3]:
sales_df.head()
Out[3]:
In [4]:
df = sales_df.reset_index()
In [5]:
df.head()
Out[5]:
Let's rename the columns as required by fbprophet. Additioinally, fbprophet doesn't like the index to be a datetime...it wants to see 'ds' as a non-index column, so we won't set an index differnetly than the integer index.
In [6]:
df=df.rename(columns={'date':'ds', 'sales':'y'})
In [7]:
df.head()
Out[7]:
Now's a good time to take a look at your data. Plot the data using pandas' plot
function
In [8]:
df.set_index('ds').y.plot()
Out[8]:
Now, let's set prophet up to begin modeling our data using our promotions
dataframe as part of the forecast
Note: Since we are using monthly data, you'll see a message from Prophet saying Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
This is OK since we are workign with monthly data but you can disable it by using weekly_seasonality=True
in the instantiation of Prophet.
In [9]:
model = Prophet(weekly_seasonality=True)
model.fit(df);
We've instantiated the model, now we need to build some future dates to forecast into.
In [10]:
future = model.make_future_dataframe(periods=24, freq = 'm')
future.tail()
Out[10]:
To forecast this future data, we need to run it through Prophet's model.
In [11]:
forecast = model.predict(future)
The resulting forecast dataframe contains quite a bit of data, but we really only care about a few columns. First, let's look at the full dataframe:
In [12]:
forecast.tail()
Out[12]:
We really only want to look at yhat, yhat_lower and yhat_upper, so we can do that with:
In [13]:
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()
Out[13]:
In [14]:
model.plot(forecast);
Personally, I'm not a fan of this visualization but I'm not going to build my own...you can see how I do that here: https://github.com/urgedata/pythondata/blob/master/fbprophet/fbprophet_part_one.ipynb.
Additionally, prophet let's us take a at the components of our model, including the holidays. This component plot is an important plot as it lets you see the components of your model including the trend and seasonality (identified in the yearly
pane).
In [15]:
model.plot_components(forecast);
Now that we have our model, let's take a look at how it compares to our actual values using a few different metrics - R-Squared and Mean Squared Error (MSE).
To do this, we need to build a combined dataframe with yhat from the forecasts and the original 'y' values from the data.
In [26]:
metric_df = forecast.set_index('ds')[['yhat']].join(df.set_index('ds').y).reset_index()
In [25]:
metric_df.tail()
Out[25]:
You can see from the above, that the last part of the dataframe has "NaN" for 'y'...that's fine because we are only concerend about checking the forecast values versus the actual values so we can drop these "NaN" values.
In [27]:
metric_df.dropna(inplace=True)
In [28]:
metric_df.tail()
Out[28]:
Now let's take a look at our R-Squared value
In [30]:
r2_score(metric_df.y, metric_df.yhat)
Out[30]:
An r-squared value of 0.99 is amazing (and probably too good to be true, which tells me this data is most likely overfit).
In [31]:
mean_squared_error(metric_df.y, metric_df.yhat)
Out[31]:
That's a large MSE value...and confirms my suspicion that this data is overfit and won't likely hold up well into the future. Remember...for MSE, closer to zero is better.
Now...let's see what the Mean Absolute Error (MAE) looks like.
In [32]:
mean_absolute_error(metric_df.y, metric_df.yhat)
Out[32]:
Not good. Not good at all. BUT...the purpose of this particular post is to show some usage of R-Squared, MAE and MSE's as metrics and I think we've done that.
I can tell you from experience that part of the problem with this particular data is that its monthly and there aren't that many data points to start with (only 72 data points...not ideal for modeling).
In [ ]: