This tutorial is based on Time Series Forecasting with the Long Short-Term Memory Network in Python by Jason Brownlee.
Before we get into the example, lets look at some visitor data from Yellowstone National park.
In [1]:
# load and plot dataset
from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
# load dataset
def parser(x):
return datetime.strptime(x, '%Y-%m-%d')
series = read_csv('../data/yellowstone-visitors.csv', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser=parser)
# summarize first few rows
# line plot
The park's recreational visits are highly seasonable with the peak season in July. The park tracks monthly averages from the last four years on it's web site. A simple approach to predict the next years visitors, is to use these averages.
In [2]:
prev_4_years = series[-60:-12]
last_year = series[12:]
pred = prev_4_years.groupby(by=prev_4_years.index.month).mean()
act = last_year.groupby(by=last_year.index.month).mean()
In [3]:
from math import sqrt
from sklearn.metrics import mean_squared_error
rmse = sqrt(mean_squared_error(act, pred))
print('Test RMSE: %.3f' % rmse)
In [ ]: