Gold Price Prediction - Baseline Model

Team 8

1. Description

Baseline Models

  • 0.1 - Price remains constant
  • 0.2 - Price is calculated using an autoregressive model where the coefficients are determined using the Particle Swarm Optimization algorithm.

2. Data Initialization


In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import prettyplotlib as ppl
import datetime

In [2]:
gold_data = pd.read_csv('WGC-GOLD_DAILY_USD_Monthly_197001_201409.csv')

3. Plots

3.1 Plot the gold prices for the 44 year period


In [3]:
#Setup Plotting Frame
fig1, ax1 = plt.subplots(1, figsize=(24,8))
n = len(gold_data.index)
x = gold_data['Date']
y = gold_data['Value']

date = []
for d in x:
    date.append(datetime.datetime.strptime(d,'%m/%d/%Y'))

#Plot the actual gold price
ppl.plot(ax1, date, y, label=str("Actual Trend"), linewidth=1.25, c='blue')
ax1.set_title("Gold Price - Currency USD\n")
ax1.set_ylabel('Price')
ax1.set_xlabel('Time')

#Start the 2 Prediction Models
y= gold_data['Value']
pred_y1 = np.copy(y)
pred_y2 = np.copy(y)

#Counter k
k = 3

#Model 0.1
for i in range(0, n, 1):
    if(i < n-k):
        pred_y1[i] = y[i+1]
    else:
        pred_y1[i] = y[i]
        
#Model 0.2      
for i in range(0, n, 1):
    if (i < n-k):
        pred_y2[i] = 0.976 * y[i+1] + 0.01373 * y[i+2] + 0.1157
        #Coefficients are given by the PSO paper
    else:
        pred_y2[i] = y[i]

#Plot those 3 models
ppl.plot(ax1, date, pred_y1, label=str("Model 0.1"), linewidth=0.75, c='orange')
ppl.plot(ax1, date, pred_y2, label=str("Model 0.2"), linewidth=0.75, c='green')
ppl.legend(ax1, loc='upper left')


Out[3]:
<matplotlib.legend.Legend at 0x7f89a3fe1b90>

3.2 Plot the errors respect to the percentage of gold prices for the 44 year period


In [4]:
fig2, ax2 = plt.subplots(1, figsize=(12,5))
#errVar_y0 = 100 * (np.absolute(pred_y0 - y) / y)
#ppl.plot(axes1, x, errVar_y0, label=str("% Error - Model 0.0"), linewidth=0.75, c='yellow')
ax2.set_title("Percentage Error \n ")
ax2.set_ylabel('Percentage')
ax2.set_xlabel('Time')

err_y1 = 100 * ((np.absolute(pred_y1-y)) / y)
err_y2 = 100 * ((np.absolute(pred_y2-y)) / y)

ppl.plot(ax2, date, err_y1, label=str("% Error - Model 0.1"), linewidth=0.75, c='orange')
ppl.plot(ax2, date, err_y2, label=str("% Error - Model 0.2"), linewidth=0.75, c='green')

ppl.legend(ax2, loc='upper left')


Out[4]:
<matplotlib.legend.Legend at 0x7f89a3f5bb90>

3.3 Error Statistics


In [5]:
err_df = pd.DataFrame(data=None, columns=['Model','Minimum % Error','Maximum % Error', 'Mean % Error'])
err_df['Model'] = ('Model 0.1', 'Model 0.2')
err_df['Minimum % Error'] = (min(err_y1), min(err_y2))
err_df['Maximum % Error'] = (max(err_y1), max(err_y2))
err_df['Mean % Error']= (np.mean(err_y1), np.mean(err_y2))
err_df


Out[5]:
Model Minimum % Error Maximum % Error Mean % Error
0 Model 0.1 0 36.638955 4.153630
1 Model 0.2 0 37.251134 4.284224

4. Predicting Results


In [6]:
forecast_df = pd.DataFrame(data=None, columns=['Model','October 2014','November 2014','December 2014'])
forecast_df['Model'] = ('Model 0.1', 'Model 0.2')

oct_forecast = []
oct_forecast.append(y[0])
oct_forecast.append(0.976 * y[0] + 0.01373 * y[1] + 0.1157)

nov_forecast = []
nov_forecast.append(oct_forecast[0])
nov_forecast.append(0.976 * oct_forecast[1] + 0.01373 * y[0] + 0.1157)

dec_forecast = []
dec_forecast.append(nov_forecast[0])
dec_forecast.append(0.976 * nov_forecast[1] + 0.01373 * oct_forecast[1] + 0.1157)

forecast_df['October 2014'] = oct_forecast
forecast_df['November 2014'] = nov_forecast
forecast_df['December 2014'] = dec_forecast

forecast_df


Out[6]:
Model October 2014 November 2014 December 2014
0 Model 0.1 1216.500000 1216.500000 1216.500000
1 Model 0.2 1205.073734 1192.970209 1181.000287

In [6]: