Calvin Ku September 27, 2016
Problem with trading is that you never know when is the best time to buy or sell a stock, as you never know if the stock price will go up or go down in the future. This simple trading bot is an attempt to solve this problem.
Given the historical data of a stock, our chimp will tell you whether you should buy or sell or hold a particular stock today (in our case, the JPM).
The only data used in this project is the JPM historical data collected from Yahoo Finance. The data ranges from December 30, 1983 to September 27, 2016. We don't use S&P 500 ETF as ETFs are generally arbitrageable which can render the techniques we will use in this project (namely, VPA) useless.
This project is about building a trading robot. In this proejct we will call it the Chimp. The Chimp is built to give the common user suggestions on whether to buy or sell or hold a particular stock on a particular trading day. The goal of this project is to build a trading robot that can beat a random monkey bot. Inpired by the famous saying of Princeton University professor Burton Malkiel in 1973 that "A blindfolded monkey throwing darts at a newspaper’s financial pages could select a portfolio that would do just as well as one carefully selected by experts” and the Forbes article Any Monkey Can Beat the Market, instead of competing on a portfolio basis, we set our battlefield on JPM.
We will use JPM as an example in this project but the same method can be applied to any stock. In the end we will evaluate our method by giving the monkey bot (which chooses the three actions equally on a random basis) and our Chimp 1000 dollars and see how they perform from September 26, 2011 to September 27, 2016 on JPM.
In this project we use the cash in hand plus the portfolio value (number of shares in hand times the market price), the total assets as the metric. We also and define the reward function to be the ratio of the difference of the assets divided by the previous assets between the current state and the previous, i.e.: $$ R(s_i) = \frac{Cash(s_{i + 1}) + PV(s_{i + 1}) - Cash(s_i) - PV(s_i)}{Cash(s_i) + PV(s_i)} $$
This simple metric is in line with what we want the trading bot to achieve in that our ultimate goal is to make as much profit as possible given what we have put into the market, and it doesn't matter whether it's in cash or in shares.
In [57]:
from __future__ import division
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import time
import random
from collections import defaultdict
from sklearn.ensemble import RandomForestRegressor
from copy import copy, deepcopy
from numba import jit
pd.set_option('display.max_columns', 50)
dfSPY = pd.read_csv('allSPY.csv', index_col='Date', parse_dates=True, na_values = ['nan'])
dfJPM = pd.read_csv('JPM.csv', index_col='Date', parse_dates=True, na_values = ['nan'])
# del dfSPY.index.name
del dfJPM.index.name
# display(dfSPY)
start_date = '1983-12-30'
end_date = '2016-09-27'
dates = pd.date_range(start_date, end_date)
dfMain = pd.DataFrame(index=dates)
# dfMain = dfMain.join(dfSPY)
dfMain = dfMain.join(dfJPM)
dfMain.dropna(inplace=True)
print("Start date: {}".format(dfMain.index[0]))
print("End date: {}\n".format(dfMain.index[-1]))
print(dfMain.describe())
In [58]:
print("\nInspect missing values:")
display(dfMain.isnull().sum())
Since we won't be using data prior to 1993 for training, we can use SPY (S&P 500 ETF) to get trading days and see if we have any data missing for JPM.
In [61]:
spy_dates = pd.date_range('1993-01-29', end_date)
dfSPY = dfSPY.ix[spy_dates, :]
dfSPY.dropna(inplace=True)
print("Number of trading days: {}".format(len(dfSPY)))
print("Number of days where JPM are traded: {}".format(len(dfMain.ix[dfSPY.index, :])))
It seems to be good. Let's look at the first few lines:
In [3]:
display(dfMain.head())
We can see that we have six columns: Open, High, Low, Close, Volume, Adj Close. The Adj Close is the closing price of that day adjusted for "future" dividends payout and splits. For our usage, we will need to adjust the rest of columns as well.
In [4]:
dfMain['Adj Close'].plot(figsize=(14, 7))
Out[4]:
Starting from the beginning, the stock price generally has a upward trend, with a bad time from 2001 to 2003 and the crush at the end of 2008.
Now we can take a look at the correlations between the variables:
In [5]:
g = sns.PairGrid(dfMain)
g.map_upper(plt.scatter, alpha=0.3)
g.map_lower(plt.scatter, alpha=0.3)
g.map_diag(sns.kdeplot, lw=3, legend=False)
plt.subplots_adjust(top=0.95)
g.fig.suptitle('Feature Pair Grid')
Out[5]:
We can see it clearly on the Adj Close
rows and columns there are several lines. This is due to the fact that the Adj Close
varaible are adjusted for times where there are splits and dividends payout. From here we know we'll need to adjust other variables to match it.
The trading bot (from now on we refer to it as the Chimp, coined with our wishful expectation that she will be smarter than the Monkey Trader) consists of two parts. In the first part we implement Q-learning and run it through the historical data for some number of iterations to construct the Q-table. The Chimp can then go with the optimal policy by following the action of which the state-action pair has the maximum Q value. However, since the state space is vast and the coverage of the Q-table is very small, in the second part we use supervised learning to train on the Q-table and make predictions for the unseen states.
The core idea of reinforcement learning is simple.
One way to do this is to use a method called Q-learning. At the core of Q-learning is the Bellman equation.
In each iteration we use the Bellman equation to update the cell of our Q-table: $$ Q(s, a) \longleftarrow (1 - \alpha) \ Q(s, a) + \alpha \ (R(s) + \gamma \ max_{a'} Q(s', a')) $$
where (s, a) is the state-action pair, $\alpha$ the learning rate, $R(s)$ the reward function, and $\gamma$ the discount factor. And then the Chimp will follow the policy: $$ \pi(s) = argmax_{a}Q(s, a) $$
Although we don't have the Q value of any state-action pair to begin with, the reward function contains the "real" information and throughout the iterations that information will slowly propagate to each state-action pair. At some point the Q values will converge to a practical level (hopefully), and we end up with a Q table in psudo-equilibrium.
However, there's a catch. How does the Chimp know what to do before she has learned anything?
One important concept of reinforcement learning is the exploration-exploitation dilemma. Essentially it means when we take an action, we have to choose between whether to explore new possibilities, or just to follow what we've known to be best, to the best of our knowledge. And of course, if we don't know much, then following that limited knowledge of ours wouldn't make much sense. On the other hand, if we've already known pretty much everything, then there's just not much to explore, and wandering around mindlessly wouldn't make sense either.
To implement this concept, we want our Chimp to set out not having a bias (not that she's got much anyways), so we introduce the variable $\epsilon$, which represents the possibility of the Chimp taking random actions. Initially we set $\epsilon = 1$ , and gradually decreases its value as the Chimp getting to know more and more about its environment. As time goes, the Chimp will become wiser and wiser and spends most of her time following what's best for her, and less time on being an explorer.
For the supervised learning part, we will use the random forest. The random forest doesn't expect linear features or assuming features interacting with each other linearly. It has the advantage of a decision tree which generally can fit into any shape of data, while being ensemble and eliminates the problem of a decision tree being easily overfitting. The random and ensemble nature of the algorithm makes it very unlikely to overfit on the training data. Since we are combining supervised learning with reinforcement learning, the problem gets more complicated and we will have more parameters to tune, and it's good to have an algorithm that kind of works "out of the box" most of the time. On the other hand, random forest handles high dimensionality very well, which makes feature engineering a bit easier where we don't need to worry too much about whether it's the newly engineered features not representative or it's the algorithm not able to handle the enlarged dimensionality. In addition to this, random forest itself is very easy to tune. We can easily grid search through number of choices of features for each splitting and the number of trees. The ensemble nature of the algorithm also makes it scalable when we need to: 1. train on more data, 2. build more trees, 3. take more stocks into account . Overall, random forest generally gives good results and it has been recognized that ensemble algorithms like random forest perform over other traditional algorithms in the Kaggle community over the years and have become quite popular.
The random forest is an ensemble method, which "ensembles" a bunch of decision trees. Each decision tree is generated by creating "nodes" with features in our dataset.
In the training stage, a data point comes down and through the nodes of the decision tree. Each node classifies the data point and sends it to the next node. Say, for example we are classifying people to determine whether their annual income is above or below average, and one feature of our data is gender. And we will probably have values like male/female/other. Now say this data point is a female, then it will get sent down the path accordingly to the next node. The node at the bottom of the decision tree is sometimes referred to as a leaf. Our data point will end up in one of the leaves, and the label of the data point is going to mark that leaf. For example, if our data point is above income average, then we mark that leaf as "above count +1". At the end of the training, all leaves will be marked with labels above or below.
In the predicting stage, we run data down the decision tree and see which leaves they end up with. And then we assign the data the labels of the leaves accordingly.
We now know how each decision tree is constructed and have a forest of decision trees. The last step is to get these decision trees to vote. If 10 trees say this person is above and 5 say below, we predict the person as above.
As said earlier, the decision trees are constructed with the features of our dataset. However, not all of the features are used to construct each decision tree. This is where the random part of the algorithm comes in. Most implementation employs a method called bagging, which generates $m$ sub-datasets from the feature space of the orginal dataset by sampling with replacement, where the size of the sub-datasets is $n'$, relative to the size of the feature space of the original dataset, $n$. The features of each bag are then used to construct a decision tree model.
We won't cover everything about the random forest here. However it's worth noting some of the more important specifics of random forest that are not covered here:
We shall set three different benchmarks here. One theoretical, and two practical.
Since our goal is to make as much money as possible. The best role model we can have, would be a God Chimp. A God Chimp is a Chimp that can foresee the future, and trades accordingly. In our case, this is not very hard to do. We can instantiate a Chimp object and get her to iterate through the entire dataset, until the whole Q-table converges. And with that Q-table in hand, we can get the psudo-perfect action series, which can make us a good deal of money. We then compute the accuracy of the action series of our mortal Chimp for that of the God Chimp. Theoretically speaking, the higher the accuracy, the closer the behavior of the mortal Chimp to the God Chimp, the more money the mortal Chimp would be making.
That said, the ups and downs of the stock market are not really uniformly distributed. This means our mortal Chimp could have a very decent accuracy for the God Chimp, but say, screwed up most of the important part. And therefore not really doing that well. Conversely, it may appear that our mortal Chimp is doing really terrible mimicing the God Chimp, but still makes a lot of money. So we will need some practical benchmarks that are more down to earth.
We shall test our Chimp against 100,000 random Monkeys and the Patient Trader we have defined earlier. Since these two naive traders don't get influenced by the media or manipulated by the market makers, they are proven to perform better than the average investor. We are happy as long as our Chimp can perform better than the Monkey, which means our Chimp is at least better than chance (and therefore better than any average person), and also it'd be great if she can beat the PT. However beating the PT in general means beating the market, which isn't something really easy to do. So we wouldn't expect that much here.
In [6]:
display(dfMain.head())
As said earlier, we need to adjust the prices of Open, High, Low, Close, Volume. This can be done by getting the adjustment fact by dividing Adj Close by Close. We then multiply the prices by this factor, and divide the volume by this factor.
In [7]:
# Adjust Open, High, Low, Volume
dfMain['Adj Factor'] = dfMain['Adj Close'] / dfMain['Close']
dfMain['Open'] = dfMain['Open'] * dfMain['Adj Factor']
dfMain['High'] = dfMain['High'] * dfMain['Adj Factor']
dfMain['Low'] = dfMain['Low'] * dfMain['Adj Factor']
dfMain['Volume'] = dfMain['Volume'] / dfMain['Adj Factor']
dfMain.drop(['Adj Factor'], axis=1, inplace=True)
display(dfMain.head())
display(dfMain.tail())
Volume price analysis has been around for over 100 years, and there are many legendary traders who made themselves famous (and wealthy) using it. In addition to this, the basic principle behind it kind of makes sense on its own, that:
But then people, especially practioners, tend to think of it as an art rather than science, in that even though you have some clues what's going on on the market, you still don't know what the best timing is. And it takes practice and practice until you "get it".
For we data scientists, everything is science, including art. If a human can stare at the candlesticks telling you when to buy or sell, so can a computer. Thus the following features are extracted from the raw dataset:
For volume:
For price:
For wick:
where -nd represents n day in the past.
The reason why we choose 5, 10, 21, 63 days is because these are the common time ranges used in technical analysis, where 5 days correspond to one trading week, 10 to two, and 21 days correspond to one trading month, 63 to three. We don't want to explode our feature space so to start with we use the most recent 5-day data with longer term average data.
Spread and wicks are terms to describe the status of the candlestick chart (see below).
The spread describes the thick body part of the candlestick which shows the difference of the opening price and the closing price. The time frame (in terms of opening/closing) can range from minutes to months depending on what we want to look at (in our case, the time frame is one day). The wicks are the thin lines that extend at the top and the bottom, which describe whether there are stocks traded at prices beyond opening/closing prices during the day (or the specific time frame of interest). As shown in the picture, we can have white or black bodies on the candlestick chart to indicate the direction of the pricing movement, with white meaning $\text{closing price} > \text{opening price}$ and vice versa. On the other hand, a candle can have a upperwick and/or a lowerwick or none at all.
Note that to implement Q-learning we need to make the variables discrete. We use 100 day maximum and 100 day average to divide the above features and get relative levels of those features.
We set the trading price of each trading day to be the Adjusted Close: $$ Trade Price = Adj\ Close $$ This information is not available to the Chimp. The properties of the Chimp get updated with this information when she places an order. The portfolio value also gets updated using this price.
In [8]:
# Price Engineering
# Get opens
period_list = [1, 2, 3, 4, 5, 10, 21, 63, 100]
for x in period_list:
dfMain['-' + str(x) + 'd_Open'] = dfMain['Open'].shift(x)
# Get adjCloses
period_list = xrange(1, 5 + 1)
for x in period_list:
dfMain['-' + str(x) + 'd_adjClose'] = dfMain['Adj Close'].shift(x)
# Get highs
period_list1 = xrange(1, 5 + 1)
for x in period_list1:
dfMain['-' + str(x) + 'd_High'] = dfMain['High'].shift(x)
period_list2 = [10, 21, 63, 100]
for x in period_list2:
dfMain[str(x) + 'd_High'] = dfMain['High'].shift().rolling(window=x).max()
# Get lows
period_list1 = xrange(1, 5 + 1)
for x in period_list1:
dfMain['-' + str(x) + 'd_Low'] = dfMain['Low'].shift(x)
period_list2 = [10, 21, 63, 100]
for x in period_list2:
dfMain[str(x) + 'd_Low'] = dfMain['High'].shift().rolling(window=x).min()
In [9]:
# Get Volume Bases
dfMain['100d_Avg_Vol'] = dfMain['Volume'].shift().rolling(window=100).mean() * 1.5
dfMain['100d_Max_Vol'] = dfMain['Volume'].shift().rolling(window=100).max()
# Get Spread Bases
dfMain['Abs_Spread'] = np.abs(dfMain['Adj Close'] - dfMain['Open'])
dfMain['Abs_Spread_Shift1'] = dfMain['Abs_Spread'].shift()
dfMain['100d_Avg_Spread'] = dfMain['Abs_Spread_Shift1'].rolling(window=100).mean() * 1.5
dfMain['100d_Max_Spread'] = dfMain['100d_High'] - dfMain['100d_Low']
dfMain.drop(['Abs_Spread_Shift1', 'Abs_Spread'], axis=1, inplace=True)
display(dfMain.tail())
display(dfMain.ix[datetime(2011, 12, 30)][['Open', 'Adj Close']])
In [10]:
@jit
def relative_transform(num):
if 0 <= num < 0.25:
return 1
elif 0.25 <= num < 0.5:
return 2
elif 0.5 <= num < 0.75:
return 3
elif 0.75 <= num < 1:
return 4
elif 1 <= num:
return 5
elif -0.25 <= num < 0:
return -1
elif -0.5 <= num < -0.25:
return -2
elif -0.75 <= num < -0.5:
return -3
elif -1 <= num < -0.75:
return -4
elif num < -1:
return -5
else:
num
# Volume Engineering
# Get volumes
period_list = xrange(1, 5 + 1)
for x in period_list:
dfMain['-' + str(x) + 'd_Vol'] = dfMain['Volume'].shift(x)
# Get avg. volumes
period_list = [10, 21, 63]
for x in period_list:
dfMain[str(x) + 'd_Avg_Vol'] = dfMain['Volume'].shift().rolling(window=x).mean()
# Get relative volumes 1
period_list = range(1, 5 + 1)
for x in period_list:
dfMain['-' + str(x) + 'd_Vol1'] = dfMain['-' + str(x) + 'd_Vol'] / dfMain['100d_Avg_Vol']
dfMain['-' + str(x) + 'd_Vol1'] = dfMain['-' + str(x) + 'd_Vol1'].apply(relative_transform)
# Get relative avg. volumes 1
period_list = [10, 21, 63]
for x in period_list:
dfMain[str(x) + 'd_Avg_Vol1'] = dfMain[str(x) + 'd_Avg_Vol'] / dfMain['100d_Avg_Vol']
dfMain[str(x) + 'd_Avg_Vol1'] = dfMain[str(x) + 'd_Avg_Vol1'].apply(relative_transform)
# Get relative volumes 2
period_list = xrange(1, 5 + 1)
for x in period_list:
dfMain['-' + str(x) + 'd_Vol2'] = dfMain['-' + str(x) + 'd_Vol'] / dfMain['100d_Max_Vol']
dfMain['-' + str(x) + 'd_Vol2'] = dfMain['-' + str(x) + 'd_Vol2'].apply(relative_transform)
In [11]:
# Spread Engineering
# Get spread
period_list1 = xrange(1, 5 + 1)
period_list2 = [10, 21, 63]
for x in period_list1:
dfMain['-' + str(x) + 'd_Spread'] = dfMain['-' + str(x) + 'd_adjClose'] - dfMain['-' + str(x) + 'd_Open']
for x in period_list2:
dfMain[str(x) + 'd_Spread'] = dfMain['-1d_adjClose'] - dfMain['-' + str(x) + 'd_Open']
# Get relative spread
period_list1 = xrange(1, 5 + 1)
period_list2 = [10, 21, 63]
for x in period_list1:
dfMain['-' + str(x) + 'd_Spread'] = dfMain['-' + str(x) + 'd_Spread'] / dfMain['100d_Avg_Spread']
dfMain['-' + str(x) + 'd_Spread'] = dfMain['-' + str(x) + 'd_Spread'].apply(relative_transform)
for x in period_list2:
dfMain[str(x) + 'd_Spread'] = dfMain[str(x) + 'd_Spread'] / dfMain['100d_Max_Spread']
dfMain[str(x) + 'd_Spread'] = dfMain[str(x) + 'd_Spread'].apply(relative_transform)
display(dfMain[['-1d_Spread', '-2d_Spread', '-3d_Spread', '-4d_Spread', '-5d_Spread', '21d_Spread']].tail())
In [12]:
# Get wicks
@jit
def upperwick(open, adj_close, high):
if high > open and high > adj_close:
return True
else:
return False
def lowerwick(open, adj_close, low):
if low < open and low < adj_close:
return True
else:
return False
start_time = time.time()
period_list1 = xrange(1, 5 + 1)
period_list2 = [10, 21, 63, 100]
for x in period_list1:
dfMain.ix[:, '-' + str(x) + 'd_upperwick_bool'] = dfMain.apply(lambda row: upperwick(row['-' + str(x) + 'd_Open'], row['-' + str(x) + 'd_adjClose'], row['-' + str(x) + 'd_High']), axis=1)
dfMain.ix[:, '-' + str(x) + 'd_lowerwick_bool'] = dfMain.apply(lambda row: lowerwick(row['-' + str(x) + 'd_Open'], row['-' + str(x) + 'd_adjClose'], row['-' + str(x) + 'd_Low']), axis=1)
for x in period_list2:
dfMain.ix[:, str(x) + 'd_upperwick_bool'] = dfMain.apply(lambda row: upperwick(row['-' + str(x) + 'd_Open'], row['-1d_adjClose'], row[str(x) + 'd_High']), axis=1)
dfMain.ix[:, str(x) + 'd_lowerwick_bool'] = dfMain.apply(lambda row: lowerwick(row['-' + str(x) + 'd_Open'], row['-1d_adjClose'], row[str(x) + 'd_Low']), axis=1)
print("Getting wicks took {} seconds.".format(time.time() - start_time))
In [13]:
@jit
def get_upperwick_length(open, adj_close, high):
return high - max(open, adj_close)
@jit
def get_lowerwick_length(open, adj_close, low):
return min(open, adj_close) - low
start_time = time.time()
# Transform upper wicks
period_list1 = xrange(1, 5 + 1)
period_list2 = [10, 21, 63]
for x in period_list1:
has_upperwicks = dfMain['-' + str(x) + 'd_upperwick_bool']
has_lowerwicks = dfMain['-' + str(x) + 'd_lowerwick_bool']
dfMain.loc[has_upperwicks, '-' + str(x) + 'd_upperwick'] = dfMain.loc[has_upperwicks, :].apply(lambda row: get_upperwick_length(row['-' + str(x) + 'd_Open'], row['-' + str(x) + 'd_adjClose'], row['-' + str(x) + 'd_High']), axis=1)
dfMain.loc[has_lowerwicks, '-' + str(x) + 'd_lowerwick'] = dfMain.loc[has_lowerwicks, :].apply(lambda row: get_lowerwick_length(row['-' + str(x) + 'd_Open'], row['-' + str(x) + 'd_adjClose'], row['-' + str(x) + 'd_Low']), axis=1)
# Get relative upperwick length
dfMain.loc[dfMain['-' + str(x) + 'd_upperwick_bool'], '-' + str(x) + 'd_upperwick'] = dfMain.loc[dfMain['-' + str(x) + 'd_upperwick_bool'], '-' + str(x) + 'd_upperwick'] / dfMain.loc[dfMain['-' + str(x) + 'd_upperwick_bool'], '100d_Avg_Spread']
# Get relative lowerwick length
dfMain.loc[dfMain['-' + str(x) + 'd_lowerwick_bool'], '-' + str(x) + 'd_lowerwick'] = dfMain.loc[dfMain['-' + str(x) + 'd_lowerwick_bool'], '-' + str(x) + 'd_lowerwick'] / dfMain.loc[dfMain['-' + str(x) + 'd_lowerwick_bool'], '100d_Avg_Spread']
# Transform upperwick ratio to int
dfMain.loc[dfMain['-' + str(x) + 'd_upperwick_bool'], '-' + str(x) + 'd_upperwick'] = dfMain.loc[dfMain['-' + str(x) + 'd_upperwick_bool'], '-' + str(x) + 'd_upperwick'].apply(relative_transform)
# Transform lowerwick ratio to int
dfMain.loc[dfMain['-' + str(x) + 'd_lowerwick_bool'], '-' + str(x) + 'd_lowerwick'] = dfMain.loc[dfMain['-' + str(x) + 'd_lowerwick_bool'], '-' + str(x) + 'd_lowerwick'].apply(relative_transform)
# Assign 0 to no-upperwick days
dfMain.loc[np.logical_not(dfMain['-' + str(x) + 'd_upperwick_bool']), '-' + str(x) + 'd_upperwick'] = 0
# Assign 0 to no-lowerwick days
dfMain.loc[np.logical_not(dfMain['-' + str(x) + 'd_lowerwick_bool']), '-' + str(x) + 'd_lowerwick'] = 0
for x in period_list2:
has_upperwicks = dfMain[str(x) + 'd_upperwick_bool']
has_lowerwicks = dfMain[str(x) + 'd_lowerwick_bool']
dfMain.loc[has_upperwicks, str(x) + 'd_upperwick'] = dfMain.loc[has_upperwicks, :].apply(lambda row: get_upperwick_length(row['-' + str(x) + 'd_Open'], row['-1d_adjClose'], row[str(x) + 'd_High']), axis=1)
dfMain.loc[has_lowerwicks, str(x) + 'd_lowerwick'] = dfMain.loc[has_lowerwicks, :].apply(lambda row: get_lowerwick_length(row['-' + str(x) + 'd_Open'], row['-1d_adjClose'], row[str(x) + 'd_Low']), axis=1)
# Get relative upperwick length
dfMain.loc[dfMain[str(x) + 'd_upperwick_bool'], str(x) + 'd_upperwick'] = dfMain.loc[dfMain[str(x) + 'd_upperwick_bool'], str(x) + 'd_upperwick'] / dfMain.loc[dfMain[str(x) + 'd_upperwick_bool'], '100d_Avg_Spread']
# Get relative lowerwick length
dfMain.loc[dfMain[str(x) + 'd_lowerwick_bool'], str(x) + 'd_lowerwick'] = dfMain.loc[dfMain[str(x) + 'd_lowerwick_bool'], str(x) + 'd_lowerwick'] / dfMain.loc[dfMain[str(x) + 'd_lowerwick_bool'], '100d_Avg_Spread']
# Transform upperwick ratio to int
dfMain.loc[dfMain[str(x) + 'd_upperwick_bool'], str(x) + 'd_upperwick'] = dfMain.loc[dfMain[str(x) + 'd_upperwick_bool'], str(x) + 'd_upperwick'].apply(relative_transform)
# Transform lowerwick ratio to int
dfMain.loc[dfMain[str(x) + 'd_lowerwick_bool'], str(x) + 'd_lowerwick'] = dfMain.loc[dfMain[str(x) + 'd_lowerwick_bool'], str(x) + 'd_lowerwick'].apply(relative_transform)
# Assign 0 to no-upperwick days
dfMain.loc[np.logical_not(dfMain[str(x) + 'd_upperwick_bool']), str(x) + 'd_upperwick'] = 0
# Assign 0 to no-lowerwick days
dfMain.loc[np.logical_not(dfMain[str(x) + 'd_lowerwick_bool']), str(x) + 'd_lowerwick'] = 0
print("Transforming wicks took {} seconds.".format(time.time() - start_time))
In [14]:
display(dfMain[['-1d_lowerwick', '-2d_lowerwick', '-3d_lowerwick', '-4d_lowerwick', '-5d_lowerwick', '10d_lowerwick', '21d_lowerwick', '63d_lowerwick']].isnull().sum())
In [15]:
display(dfMain.head())
display(dfMain.tail())
In [16]:
dfMain['Trade Price'] = dfMain['Adj Close']
print(dfMain[['Trade Price', 'Open', 'Adj Close']].head())
In [17]:
display(dfMain.columns)
In [18]:
dfMain.drop(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume', \
'-1d_Vol', '-2d_Vol', '-3d_Vol', '-4d_Vol', '-5d_Vol', '10d_Avg_Vol', '21d_Avg_Vol', '63d_Avg_Vol', \
'-1d_Open', '-2d_Open', '-3d_Open', '-4d_Open', '-5d_Open', '-10d_Open', '-21d_Open', '-63d_Open', '-100d_Open', \
'-1d_adjClose', '-2d_adjClose', '-3d_adjClose', '-4d_adjClose', '-5d_adjClose', \
'-1d_High', '-2d_High', '-3d_High', '-4d_High', '-5d_High', '10d_High', '21d_High', '63d_High', '100d_High', \
'-1d_Low', '-2d_Low', '-3d_Low', '-4d_Low', '-5d_Low', '10d_Low', '21d_Low', '63d_Low', '100d_Low', \
'100d_Avg_Vol', '100d_Max_Vol', '100d_Avg_Spread', '100d_Max_Spread', \
'-1d_upperwick_bool', '-2d_upperwick_bool', '-3d_upperwick_bool', '-4d_upperwick_bool', '-5d_upperwick_bool', '10d_upperwick_bool', '21d_upperwick_bool', '63d_upperwick_bool', '100d_upperwick_bool', \
'-1d_lowerwick_bool', '-2d_lowerwick_bool', '-3d_lowerwick_bool', '-4d_lowerwick_bool', '-5d_lowerwick_bool', '10d_lowerwick_bool', '21d_lowerwick_bool', '63d_lowerwick_bool', '100d_lowerwick_bool'], \
axis=1, inplace=True)
In [19]:
display(dfMain.columns)
dfMain.dropna(inplace=True)
len(dfMain.columns)
Out[19]:
In [22]:
data_full = copy(dfMain)
In [23]:
display(data_full.head())
display(data_full.tail())
The problem with time series data in contrast to cross-sectional ones is that we cannot rely on conventional methods such as cross-validation or the usual 4:3:3 train-cv-test testing framework, as all of these methods are based on the assumption that all our data are drawn from the same population and a careful selection of a sample (samples) with proper modelling can say a lot about the entire population (and of course, about the other carefully drawn samples). However, we are not so lucky when it comes to dealing with time series data, mostly because:
Most if not all of the time our model really isn't the underlying model, that is, the data doesn't really come from the model we use. So it works only for really limited range of time and as the time series gets longer the difference between our model on the training set and the underlying model starts to show.
Essentially the system we are looking at is a time-dependent one so the underlying model itself most likely changes from time to time (unless we're talking about some grand unified model that can model the whole universe), in which case, assuming one model structure will work on the entire period of data is just wishful thinking.
That said, a lot of time we wish that in the process of our research, we can find some "invariants" (or least psudo-invariants) in our data that doesn't change as time goes. Still, we don't know if they are out there or not.
As said above, we will thus employ a training-testing framework that rolls as time goes. In our case, we keep 35 trading months of data for training (how this is determined will be shown later), and use the model to predict for 7 days, and since we are probably more interested in the performance of the JPM of the recent years we will proceed as following:
We start off setting our parameters as follows:
iter_number = 5000
Trading price—as mentioned earlier, we assume the trading price to be the same as the Adjusted Close.
No transaction cost—this can simplify the problem so that we can focus on modelling.
Two actions—we limit ourselves to only two actions, buy and sell. Again since there's no transaction cost, buying when there's no cash is equivalent to hold (and similar for the sell case). By limiting the size of the action space it's easier for our Q value to converge.
As mentioned above, we will use a roll-forward training framework to build our Chimp. We will first give it a few tries and fine-tune our parameters on the validation dataset. We shall call this the validation phase. And then we'll move on the to test on the test dataset, which will be referred to as the test phase.
We will set up our two benchmarks for the two phases for comparison. To recap, which include:
In [24]:
validation_start_date = datetime(2006, 9, 25)
validation_end_date = datetime(2011, 9, 27)
test_start_date = datetime(2011, 9, 26)
test_end_date = datetime(2016, 9, 27)
print("Validation phase")
print("{0} Trade Price: {1}".format(validation_start_date, data_full.ix[validation_start_date, 'Trade Price']))
print("{0} Trade Price: {1}".format(validation_end_date, data_full.ix[validation_end_date, 'Trade Price']))
validation_phase_data = data_full.ix[validation_start_date:validation_end_date, :]
print("Number of dates in validation dataset: {}\n".format(len(validation_phase_data)))
print("Test phase")
print("{0} Trade Price: {1}".format(test_start_date, data_full.ix[test_start_date, 'Trade Price']))
print("{0} Trade Price: {1}".format(test_end_date, data_full.ix[test_end_date, 'Trade Price']))
test_phase_data = data_full.ix[test_start_date:test_end_date, :]
print("Number of dates in test dataset: {}".format(len(test_phase_data)))
$Cash_{init} = 1000.00$
$Share_{init} = 0$
$PV_{init} = 0$
$Trading \ Price_{init} = 36.61$
$Share_{start} = floor(\frac{1000.00}{36.61}) = 27$
$PV_{start} = 36.61 \cdot 27 = 988.47$
$Cash_{start} = Cash_{init} - PV_{start} = 1000.00 - 988.47 = 11.53$
$Total \ Assets_{start} = Cash_{start} + PV_{start} = 1000.00$
$Cash_{end} = 11.53$ $Share_{end} = 27$ $Trading \ Price_{end} = 27.42$
$PV_{end} = 27.42 \cdot 27 = 740.34$
$Total \ Assets_{end} = Cash_{end} + PV_{end} = 11.53 + 740.34 = 751.87$
We can calculate the annual ROI by solving the following equation for $r_{val}$: $$ (1 + r)^{1260/252} = 0.7519 $$
$$ \Longrightarrow \boxed{r_{val} = -0.05543464 \approx -5.54\%} $$Similarly, we will have $$ \boxed{r_{test} = 0.1912884 \approx 19.13\%} $$
We use a MonkeyBot
class which will place one and only one order randomly everyday. We iterate it through the time frame we choose 100,000 times and we get the following distributions:
In [25]:
import random
import time
from copy import deepcopy
class MonkeyBot(object):
def __init__(self, dfEnv, cash=1000, share=0, pv=0, now_yes_share=0, random_state=0):
random.seed(random_state)
self.cash = cash
self.share = share
self.pv = pv
self.pv_history_list = []
self.action_list = []
self.env = deepcopy(dfEnv)
def buy(self, stock_price):
num_affordable = int(self.cash // stock_price)
self.cash = self.cash - stock_price * num_affordable
self.share = self.share + num_affordable
self.pv = stock_price * self.share
self.action_list.append('Buy')
def sell(self, stock_price):
self.cash = self.cash + stock_price * self.share
self.pv = 0
self.share = 0
self.action_list.append('Sell')
def hold(self, stock_price):
self.pv = stock_price * self.share
self.action_list.append('Hold')
def reset(self):
self.cash = 1000
self.share = 0
self.pv = 0
def yes_share(self):
# Represent chimp asset in state_action
if self.share > 0:
return 1
else:
return 0
def make_decision(self, x):
random_choice = random.choice([1, 2])
if random_choice == 0:
self.hold(x)
elif random_choice == 1:
self.buy(x)
else:
self.sell(x)
return self.pv # for frame-wise operation
def simulate(self, iters):
for i in range(iters):
self.env['Monkey PV'] = self.env['Trade Price'].apply(self.make_decision)
self.pv_history_list.append(self.env.ix[-1, 'Monkey PV'] + self.cash)
self.reset()
In [26]:
monkey_val = MonkeyBot(validation_phase_data, random_state=0)
start_time = time.time()
iters = 100000
monkey_val.simulate(iters)
print("{0} iterations took {1} seconds".format(iters, time.time() - start_time))
In [27]:
plt.hist(monkey_val.pv_history_list, bins=50)
Out[27]:
In [28]:
monkey_val_stats = pd.Series(monkey_val.pv_history_list)
print(monkey_val_stats.describe())
In [30]:
monkey_test = MonkeyBot(test_phase_data, random_state=0)
start_time = time.time()
iters = 100000
monkey_test.simulate(iters)
print("{0} iterations took {1} seconds".format(iters, time.time() - start_time))
In [31]:
plt.hist(monkey_test.pv_history_list, bins=50)
Out[31]:
In [32]:
monkey_test_stats = pd.Series(monkey_test.pv_history_list)
print(monkey_test_stats.describe())
Using the mean we can calculate $r_{val}$: $$ (1 + r_{val})^{1260/252} = 0.8721 $$
$$ \Longrightarrow \boxed{r_{val} = -0.02699907 \approx -2.70\%} $$Similarly, $$ \Longrightarrow \boxed{r_{test} = 0.09056276 \approx 9.06\%} $$
In [33]:
from sklearn.ensemble import RandomForestRegressor
class ChimpBot(MonkeyBot):
"""A trading bot that learns to trade in the stock market."""
valid_actions = ['Buy', 'Sell']
epsilon = 1
gamma = 0.75
random_reward = [0]
random_counter = 0
policy_counter = 0
track_key1 = {'Sell': 0, 'Buy': 0, 'Hold': 0}
track_key2 = {'Sell': 0, 'Buy': 0, 'Hold': 0}
track_random_decision = {'Sell': 0, 'Buy': 0, 'Hold': 0}
reset_counter = 0
def __init__(self, dfEnv, iter_random_rounds, test_mode=False, cash=1000, share=0, pv=0, random_state=0):
super(ChimpBot, self).__init__(dfEnv, iter_random_rounds, cash, share, pv)
random.seed(random_state)
np.random.seed(random_state)
# sets self.cash = 1000
# sets self.share = 0
# sets self.pv = 0
# sets self.pv_history_list = []
# sets self.env = dfEnv
# implements buy(self, stock_price)
# implements sell(self, stock_price)
# implements hold(self)
self.test_mode = test_mode
self.num_features = len(dfEnv.columns) - 1
self.random_rounds = iter_random_rounds # Number of rounds where the bot chooses to go monkey
self.iter_env = self.env.iterrows()
self.now_env_index, self.now_row = self.iter_env.next()
# self.now_yes_share = 0
self.now_action = ''
# self.now_q = 0
self.prev_cash = self.cash
self.prev_share = self.share
self.prev_pv = self.pv
self.q_df_columns = list(self.env.columns)
self.q_df_columns.pop()
self.q_df_columns.extend(['Action', 'Q Value'])
self.q_df = pd.DataFrame(columns=self.q_df_columns)
self.q_dict = defaultdict(lambda: (0, 0)) # element of q_dict is (state, act): (q_value, t)
# self.q_dict_analysis preserves the datetime data and is not used by the ChimpBot
self.q_dict_analysis = defaultdict(lambda: (0, 0))
self.negative_reward = 0
self.n_reward_hisotry = []
self.net_reward = 0
self.reset_counter = 0
def make_q_df(self):
result_dict = defaultdict(list)
for index, row in self.q_dict.iteritems():
for i in range(len(self.q_dict.keys()[0])):
column_name = 'col' + str(i + 1)
result_dict[column_name].append(index[i])
result_dict['Q'].append(self.q_dict[index][0])
self.q_df = pd.DataFrame(result_dict)
q_df_column_list = ['col' + str(x) for x in range(1, self.num_features + 1 + 1)]
q_df_column_list.append('Q')
self.q_df = self.q_df[q_df_column_list]
def transfer_action(x):
if x == 'Buy':
return 1
elif x == 'Sell':
return 2
elif x == 'Hold':
return 0
else:
raise ValueError("Wrong action!")
def str_float_int(x):
return int(float(x))
arr_int = np.vectorize(str_float_int)
self.q_df['col' + str(self.num_features + 1)] = self.q_df['col' + str(self.num_features + 1)].apply(transfer_action)
self.q_df.ix[:, :-1] = self.q_df.ix[:, :-1].apply(arr_int)
def split_q_df(self):
self.q_df_X = self.q_df.ix[:, :-1]
self.q_df_y = self.q_df.ix[:, -1]
def train_on_q_df(self):
reg = RandomForestRegressor(n_estimators=128, max_features='sqrt', n_jobs=-1, random_state=0)
self.q_reg = reg
self.q_reg = self.q_reg.fit(self.q_df_X, self.q_df_y)
def update_q_model(self):
print("Updating Q model...")
start_time = time.time()
self.make_q_df()
self.split_q_df()
self.train_on_q_df()
def from_state_action_predict_q(self, state_action):
state_action = [state_action]
pred_q = self.q_reg.predict(state_action)
return pred_q
def max_q(self, now_row):
def transfer_action(x):
if x == 'Buy':
return 1
elif x == 'Sell':
return 2
elif x == 'Hold':
return 0
else:
raise ValueError("Wrong action!")
def str_float_int(x):
return int(float(x))
now_row2 = list(now_row)
# now_row2.append(self.now_yes_share)
max_q = ''
q_compare_dict = {}
if len(now_row2) > self.num_features:
raise ValueError("Got ya bastard! @ MaxQ")
# Populate the q_dict
for act in set(self.valid_actions):
now_row2.append(act)
now_row_key = tuple(now_row2)
_ = self.q_dict[now_row_key]
try:
self.q_reg
except AttributeError:
pass
# print('No q_reg yet...going with default.')
else:
if _[1] == 0:
single_X = np.array(now_row_key)
# print(single_X)
arr_int = np.vectorize(str_float_int)
single_X[-1] = transfer_action(single_X[-1])
single_X = arr_int(single_X)
single_X = single_X.reshape(1, -1)
pred_q = self.q_reg.predict(single_X)
dreamed_q = (1 - (1 / (self.q_dict[now_row_key][1] + 1))) * self.q_dict[now_row_key][0] + (1 / (self.q_dict[now_row_key][1] + 1)) * pred_q[0]
self.q_dict[now_row_key] = (dreamed_q, self.q_dict[now_row_key][1] + 1)
q_compare_dict[now_row_key] = self.q_dict[now_row_key]
now_row2.pop()
try:
max(q_compare_dict.iteritems(), key=lambda x:x[1])
except ValueError:
print("Wrong Q Value in Q Compare Dict!")
else:
key, qAndT = max(q_compare_dict.iteritems(), key=lambda x:x[1])
# print("Action: {0}, with Q-value: {1}".format(key[-1], qAndT[0]))
return key[-1], qAndT[0], qAndT[1]
def q_update(self):
# print("Data Index: {}".format(self.now_env_index))
now_states = list(self.now_row)
# now_states = list(now_states)
now_states.pop() # disregard the Trade Price
prev_states = list(self.prev_states)
if len(prev_states) > self.num_features:
raise ValueError("Got ya bastard! @ Q_Update...something wrong with the self.prev_states!!!")
# prev_states.append(self.prev_yes_share)
prev_states.append(self.prev_action)
prev_states_key = tuple(prev_states)
if len(prev_states_key) > self.num_features + 2:
raise ValueError("Got ya bastard! @ Q_Update")
q_temp = self.q_dict[prev_states_key]
q_temp0 = (1 - (1 / (q_temp[1] + 1))) * q_temp[0] + (1 / (q_temp[1] + 1)) * (self.reward + self.gamma * self.max_q(now_states)[1])
self.q_dict[prev_states_key] = (q_temp0, q_temp[1] + 1)
# For analysis purpose
self.q_dict_analysis[prev_states_key] = (q_temp0, self.prev_env_index)
# print("Now Action: {}".format())
# print(prev_states_key)
return (self.q_dict[prev_states_key])
def policy(self, now_row):
return self.max_q(now_row)[0]
def reset(self):
# Portfolio change over iterations
self.pv_history_list.append(self.pv + self.cash)
self.iter_env = self.env.iterrows()
self.now_env_index, self.now_row = self.iter_env.next()
self.cash = 1000
self.share = 0
self.pv = 0
self.prev_cash = self.cash
self.prev_share = self.share
self.prev_pv = self.pv
if self.test_mode is True:
self.epsilon = 0
else:
if self.epsilon - 1/self.random_rounds > 1/self.random_rounds: # Epislon threshold: 0.01
self.random_counter += 1
self.epsilon = self.epsilon - 1/self.random_rounds
else:
self.epsilon = 0.000001 # Epislon threshold: 0.1
self.policy_counter += 1
self.net_reward = 0
self.reset_counter += 1
if self.reset_counter % self.random_rounds == 0:
self.update_q_model()
if self.reset_counter != self.random_rounds:
self.action_list = []
def make_decision(self, now_row):
return self.policy(now_row)
def update(self):
# Update state
now_states = list(self.now_row)
if len(now_states) > self.num_features + 1:
print(len(now_states))
print(self.num_features)
raise ValueError("Got ya bastard! @ Q_Update...something wrong with the self.now_row!!!")
now_states.pop() # disregard the Trade Price
if len(now_states) > self.num_features:
print(now_states)
raise ValueError("Got ya bastard! @ Q_Update...something wrong with now_states after pop!!!")
# Exploitation-exploration decisioning
self.decision = np.random.choice(2, p = [self.epsilon, 1 - self.epsilon]) # decide to go random or with the policy
# self.decision = 0 # Force random mode
# print("Random decision: {0}, Epislon: {1}".format(self.decision, self.epsilon))
if self.decision == 0: # if zero, go random
action = random.choice(self.valid_actions)
else: # else go with the policy
action = self.make_decision(now_states)
if len(now_states) > self.num_features:
print(now_states)
raise ValueError("Got ya bastard! @ Q_Update...something wrong with now_states after make_decision!!!")
# Execute action and get reward
if action == 'Buy':
# print(self.now_row)
self.buy(self.now_row[-1])
elif action == 'Sell':
# print(self.now_row)
self.sell(self.now_row[-1])
elif action == 'Hold':
# print(self.now_row)
self.hold(self.now_row[-1])
else:
raise ValueError("Wrong action man!")
try:
self.prev_states
except AttributeError:
print("Running the first time...no prevs exist.")
else:
self.reward = ((self.cash - self.prev_cash) + (self.pv - self.prev_pv)) / (self.prev_cash + self.prev_pv)
self.q_update()
self.prev_states = now_states
if len(now_states) > self.num_features:
raise ValueError("Got ya bastard! @ Q_Update...something wrong with the now_states!!!")
self.now_action = action
self.prev_action = action
# self.prev_yes_share = self.now_yes_share
self.prev_env_index = deepcopy(self.now_env_index)
self.prev_cash = self.cash
self.prev_share = self.share
self.prev_pv = self.pv
try:
self.now_env_index, self.now_row = self.iter_env.next()
except StopIteration:
pass
# print("End of data.")
else:
pass
try:
_ = self.reward
except AttributeError:
print("No reward yet...0 assigned.")
self.reward = 0
def simulate(self):
start_time = time.time()
for i in range(self.random_rounds):
for l in range(len(self.env)):
self.update()
self.reset()
print("{0} rounds of simulation took {1} seconds".format(self.random_rounds, time.time() - start_time))
return self.pv_history_list
In [1038]:
iter_random_rounds=5000
god_chimp = ChimpBot(dfEnv=data_full, iter_random_rounds=iter_random_rounds, random_state=0)
pv_history_list = god_chimp.simulate()
print(pv_history_list[-1])
pd.Series(pv_history_list).plot()
Out[1038]:
In [1040]:
print(pd.Series(god_chimp.action_list).describe())
In [1041]:
# Convert Q-Table to Dataframe from the God Chimp (full dataset)
iter_random_rounds=5000
result_dict = defaultdict(list)
for index, row in god_chimp.q_dict_analysis.iteritems():
for i in range(len(god_chimp.q_dict_analysis.keys()[0])):
column_name = 'col' + str(i + 1)
result_dict[column_name].append(index[i])
result_dict['Q'].append(god_chimp.q_dict_analysis[index][0])
result_dict['Date'].append(god_chimp.q_dict_analysis[index][1])
god_chimp_q_df = pd.DataFrame(result_dict)
# Yes share column removed
column_list = ['col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'col7', 'col8', 'col9', 'col10', 'col11', 'col12', 'col13', 'col14', 'col15', 'col16', 'col17', 'col18', 'col19', 'col20', 'col21', 'col22', 'col23', 'col24', 'col25', 'col26', 'col27', 'col28', 'col29', 'col30', 'col31', 'col32', 'col33', 'col34', 'col35', 'col36', 'col37', 'col38', 'Date', 'Q']
god_chimp_q_df = god_chimp_q_df[column_list]
god_chimp_q_df.sort_values('Date', inplace=True)
god_chimp_q_df.reset_index(inplace=True)
del god_chimp_q_df['index']
god_chimp_q_df.reset_index(inplace=True)
del god_chimp_q_df['index']
god_chimp_q_df.set_index(god_chimp_q_df['Date'], inplace=True)
del god_chimp_q_df.index.name
del god_chimp_q_df['Date']
print(len(god_chimp_q_df))
display(god_chimp_q_df.head())
In [10]:
def action_to_int(string):
if string == 'Buy':
return 1
elif string == 'Sell':
return 2
else:
return string
god_chimp_q_df.ix[:, -2] = god_chimp_q_df.ix[:, -2].apply(action_to_int)
In [1044]:
god_chimp_q_df.head()
Out[1044]:
As said earlier, one problem with time series data is to find the training window size wihtin which the data can be seen as being drawn from the same population as the data we want to predict. Then of course we can generalize what we have learned/modelled from the training to the cross-validation/test dataset.
To do this we can make use of the God Chimp’s Q-table we just got and get:
In [76]:
from sklearn.metrics import accuracy_score
def find_best_training_size(data_full, full_q_df, training_sizes, testing_size, target_data, random_state=0):
start_time = time.time()
accs = []
d_counter = 0
# Loop through all batches in validation dataset
(u, ) = data_full.index.get_indexer_for([target_data.index[0]])
for d in range(u, u + testing_size * (len(target_data) // testing_size), testing_size):
acc_num_train_months = []
d_counter += 1
# Dates in the batch
date_range = data_full.iloc[d:d + testing_size].index
# Loop through all sizes of training sets
for num_train_month in range(1, training_sizes + 1):
# Prepare Training/Testing Datasets
X_train = full_q_df.iloc[d - (int(21 * num_train_month)):d, :-1]
y_train = full_q_df.iloc[d - (int(21 * num_train_month)):d, -1]
X_test = full_q_df.ix[date_range, :-1]
y_test = full_q_df.ix[date_range, -1]
# Fit data and make predictions
reg = RandomForestRegressor(n_estimators=128, max_features='sqrt', oob_score=True, n_jobs=-1, random_state=random_state)
reg.fit(X_train, y_train)
y_pred = reg.predict(X_test)
y_fit = reg.predict(X_train)
pred_q = y_pred
actions = X_test.ix[:, -1]
data = {'Action': actions, 'Q': pred_q}
df_pred = pd.DataFrame(data=data, index=y_test.index)
pred_actions = []
for date in date_range:
max_q = [0, -1]
for i, r in df_pred.ix[date].iterrows():
if r['Q'] > max_q[1]:
max_q = [r['Action'], r['Q']]
pred_actions.append(max_q[0])
best_actions = []
for date in date_range:
max_q = [0, -1]
for i, r in full_q_df.ix[date].iterrows():
if r['Q'] > max_q[1]:
max_q = [r[-2], r['Q']]
best_actions.append(max_q[0])
acc_num_train_months.append(accuracy_score(best_actions, pred_actions))
accs.append(np.array(acc_num_train_months))
print("Batch {0} completed....{1:.2f}%".format(d_counter, d_counter / len(range(u, u + testing_size * (len(target_data) // testing_size), testing_size))))
geo_means = np.power(reduce(lambda x,y: x*y, accs), (1/len(accs)))
arithmetic_means = reduce(lambda x,y: x+y, accs) / len(accs)
print("Geometric Means Max: {}".format((np.argmax(geo_means) + 1, np.max(geo_means))))
print("Arithemtic Means Max: {}".format((np.argmax(arithmetic_means) + 1, np.max(arithmetic_means))))
print("Grid search best num_train_year took {} seconds:".format(time.time() - start_time))
return (geo_means, arithmetic_means)
In [1046]:
means = find_best_training_size(data_full=data_full, full_q_df=god_chimp_q_df, training_sizes=120, testing_size=7, target_data=validation_phase_data, random_state=0)
geo_means = means[0]
arithmetic_means = means[1]
In [1048]:
print(geo_means)
print(sorted(range(len(geo_means)), key=lambda k: geo_means[k], reverse=True))
print(arithmetic_means)
print(sorted(range(len(arithmetic_means)), key=lambda k: arithmetic_means[k], reverse=True))
validation_phase_data['Trade Price'].plot()
plt.figure()
plt.plot(geo_means)
plt.figure()
plt.plot(arithmetic_means)
Out[1048]:
We can see a trend of accuracies going up and then down again. Here we choose 35 months of data to build our model.
In [142]:
import sys
def chimp_simulate(data_full, target_data, num_iter, train_size, batch_size, random_state=0):
pv_history_list = []