In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import datetime as dt
import time
from buy_and_hold import BuyAndHoldStrategy
from scipy.stats.mstats import gmean
In [2]:
data = pd.read_csv("buy_and_hold_prices.csv", index_col = "Date")
#database of NASDAQ100 stocks from yahoo (Adj Close)
data.index = pd.to_datetime(data.index)
In this strategy we buy stock yearly, the default annual investment is set to 10000 USD.
We buy 5 stocks which were "the best" and hold them. Every year we add new stocks to our portfolio. "The best" mean that the stock daily returns were most often higher than $q$-th percentile of all daily returns for the last $w$ trading days.
In [3]:
strategy = BuyAndHoldStrategy(data)
strategy.plot()
In strategy.summary
you can see the number of bought stocks every year. The whole portfolio is cumulating those stocks.
In [4]:
strategy.summary
Out[4]:
In [5]:
strategy.annual_returns
Out[5]:
In this part we search for "best" parameters (window $w$ and quantile $q$).
We start with $w \in \{50,60,70,80,90,100,110,120\}$ and $q \in \{0.75,0.8,0.85,0.9,0.95\}$.
In [6]:
windows = np.array([50,60,70,80,90,100,110,120])
quantiles = np.array([0.75,0.8,0.85,0.9,0.95])
In [7]:
#optimization on 1/2006-6/2017 = ALL
returns_all = {}
start_time = time.time()
for w in range(len(windows)):
ret_row = []
for q in range(len(quantiles)):
s = BuyAndHoldStrategy(data, window=windows[w], quantile=quantiles[q],
spx = False, maxdrawdown = False)
ret_row.append(gmean(s.annual_returns+1)-1)
returns_all[windows[w]] = ret_row
print("Computation time: " + str(np.round(time.time() - start_time,2)))
In [8]:
returns_all = pd.DataFrame(returns_all, index=quantiles)
returns_all.columns = windows
returns_all = returns_all.astype(np.float64)
Using the whole database - prices from 1/2006 to 6/2017 (market crash included). According to the following table and previous analysis (not in this document) we will do robust optimization for $w\in\{100,110,120\}$ and $q\in\{0.85,0.9\}$
In [9]:
returns_all
Out[9]:
In [10]:
window_int = [100,110,120]
quantile_int = [0.85,0.9]
In robust optimization we start trading at random dates before market crash, to see the progress of the Annual Buy & Hold Strategy. We generate 50 random starts from interval (1/2006,12/2007). The database will be from 1/2006 to 6/2009.
In [11]:
start_dates = np.random.randint(0,500,50)
In [12]:
start_dates = np.sort(data.index[start_dates])
In [13]:
#grid search on 1/2006-6/2009 = WORST CASE random starts ... ROBUST
start_time = time.time()
ret = []
for start in start_dates:
returns = {}
for w in range(len(window_int)):
ret_row = []
for q in range(len(quantile_int)):
s = BuyAndHoldStrategy(data[start:dt.date(2009,6,30)], window=windows[w],
quantile=quantiles[q], spx = False,
maxdrawdown = False)
ret_row.append(gmean(s.annual_returns+1)-1)
returns[windows[w]] = ret_row
returns = pd.DataFrame(returns, index=quantile_int)
returns.columns = window_int
returns = returns.astype(np.float64)
ret.append(returns)
print("Computation time: " + str(np.round(time.time() - start_time,2)))
In [14]:
arr = []
for i in range(len(start_dates)):
arr.append(np.array(ret[i]))
In [15]:
arr = np.array(arr)
In [16]:
output = pd.Panel(arr, items=start_dates,
major_axis=quantile_int,
minor_axis=window_int)
In [17]:
print("Mean of 50 average returns:")
pd.DataFrame(np.mean(output, axis = 0), index=quantile_int,
columns=window_int)
Out[17]:
In [18]:
print("Window length:")
pd.Series(np.mean(np.mean(output, axis = 0), axis = 0),
index=window_int)
Out[18]:
In [19]:
print("Quantile:")
pd.Series(np.mean(np.mean(output, axis = 0), axis = 1),
index = quantile_int)
Out[19]:
From worst case optimization (financial crisis) we can see better performance for $w = 100$ and $q = 0.85$. The same robust optimization for the whole database.
In [20]:
#grid search on 1/2006-6/2017 = random starts ... ALL TIME ROBUST
start_time = time.time()
ret2 = []
for start in start_dates:
returns = {}
for w in range(len(window_int)):
ret_row = []
for q in range(len(quantile_int)):
s = BuyAndHoldStrategy(data[start:], window=windows[w],
quantile=quantiles[q], spx = False,
maxdrawdown = False)
ret_row.append(gmean(s.annual_returns+1)-1)
returns[windows[w]] = ret_row
returns = pd.DataFrame(returns, index=quantile_int)
returns.columns = window_int
returns = returns.astype(np.float64)
ret2.append(returns)
print("Computation time: " + str(np.round(time.time() - start_time,2)))
In [21]:
arr = []
for i in range(len(start_dates)):
arr.append(np.array(ret2[i]))
arr = np.array(arr)
output2 = pd.Panel(arr, items=start_dates, major_axis=quantile_int,
minor_axis=window_int)
In [22]:
print("Mean of 50 average returns:")
pd.DataFrame(np.mean(output2, axis = 0), index=quantile_int,
columns=window_int)
Out[22]:
In [23]:
print("Window length:")
pd.Series(np.mean(np.mean(output2, axis = 0), axis = 0),
index=window_int )
Out[23]:
In [24]:
print("Quantile:")
pd.Series(np.mean(np.mean(output2, axis = 0), axis = 1),
index = quantile_int)
Out[24]:
Differences in robust all time optimization are very small. We choose $w=100$ because of better performence in worst case and $q=0.9$ because of better performance in all time.