Portfolio Simulation Modeling

Download Data

Download S&P 500 index data from Yahoo Finance



In [78]:

    
import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
import datetime as datetime
from numpy import *
%matplotlib inline

Set start date, end date and data source ('Yahoo Finance', 'Google Finance', etc.). Download S&P 500 index data from Yahoo Finance.



In [79]:

    
start_date = datetime.date(1976,1,1)
end_date = datetime.date(2017,1,1)
# Download S&P 500 index data
try:
    SnP500_Ddata = web.DataReader('^GSPC','yahoo',start_date,end_date)
except:
    SnP500_Ddata = pd.read_csv("http://analytics.romanko.ca/data/SP500_hist.csv")
    SnP500_Ddata.index = pd.to_datetime(SnP500_Ddata.Date)
SnP500_Ddata.head()

Transform daily data into annual data:



In [80]:

    
# Create a time-series of annual data points from daily data
SnP500_Adata = SnP500_Ddata.resample('A').last()
SnP500_Adata[['Volume','Adj Close']].tail()









    Out[80]:






  
    
      
      Volume
      Adj Close
    
    
      Date
      
      
    
  
  
    
      2012-12-31
      3204330000
      1426.189941
    
    
      2013-12-31
      2312840000
      1848.359985
    
    
      2014-12-31
      2606070000
      2058.899902
    
    
      2015-12-31
      2655330000
      2043.939941
    
    
      2016-12-31
      2670900000
      2238.830078

Compute annual return of S&P 500 index:



In [81]:

    
SnP500_Adata[['Adj Close']] = SnP500_Adata[['Adj Close']].apply(pd.to_numeric, errors='ignore')
SnP500_Adata['returns'] = SnP500_Adata['Adj Close'] / SnP500_Adata['Adj Close'].shift(1) -1
SnP500_Adata = SnP500_Adata.dropna()
print SnP500_Adata['returns']









    



Date
1977-12-31   -0.115020
1978-12-31    0.010620
1979-12-31    0.123088
1980-12-31    0.257736
1981-12-31   -0.097304
1982-12-31    0.147613
1983-12-31    0.172710
1984-12-31    0.014006
1985-12-31    0.263334
1986-12-31    0.146204
1987-12-31    0.020275
1988-12-31    0.124008
1989-12-31    0.272505
1990-12-31   -0.065591
1991-12-31    0.263067
1992-12-31    0.044643
1993-12-31    0.070552
1994-12-31   -0.015393
1995-12-31    0.341107
1996-12-31    0.202637
1997-12-31    0.310082
1998-12-31    0.266686
1999-12-31    0.195260
2000-12-31   -0.101392
2001-12-31   -0.130427
2002-12-31   -0.233660
2003-12-31    0.263804
2004-12-31    0.089935
2005-12-31    0.030010
2006-12-31    0.136194
2007-12-31    0.035296
2008-12-31   -0.384858
2009-12-31    0.234542
2010-12-31    0.127827
2011-12-31   -0.000032
2012-12-31    0.134057
2013-12-31    0.296012
2014-12-31    0.113906
2015-12-31   -0.007266
2016-12-31    0.095350
Freq: A-DEC, Name: returns, dtype: float64

Compute average annual return and standard deviation of return for S&P 500 index:



In [82]:

    
SnP500_mean_ret = float(SnP500_Adata[['returns']].mean())
SnP500_std_ret = float(SnP500_Adata[['returns']].std())
print ("S&P 500 average return = %g%%, st. dev  = %g%%") % (100*SnP500_mean_ret, 100*SnP500_std_ret)









    



S&P 500 average return = 9.13031%, st. dev  = 15.8073%

Simulation Example 1

We want to invest \$1000 in the US stock market for 1 year: $v_0 = 1000$



In [83]:

    
v0 = 1000 # Initial capital

In our example we assume that the return of the market over the next year follow Normal distribution.

Between 1977 and 2014, S&P 500 returned 9.38% per year on average with a standard deviation of 16.15%.

Generate 100 scenarios for the market return over the next year (draw 100 random numbers from a Normal distribution with mean 9.38% and standard deviation of 16.15%):



In [84]:

    
Ns = 100 # Number of scenarios
r01 = random.normal(SnP500_mean_ret, SnP500_std_ret, Ns)
r01









    Out[84]:





array([-0.01238517,  0.26551319,  0.12050935, -0.14274842, -0.04177976,
        0.39895354, -0.03824214, -0.22761098,  0.11755653,  0.08573296,
        0.19937247, -0.00171866,  0.04072682, -0.01189156,  0.3362564 ,
       -0.21669386, -0.38799112,  0.18485995,  0.04754443,  0.11693826,
        0.07614133, -0.1062223 , -0.07852525,  0.14853333,  0.33711094,
        0.10422203,  0.14578936, -0.14485968, -0.00211176,  0.39213514,
        0.12256484, -0.08522198,  0.16713441,  0.18741692, -0.00061099,
        0.15956528, -0.11670832, -0.02210576,  0.11262091,  0.00328012,
        0.06576472,  0.13523109,  0.29724132, -0.12922588, -0.01354248,
        0.17589712, -0.04148897, -0.23096135,  0.16538584, -0.3160094 ,
        0.33628253, -0.10127971,  0.08717314, -0.20843334,  0.24726468,
        0.01839252,  0.13707443, -0.08166534,  0.13717431,  0.34053443,
       -0.06250905, -0.08706011,  0.19035904,  0.25749463,  0.07955175,
        0.19434696,  0.06719778, -0.00841847,  0.15456418, -0.12648744,
        0.08077283,  0.33980726, -0.05637986,  0.20951367,  0.30166443,
        0.032435  ,  0.22411126,  0.16199499, -0.03617882,  0.16548387,
        0.09362067, -0.1357435 ,  0.16220185,  0.1383636 , -0.02506051,
        0.36854717, -0.05416744,  0.13716198,  0.31453671,  0.24717374,
       -0.09648866,  0.10130499,  0.27721024, -0.10564058, -0.01471735,
        0.23465512, -0.02991651,  0.01113203,  0.00540451, -0.17196068])

Value of investment at the end of year 1: $v_1 = v_0 + r_{0,1}\cdot v_0 = (1 + r_{0,1})\cdot v_0$



In [85]:

    
# Distribution of value at the end of year 1
v1 = (r01 + 1) * v0
v1









    Out[85]:





array([  987.61482528,  1265.51318878,  1120.50934834,   857.25157777,
         958.22024362,  1398.95353975,   961.75785649,   772.38901654,
        1117.55653085,  1085.73295589,  1199.37246505,   998.28134396,
        1040.72682365,   988.10844466,  1336.25639969,   783.30614099,
         612.00887912,  1184.85995012,  1047.54443287,  1116.93826371,
        1076.14132752,   893.77770311,   921.47475209,  1148.53333485,
        1337.11093821,  1104.22202826,  1145.78936276,   855.14032225,
         997.88824334,  1392.13513616,  1122.56484447,   914.77802253,
        1167.13441442,  1187.41691736,   999.38901008,  1159.56528341,
         883.29167977,   977.89424069,  1112.62090557,  1003.28011909,
        1065.76472251,  1135.23108898,  1297.24132031,   870.77412167,
         986.45751643,  1175.89711637,   958.5110295 ,   769.03865222,
        1165.38583865,   683.99060244,  1336.28253363,   898.72028598,
        1087.17313836,   791.56666208,  1247.26468432,  1018.39251755,
        1137.07442819,   918.33465719,  1137.1743069 ,  1340.534432  ,
         937.49094981,   912.93989015,  1190.35904151,  1257.4946254 ,
        1079.55174622,  1194.34696311,  1067.19778023,   991.58153197,
        1154.5641834 ,   873.51256292,  1080.77282554,  1339.80726245,
         943.62014017,  1209.51367391,  1301.66442695,  1032.43500447,
        1224.11125943,  1161.99499245,   963.82118088,  1165.48386619,
        1093.62067447,   864.25649811,  1162.20184993,  1138.36360412,
         974.93949394,  1368.54716733,   945.83255726,  1137.1619761 ,
        1314.53671406,  1247.1737412 ,   903.51133691,  1101.30499341,
        1277.21023968,   894.35941561,   985.28264519,  1234.65511713,
         970.08349118,  1011.13203092,  1005.40450597,   828.03931612])

Mean:



In [86]:

    
mean(v1)









    Out[86]:





1067.9374174412631

Standard deviation:



In [87]:

    
std(v1)









    Out[87]:





163.56702482311968

Minimum, maximum:



In [88]:

    
min(v1), max(v1)









    Out[88]:





(612.00887912029145, 1398.9535397528282)

Persentiles

5th percentile, median, 95th percentile:



In [89]:

    
percentile(v1, [5, 50,95])









    Out[89]:





array([  791.15363603,  1080.16228588,  1337.24575442])

Alternative way to compute percentiles

5th percentile, median, 95th percentile:



In [90]:

    
sortedScen = sorted(v1)   # Sort scenarios
sortedScen[5-1], sortedScen[50-1], sortedScen[95-1]









    Out[90]:





(783.30614098874196, 1079.5517462159646, 1337.1109382129714)

Plot a histogram of the distribution of outcomes for v1:



In [91]:

    
hist, bins = histogram(v1)
positions = (bins[:-1] + bins[1:]) / 2
plt.bar(positions, hist, width=60)
plt.xlabel('portfolio value after 1 year')
plt.ylabel('frequency')
plt.show()

Simulated paths over time:



In [92]:

    
# Plot simulated paths over time
for res in v1:
    plt.plot((0,1), (v0, res))
plt.xlabel('time step')
plt.ylabel('portfolio value')
plt.show()

Simulation Example 2

You are planning for retirement and decide to invest in the market for the next 30 years (instead of only the next year as in example 1).

Assume that every year your investment returns from investing into the S&P 500 will follow a Normal distribution with the mean and standard deviation as in example 1.

Your initial capital is still \$1000



In [93]:

    
v0 = 1000 # Initial capital

Between 1977 and 2014, S&P 500 returned 9.38% per year on average with a standard deviation of 16.15% Simulate 30 columns of 100 observations each of single period returns:



In [94]:

    
r_speriod30  = random.normal(SnP500_mean_ret, SnP500_std_ret, (Ns, 30))
r_speriod30









    Out[94]:





array([[-0.27285485, -0.006935  , -0.0101025 , ...,  0.03304577,
         0.187185  ,  0.22997199],
       [ 0.13763737,  0.01706496,  0.38933684, ...,  0.40079969,
         0.08947382,  0.02410985],
       [ 0.06776747,  0.26355395,  0.21173347, ...,  0.03464522,
         0.06159513,  0.04127322],
       ..., 
       [ 0.47489008, -0.18459152,  0.14817538, ...,  0.14354497,
         0.07564415,  0.24085406],
       [-0.02221353,  0.3444806 ,  0.07884434, ...,  0.25827233,
        -0.06371246,  0.09849279],
       [-0.04462351,  0.05422174,  0.23578598, ...,  0.33485102,
         0.1285475 ,  0.1702835 ]])

Compute and plot $v_{30}$



In [95]:

    
v30 = prod(1 + r_speriod30 , 1) * v0

hist, bins = histogram(v30)
positions = (bins[:-1] + bins[1:]) / 2
width = (bins[1] - bins[0]) * 0.8
plt.bar(positions, hist, width=width)
plt.xlabel('portfolio value after 30 years')
plt.ylabel('frequency')
plt.show()

Simulated paths over time:



In [96]:

    
for scenario in r_speriod30:
    y = [prod(1 + scenario[0:i]) * v0 for i in range(0,31)]
    plt.plot(range(0,31), y)
plt.xlabel('time step')
plt.ylabel('portfolio value')
plt.show()

Simulation Example 3

Download US Treasury bill data from Federal Reserve:



In [97]:

    
# Download 3-month T-bill rates from Federal Reserve
start_date_b = datetime.date(1977,1,1)
end_date_b = datetime.date(2017,1,1)
TBill_Ddata = web.DataReader('DTB3','fred',start_date_b,end_date_b)
TBill_Ddata.head()

Transform daily data into annual data:



In [98]:

    
# Create a time-series of annual data points from daily data
TBill_Adata = TBill_Ddata.resample('A').last()
TBill_Adata[['DTB3']].tail()

Compute annual return for bonds:



In [99]:

    
TBill_Adata['returns'] = TBill_Adata['DTB3'] / 100
TBill_Adata = TBill_Adata.dropna()
print TBill_Adata['returns']









    



DATE
1977-12-31    0.0613
1978-12-31    0.0926
1979-12-31    0.1204
1980-12-31    0.1430
1981-12-31    0.1108
1982-12-31    0.0792
1983-12-31    0.0897
1984-12-31    0.0785
1985-12-31    0.0705
1986-12-31    0.0567
1987-12-31    0.0568
1988-12-31    0.0810
1989-12-31    0.0755
1990-12-31    0.0644
1991-12-31    0.0388
1992-12-31    0.0308
1993-12-31    0.0301
1994-12-31    0.0553
1995-12-31    0.0496
1996-12-31    0.0507
1997-12-31    0.0522
1998-12-31    0.0437
1999-12-31    0.0517
2000-12-31    0.0573
2001-12-31    0.0171
2002-12-31    0.0120
2003-12-31    0.0093
2004-12-31    0.0218
2005-12-31    0.0399
2006-12-31    0.0489
2007-12-31    0.0329
2008-12-31    0.0011
2009-12-31    0.0006
2010-12-31    0.0012
2011-12-31    0.0002
2012-12-31    0.0005
2013-12-31    0.0007
2014-12-31    0.0004
2015-12-31    0.0016
2016-12-31    0.0050
Freq: A-DEC, Name: returns, dtype: float64

Compute average annual return and standard deviation of return for bonds:



In [100]:

    
TBill_mean_ret = float(TBill_Adata[['returns']].mean())
TBill_std_ret = float(TBill_Adata[['returns']].std())
print ("T-bill average return = %g%%, st. dev  = %g%%") % (100*TBill_mean_ret, 100*TBill_std_ret)









    



T-bill average return = 4.5845%, st. dev  = 3.64866%

Compute covariance matrix:



In [101]:

    
covMat = cov(array(SnP500_Adata[['returns']]),array(TBill_Adata[['returns']]),rowvar=0)
covMat









    Out[101]:





array([[ 0.024987  ,  0.00080272],
       [ 0.00080272,  0.00133127]])

Simulate portfolio:



In [102]:

    
v0 = 1000 # Initial capital
Ns = 5000 # Number of scenarios



In [103]:

    
mu = [SnP500_mean_ret, TBill_mean_ret] # Expected return
mu









    Out[103]:





[0.09130311413356892, 0.045845000000000004]



In [104]:

    
stockRet = ones(Ns)
bondsRet = ones(Ns)



In [105]:

    
scenarios = random.multivariate_normal(mu, covMat, Ns)
for year in range(1, 31):
    scenarios = random.multivariate_normal(mu, covMat, Ns)
    stockRet *= (1 + scenarios[:,0])
    bondsRet *= (1 + scenarios[:,1])



In [106]:

    
v30 = 0.5 * v0 * stockRet + 0.5 * v0 * bondsRet



In [107]:

    
hist, bins = histogram(v30, bins = 100)
positions = (bins[:-1] + bins[1:]) / 2
width = (bins[1] - bins[0]) * 0.8
plt.bar(positions, hist, width=width)
plt.xlabel('portfolio value after 30 years')
plt.ylabel('frequency')
plt.show()

Simulation Example 4

Compare two portfolios



In [108]:

    
# Compute portfolios by iterating through different combinations of weights
v30comp = []
for w in arange(0.2, 1.01, 0.2):
    v30comp += [w * v0 * stockRet + (1 - w) * v0 * bondsRet]



In [109]:

    
# Plot a histogram of the distribution of
# differences in outcomes for v30
# (Stratery 4 - Strategy 2)
v30d = v30comp[3] - v30comp[1]



In [110]:

    
hist, bins = histogram(v30d, bins = 50)
positions = (bins[:-1] + bins[1:]) / 2
width = (bins[1] - bins[0]) * 0.8
plt.bar(positions, hist, width=width)
plt.show()



In [111]:

    
# Compute number of elements in v30d that are > 0 and < 0 and compare
pos_count = (v30d > 0).sum()
neg_count = (v30d <= 0).sum()

print u"""Strategy 4 was better in %d cases 
Strategy 2 was better in %d cases
Difference = %d""" % (pos_count, neg_count, pos_count - neg_count)









    



Strategy 4 was better in 4406 cases 
Strategy 2 was better in 594 cases
Difference = 3812

	Date	Open	High	Low	Close	Adj Close	Volume
Date
1976-01-02	1976-01-02	90.190002	91.180000	89.809998	90.900002	90.900002	10300000
1976-01-05	1976-01-05	90.900002	92.839996	90.849998	92.580002	92.580002	21960000
1976-01-06	1976-01-06	92.580002	94.180000	92.370003	93.529999	93.529999	31270000
1976-01-07	1976-01-07	93.529999	95.150002	92.910004	93.949997	93.949997	33170000
1976-01-08	1976-01-08	93.949997	95.470001	93.410004	94.580002	94.580002	29030000

	DTB3
DATE
1977-01-03	4.39
1977-01-04	4.49
1977-01-05	4.47
1977-01-06	4.50
1977-01-07	4.62

	Volume	Adj Close
Date
2012-12-31	3204330000	1426.189941
2013-12-31	2312840000	1848.359985
2014-12-31	2606070000	2058.899902
2015-12-31	2655330000	2043.939941
2016-12-31	2670900000	2238.830078

	DTB3
DATE
2012-12-31	0.05
2013-12-31	0.07
2014-12-31	0.04
2015-12-31	0.16
2016-12-31	0.50