Stock Analysis

This notebook analyzes some securities -- stocks, bonds, commodities -- computes their correlations, and optimizes risk vs. reward for a set of portfolio scenarios.

Select "Run All" to update this notebook with the latest stock quotes.

Caveat: This analysis covers a date range starting around 2007 (if run without modification). Obviously, the analysis can vary dramatically if you choose another date range.

Initialization



In [1]:

    
from sklearn import cluster  # machine learning
import pandas             # data manipulation and computation
import ystockquote as ys  # stock prices



In [2]:

    
rcParams['figure.figsize'] = (13, 6)  # default figure size

1. Data Collection

Choose a few securities.



In [3]:

    
symbols = array([
           'blv',  # bond: long-term bond
           'bnd',  # bond: total bond market
           'bsv',  # bond: short-term bond
           'tip',  # bond: treasury inflation protected
           'gld',  # commodity: gold (SPDR)
           'iau',  # commodity: gold (iShares)
           'slv',  # commodity: silver
           'vti',  # stock: US total stock market
           'vbk',  # stock: US small-cap stock
           'aapl', # stock: Apple
           'ua',   # stock: Under Armour
           'cmg',  # stock: Chipotle
          ])

Get daily close prices.



In [4]:

    
prices = pandas.DataFrame({
             symbol: {
                 date: quote['Adj Close'] 
                 for date, quote in ys.get_historical_prices(symbol, '2000-01-01', '2015-01-01').iteritems()
             }
             for symbol in symbols
         }, columns=symbols, dtype=float)
prices.tail()



In [5]:

    
prices.head()









    Out[5]:






  
    
      
      blv
      bnd
      bsv
      tip
      gld
      iau
      slv
      vti
      vbk
      aapl
      ua
      cmg
    
  
  
    
      2000-01-03
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
       3.79
      NaN
      NaN
    
    
      2000-01-04
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
       3.47
      NaN
      NaN
    
    
      2000-01-05
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
       3.52
      NaN
      NaN
    
    
      2000-01-06
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
       3.21
      NaN
      NaN
    
    
      2000-01-07
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
       3.37
      NaN
      NaN

Only preserve dates for which all stocks exist.



In [6]:

    
prices.dropna(inplace=True)
prices.head()

Optional: choose a custom start date for analysis.



In [7]:

    
prices = prices.ix['2013-01-01':]
prices.head()

Plot the daily close prices.



In [8]:

    
prices[['vti', 'bnd', 'iau']].plot()  # plot a few, or...
#prices.plot()  # plot all
ylabel('Daily Close')









    Out[8]:





<matplotlib.text.Text at 0x10b782890>

Plot prices normalized from Day 1.



In [9]:

    
prices_norm = prices.copy()
for symbol in symbols:
    prices_norm[symbol] = prices[symbol]/prices[symbol][0]
prices_norm[['vti', 'bnd', 'iau']].plot()









    Out[9]:





<matplotlib.axes.AxesSubplot at 0x107559310>

Plot the ratio VTI/IAU.

In 2013-14, stocks are increasing faster than gold.



In [10]:

    
(prices_norm.vti/prices_norm.iau).plot(logy=True)
ylim(0.3, 1.5)









    Out[10]:





(0.3, 1.5)

2. Data Analysis

Compute the daily percent change.



In [11]:

    
prices.pct_change().tail()

Get basic statistics for the daily percent change.



In [12]:

    
prices.pct_change().describe()

Compute the correlation between each pair of ETFs.



In [13]:

    
C = prices.pct_change().corr()
C

Cluster the stocks based upon their pairwise correlations.

Not surprisingly, the securities cluster by stocks, bonds, and commodities! And that's without predefining the number of clusters.



In [14]:

    
_, labels = cluster.affinity_propagation(C)
for label in range(max(labels)+1):
    print label, symbols[labels==label]









    



0 ['blv' 'bnd' 'bsv' 'tip']
1 ['gld' 'iau' 'slv']
2 ['vti' 'vbk' 'aapl' 'ua' 'cmg']

Plot histograms of the daily percent change.

We find that the daily percent change is roughly Gaussian, which implies that the daily change follows a log-normal distribution.



In [15]:

    
prices.pct_change().hist(column=['vti', 'iau', 'bnd', 'cmg'], bins=40, sharex=True, sharey=True)









    Out[15]:





array([[<matplotlib.axes.AxesSubplot object at 0x10b9f0a50>,
        <matplotlib.axes.AxesSubplot object at 0x10b966290>],
       [<matplotlib.axes.AxesSubplot object at 0x10ba96310>,
        <matplotlib.axes.AxesSubplot object at 0x10b894150>]], dtype=object)

Scatter plot the daily percent change between BND and VTI.

There is a slight inverse correlation between BND and VTI.

The plot also suggests that the joint daily change is jointly Gaussian. If two jointly Gaussian variables are uncorrelated, then they are statistically independent.

That is a good property to have when choosing securities for your portfolio. If two securities are perfectly correlated, well, then you should just own more of one. If two securities are perfectly inversely correlated, then their movements cancel out, so it's as if you're investing less. By choosing statistically independent securities, we have a broader range of risk vs. reward combinations.



In [16]:

    
scatter(prices.bnd.pct_change(), prices.vti.pct_change())
xlabel('BND')
ylabel('VTI')









    Out[16]:





<matplotlib.text.Text at 0x10ba66450>

3. Portfolio Assignment

As shown earlier, the securities cluster into three categories: stocks, bonds, and commodities. So let's choose one from each.

Get statistics for a portfolio with positions in VTI, BND, and IAU.



In [17]:

    
# assuming you use these proportions on day 1
portfolio = 0.5*prices_norm.vti + 0.3*prices_norm.bnd + 0.2*prices_norm.iau
portfolio.pct_change().describe()









    Out[17]:





count    467.000000
mean       0.000343
std        0.004556
min       -0.027941
25%       -0.001741
50%        0.000672
75%        0.002995
max        0.015430
dtype: float64

Get statistics for a portfolio that is partially allocated in VTI and partially unallocated.

Compared to the previous portfolio, this portfolio has similar reward but far more risk.



In [18]:

    
(0.8*prices_norm.vti + 0.2).pct_change().describe()









    Out[18]:





count    467.000000
mean       0.000664
std        0.005887
min       -0.020276
25%       -0.002300
50%        0.001026
75%        0.004367
max        0.018488
dtype: float64

Plot the risk and reward of a portfolio with varying positions in VTI and IAU.



In [19]:

    
pct_vti = linspace(0, 1, 100, endpoint=False)
vti_iau = pandas.DataFrame({
    'reward': pandas.Series([(r*prices_norm.vti + (1-r)*prices_norm.iau).pct_change().mean() for r in pct_vti], index=pct_vti),
    'risk': pandas.Series([(r*prices_norm.vti + (1-r)*prices_norm.iau).pct_change().std() for r in pct_vti], index=pct_vti)
})
ax = vti_iau.plot(secondary_y='risk')
ax.set_xlabel('Percent of portfolio in VTI')
ax.set_ylabel('Mean of daily percent change')
ax.right_ax.set_ylabel('Standard deviation of daily percent change')
title('VTI vs. IAU: Reward and Risk')









    Out[19]:





<matplotlib.text.Text at 0x10c451990>

Plot risk vs. reward.



In [20]:

    
reward_over_risk = vti_iau.reward/vti_iau.risk
reward_over_risk.plot()
xlabel('Percent of portfolio in VTI')
ylabel('Reward/Risk')









    Out[20]:





<matplotlib.text.Text at 0x10c9749d0>

Compute optimal allocation between VTI and IAU.

Reward/risk is maximized when the portfolio is approximately 53% VTI and 47% IAU.



In [21]:

    
reward_over_risk.argmax()









    Out[21]:





0.98999999999999999

Repeat for VTI and BND.



In [22]:

    
vti_bnd = pandas.DataFrame({
    'reward': pandas.Series([(r*prices_norm.vti + (1-r)*prices_norm.bnd).pct_change().mean() for r in pct_vti], index=pct_vti),
    'risk': pandas.Series([(r*prices_norm.vti + (1-r)*prices_norm.bnd).pct_change().std() for r in pct_vti], index=pct_vti)
})
ax = vti_bnd.plot(secondary_y='risk')
ax.set_xlabel('Percent of portfolio in VTI')
ax.set_ylabel('Mean of daily percent change')
ax.right_ax.set_ylabel('Standard deviation of daily percent change')
title('VTI vs. BND: Reward and Risk')









    Out[22]:





<matplotlib.text.Text at 0x10cc528d0>



In [23]:

    
reward_over_risk = vti_bnd.reward/vti_bnd.risk
reward_over_risk.plot()
xlabel('Percent of portfolio in VTI')
ylabel('Reward/Risk')









    Out[23]:





<matplotlib.text.Text at 0x10cc72690>

Reward/risk is mazimized at 14% VTI and 86% BND.



In [24]:

    
reward_over_risk.argmax()









    Out[24]:





0.41000000000000003

Find the maximum reward given a risk constraint.

Suppose that we require the standard deviation of the daily percent change to be less than 0.01.

Given this risk constraint, reward is maximized at 77% VTI and 23% BND.



In [25]:

    
vti_bnd[vti_bnd.risk < 0.01].iloc[-1]









    Out[25]:





reward    0.000800
risk      0.007029
Name: 0.99, dtype: float64

Plot the trajectory of three hypothetical portfolios.



In [26]:

    
portfolios = pandas.DataFrame({
    'bnd heavy': 0.15*prices_norm.vti + 0.15*prices_norm.iau + 0.70*prices_norm.bnd,
    'iau heavy': 0.15*prices_norm.vti + 0.70*prices_norm.iau + 0.15*prices_norm.bnd,
    'vti heavy': 0.70*prices_norm.vti + 0.15*prices_norm.iau + 0.15*prices_norm.bnd,
    'mixed':     0.45*prices_norm.vti + 0.43*prices_norm.iau + 0.12*prices_norm.bnd,
})
portfolios.plot()









    Out[26]:





<matplotlib.axes.AxesSubplot at 0x10c44c790>

Get the statistics of these portfolios.



In [27]:

    
portfolios.pct_change().describe()

So what does this all mean? I have no idea.

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
2014-11-03	91.10	82.15	80.17	113.08	112.15	11.29	15.48	104.08	124.94	108.93	65.99	638.53
2014-11-04	91.18	82.13	80.17	113.04	112.22	11.30	15.36	103.70	124.12	108.13	64.79	647.68
2014-11-05	91.36	82.23	80.12	113.15	109.79	11.06	14.66	104.19	123.87	108.39	64.25	642.59
2014-11-06	91.01	82.04	80.10	112.96	109.88	11.07	14.82	104.70	124.75	108.70	66.28	647.12
2014-11-07	91.60	82.32	80.21	113.62	112.97	11.37	15.10	104.78	125.07	109.01	66.98	649.03

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
2007-04-10	50.97	57.73	62.42	78.08	67.16	6.71	13.85	62.02	66.83	12.75	13.26	64.67
2007-04-11	50.90	57.58	62.38	77.96	67.08	6.71	13.77	61.62	66.59	12.53	13.17	63.78
2007-04-12	50.96	57.57	62.53	77.96	66.99	6.71	13.81	62.03	67.16	12.47	13.16	64.16
2007-04-13	50.86	57.48	62.35	77.77	67.84	6.79	13.99	62.24	67.51	12.21	12.97	64.07
2007-04-16	51.03	57.53	62.43	78.01	68.40	6.84	13.98	62.86	68.31	12.37	13.13	65.20

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
2013-01-02	85.86	79.92	79.00	118.16	163.17	16.40	29.92	72.75	90.81	74.93	24.17	301.06
2013-01-03	84.98	79.68	78.97	117.35	161.20	16.19	29.18	72.64	90.74	73.99	24.83	300.95
2013-01-04	85.37	79.81	78.98	117.30	160.44	16.11	29.24	73.01	91.32	71.93	24.67	300.18
2013-01-07	85.37	79.75	78.96	117.70	159.43	16.02	29.18	72.81	91.19	71.50	24.22	299.59
2013-01-08	85.73	79.83	79.00	117.86	160.56	16.12	29.38	72.62	90.92	71.70	24.32	297.76

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
2014-11-03	-0.002955	-0.000487	-0.000623	0.000000	-0.004527	-0.005286	-0.001290	0.000577	-0.000720	0.013020	0.006252	0.000831
2014-11-04	0.000878	-0.000243	0.000000	-0.000354	0.000624	0.000886	-0.007752	-0.003651	-0.006563	-0.007344	-0.018185	0.014330
2014-11-05	0.001974	0.001218	-0.000624	0.000973	-0.021654	-0.021239	-0.045573	0.004725	-0.002014	0.002405	-0.008335	-0.007859
2014-11-06	-0.003831	-0.002311	-0.000250	-0.001679	0.000820	0.000904	0.010914	0.004895	0.007104	0.002860	0.031595	0.007050
2014-11-07	0.006483	0.003413	0.001373	0.005843	0.028122	0.027100	0.018893	0.000764	0.002565	0.002852	0.010561	0.002952

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
count	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000	467.000000
mean	0.000153	0.000065	0.000033	-0.000078	-0.000721	-0.000718	-0.001314	0.000807	0.000730	0.000933	0.002431	0.001813
std	0.005318	0.001986	0.000802	0.003354	0.011482	0.011484	0.017203	0.007089	0.009358	0.016040	0.022617	0.018602
min	-0.025536	-0.011150	-0.003264	-0.013636	-0.087808	-0.088459	-0.126187	-0.024777	-0.035620	-0.123450	-0.073851	-0.069645
25%	-0.003047	-0.001057	-0.000380	-0.001910	-0.005575	-0.005242	-0.009114	-0.002752	-0.004452	-0.006993	-0.008279	-0.006315
50%	0.000365	0.000125	0.000000	0.000000	-0.000241	0.000000	-0.000585	0.001216	0.001855	0.001070	0.002025	0.000433
75%	0.003590	0.001261	0.000504	0.001917	0.005494	0.005135	0.005532	0.005202	0.006979	0.009821	0.011570	0.008570
max	0.014883	0.007994	0.003556	0.013260	0.043557	0.044025	0.064408	0.022464	0.024459	0.081912	0.229289	0.160954

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
2000-01-03	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3.79	NaN	NaN
2000-01-04	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3.47	NaN	NaN
2000-01-05	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3.52	NaN	NaN
2000-01-06	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3.21	NaN	NaN
2000-01-07	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3.37	NaN	NaN

	blv	bnd	bsv	tip	gld	iau	slv	vti	vbk	aapl	ua	cmg
blv	1.000000	0.883149	0.668414	0.832665	0.235849	0.233196	0.138741	-0.170632	-0.175536	-0.118826	-0.117368	-0.003331
bnd	0.883149	1.000000	0.827002	0.865648	0.332163	0.330167	0.227901	-0.088770	-0.101778	-0.090745	-0.103985	0.001950
bsv	0.668414	0.827002	1.000000	0.748721	0.324600	0.326571	0.238990	-0.053424	-0.049828	-0.052260	-0.057686	-0.031347
tip	0.832665	0.865648	0.748721	1.000000	0.272348	0.273133	0.181317	-0.121027	-0.112926	-0.110870	-0.078758	-0.001065
gld	0.235849	0.332163	0.324600	0.272348	1.000000	0.998278	0.883991	0.031921	0.052756	0.105621	-0.116838	-0.018545
iau	0.233196	0.330167	0.326571	0.273133	0.998278	1.000000	0.883531	0.033171	0.052788	0.104115	-0.112072	-0.018756
slv	0.138741	0.227901	0.238990	0.181317	0.883991	0.883531	1.000000	0.099791	0.112499	0.143244	-0.091253	-0.004312
vti	-0.170632	-0.088770	-0.053424	-0.121027	0.031921	0.033171	0.099791	1.000000	0.924305	0.277488	0.425141	0.370118
vbk	-0.175536	-0.101778	-0.049828	-0.112926	0.052756	0.052788	0.112499	0.924305	1.000000	0.226838	0.486547	0.393794
aapl	-0.118826	-0.090745	-0.052260	-0.110870	0.105621	0.104115	0.143244	0.277488	0.226838	1.000000	0.068858	0.031359
ua	-0.117368	-0.103985	-0.057686	-0.078758	-0.116838	-0.112072	-0.091253	0.425141	0.486547	0.068858	1.000000	0.301409
cmg	-0.003331	0.001950	-0.031347	-0.001065	-0.018545	-0.018756	-0.004312	0.370118	0.393794	0.031359	0.301409	1.000000