Using CAPE ratios as market timing indicators

CAPE ratios are a valuation metric for the U.S. S&P 500 equity indexes, invented by Benjamin Graham and David Dodd and popularized by the behavioral economist and Nobel laureate Robert Shiller. It seeks to improve on the standard price-to-earnings ratio by using the moving ten-year inflation-adjusted earnings average instead, and thereby reducing the impacts of short-term market volatility. Higher CAPE ratios imply lower real returns over the long term.

When the CAPE ratios climb, many people [get concerned about unsupportable valuations and the increased potential for stomach-churning drops] (http://www.cnbc.com/2015/09/03/risk-of-big-stock-drops-grows-robert-shiller.html). As I write this, we're in the mid-twenties, significantly above the long-term average of 17. (Though not so out of line with most of the post-2000 CAPE ratios, with the notable exception of the 2008 meltdown.) Still, Professor Shiller is careful to say that CAPE ratios aren't a short-term market timing tool.

But I am the kind of person who, when told, "do not use CAPE ratios as short-term market timing tool!" immediately wonders, "Huh. I wonder if I can use the CAPE ratios as a short-term market timing tool?"

Wouldn't it be great if there were some strategy where you sell when the CAPE ratio gets over X and buy back in when it goes below Y and beat the buy-and-hold performance? Fortunately, Professor Shiller makes his data available, so it's easy to find out.

Download the data and load it into a DataFrame

We're using Python, and its very useful data analysis library Pandas. Its DataFrame structure is a natural for time series processing like this. We'll do some plotting, too; for that, we'll use the matplotlib library.



In [36]:

    
%matplotlib inline
import IPython.html.widgets as widgets
import IPython.display as display
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import pandas as pd

pd.set_option('display.float_format', lambda x: '%.4f' % x)
pylab.rcParams['figure.figsize'] = 14, 8
pd.set_option('display.width', 400)

# load the CAPE data and clean it up a bit.
# source: http://www.econ.yale.edu/~shiller/data.htm
cape_data = pd.read_csv('CAPE data', skiprows=range(0,8), skip_blank_lines=True, \
                        names=['date','s&p comp price','s&p comp div','s&p comp earnings','CPI',\
                        'date fraction','int rate GS10','real price','real div','real earnings','CAPE','na'])
cape_data = cape_data[0:1737]
cape_data.sort('date',inplace=True)
cape_data.drop('na', axis=1,inplace=True)
cape_data.set_index('date')
cape_data[['s&p comp price', 'CPI', 'int rate GS10']] = cape_data[['s&p comp price', 'CPI','int rate GS10']].astype(float)

cape_data = cape_data.iloc[948:] # look at sample data since 1950

Let's see how the dataset is structured.



In [37]:

    
cape_data.head()









    Out[37]:






  
    
      
      date
      s&p comp price
      s&p comp div
      s&p comp earnings
      CPI
      date fraction
      int rate GS10
      real price
      real div
      real earnings
      CAPE
    
  
  
    
      948
      1950.0100
      16.8800
      1.1500
      2.3400
      23.5000
      1950.0400
      2.3200
      171.4300
      11.6800
      23.7300
      10.7500
    
    
      949
      1950.0200
      17.2100
      1.1600
      2.3500
      23.5000
      1950.1200
      2.3400
      174.7800
      11.7800
      23.9000
      10.9100
    
    
      950
      1950.0300
      17.3500
      1.1700
      2.3700
      23.6000
      1950.2100
      2.3600
      175.4600
      11.8300
      23.9700
      10.9100
    
    
      951
      1950.0400
      17.8400
      1.1800
      2.4300
      23.6000
      1950.2900
      2.3800
      180.4200
      11.9300
      24.5400
      11.1800
    
    
      952
      1950.0500
      18.4400
      1.1900
      2.4800
      23.7000
      1950.3700
      2.4000
      185.7000
      11.9800
      25.0100
      11.4600

Now let's plot the price, CAPE, and 1YPE data. We'll normalize the price data, adjusting for 2015 dollars and scaling it to make it readable next to the earnings ratios.



In [38]:

    
temp_df = cape_data.copy()

# create a column that holds the price in 2015 dollars
CPI2015 = temp_df.iloc[-1]['CPI']
temp_df['2015 price'] = temp_df['s&p comp price'] * ( CPI2015 / temp_df['CPI'] ) * .06
# ... and a column that holds the 1 year PE
temp_df['1PE'] = temp_df['s&p comp price'] / temp_df['s&p comp earnings'] 

# plot the price, CAPE, and 1Y PE
ax = temp_df.plot( x='date', y='2015 price', color='Black')
temp_df.plot( x='date',y='CAPE', color='Red', ax = ax )
temp_df.plot( x='date', y='1PE', color='Blue', ax=ax)
ax.legend(['S&P comp. price, 2015 dollars (scaled)','CAPE (10PE)','1YPE'], loc='upper left')









    Out[38]:





<matplotlib.legend.Legend at 0x10cfb2150>

OK, time to write some code to test our sell- and buy-trigger hypothesis.

Implement the core functions we'll be using

We'll need a split up the input data into two groups. The first group will hold all the periods where we have not hit our sell trigger and are invested in the market. The second group holds all the periods where we hit the sell trigger but have not yet hit our next buy trigger, and thus are not invested in the market.



In [39]:

    
def split_periods( sell_threshold, buy_threshold, df, debug=False ):
    """
    Given an input dataframe, return two lists of subset dataframes. In the first
    list are all the investable periods, and in the second list are the noninvestable
    periods. Investability is determined by CAPE ratio trigger values.
    
    Inputs:
    
        sell_threshold - the CAPE ratio point at which to end the investable period
        buy_threshold - the CAPE ratio at which to start the investable period
        df - the initial dataset with the CAPE ratio column
        debug - optional boolean for heavy debugging output
        
    Outputs:
    
        Returns two lists. The first is a list of subset dataframes that meet the investable
        period criteria, and second is a list of subset dataframes that don't meet the investable
        period critera.
    """
    
    invested_state = True
    invested_periods = list()
    noninvested_periods = list()
    current_period = list()
    # add an 'is invested' column
    df['is_invested'] = None
    
    for index, row in df.iterrows():
        if row['CAPE'] > sell_threshold and invested_state == True:
            # cape too high, cashing out. add the previous invested period to the invested periods list
            invested_state = False
            if debug:
                print "sell threshold, appending " + str( df.loc[ df['date'].isin( current_period) ] )
            invested_periods.append( df.loc[ df['date'].isin( current_period) ] )
            current_period = list()
        if row['CAPE'] < buy_threshold and invested_state == False:
            # cape low enough to buy back in. add the previous noninvested period to the noninvested periods list
            invested_state = True
            if debug:
                print "buy threshold, appending " + str( df.loc[ df['date'].isin( current_period) ] )
            noninvested_periods.append( df.loc[ df['date'].isin( current_period) ] )
            current_period = list()
        # set this row's 'is invested' state
        current_period.append( row['date'] )
        df.loc[index,'is_invested'] = invested_state
        
    # don't forget rows at the end of the dataset
    if invested_state == True:
        if debug:
            print "end of df: appending to invested: " + str( df.loc[ df['date'].isin( current_period) ] )
        invested_periods.append( df.loc[ df['date'].isin( current_period) ] )
    else:
        if debug:
            print "end of df: appending to noninvested: " + str( df.loc[ df['date'].isin( current_period) ] )
        noninvested_periods.append( df.loc[ df['date'].isin( current_period) ] )
    return (invested_periods,noninvested_periods)

For each investable period, we'll need to figure out the price gain or loss over that period. We'll be collecting dividends, so we'll add that into the total.



In [40]:

    
def compute_gain_or_loss( df ):
    '''
    compute the difference from the price in the first row to the price in the last row,
    adding in dividends for the total.
    '''
    if len(df) == 0:
        return 0.0
    divs = df['s&p comp div'] / 12   # dividends are annualized; break it down to months.
    divs_sum = divs.sum()
    if len(df) == 1:
        return divs_sum
    return df.iloc[-1]['s&p comp price'] - df.iloc[0]['s&p comp price'] + divs_sum

For each noninvestable period, we'll be parking our money in a safe short-term interest bearing account. We'll write a function to calculate that amount.



In [41]:

    
def compute_interest( df ):
    '''
    for the time period in the dataframe, compute the interest that would be accrued
    '''
    intr = df['int rate GS10'] / 12         # to get monthly gs10 interest rate
    intr = intr / 10                        # to approximate short term interest rate
    return intr.sum()

Compute_passive_gains() calculates the gains for the baseline buy-and-hold case. It returns the change in price plus the dividends over the entire time period.

Compute_cape_strategy_gains() splits the input data into investable and noninvestable periods, based on its sell and buy thresholds given as parameters. For each of the investable periods, it computes the gain or loss plus the dividends. For each of the noninvestable periods, it computes the interest accrued. It sums these together and returns this as the total gain for the CAPE strategy.



In [42]:

    
def compute_passive_gains( df ):
    '''
    compute the total gains or losses for the dataframe
    '''
    gains_list = map( compute_gain_or_loss, [ df, ] )
    return sum(gains_list)

def compute_cape_strategy_gains( sell_threshold, buy_threshold, df, debug = False ):
    '''
    compute the total gains and losses for the dataframe, using the sell and buy thresholds
    to time the investable periods
    '''
    # split the dataframe into invested and noninvested periods, using the thresholds
    (invested_periods,noninvested_periods) = split_periods(sell_threshold, buy_threshold, df, debug ) 
    if debug:
        print "\n\n===cape invested periods" + str(invested_periods) + "\n===="
        print "\n\n===cape noninvested periods " + str(noninvested_periods) + "\n===="
    # compute the gain or loss for each invested period
    gain_list = map( compute_gain_or_loss, invested_periods )
    # compute the interest accrued for each noninvested period
    int_list = map( compute_interest, noninvested_periods )
    return sum(gain_list) + sum(int_list)

Do the calculations

We calculate the baseline buy-and-hold case, and then loop through a range of possible sell and buy threshold values for the CAPE strategy. We store the difference between the baseline gain and the CAPE strategy in a DataFrame. A positive number means you do better with buy-and-hold than with the given CAPE strategy; a negative number means the CAPE strategy wins.



In [43]:

    
sample_df = cape_data.copy()
results = []

# compute the buy and hold results
baseline_gains = compute_passive_gains( sample_df )

# compute the CAPE strategy results for various sell and buy thresholds
for s in range(20,50):
    for b in range( 10, 30 ):
        if b > s: continue
        cape_gains = compute_cape_strategy_gains( s, b, sample_df )
        results.append( [ baseline_gains - cape_gains, s, b ] )
        
results_df = pd.DataFrame( data=results, columns=['baseline - cape return', 'sell threshold', 'buy threshold'])

Plot the results

Now we plot the results.

The axes indicate the sell and buy threshold parameters tested with the cape strategy. The dots tell you how the test came out: blue means buy-and-hold wins, and red means the CAPE strategy wins. The size of the dot gives you a relative indicator of how much winning is going on; the bigger the dot, the bigger the win.



In [44]:

    
baseline_df = results_df[ results_df['baseline - cape return'] > 0 ]
ax = baseline_df.plot(kind='scatter', x='sell threshold',y='buy threshold', \
                color='Blue', s = baseline_df['baseline - cape return'] * .05)

cape_df = results_df[ results_df['baseline - cape return'] <= 0 ]
cape_df.plot(kind='scatter', x='sell threshold',y='buy threshold', \
                color='Red', s = cape_df['baseline - cape return'] * -.05 , ax = ax )









    Out[44]:





<matplotlib.axes._subplots.AxesSubplot at 0x10cfeeb90>

What this tells me is that you can use the CAPE ratio to beat buy-and-hold -- but only when it's very, very high. Which isn't terribly often: the CAPE ratio has been over 35 4.3% of the 1950-2015 timeframe, all of it during the dotcom boom of 1998-2001.

Looking at other stock market debacles, Black Tuesday's 1929 CAPE ratio was around 30, and Black Monday's 1987 CAPE ratio was in the high teens. When we look at the plot, we see that buy-and-hold easily beats a CAPE strategy that uses either of those numbers as sell triggers.

You see where I'm going with this, right? CAPE ratios are not, in fact, useful market timing tools. Also, it's really, really, really hard to beat buy-and-hold. Really. Effin. Hard.

Appendix: further reading

Professor Shiller's most well-known popular book is Irrational Exuberance, where he describes his theory of behavioral economics and asset bubbles in particular.

Professor Shiller was a fascinating interviewee on Barry Ritholtz's Masters in Business podcast.

Appendix: tests



In [45]:

    
# import and options
%matplotlib inline
import IPython.html.widgets as widgets
import IPython.display as display
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import pandas as pd

pd.set_option('display.float_format', lambda x: '%.4f' % x)
pylab.rcParams['figure.figsize'] = 14, 8
pd.set_option('display.width', 400)

# read in test file for tests 1 and 2
test_data = pd.read_csv('testdata.csv', skiprows=range(0,8), skip_blank_lines=True, \
                        names=['date','s&p comp price','s&p comp div','s&p comp earnings','CPI',\
                        'date fraction','int rate GS10','real price','real div','real earnings','CAPE','na'])
test_data.drop('na', axis=1,inplace=True)
test_data.sort('date', inplace=True)
test_data.set_index('date')

#############################
#
# test 1
#
#############################
print "\n\ntest 1: baseline and cape strategies are the same"
sample_df = test_data.copy()
results = []

baseline_gains = compute_passive_gains( sample_df )
print "baseline gains " + str(baseline_gains)
assert baseline_gains == 1012.0

print "doing cape computation now"
sell_threshold = 25.0
buy_threshold = 20.0
cape_gains = compute_cape_strategy_gains( sell_threshold, buy_threshold, sample_df )
print "cape_gains " + str(cape_gains)
assert cape_gains == 1012.0
results.append( [ baseline_gains - cape_gains, sell_threshold, buy_threshold ] )
        
results_df = pd.DataFrame( data=results, columns=['baseline - cape return', 'sell threshold', 'buy threshold'])
assert len(results_df) == 1
assert results_df.iloc[0]['baseline - cape return'] == 0.0
assert results_df.iloc[0]['sell threshold'] == sell_threshold
assert results_df.iloc[0]['buy threshold'] == buy_threshold
print "test 1 passed"

#############################
#
# test 2
#
#############################
print "\n\ntest 2: cape strategy always in cash"
sample_df = test_data.copy()
results = []

baseline_gains = compute_passive_gains( sample_df )
print "baseline gains " + str(baseline_gains)
assert baseline_gains == 1012.0

print "doing cape computation now"
sell_threshold = 15.0
buy_threshold = 10.0
cape_gains = compute_cape_strategy_gains( sell_threshold, buy_threshold, sample_df )
print "cape_gains " + str(cape_gains)
assert cape_gains == 0.6
results.append( [ baseline_gains - cape_gains, sell_threshold, buy_threshold ] )
        
results_df = pd.DataFrame( data=results, columns=['baseline - cape return', 'sell threshold', 'buy threshold'])
assert len(results_df) == 1
assert results_df.iloc[0]['baseline - cape return'] == 1011.4
assert results_df.iloc[0]['sell threshold'] == sell_threshold
assert results_df.iloc[0]['buy threshold'] == buy_threshold
print "test 2 passed"

# read in test file for test 3
test_data = pd.read_csv('testdata2.csv', skiprows=range(0,8), skip_blank_lines=True, \
                        names=['date','s&p comp price','s&p comp div','s&p comp earnings','CPI',\
                        'date fraction','int rate GS10','real price','real div','real earnings','CAPE','na'])
test_data.drop('na', axis=1,inplace=True)
test_data.sort('date', inplace=True)
test_data.set_index('date')

#############################
#
# test 3
#
#############################
print "\n\ntest 3: cape strategy sometimes in cash"
sample_df = test_data.copy()
results = []

baseline_gains = compute_passive_gains( sample_df )
print "baseline gains " + str(baseline_gains)
assert baseline_gains == 12.0

print "doing cape computation now"
sell_threshold = 22.0
buy_threshold = 21.0
cape_gains = compute_cape_strategy_gains( sell_threshold, buy_threshold, sample_df, debug = True )
print "cape_gains " + str(cape_gains)
assert cape_gains == 1008.20
results.append( [ baseline_gains - cape_gains, sell_threshold, buy_threshold ] )
        
results_df = pd.DataFrame( data=results, columns=['baseline - cape return', 'sell threshold', 'buy threshold'])
assert len(results_df) == 1
assert results_df.iloc[0]['baseline - cape return'] == -996.2
assert results_df.iloc[0]['sell threshold'] == sell_threshold
assert results_df.iloc[0]['buy threshold'] == buy_threshold
print "test 3 passed"









    




test 1: baseline and cape strategies are the same
baseline gains 1012.0
doing cape computation now
cape_gains 1012.0
test 1 passed


test 2: cape strategy always in cash
baseline gains 1012.0
doing cape computation now
cape_gains 0.6
test 2 passed


test 3: cape strategy sometimes in cash
baseline gains 12.0
doing cape computation now
sell threshold, appending Empty DataFrame
Columns: [date, s&p comp price, s&p comp div, s&p comp earnings, CPI, date fraction, int rate GS10, real price, real div, real earnings, CAPE, is_invested]
Index: []
buy threshold, appending        date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
3 2009.0100       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False
4 2009.0200       2000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False
sell threshold, appending        date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
5 2009.0300       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
6 2009.0400       2000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
buy threshold, appending        date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
7 2009.0500       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False
8 2009.0600       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False
end of df: appending to invested:         date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
9  2009.0700       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
10 2009.0800       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
11 2009.0900       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
0  2009.1000       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
1  2009.1100       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
2  2009.1200       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True


===cape invested periods[Empty DataFrame
Columns: [date, s&p comp price, s&p comp div, s&p comp earnings, CPI, date fraction, int rate GS10, real price, real div, real earnings, CAPE, is_invested]
Index: [],        date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
5 2009.0300       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
6 2009.0400       2000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True,         date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
9  2009.0700       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
10 2009.0800       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
11 2009.0900       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
0  2009.1000       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
1  2009.1100       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True
2  2009.1200       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 20.0000        True]
====


===cape noninvested periods [       date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
3 2009.0100       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False
4 2009.0200       2000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False,        date  s&p comp price  s&p comp div  s&p comp earnings      CPI  date fraction  int rate GS10  real price  real div  real earnings    CAPE is_invested
7 2009.0500       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False
8 2009.0600       1000.0000       12.0000            25.3500 216.1770      2009.7900         6.0000   1178.7300   25.8400        27.9900 25.0000       False]
====
cape_gains 1008.2
test 3 passed



In [ ]:

	date	s&p comp price	s&p comp div	s&p comp earnings	CPI	date fraction	int rate GS10	real price	real div	real earnings	CAPE
948	1950.0100	16.8800	1.1500	2.3400	23.5000	1950.0400	2.3200	171.4300	11.6800	23.7300	10.7500
949	1950.0200	17.2100	1.1600	2.3500	23.5000	1950.1200	2.3400	174.7800	11.7800	23.9000	10.9100
950	1950.0300	17.3500	1.1700	2.3700	23.6000	1950.2100	2.3600	175.4600	11.8300	23.9700	10.9100
951	1950.0400	17.8400	1.1800	2.4300	23.6000	1950.2900	2.3800	180.4200	11.9300	24.5400	11.1800
952	1950.0500	18.4400	1.1900	2.4800	23.7000	1950.3700	2.4000	185.7000	11.9800	25.0100	11.4600