Fundamental factor models

By Evgenia "Jenny" Nitishinskaya, Delaney Granizo-Mackenzie, and Maxwell Margenot.

Part of the Quantopian Lecture Series:

Notebook released under the Creative Commons Attribution 4.0 License.

Fundamentals are data having to do with the asset issuer, like the sector, size, and expenses of the company. We can use this data to build a linear factor model, expressing returns on any asset as

$$R_t = a_t + b_{t1} F_1 + b_{t2} F_2 + \ldots + b_{tK} F_K + \epsilon_t$$

There are two different approaches to computing the factors $F_j$, which represent the returns associated with some fundamental characteristics, and the factor sensitivities $b_{tj}$.

Approach 1: Portfolio Construction

In the first, we start by representing each characteristic of interest by a portfolio: we sort all assets by that characteristic, then build the portfolio by going long the top quantile of assets and short the bottom quantile. The factor corresponding to this characteristic is the return on this portfolio. Then, the $b_{ij}$ are estimated for each asset $i$ by regressing over the historical values of $R_i$ and of the factors.

We'll use the canonical Fama-French factors for this example, which are the returns of portfolios constructred based on fundamental factors.

We start by getting the fundamentals data for all assets and constructing the portfolios for each characteristic:

Import some libraries.



In [1]:

    
import numpy as np
import pandas as pd
from quantopian.pipeline.data import morningstar
import statsmodels.api as sm
from statsmodels import regression
import matplotlib.pyplot as plt
import scipy.stats

Set the date range for which we want data.



In [2]:

    
start_date = '2011-1-1'
end_date = '2012-1-1'

Using the Pipeline API to Fetch Data

The pipeline API is a very useful tool for factor analysis. We use it here to get data for our analysis. Specifically, we want the daily values of book to price ratio and market cap for every security. But we also do several other useful filtering steps which are detailed in code comments.



In [3]:

    
import numpy as np
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data import morningstar
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import CustomFactor, Returns

# Here's the raw data we need, everything else is derivative.

class MarketCap(CustomFactor):
    # Here's the data we need for this factor
    inputs = [morningstar.valuation.shares_outstanding, USEquityPricing.close]
    # Only need the most recent values for both series
    window_length = 1
    
    def compute(self, today, assets, out, shares, close_price):
        # Shares * price/share = total price = market cap
        out[:] = shares * close_price
        
        
class BookToPrice(CustomFactor):
    # pb = price to book, we'll need to take the reciprocal later
    inputs = [morningstar.valuation_ratios.pb_ratio]
    window_length = 1
    
    def compute(self, today, assets, out, pb):
        out[:] = 1 / pb
        
def make_pipeline():
    """
    Create and return our pipeline.
    
    We break this piece of logic out into its own function to make it easier to
    test and modify in isolation.
    
    In particular, this function can be copy/pasted into research and run by itself.
    """
    pipe = Pipeline()

    # Add our factors to the pipeline
    market_cap = MarketCap()
    # Raw market cap and book to price data gets fed in here
    pipe.add(market_cap, "market_cap")
    book_to_price = BookToPrice()
    pipe.add(book_to_price, "book_to_price")
    
    # We also get daily returns
    returns = Returns(inputs=[USEquityPricing.close], window_length=2)
    pipe.add(returns, "returns")
    
    # We compute a daily rank of both factors, this is used in the next step,
    # which is computing portfolio membership.
    market_cap_rank = market_cap.rank()
    pipe.add(market_cap_rank, 'market_cap_rank')
    
    book_to_price_rank = book_to_price.rank()
    pipe.add(book_to_price_rank, 'book_to_price_rank')

    # Build Filters representing the top and bottom 1000 stocks by our combined ranking system.
    biggest = market_cap_rank.top(1000)
    smallest = market_cap_rank.bottom(1000)
    
    highpb = book_to_price_rank.top(1000)
    lowpb = book_to_price_rank.bottom(1000)
    
    # Don't return anything not in this set, as we don't need it.
    pipe.set_screen(biggest | smallest | highpb | lowpb)
    
    # Add the boolean flags we computed to the output data
    pipe.add(biggest, 'biggest')
    pipe.add(smallest, 'smallest')
    
    pipe.add(highpb, 'highpb')
    pipe.add(lowpb, 'lowpb')
    
    return pipe









    



/build/src/ipython/IPython/kernel/__main__.py:11: NotAllowedInLiveWarning: The fundamentals attribute valuation.shares_outstanding is not yet allowed in broker-backed live trading

Now we initialize the pipeline.



In [4]:

    
pipe = make_pipeline()

We can visualize the dependency graph of our data computations here.



In [5]:

    
pipe.show_graph('png')









    Out[5]:

This function will allow us to run the pipeline.



In [6]:

    
from quantopian.research import run_pipeline

Now let's actually run it and check out our results.



In [7]:

    
# This takes a few minutes.
results = run_pipeline(pipe, start_date, end_date)
results









    Out[7]:






  
    
      
      
      biggest
      book_to_price
      book_to_price_rank
      highpb
      lowpb
      market_cap
      market_cap_rank
      returns
      smallest
    
  
  
    
      2011-01-03 00:00:00+00:00
      Equity(2 [ARNC])
      True
      0.991867
      3625.0
      False
      False
      1.573937e+10
      4373.0
      0.013083
      False
    
    
      Equity(21 [AAME])
      False
      2.052967
      4526.0
      True
      False
      4.520648e+07
      578.0
      -0.009756
      True
    
    
      Equity(24 [AAPL])
      True
      0.167400
      494.0
      False
      True
      2.957765e+11
      4701.0
      -0.003769
      False
    
    
      Equity(31 [ABAX])
      False
      0.257898
      857.0
      False
      True
      6.002049e+08
      2396.0
      -0.020073
      False
    
    
      Equity(37 [ABCW])
      False
      0.277500
      954.0
      False
      True
      2.601996e+07
      370.0
      0.000000
      True
    
    
      Equity(51 [ABL])
      False
      2.364066
      4584.0
      True
      False
      2.415793e+05
      2.0
      -0.082353
      True
    
    
      Equity(53 [ABMD])
      False
      0.235200
      760.0
      False
      True
      3.632957e+08
      1985.0
      -0.012346
      False
    
    
      Equity(58 [SERV])
      False
      1.666944
      4405.0
      True
      False
      8.785592e+06
      109.0
      0.030172
      True
    
    
      Equity(62 [ABT])
      True
      0.297699
      1040.0
      False
      False
      7.410661e+10
      4626.0
      0.007778
      False
    
    
      Equity(64 [ABX])
      True
      0.344994
      1277.0
      False
      False
      5.297974e+10
      4599.0
      0.011217
      False
    
    
      Equity(67 [ADSK])
      True
      0.193900
      599.0
      False
      True
      8.684483e+09
      4138.0
      -0.014952
      False
    
    
      Equity(76 [TAP])
      True
      0.857633
      3270.0
      False
      False
      9.346545e+09
      4175.0
      -0.004362
      False
    
    
      Equity(88 [ACI])
      True
      0.460893
      1821.0
      False
      False
      5.696619e+09
      3993.0
      -0.005954
      False
    
    
      Equity(100 [IEP])
      False
      1.077354
      3838.0
      True
      False
      3.059538e+09
      3653.0
      0.006853
      False
    
    
      Equity(106 [ACU])
      False
      0.853315
      3259.0
      False
      False
      2.914872e+07
      403.0
      -0.003141
      True
    
    
      Equity(107 [ACV])
      True
      0.360101
      1338.0
      False
      False
      3.663793e+09
      3768.0
      -0.000270
      False
    
    
      Equity(112 [ACY])
      False
      1.571092
      4353.0
      True
      False
      2.793301e+07
      390.0
      0.008357
      True
    
    
      Equity(114 [ADBE])
      True
      0.367202
      1384.0
      False
      False
      1.565831e+10
      4371.0
      0.006869
      False
    
    
      Equity(117 [AEY])
      False
      0.949668
      3506.0
      False
      False
      3.185216e+07
      433.0
      0.000000
      True
    
    
      Equity(122 [ADI])
      True
      0.301296
      1058.0
      False
      False
      1.125325e+10
      4246.0
      -0.007638
      False
    
    
      Equity(128 [ADM])
      True
      0.829669
      3196.0
      False
      False
      1.922429e+10
      4421.0
      0.007028
      False
    
    
      Equity(154 [AEM])
      True
      0.267001
      904.0
      False
      True
      1.200374e+10
      4266.0
      0.002354
      False
    
    
      Equity(157 [AEG])
      True
      3.364738
      4680.0
      True
      False
      1.065935e+10
      4229.0
      0.008210
      False
    
    
      Equity(161 [AEP])
      True
      0.798722
      3103.0
      False
      False
      1.728033e+10
      4399.0
      -0.002495
      False
    
    
      Equity(166 [AES])
      True
      0.808669
      3134.0
      False
      False
      9.602998e+09
      4188.0
      -0.002048
      False
    
    
      Equity(168 [AET])
      True
      0.843526
      3232.0
      False
      False
      1.221105e+10
      4277.0
      0.002628
      False
    
    
      Equity(185 [AFL])
      True
      0.458695
      1804.0
      False
      False
      2.660740e+10
      4486.0
      0.006238
      False
    
    
      Equity(197 [AGCO])
      True
      0.602011
      2404.0
      False
      False
      4.714631e+09
      3908.0
      -0.002363
      False
    
    
      Equity(205 [AGN])
      True
      0.221602
      705.0
      False
      True
      2.085580e+10
      4439.0
      0.000000
      False
    
    
      Equity(216 [HES])
      True
      0.683620
      2718.0
      False
      False
      2.515630e+10
      4474.0
      0.001308
      False
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      2012-01-03 00:00:00+00:00
      Equity(41763 [TEA])
      False
      0.070500
      147.0
      False
      True
      7.167431e+08
      2593.0
      0.020686
      False
    
    
      Equity(41765 [CHEF])
      False
      0.066000
      138.0
      False
      True
      3.723197e+08
      2050.0
      0.043808
      False
    
    
      Equity(41766 [HZNP])
      False
      1.241773
      3737.0
      True
      False
      7.811440e+07
      927.0
      -0.043062
      True
    
    
      Equity(41777 [LSG])
      False
      1.627869
      4122.0
      True
      False
      5.038072e+08
      2300.0
      0.024390
      False
    
    
      Equity(41778 [ASM])
      False
      0.407498
      1377.0
      False
      False
      3.821348e+07
      558.0
      0.021583
      True
    
    
      Equity(41789 [MNGL])
      False
      0.051900
      117.0
      False
      True
      9.680125e+07
      1066.0
      0.011567
      False
    
    
      Equity(41820 [CARB])
      False
      0.109200
      236.0
      False
      True
      2.789563e+08
      1836.0
      0.008174
      False
    
    
      Equity(41823 [PPP])
      False
      1.507613
      4026.0
      True
      False
      2.823994e+08
      1843.0
      0.028939
      False
    
    
      Equity(41841 [PLMT])
      False
      1.228350
      3723.0
      True
      False
      6.503190e+07
      830.0
      0.011881
      True
    
    
      Equity(41843 [ELLO])
      False
      1.403903
      3939.0
      True
      False
      6.035624e+07
      799.0
      0.009009
      True
    
    
      Equity(41852 [OBT])
      False
      1.224290
      3718.0
      True
      False
      1.118129e+07
      145.0
      -0.026385
      True
    
    
      Equity(41858 [NMAR])
      False
      0.090000
      185.0
      False
      True
      5.748000e+07
      770.0
      NaN
      True
    
    
      Equity(41872 [VER])
      False
      0.553710
      1941.0
      False
      False
      7.535809e+07
      912.0
      0.000000
      True
    
    
      Equity(41886 [FNV])
      True
      0.459897
      1589.0
      False
      False
      4.864339e+09
      3897.0
      -0.008592
      False
    
    
      Equity(41888 [CSRE])
      False
      1.857010
      4234.0
      True
      False
      4.874792e+08
      2262.0
      0.049587
      False
    
    
      Equity(41893 [PBSK])
      False
      1.608234
      4111.0
      True
      False
      3.682639e+07
      534.0
      0.011111
      True
    
    
      Equity(41915 [FSM])
      False
      0.272702
      813.0
      False
      True
      6.749870e+08
      2536.0
      0.026316
      False
    
    
      Equity(42000 [ASBB])
      False
      1.042970
      3397.0
      False
      False
      6.533924e+07
      832.0
      0.003258
      True
    
    
      Equity(42021 [XLS])
      False
      1.199328
      3677.0
      True
      False
      1.675968e+09
      3216.0
      -0.008734
      False
    
    
      Equity(42023 [XYL])
      True
      0.442302
      1516.0
      False
      False
      4.732375e+09
      3888.0
      -0.011565
      False
    
    
      Equity(42080 [BUR])
      False
      0.048100
      109.0
      False
      True
      NaN
      NaN
      NaN
      False
    
    
      Equity(42112 [GEVA])
      False
      0.184101
      470.0
      False
      True
      4.661304e+08
      2230.0
      -0.007092
      False
    
    
      Equity(42115 [TGD])
      False
      0.274100
      819.0
      False
      True
      2.651027e+08
      1806.0
      0.010695
      False
    
    
      Equity(42118 [GRPN])
      True
      NaN
      NaN
      False
      False
      1.310047e+10
      4256.0
      -0.037488
      False
    
    
      Equity(42125 [VAC])
      False
      3.066544
      4499.0
      True
      False
      NaN
      NaN
      0.002335
      False
    
    
      Equity(42151 [PACD])
      False
      1.252505
      3756.0
      True
      False
      2.008800e+09
      3344.0
      0.036789
      False
    
    
      Equity(42165 [INVN])
      False
      0.092300
      193.0
      False
      True
      NaN
      NaN
      0.001006
      False
    
    
      Equity(42166 [CLVS])
      False
      0.062900
      134.0
      False
      True
      NaN
      NaN
      -0.012614
      False
    
    
      Equity(42173 [DLPH])
      False
      0.219602
      619.0
      False
      True
      NaN
      NaN
      0.005602
      False
    
    
      Equity(42184 [MFRM])
      False
      0.001900
      11.0
      False
      True
      NaN
      NaN
      0.014880
      False
    
  

767510 rows × 9 columns

Great, we have all the data. Now we need to compute the returns of our portfolios over time. We have the daily returns for each equity, plus whether or not that equity was included in any given portfolio on any given day. We can combine that information in the following way to yield daily portfolio returns.

Step 1: Subset our results into only data belonging to our 'biggest' portfolio.



In [8]:

    
results[results.biggest]









    Out[8]:






  
    
      
      
      biggest
      book_to_price
      book_to_price_rank
      highpb
      lowpb
      market_cap
      market_cap_rank
      returns
      smallest
    
  
  
    
      2011-01-03 00:00:00+00:00
      Equity(2 [ARNC])
      True
      0.991867
      3625.0
      False
      False
      1.573937e+10
      4373.0
      0.013083
      False
    
    
      Equity(24 [AAPL])
      True
      0.167400
      494.0
      False
      True
      2.957765e+11
      4701.0
      -0.003769
      False
    
    
      Equity(62 [ABT])
      True
      0.297699
      1040.0
      False
      False
      7.410661e+10
      4626.0
      0.007778
      False
    
    
      Equity(64 [ABX])
      True
      0.344994
      1277.0
      False
      False
      5.297974e+10
      4599.0
      0.011217
      False
    
    
      Equity(67 [ADSK])
      True
      0.193900
      599.0
      False
      True
      8.684483e+09
      4138.0
      -0.014952
      False
    
    
      Equity(76 [TAP])
      True
      0.857633
      3270.0
      False
      False
      9.346545e+09
      4175.0
      -0.004362
      False
    
    
      Equity(88 [ACI])
      True
      0.460893
      1821.0
      False
      False
      5.696619e+09
      3993.0
      -0.005954
      False
    
    
      Equity(107 [ACV])
      True
      0.360101
      1338.0
      False
      False
      3.663793e+09
      3768.0
      -0.000270
      False
    
    
      Equity(114 [ADBE])
      True
      0.367202
      1384.0
      False
      False
      1.565831e+10
      4371.0
      0.006869
      False
    
    
      Equity(122 [ADI])
      True
      0.301296
      1058.0
      False
      False
      1.125325e+10
      4246.0
      -0.007638
      False
    
    
      Equity(128 [ADM])
      True
      0.829669
      3196.0
      False
      False
      1.922429e+10
      4421.0
      0.007028
      False
    
    
      Equity(154 [AEM])
      True
      0.267001
      904.0
      False
      True
      1.200374e+10
      4266.0
      0.002354
      False
    
    
      Equity(157 [AEG])
      True
      3.364738
      4680.0
      True
      False
      1.065935e+10
      4229.0
      0.008210
      False
    
    
      Equity(161 [AEP])
      True
      0.798722
      3103.0
      False
      False
      1.728033e+10
      4399.0
      -0.002495
      False
    
    
      Equity(166 [AES])
      True
      0.808669
      3134.0
      False
      False
      9.602998e+09
      4188.0
      -0.002048
      False
    
    
      Equity(168 [AET])
      True
      0.843526
      3232.0
      False
      False
      1.221105e+10
      4277.0
      0.002628
      False
    
    
      Equity(185 [AFL])
      True
      0.458695
      1804.0
      False
      False
      2.660740e+10
      4486.0
      0.006238
      False
    
    
      Equity(197 [AGCO])
      True
      0.602011
      2404.0
      False
      False
      4.714631e+09
      3908.0
      -0.002363
      False
    
    
      Equity(205 [AGN])
      True
      0.221602
      705.0
      False
      True
      2.085580e+10
      4439.0
      0.000000
      False
    
    
      Equity(216 [HES])
      True
      0.683620
      2718.0
      False
      False
      2.515630e+10
      4474.0
      0.001308
      False
    
    
      Equity(239 [AIG])
      True
      1.587554
      4368.0
      True
      False
      7.789643e+09
      4100.0
      0.001738
      False
    
    
      Equity(273 [ALU])
      True
      0.502790
      2010.0
      False
      False
      6.666410e+09
      4050.0
      0.006826
      False
    
    
      Equity(300 [ALK])
      True
      0.530110
      2129.0
      False
      False
      4.179994e+09
      3845.0
      -0.013041
      False
    
    
      Equity(328 [ALTR])
      True
      0.175399
      527.0
      False
      True
      1.112724e+10
      4241.0
      -0.009460
      False
    
    
      Equity(337 [AMAT])
      True
      0.456809
      1798.0
      False
      False
      1.864512e+10
      4411.0
      -0.006369
      False
    
    
      Equity(338 [BEAM])
      True
      0.614288
      2448.0
      False
      False
      9.196057e+09
      4161.0
      -0.007411
      False
    
    
      Equity(351 [AMD])
      True
      0.123499
      347.0
      False
      True
      5.582949e+09
      3985.0
      0.006020
      False
    
    
      Equity(353 [AME])
      True
      0.264201
      888.0
      False
      True
      9.422944e+09
      4179.0
      -0.002034
      False
    
    
      Equity(357 [TWX])
      True
      1.008980
      3676.0
      False
      False
      3.569727e+10
      4537.0
      0.003117
      False
    
    
      Equity(368 [AMGN])
      True
      0.483489
      1907.0
      False
      False
      5.187034e+10
      4597.0
      -0.011345
      False
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      2012-01-03 00:00:00+00:00
      Equity(38949 [OIBR_C])
      True
      1.526485
      4043.0
      True
      False
      3.714609e+09
      3752.0
      0.004894
      False
    
    
      Equity(38965 [FTNT])
      True
      0.085800
      177.0
      False
      True
      3.359329e+09
      3684.0
      0.006925
      False
    
    
      Equity(39053 [CIT])
      True
      1.310616
      3821.0
      True
      False
      6.994310e+09
      4038.0
      -0.007120
      False
    
    
      Equity(39073 [CIE])
      True
      0.507202
      1764.0
      False
      False
      5.519195e+09
      3958.0
      -0.009591
      False
    
    
      Equity(39095 [CHTR])
      True
      0.151800
      370.0
      False
      True
      6.260309e+09
      4002.0
      0.001408
      False
    
    
      Equity(39347 [ST])
      True
      0.184101
      469.0
      False
      True
      4.627855e+09
      3876.0
      -0.006427
      False
    
    
      Equity(39495 [SDRL])
      True
      0.395507
      1328.0
      False
      False
      1.469849e+10
      4295.0
      -0.001806
      False
    
    
      Equity(39499 [VIP])
      True
      0.789391
      2720.0
      False
      False
      1.543534e+10
      4306.0
      0.002114
      False
    
    
      Equity(39546 [LYB])
      True
      0.743384
      2578.0
      False
      False
      1.875532e+10
      4355.0
      -0.010360
      False
    
    
      Equity(39612 [SIX])
      True
      0.438404
      1501.0
      False
      False
      4.537155e+09
      3867.0
      -0.003625
      False
    
    
      Equity(39778 [QEP])
      True
      0.566990
      1986.0
      False
      False
      5.186463e+09
      3935.0
      0.002394
      False
    
    
      Equity(39994 [NXPI])
      True
      0.325098
      1041.0
      False
      False
      3.794166e+09
      3772.0
      -0.002597
      False
    
    
      Equity(40338 [SMFG])
      True
      1.613424
      4114.0
      True
      False
      3.888654e+10
      4499.0
      0.001821
      False
    
    
      Equity(40445 [LPLA])
      True
      0.391803
      1313.0
      False
      False
      3.370944e+09
      3685.0
      0.002626
      False
    
    
      Equity(40573 [FRC])
      True
      0.657289
      2318.0
      False
      False
      3.962769e+09
      3801.0
      -0.002603
      False
    
    
      Equity(40616 [MMI])
      True
      0.423603
      1434.0
      False
      False
      1.161710e+10
      4213.0
      -0.000258
      False
    
    
      Equity(40755 [NLSN])
      True
      0.435294
      1483.0
      False
      False
      1.062889e+10
      4179.0
      0.001350
      False
    
    
      Equity(40852 [KMI])
      True
      0.144699
      351.0
      False
      True
      2.603239e+10
      4425.0
      0.019658
      False
    
    
      Equity(41047 [HCA])
      True
      NaN
      NaN
      False
      False
      9.621716e+09
      4150.0
      0.035228
      False
    
    
      Equity(41150 [APO])
      True
      0.089900
      184.0
      False
      True
      4.505351e+09
      3862.0
      0.009764
      False
    
    
      Equity(41242 [ARCO])
      True
      0.140501
      334.0
      False
      True
      4.301630e+09
      3843.0
      -0.005329
      False
    
    
      Equity(41416 [KOS])
      True
      0.183699
      466.0
      False
      True
      4.787567e+09
      3892.0
      0.000815
      False
    
    
      Equity(41451 [LNKD])
      True
      0.063700
      136.0
      False
      True
      6.066141e+09
      3993.0
      -0.011471
      False
    
    
      Equity(41462 [MOS])
      True
      0.467093
      1621.0
      False
      False
      2.252842e+10
      4393.0
      0.003521
      False
    
    
      Equity(41484 [YNDX])
      True
      0.115400
      258.0
      False
      True
      6.359126e+09
      4005.0
      0.009641
      False
    
    
      Equity(41491 [FSL])
      True
      NaN
      NaN
      False
      False
      3.108435e+09
      3640.0
      -0.023148
      False
    
    
      Equity(41636 [MPC])
      True
      0.844167
      2881.0
      False
      False
      1.187205e+10
      4221.0
      0.001203
      False
    
    
      Equity(41886 [FNV])
      True
      0.459897
      1589.0
      False
      False
      4.864339e+09
      3897.0
      -0.008592
      False
    
    
      Equity(42023 [XYL])
      True
      0.442302
      1516.0
      False
      False
      4.732375e+09
      3888.0
      -0.011565
      False
    
    
      Equity(42118 [GRPN])
      True
      NaN
      NaN
      False
      False
      1.310047e+10
      4256.0
      -0.037488
      False
    
  

253000 rows × 9 columns

Step 2: Get returns.



In [9]:

    
results[results.biggest]['returns']









    Out[9]:





2011-01-03 00:00:00+00:00  Equity(2 [ARNC])          0.013083
                           Equity(24 [AAPL])        -0.003769
                           Equity(62 [ABT])          0.007778
                           Equity(64 [ABX])          0.011217
                           Equity(67 [ADSK])        -0.014952
                           Equity(76 [TAP])         -0.004362
                           Equity(88 [ACI])         -0.005954
                           Equity(107 [ACV])        -0.000270
                           Equity(114 [ADBE])        0.006869
                           Equity(122 [ADI])        -0.007638
                           Equity(128 [ADM])         0.007028
                           Equity(154 [AEM])         0.002354
                           Equity(157 [AEG])         0.008210
                           Equity(161 [AEP])        -0.002495
                           Equity(166 [AES])        -0.002048
                           Equity(168 [AET])         0.002628
                           Equity(185 [AFL])         0.006238
                           Equity(197 [AGCO])       -0.002363
                           Equity(205 [AGN])         0.000000
                           Equity(216 [HES])         0.001308
                           Equity(239 [AIG])         0.001738
                           Equity(273 [ALU])         0.006826
                           Equity(300 [ALK])        -0.013041
                           Equity(328 [ALTR])       -0.009460
                           Equity(337 [AMAT])       -0.006369
                           Equity(338 [BEAM])       -0.007411
                           Equity(351 [AMD])         0.006020
                           Equity(353 [AME])        -0.002034
                           Equity(357 [TWX])         0.003117
                           Equity(368 [AMGN])       -0.011345
                                                       ...   
2012-01-03 00:00:00+00:00  Equity(38949 [OIBR_C])    0.004894
                           Equity(38965 [FTNT])      0.006925
                           Equity(39053 [CIT])      -0.007120
                           Equity(39073 [CIE])      -0.009591
                           Equity(39095 [CHTR])      0.001408
                           Equity(39347 [ST])       -0.006427
                           Equity(39495 [SDRL])     -0.001806
                           Equity(39499 [VIP])       0.002114
                           Equity(39546 [LYB])      -0.010360
                           Equity(39612 [SIX])      -0.003625
                           Equity(39778 [QEP])       0.002394
                           Equity(39994 [NXPI])     -0.002597
                           Equity(40338 [SMFG])      0.001821
                           Equity(40445 [LPLA])      0.002626
                           Equity(40573 [FRC])      -0.002603
                           Equity(40616 [MMI])      -0.000258
                           Equity(40755 [NLSN])      0.001350
                           Equity(40852 [KMI])       0.019658
                           Equity(41047 [HCA])       0.035228
                           Equity(41150 [APO])       0.009764
                           Equity(41242 [ARCO])     -0.005329
                           Equity(41416 [KOS])       0.000815
                           Equity(41451 [LNKD])     -0.011471
                           Equity(41462 [MOS])       0.003521
                           Equity(41484 [YNDX])      0.009641
                           Equity(41491 [FSL])      -0.023148
                           Equity(41636 [MPC])       0.001203
                           Equity(41886 [FNV])      -0.008592
                           Equity(42023 [XYL])      -0.011565
                           Equity(42118 [GRPN])     -0.037488
Name: returns, dtype: float64

Step 3: Group by day and take the mean. This is pretty deep into pandas logic, so if you don't understand this on first pass it is recommended to check out pandas' documentation on all the functions used. Especially groupby, which is very useful. Keep in mind that the index in our results is a MultiIndex rather than a regular Index, that can complicate things.



In [10]:

    
results[results.biggest]['returns'].groupby(level=0).mean()









    Out[10]:





2011-01-03 00:00:00+00:00   -0.000173
2011-01-04 00:00:00+00:00    0.010982
2011-01-05 00:00:00+00:00   -0.005017
2011-01-06 00:00:00+00:00    0.004201
2011-01-07 00:00:00+00:00   -0.003210
2011-01-10 00:00:00+00:00   -0.001821
2011-01-11 00:00:00+00:00    0.000640
2011-01-12 00:00:00+00:00    0.006782
2011-01-13 00:00:00+00:00    0.009905
2011-01-14 00:00:00+00:00   -0.000804
2011-01-18 00:00:00+00:00    0.005200
2011-01-19 00:00:00+00:00    0.004477
2011-01-20 00:00:00+00:00   -0.012755
2011-01-21 00:00:00+00:00   -0.006107
2011-01-24 00:00:00+00:00    0.000013
2011-01-25 00:00:00+00:00    0.007606
2011-01-26 00:00:00+00:00   -0.001106
2011-01-27 00:00:00+00:00    0.009396
2011-01-28 00:00:00+00:00    0.003167
2011-01-31 00:00:00+00:00   -0.017886
2011-02-01 00:00:00+00:00    0.009216
2011-02-02 00:00:00+00:00    0.016059
2011-02-03 00:00:00+00:00   -0.001468
2011-02-04 00:00:00+00:00    0.003518
2011-02-07 00:00:00+00:00    0.002991
2011-02-08 00:00:00+00:00    0.005747
2011-02-09 00:00:00+00:00    0.004054
2011-02-10 00:00:00+00:00   -0.004357
2011-02-11 00:00:00+00:00    0.001720
2011-02-14 00:00:00+00:00    0.006792
                               ...   
2011-11-18 00:00:00+00:00   -0.019297
2011-11-21 00:00:00+00:00   -0.000848
2011-11-22 00:00:00+00:00   -0.018459
2011-11-23 00:00:00+00:00   -0.004299
2011-11-25 00:00:00+00:00   -0.025292
2011-11-28 00:00:00+00:00   -0.002621
2011-11-29 00:00:00+00:00    0.033148
2011-11-30 00:00:00+00:00    0.001860
2011-12-01 00:00:00+00:00    0.045930
2011-12-02 00:00:00+00:00   -0.001984
2011-12-05 00:00:00+00:00   -0.000097
2011-12-06 00:00:00+00:00    0.012671
2011-12-07 00:00:00+00:00   -0.001904
2011-12-08 00:00:00+00:00    0.000533
2011-12-09 00:00:00+00:00   -0.025552
2011-12-12 00:00:00+00:00    0.018515
2011-12-13 00:00:00+00:00   -0.017489
2011-12-14 00:00:00+00:00   -0.014710
2011-12-15 00:00:00+00:00   -0.013625
2011-12-16 00:00:00+00:00    0.004174
2011-12-19 00:00:00+00:00    0.005895
2011-12-20 00:00:00+00:00   -0.015466
2011-12-21 00:00:00+00:00    0.032779
2011-12-22 00:00:00+00:00    0.001803
2011-12-23 00:00:00+00:00    0.010930
2011-12-27 00:00:00+00:00    0.007011
2011-12-28 00:00:00+00:00   -0.000239
2011-12-29 00:00:00+00:00   -0.015062
2011-12-30 00:00:00+00:00    0.011663
2012-01-03 00:00:00+00:00   -0.001328
Name: returns, dtype: float64

Now run through this computation for each portfolio and get our final results.



In [11]:

    
R_biggest = results[results.biggest]['returns'].groupby(level=0).mean()
R_smallest = results[results.smallest]['returns'].groupby(level=0).mean()

R_highpb = results[results.highpb]['returns'].groupby(level=0).mean()
R_lowpb = results[results.lowpb]['returns'].groupby(level=0).mean()

SMB = R_smallest - R_biggest
HML = R_highpb - R_lowpb

What were the daily returns?



In [12]:

    
plt.plot(SMB.index, SMB.values)
plt.ylabel('Daily Percent Return')
plt.legend(['SMB Portfolio Returns']);



In [13]:

    
plt.plot(HML.index, HML.values)
plt.ylabel('Daily Percent Return')
plt.legend(['HML Portfolio Returns']);

And what would it look like to hold these portfolios over time?



In [14]:

    
plt.plot(SMB.index, np.cumprod(SMB.values+1))
plt.ylabel('Cumulative Return')
plt.legend(['SMB Portfolio Returns']);

The last data we need are the daily returns on the broad market.



In [15]:

    
M = get_pricing('SPY', start_date='2011-1-1', end_date='2012-1-1', fields='price').pct_change()[1:]



In [16]:

    
plt.plot(M.index, M.values)
plt.ylabel('Daily Percent Return')
plt.legend(['Market Portfolio Returns']);

Actually Running the Regression

Now that we have returns series representing our factors, we can compute the factor model for any return stream using a linear regression. Below, we compute the factor sensitivities for returns on a tech portfolio.



In [17]:

    
# Get returns data for our portfolio
portfolio = get_pricing(['MSFT', 'AAPL', 'YHOO', 'FB', 'TSLA'], 
                        fields='price', start_date=start_date, end_date=end_date).pct_change()[1:]
R = np.mean(portfolio, axis=1)

Put all the data into one dataframe for convenience.



In [18]:

    
# Define a constant to compute intercept
constant = pd.TimeSeries(np.ones(len(R.index)), index=R.index)

df = pd.DataFrame({'R': R,
              'M': M,
              'SMB': SMB,
              'HML': HML,
              'Constant': constant})
df = df.dropna()

Perform the regression. You'll notice that these are the sensitivities over an entire year. It can be valuable to look at the rolling sensitivities as well to determine how stable they are.



In [19]:

    
# Perform linear regression to get the coefficients in the model
b1, b2, b3 = regression.linear_model.OLS(df['R'], df[['M', 'SMB', 'HML']]).fit().params

# Print the coefficients from the linear regression
print 'Historical Sensitivities of portfolio returns to factors:\nMarket: %f\nMarket cap: %f\nB/P: %f' %  (b1, b2, b3)









    



Historical Sensitivities of portfolio returns to factors:
Market: 0.962431
Market cap: -0.060328
B/P: -0.115476

Let's perform a rolling regression to look at how the estimated sensitivities change over time.



In [20]:

    
model = pd.stats.ols.MovingOLS(y = df['R'], x=df[['M', 'SMB', 'HML']], 
                             window_type='rolling', 
                             window=100)
rolling_parameter_estimates = model.beta
rolling_parameter_estimates.plot();
plt.title('Computed Betas');
plt.legend(['Market Beta', 'SMB Beta', 'HML Beta', 'Intercept']);

Approach 2: Factor Value Normalization

This is also known as cross-sectional factor analysis.

Another approach is to normalize factor values each bar and see how predictive of that bar's returns they were. We do this by computing a normalized factor value $b_{aj}$ for each asset $a$ in the following way.

$$b_{aj} = \frac{F_{aj} - \mu_{F_j}}{\sigma_{F_j}}$$

$F_{aj}$ is the value of factor $j$ for asset $a$ during this bar, $\mu_{F_j}$ is the mean factor value across all assets, and $\sigma_{F_j}$ is the standard deviation of factor values over all assets. Notice that we are just computing a z-score to make asset specific factor values comparable across different factors.

The exceptions to this formula are indicator variables, which are set to 1 for true and 0 for false. One example is industry membership: the coefficient tells us whether the asset belongs to the industry or not.

After we calculate all of the normalized scores during bar $t$, we can estimate factor $j$'s returns $F_{jt}$, using a cross-sectional regression (i.e. at each time step, we perform a regression using the equations for all of the assets). Specifically, once we have returns for each asset $R_{at}$, and normalized factor coefficients $b_{aj}$, we construct the following model and estimate the $F_j$s and $a_t$

$$R_{at} = a_t + b_{a1}F_1 + b_{a2}F_2 + \dots + b_{aK}F_K$$

You can think of this as slicing through the other direction from the first analysis, as now the factor returns are unknowns to be solved for, whereas originally the coefficients were the unknowns. Another way to think about it is that you're determining how predictive of returns the factor was on that day, and therefore how much return you could have squeezed out of that factor.

Following this procedure, we'll get the cross-sectional returns on 2011-01-03, and compute the coefficients for all assets:

Getting the Data

We already have the results of the previous pipeline call, so we can grab book to price information for 2011-1-3 pretty easily.



In [21]:

    
BTP = results['book_to_price']['2011-1-3']
zscore = (BTP - np.mean(BTP)) / np.std(BTP)
zscore.dropna(inplace=True)

plt.hist(zscore)
plt.xlabel('Z-Score')
plt.ylabel('Frequency');

Problem: The Data is Weirdly Distributed

Notice how there are big outliers in the dataset that cause the z-scores to lose a lot of information. Basically the presence of some huge book to price datapoints causes the rest of the data to seem to occupy a relatively small area. We need to get around this issue using some data cleaning technique, here we're use winsorization.

Winsorization

Winzorization takes the top $n\%$ of a dataset and sets it all equal to the least extreme value in the top $n\%$. For example, if your dataset ranged from 0-10, plus a few crazy outliers, those outliers would be set to 0 or 10 depending on their direction. Here is an example.



In [22]:

    
# Get some random data
X = np.random.normal(0, 1, 100)

# Put in some outliers
X[0] = 1000
X[1] = -1000

# Perform winsorization
print 'Before winsorization', np.min(X), np.max(X)
scipy.stats.mstats.winsorize(X, inplace=True, limits=0.01)
print 'After winsorization', np.min(X), np.max(X)









    



Before winsorization -1000.0 1000.0
After winsorization -1.8541550478 3.20430088003

This looks good, let's see how our book to price data looks when winsorized.



In [23]:

    
BTP = results['book_to_price']['2011-1-3']
scipy.stats.mstats.winsorize(BTP, inplace=True, limits=0.01)
BTP_z = (BTP - np.mean(BTP)) / np.std(BTP)
BTP_z.dropna(inplace=True)

plt.hist(BTP_z)
plt.xlabel('Z-Score')
plt.ylabel('Frequency');

We need the returns for that day as well.



In [24]:

    
R_day = results['returns']['2011-1-3']

Now set up our data and estimate $F_j$ using linear regression.



In [25]:

    
constant = pd.TimeSeries(np.ones(len(R_day.index)), index=R_day.index)

df_day = pd.DataFrame({'R': R_day,
              'BTP_z': BTP_z,
              'Constant': constant})
df_day = df_day.dropna()

# Perform linear regression to get the coefficients in the model
F1 = regression.linear_model.OLS(df_day['R'], df_day['BTP_z']).fit().params
print F1









    



BTP_z    0.002036
dtype: float64

Finally, let's add another factor so you can see how the code changes.



In [26]:

    
MKT = results['market_cap']['2011-1-3']
scipy.stats.mstats.winsorize(MKT, inplace=True, limits=0.01)
MKT_z = (MKT - np.mean(MKT)) / np.std(MKT)

constant = pd.TimeSeries(np.ones(len(R_day.index)), index=R_day.index)

df_day = pd.DataFrame({'R': R_day,
              'BTP_z': BTP_z,
              'MKT_z': MKT_z,
              'Constant': constant})
df_day = df_day.dropna()

# Perform linear regression to get the coefficients in the model
F1, F2 = regression.linear_model.OLS(df_day['R'], df_day[['BTP_z', 'MKT_z']]).fit().params
print F1, F2









    



0.00202044998408 0.000131289947406

To expand this analysis, you would simply loop through days, running this every day and getting an estimated factor return.

Using Fundamental Factor Modeling

Returns Prediction

As discussed in the Arbitrage Price Theory lecture, factor modeling can be used to predict future returns based on current fundamental factors, or to determine when an asset may be mispriced. Modeling future returns is accomplished by offsetting the returns in the regression, so that rather than predicted for current returns, you are predicting for future returns. Once you have a predictive model, the most canonical way to create a strategy is to attempt a long-short equity approach.

There is a full lecture describing long-short equity, but the general idea is that you rank equities based on their predicted future returns. You then long the top p% and short the bottom p% remaining neutral on dollar volume. If the assets at the top of the ranking on average tend to make $5\%$ more per year than the market, and assets at the bottom tend to make $5\%$ less, then you will make $(M + 0.05) - (M - 0.05) = 0.10$ or $10\%$ percent per year, where $M$ is the market return that gets canceled out.

Hedging out Exposure

Once we've determined that we are exposed to a factor, we may want to avoid depending on the performance of that factor by taking out a hedge. This is discussed in the Beta Hedging lecture and also in the Risk Factor Exposure notebook.

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian, Inc. has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian, Inc. at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.

		biggest	book_to_price	book_to_price_rank	highpb	lowpb	market_cap	market_cap_rank	returns	smallest
2011-01-03 00:00:00+00:00	Equity(2 [ARNC])	True	0.991867	3625.0	False	False	1.573937e+10	4373.0	0.013083	False
	Equity(21 [AAME])	False	2.052967	4526.0	True	False	4.520648e+07	578.0	-0.009756	True
	Equity(24 [AAPL])	True	0.167400	494.0	False	True	2.957765e+11	4701.0	-0.003769	False
	Equity(31 [ABAX])	False	0.257898	857.0	False	True	6.002049e+08	2396.0	-0.020073	False
	Equity(37 [ABCW])	False	0.277500	954.0	False	True	2.601996e+07	370.0	0.000000	True
	Equity(51 [ABL])	False	2.364066	4584.0	True	False	2.415793e+05	2.0	-0.082353	True
	Equity(53 [ABMD])	False	0.235200	760.0	False	True	3.632957e+08	1985.0	-0.012346	False
	Equity(58 [SERV])	False	1.666944	4405.0	True	False	8.785592e+06	109.0	0.030172	True
	Equity(62 [ABT])	True	0.297699	1040.0	False	False	7.410661e+10	4626.0	0.007778	False
	Equity(64 [ABX])	True	0.344994	1277.0	False	False	5.297974e+10	4599.0	0.011217	False
	Equity(67 [ADSK])	True	0.193900	599.0	False	True	8.684483e+09	4138.0	-0.014952	False
	Equity(76 [TAP])	True	0.857633	3270.0	False	False	9.346545e+09	4175.0	-0.004362	False
	Equity(88 [ACI])	True	0.460893	1821.0	False	False	5.696619e+09	3993.0	-0.005954	False
	Equity(100 [IEP])	False	1.077354	3838.0	True	False	3.059538e+09	3653.0	0.006853	False
	Equity(106 [ACU])	False	0.853315	3259.0	False	False	2.914872e+07	403.0	-0.003141	True
	Equity(107 [ACV])	True	0.360101	1338.0	False	False	3.663793e+09	3768.0	-0.000270	False
	Equity(112 [ACY])	False	1.571092	4353.0	True	False	2.793301e+07	390.0	0.008357	True
	Equity(114 [ADBE])	True	0.367202	1384.0	False	False	1.565831e+10	4371.0	0.006869	False
	Equity(117 [AEY])	False	0.949668	3506.0	False	False	3.185216e+07	433.0	0.000000	True
	Equity(122 [ADI])	True	0.301296	1058.0	False	False	1.125325e+10	4246.0	-0.007638	False
	Equity(128 [ADM])	True	0.829669	3196.0	False	False	1.922429e+10	4421.0	0.007028	False
	Equity(154 [AEM])	True	0.267001	904.0	False	True	1.200374e+10	4266.0	0.002354	False
	Equity(157 [AEG])	True	3.364738	4680.0	True	False	1.065935e+10	4229.0	0.008210	False
	Equity(161 [AEP])	True	0.798722	3103.0	False	False	1.728033e+10	4399.0	-0.002495	False
	Equity(166 [AES])	True	0.808669	3134.0	False	False	9.602998e+09	4188.0	-0.002048	False
	Equity(168 [AET])	True	0.843526	3232.0	False	False	1.221105e+10	4277.0	0.002628	False
	Equity(185 [AFL])	True	0.458695	1804.0	False	False	2.660740e+10	4486.0	0.006238	False
	Equity(197 [AGCO])	True	0.602011	2404.0	False	False	4.714631e+09	3908.0	-0.002363	False
	Equity(205 [AGN])	True	0.221602	705.0	False	True	2.085580e+10	4439.0	0.000000	False
	Equity(216 [HES])	True	0.683620	2718.0	False	False	2.515630e+10	4474.0	0.001308	False
...	...	...	...	...	...	...	...	...	...	...
2012-01-03 00:00:00+00:00	Equity(41763 [TEA])	False	0.070500	147.0	False	True	7.167431e+08	2593.0	0.020686	False
	Equity(41765 [CHEF])	False	0.066000	138.0	False	True	3.723197e+08	2050.0	0.043808	False
	Equity(41766 [HZNP])	False	1.241773	3737.0	True	False	7.811440e+07	927.0	-0.043062	True
	Equity(41777 [LSG])	False	1.627869	4122.0	True	False	5.038072e+08	2300.0	0.024390	False
	Equity(41778 [ASM])	False	0.407498	1377.0	False	False	3.821348e+07	558.0	0.021583	True
	Equity(41789 [MNGL])	False	0.051900	117.0	False	True	9.680125e+07	1066.0	0.011567	False
	Equity(41820 [CARB])	False	0.109200	236.0	False	True	2.789563e+08	1836.0	0.008174	False
	Equity(41823 [PPP])	False	1.507613	4026.0	True	False	2.823994e+08	1843.0	0.028939	False
	Equity(41841 [PLMT])	False	1.228350	3723.0	True	False	6.503190e+07	830.0	0.011881	True
	Equity(41843 [ELLO])	False	1.403903	3939.0	True	False	6.035624e+07	799.0	0.009009	True
	Equity(41852 [OBT])	False	1.224290	3718.0	True	False	1.118129e+07	145.0	-0.026385	True
	Equity(41858 [NMAR])	False	0.090000	185.0	False	True	5.748000e+07	770.0	NaN	True
	Equity(41872 [VER])	False	0.553710	1941.0	False	False	7.535809e+07	912.0	0.000000	True
	Equity(41886 [FNV])	True	0.459897	1589.0	False	False	4.864339e+09	3897.0	-0.008592	False
	Equity(41888 [CSRE])	False	1.857010	4234.0	True	False	4.874792e+08	2262.0	0.049587	False
	Equity(41893 [PBSK])	False	1.608234	4111.0	True	False	3.682639e+07	534.0	0.011111	True
	Equity(41915 [FSM])	False	0.272702	813.0	False	True	6.749870e+08	2536.0	0.026316	False
	Equity(42000 [ASBB])	False	1.042970	3397.0	False	False	6.533924e+07	832.0	0.003258	True
	Equity(42021 [XLS])	False	1.199328	3677.0	True	False	1.675968e+09	3216.0	-0.008734	False
	Equity(42023 [XYL])	True	0.442302	1516.0	False	False	4.732375e+09	3888.0	-0.011565	False
	Equity(42080 [BUR])	False	0.048100	109.0	False	True	NaN	NaN	NaN	False
	Equity(42112 [GEVA])	False	0.184101	470.0	False	True	4.661304e+08	2230.0	-0.007092	False
	Equity(42115 [TGD])	False	0.274100	819.0	False	True	2.651027e+08	1806.0	0.010695	False
	Equity(42118 [GRPN])	True	NaN	NaN	False	False	1.310047e+10	4256.0	-0.037488	False
	Equity(42125 [VAC])	False	3.066544	4499.0	True	False	NaN	NaN	0.002335	False
	Equity(42151 [PACD])	False	1.252505	3756.0	True	False	2.008800e+09	3344.0	0.036789	False
	Equity(42165 [INVN])	False	0.092300	193.0	False	True	NaN	NaN	0.001006	False
	Equity(42166 [CLVS])	False	0.062900	134.0	False	True	NaN	NaN	-0.012614	False
	Equity(42173 [DLPH])	False	0.219602	619.0	False	True	NaN	NaN	0.005602	False
	Equity(42184 [MFRM])	False	0.001900	11.0	False	True	NaN	NaN	0.014880	False