Exercises: Universe Selection - Answer Key

This exercise notebook refers to this lecture. Please use the lecture for explanations and sample code.

https://www.quantopian.com/lectures#Universe-Selection

Part of the Quantopian Lecture Series:


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters import QTradableStocksUS, AtLeastN
from quantopian.research import run_pipeline
from quantopian.pipeline.data import morningstar
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume

Helper Functions


In [2]:
def calculate_daily_turnover(unstacked):
    return (unstacked
            .diff()        # Get True/False showing where values changed from previous day.
            .iloc[1:]      # Drop first row, which is meaningless after diff().
            .astype(bool)  # diff() coerces from bool -> object :(.  Undo that.
            .groupby(axis=1, level=0)  
            .sum())

Exercise 1: Examining the QTradableStocksUS Universe

a. Initializing the Universe

Set the QTradableStocksUS as your universe by using the QTradableStocksUS() function.


In [3]:
# Your code goes here

universe = QTradableStocksUS()

b. Finding Asset Composition

Use the pipeline API with the QTradableStocksUS as a screen to find and print the list of equities included in the QTradableStocksUS on 2016-07-01.


In [4]:
# Your code goes here

pipe = Pipeline(
    columns={
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

pd.DataFrame(result.index.levels[1])


Out[4]:
0
0 Equity(2 [ARNC])
1 Equity(24 [AAPL])
2 Equity(31 [ABAX])
3 Equity(39 [DDC])
4 Equity(41 [ARCB])
5 Equity(52 [ABM])
6 Equity(53 [ABMD])
7 Equity(62 [ABT])
8 Equity(64 [ABX])
9 Equity(67 [ADSK])
10 Equity(76 [TAP])
11 Equity(84 [ACET])
12 Equity(110 [ACXM])
13 Equity(114 [ADBE])
14 Equity(122 [ADI])
15 Equity(128 [ADM])
16 Equity(154 [AEM])
17 Equity(161 [AEP])
18 Equity(166 [AES])
19 Equity(168 [AET])
20 Equity(185 [AFL])
21 Equity(197 [AGCO])
22 Equity(216 [HES])
23 Equity(239 [AIG])
24 Equity(247 [AIN])
25 Equity(253 [AIR])
26 Equity(266 [AJG])
27 Equity(270 [AKRX])
28 Equity(289 [MATX])
29 Equity(300 [ALK])
... ...
2057 Equity(49203 [GCI])
2058 Equity(49204 [CABO])
2059 Equity(49207 [BLD])
2060 Equity(49208 [BW])
2061 Equity(49210 [CC])
2062 Equity(49213 [ENR])
2063 Equity(49222 [TDOC])
2064 Equity(49229 [KHC])
2065 Equity(49238 [SRG])
2066 Equity(49242 [PYPL])
2067 Equity(49271 [OLLI])
2068 Equity(49275 [RPD])
2069 Equity(49279 [BUFF])
2070 Equity(49288 [LITE])
2071 Equity(49298 [NK])
2072 Equity(49312 [GLBL])
2073 Equity(49315 [Z])
2074 Equity(49318 [BETR])
2075 Equity(49321 [RUN])
2076 Equity(49322 [PLNT])
2077 Equity(49325 [CCP])
2078 Equity(49335 [GBT])
2079 Equity(49413 [PEN])
2080 Equity(49434 [FLOW])
2081 Equity(49448 [PJT])
2082 Equity(49455 [PFGC])
2083 Equity(49456 [SGRY])
2084 Equity(49458 [MSG])
2085 Equity(49460 [NVCR])
2086 Equity(49464 [PSTG])

2087 rows × 1 columns

c. Sector Exposure

Use the pipeline API with the QTradableStocksUS as a screen to find and print the sector composition of the universe on 2016-07-01.


In [5]:
SECTOR_CODE_NAMES = {
    Sector.BASIC_MATERIALS: 'Basic Materials',
    Sector.CONSUMER_CYCLICAL: 'Consumer Cyclical',
    Sector.FINANCIAL_SERVICES: 'Financial Services',
    Sector.REAL_ESTATE: 'Real Estate',
    Sector.CONSUMER_DEFENSIVE: 'Consumer Defensive',
    Sector.HEALTHCARE: 'Healthcare',
    Sector.UTILITIES: 'Utilities',
    Sector.COMMUNICATION_SERVICES: 'Communication Services',
    Sector.ENERGY: 'Energy',
    Sector.INDUSTRIALS: 'Industrials',
    Sector.TECHNOLOGY: 'Technology',
    -1 : 'Misc'
}

# Your code goes here

pipe = Pipeline(
    columns={'Sector': Sector()
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

sectors = result.groupby('Sector').size()
sectors.index = sectors.index.map(lambda code: SECTOR_CODE_NAMES[code])
sectors


Out[5]:
Misc                        1
Basic Materials           132
Consumer Cyclical         308
Financial Services        270
Real Estate               170
Consumer Defensive        100
Healthcare                266
Utilities                  63
Communication Services     40
Energy                    121
Industrials               309
Technology                307
dtype: int64

d. Turnover Rate

Use the pipeline API with the QTradableStocksUS as a screen and the calculate_daily_turnover helper function to find and plot the turnover of the universe during 2016.


In [6]:
# Your code goes here

pipe = Pipeline(
    columns={'QTradableStocksUS' : universe
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-01-01', '2017-01-01')

result = result.unstack().fillna(False)

turnover = calculate_daily_turnover(result)

turnover.plot(figsize=(14, 8));

turnover.describe()


Out[6]:
QTradableStocksUS
count 252.000000
mean 3.896825
std 2.042689
min 0.000000
25% 3.000000
50% 4.000000
75% 5.000000
max 11.000000

Exercise 2: Examining Tradability

a. NetIncome 1500

Create a universe consisting of the top 1500 equities by net income then find and print the list of equities included in the universe on 2016-07-01.


In [7]:
# Your code goes here

universe = morningstar.income_statement.net_income.latest.top(1500)

pipe = Pipeline(
    columns={
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

pd.DataFrame(result.index.levels[1])


Out[7]:
0
0 Equity(24 [AAPL])
1 Equity(62 [ABT])
2 Equity(66 [AB])
3 Equity(76 [TAP])
4 Equity(114 [ADBE])
5 Equity(122 [ADI])
6 Equity(128 [ADM])
7 Equity(154 [AEM])
8 Equity(161 [AEP])
9 Equity(166 [AES])
10 Equity(168 [AET])
11 Equity(185 [AFL])
12 Equity(266 [AJG])
13 Equity(270 [AKRX])
14 Equity(300 [ALK])
15 Equity(332 [ALX])
16 Equity(337 [AMAT])
17 Equity(353 [AME])
18 Equity(357 [TWX])
19 Equity(368 [AMGN])
20 Equity(410 [AN])
21 Equity(412 [ANAT])
22 Equity(438 [AON])
23 Equity(465 [APH])
24 Equity(523 [AAN])
25 Equity(538 [ARW])
26 Equity(547 [ASB])
27 Equity(559 [ASH])
28 Equity(595 [GAS])
29 Equity(600 [OA])
... ...
1470 Equity(49723 [PSA_PRB])
1471 Equity(49734 [BAC_PRC])
1472 Equity(49746 [C_PRS])
1473 Equity(49750 [FRC_PRG])
1474 Equity(49758 [OSB])
1475 Equity(49781 [GS_PRN])
1476 Equity(49786 [SCHW_PRD])
1477 Equity(49805 [BBT_PRH])
1478 Equity(49820 [AFSI_PRE])
1479 Equity(49831 [HBAN_O])
1480 Equity(49870 [STT_PRG])
1481 Equity(49876 [BATS])
1482 Equity(49877 [PNK])
1483 Equity(49878 [BATR_A])
1484 Equity(49879 [BATR_K])
1485 Equity(49880 [LSXM_B])
1486 Equity(49881 [LSXM_A])
1487 Equity(49883 [LSXM_K])
1488 Equity(49885 [FWON_A])
1489 Equity(49908 [RRR])
1490 Equity(49909 [BAC_PRA])
1491 Equity(49941 [PRE_PRG])
1492 Equity(49942 [PRE_PRI])
1493 Equity(49943 [PRE_PRH])
1494 Equity(49976 [DFT_PRC])
1495 Equity(49977 [PSA_PRC])
1496 Equity(49987 [IBKC_O])
1497 Equity(49998 [LHO_PRJ])
1498 Equity(50059 [VR_PRA])
1499 Equity(50061 [WFC_PRX])

1500 rows × 1 columns

b. Measuring Tradability

Find the average 200 day average dollar volume of the NetIncome 1500 universe using the AverageDollarVolume built in factor and compare to that of the QTradableStocksUS.


In [8]:
# Your code goes here

pipe = Pipeline(
    columns={'Average Dollar Volume' : AverageDollarVolume(window_length = 200)
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

print "NetIncome 1500 ADV:", np.mean(result['Average Dollar Volume'])

pipe = Pipeline(
    columns={'Average Dollar Volume' : AverageDollarVolume(window_length = 30)
    },
    screen=QTradableStocksUS()
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

print "QTradableStocksUS ADV:", np.mean(result['Average Dollar Volume'])


NetIncome 1500 ADV: 74816585.5336
QTradableStocksUS ADV: 61220949.8328

Using average dollar volume as a stand-in liquidity metric, the QTradableStocksUS is more liquid than the NetIncome 1500.

We used ADV as a liquidity metric here as it is an important indicator and its calculation is simple. However, average dollar volume is not a perfect measurement of liquidity as liquidity is determined by several factors and volume is only one of those. For more information on liquidity and its effects on algorithm performance, refer to the lecture on Volume, Slippage, and Liquidity

Exercise 3: Sector Balance

a. Dividend 1500

Create a universe consisting of the top 1500 equities by dividend yield then find and print the list of equities included in the this universe on 2016-07-01.


In [9]:
# Your code goes here

universe = morningstar.income_statement.net_income.latest.top(1500)

pipe = Pipeline(
    columns={
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

pd.DataFrame(result.index.levels[1])


Out[9]:
0
0 Equity(24 [AAPL])
1 Equity(62 [ABT])
2 Equity(66 [AB])
3 Equity(76 [TAP])
4 Equity(114 [ADBE])
5 Equity(122 [ADI])
6 Equity(128 [ADM])
7 Equity(154 [AEM])
8 Equity(161 [AEP])
9 Equity(166 [AES])
10 Equity(168 [AET])
11 Equity(185 [AFL])
12 Equity(266 [AJG])
13 Equity(270 [AKRX])
14 Equity(300 [ALK])
15 Equity(332 [ALX])
16 Equity(337 [AMAT])
17 Equity(353 [AME])
18 Equity(357 [TWX])
19 Equity(368 [AMGN])
20 Equity(410 [AN])
21 Equity(412 [ANAT])
22 Equity(438 [AON])
23 Equity(465 [APH])
24 Equity(523 [AAN])
25 Equity(538 [ARW])
26 Equity(547 [ASB])
27 Equity(559 [ASH])
28 Equity(595 [GAS])
29 Equity(600 [OA])
... ...
1470 Equity(49723 [PSA_PRB])
1471 Equity(49734 [BAC_PRC])
1472 Equity(49746 [C_PRS])
1473 Equity(49750 [FRC_PRG])
1474 Equity(49758 [OSB])
1475 Equity(49781 [GS_PRN])
1476 Equity(49786 [SCHW_PRD])
1477 Equity(49805 [BBT_PRH])
1478 Equity(49820 [AFSI_PRE])
1479 Equity(49831 [HBAN_O])
1480 Equity(49870 [STT_PRG])
1481 Equity(49876 [BATS])
1482 Equity(49877 [PNK])
1483 Equity(49878 [BATR_A])
1484 Equity(49879 [BATR_K])
1485 Equity(49880 [LSXM_B])
1486 Equity(49881 [LSXM_A])
1487 Equity(49883 [LSXM_K])
1488 Equity(49885 [FWON_A])
1489 Equity(49908 [RRR])
1490 Equity(49909 [BAC_PRA])
1491 Equity(49941 [PRE_PRG])
1492 Equity(49942 [PRE_PRI])
1493 Equity(49943 [PRE_PRH])
1494 Equity(49976 [DFT_PRC])
1495 Equity(49977 [PSA_PRC])
1496 Equity(49987 [IBKC_O])
1497 Equity(49998 [LHO_PRJ])
1498 Equity(50059 [VR_PRA])
1499 Equity(50061 [WFC_PRX])

1500 rows × 1 columns

b. Dividend 1500 Sector Composition

Find and print the sector composition of the universe on 2016-07-01.


In [10]:
SECTOR_CODE_NAMES = {
    Sector.BASIC_MATERIALS: 'Basic Materials',
    Sector.CONSUMER_CYCLICAL: 'Consumer Cyclical',
    Sector.FINANCIAL_SERVICES: 'Financial Services',
    Sector.REAL_ESTATE: 'Real Estate',
    Sector.CONSUMER_DEFENSIVE: 'Consumer Defensive',
    Sector.HEALTHCARE: 'Healthcare',
    Sector.UTILITIES: 'Utilities',
    Sector.COMMUNICATION_SERVICES: 'Communication Services',
    Sector.ENERGY: 'Energy',
    Sector.INDUSTRIALS: 'Industrials',
    Sector.TECHNOLOGY: 'Technology',
    -1 : 'Misc'
}

# Your code goes here

universe = morningstar.valuation_ratios.dividend_yield.latest.top(1500)


pipe = Pipeline(
    columns={'Sector': Sector()
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

sectors = 100*result.groupby('Sector').size()/1500
sectors.index = sectors.index.map(lambda code: SECTOR_CODE_NAMES[code])
sectors


Out[10]:
Basic Materials            4.600000
Consumer Cyclical          6.066667
Financial Services        33.866667
Real Estate               21.200000
Consumer Defensive         2.133333
Healthcare                 1.800000
Utilities                  4.000000
Communication Services     2.400000
Energy                    11.800000
Industrials                8.200000
Technology                 3.933333
dtype: float64

The Dividend 1500 has a very unbalanced sector composition, with 34% of equities being from the financial services sector. This exposes the universe to sector risk and makes it vulnerable to volatility in the financial services sector. The QTradableStocksUS avoids this by capping the number of equities from any single sector at 30%.

Exercise 4: Turnover Smoothing

a. PE 1500

Create a universe consisting of the top 1500 equities by price to earnings ratio then find and print the list of equities included in the this universe on 2016-07-01.


In [11]:
# Your code goes here

universe = morningstar.valuation_ratios.pe_ratio.latest.top(1500)

pipe = Pipeline(
    columns={
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-07-01', '2016-07-01')

pd.DataFrame(result.index.levels[1])


Out[11]:
0
0 Equity(31 [ABAX])
1 Equity(39 [DDC])
2 Equity(52 [ABM])
3 Equity(53 [ABMD])
4 Equity(67 [ADSK])
5 Equity(69 [ACAT])
6 Equity(76 [TAP])
7 Equity(100 [IEP])
8 Equity(110 [ACXM])
9 Equity(114 [ADBE])
10 Equity(153 [AE])
11 Equity(154 [AEM])
12 Equity(225 [AHPI])
13 Equity(239 [AIG])
14 Equity(301 [ALKS])
15 Equity(311 [ALOG])
16 Equity(351 [AMD])
17 Equity(366 [AVD])
18 Equity(371 [TVTY])
19 Equity(392 [AMS])
20 Equity(447 [AP])
21 Equity(450 [CLFD])
22 Equity(455 [APC])
23 Equity(484 [ATU])
24 Equity(553 [ASEI])
25 Equity(559 [ASH])
26 Equity(579 [ASTE])
27 Equity(600 [OA])
28 Equity(610 [ATNI])
29 Equity(629 [AU])
... ...
1470 Equity(49463 [KLDX])
1471 Equity(49501 [LIVN])
1472 Equity(49503 [AFCO])
1473 Equity(49516 [MPSX])
1474 Equity(49523 [TLGT])
1475 Equity(49576 [AC])
1476 Equity(49606 [MIME])
1477 Equity(49608 [MTCH])
1478 Equity(49627 [RMR])
1479 Equity(49630 [CSRA])
1480 Equity(49655 [TEAM])
1481 Equity(49666 [AGR])
1482 Equity(49668 [CCRC])
1483 Equity(49697 [CIFC])
1484 Equity(49700 [FCE_A])
1485 Equity(49701 [FCE_B])
1486 Equity(49703 [PBBI])
1487 Equity(49722 [SPI])
1488 Equity(49727 [GCP])
1489 Equity(49758 [OSB])
1490 Equity(49817 [HCM])
1491 Equity(49830 [UA])
1492 Equity(49880 [LSXM_B])
1493 Equity(49881 [LSXM_A])
1494 Equity(49883 [LSXM_K])
1495 Equity(49894 [ARA])
1496 Equity(49908 [RRR])
1497 Equity(49959 [SITE])
1498 Equity(50002 [COTV])
1499 Equity(50005 [ZDGE])

1500 rows × 1 columns

b. PE 1500 Turnover

Use the calculate_daily_turnover helper function to find and plot the turnover of the PE 1500 universe during 2016. Compare the average to that of the QTradableStocksUS.


In [12]:
# Your code goes here

pipe = Pipeline(
    columns={'Price to Earnings Ratio 1500' : universe
    },
    screen=universe
)

result = run_pipeline(pipe, '2016-01-01', '2017-01-01')

result = result.unstack().fillna(False)

turnover = calculate_daily_turnover(result)

turnover.plot(figsize=(14, 8));

print turnover.describe().loc['mean']


Price to Earnings Ratio 1500    23.269841
Name: mean, dtype: float64

The mean turnover was almost twice as high as in the QTradableStocksUS, which has built-in smoothing features to prevent equities near the threshold from entering and exiting frequently.

c. Smoothing the PE 1500

Using AtLeastN, apply a smoothing function to the PE 1500 to reduce turnover noise and find the new mean turnover.


In [13]:
# Your code goes here

universe_smoothed = AtLeastN(inputs=[universe],
                       window_length=21,
                       N=16,)

pipe = Pipeline(
    columns={'Smoothed PE 1500' : universe_smoothed
    },
    screen=universe_smoothed
)

result = run_pipeline(pipe, '2016-01-01', '2017-01-01')

result = result.unstack().fillna(False)

turnover = calculate_daily_turnover(result)

print turnover.describe().loc['mean']


Smoothed PE 1500    8.849206
Name: mean, dtype: float64

The mean turnover of the smoothed universe is less than half of what it was before the smoothing. This action reduced the noise from small movements near the threshold and left only the meaningful turnover events.


Congratulations on completing the Universe Selection answer key!

As you learn more about writing trading models and the Quantopian platform, enter a daily Quantopian Contest. Your strategy will be evaluated for a cash prize every day.

Start by going through the Writing a Contest Algorithm tutorial.

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian, Inc. has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian, Inc. at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.