This exercise notebook refers to this lecture. Please use the lecture for explanations and sample code.
https://www.quantopian.com/lectures#Universe-Selection
Part of the Quantopian Lecture Series:
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters import QTradableStocksUS, AtLeastN
from quantopian.research import run_pipeline
from quantopian.pipeline.data import morningstar
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume
In [2]:
def calculate_daily_turnover(unstacked):
return (unstacked
.diff() # Get True/False showing where values changed from previous day.
.iloc[1:] # Drop first row, which is meaningless after diff().
.astype(bool) # diff() coerces from bool -> object :(. Undo that.
.groupby(axis=1, level=0)
.sum())
In [3]:
# Your code goes here
universe = QTradableStocksUS()
In [4]:
# Your code goes here
pipe = Pipeline(
columns={
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
pd.DataFrame(result.index.levels[1])
Out[4]:
In [5]:
SECTOR_CODE_NAMES = {
Sector.BASIC_MATERIALS: 'Basic Materials',
Sector.CONSUMER_CYCLICAL: 'Consumer Cyclical',
Sector.FINANCIAL_SERVICES: 'Financial Services',
Sector.REAL_ESTATE: 'Real Estate',
Sector.CONSUMER_DEFENSIVE: 'Consumer Defensive',
Sector.HEALTHCARE: 'Healthcare',
Sector.UTILITIES: 'Utilities',
Sector.COMMUNICATION_SERVICES: 'Communication Services',
Sector.ENERGY: 'Energy',
Sector.INDUSTRIALS: 'Industrials',
Sector.TECHNOLOGY: 'Technology',
-1 : 'Misc'
}
# Your code goes here
pipe = Pipeline(
columns={'Sector': Sector()
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
sectors = result.groupby('Sector').size()
sectors.index = sectors.index.map(lambda code: SECTOR_CODE_NAMES[code])
sectors
Out[5]:
In [6]:
# Your code goes here
pipe = Pipeline(
columns={'QTradableStocksUS' : universe
},
screen=universe
)
result = run_pipeline(pipe, '2016-01-01', '2017-01-01')
result = result.unstack().fillna(False)
turnover = calculate_daily_turnover(result)
turnover.plot(figsize=(14, 8));
turnover.describe()
Out[6]:
In [7]:
# Your code goes here
universe = morningstar.income_statement.net_income.latest.top(1500)
pipe = Pipeline(
columns={
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
pd.DataFrame(result.index.levels[1])
Out[7]:
In [8]:
# Your code goes here
pipe = Pipeline(
columns={'Average Dollar Volume' : AverageDollarVolume(window_length = 200)
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
print "NetIncome 1500 ADV:", np.mean(result['Average Dollar Volume'])
pipe = Pipeline(
columns={'Average Dollar Volume' : AverageDollarVolume(window_length = 30)
},
screen=QTradableStocksUS()
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
print "QTradableStocksUS ADV:", np.mean(result['Average Dollar Volume'])
Using average dollar volume as a stand-in liquidity metric, the QTradableStocksUS is more liquid than the NetIncome 1500.
We used ADV as a liquidity metric here as it is an important indicator and its calculation is simple. However, average dollar volume is not a perfect measurement of liquidity as liquidity is determined by several factors and volume is only one of those. For more information on liquidity and its effects on algorithm performance, refer to the lecture on Volume, Slippage, and Liquidity
In [9]:
# Your code goes here
universe = morningstar.income_statement.net_income.latest.top(1500)
pipe = Pipeline(
columns={
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
pd.DataFrame(result.index.levels[1])
Out[9]:
In [10]:
SECTOR_CODE_NAMES = {
Sector.BASIC_MATERIALS: 'Basic Materials',
Sector.CONSUMER_CYCLICAL: 'Consumer Cyclical',
Sector.FINANCIAL_SERVICES: 'Financial Services',
Sector.REAL_ESTATE: 'Real Estate',
Sector.CONSUMER_DEFENSIVE: 'Consumer Defensive',
Sector.HEALTHCARE: 'Healthcare',
Sector.UTILITIES: 'Utilities',
Sector.COMMUNICATION_SERVICES: 'Communication Services',
Sector.ENERGY: 'Energy',
Sector.INDUSTRIALS: 'Industrials',
Sector.TECHNOLOGY: 'Technology',
-1 : 'Misc'
}
# Your code goes here
universe = morningstar.valuation_ratios.dividend_yield.latest.top(1500)
pipe = Pipeline(
columns={'Sector': Sector()
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
sectors = 100*result.groupby('Sector').size()/1500
sectors.index = sectors.index.map(lambda code: SECTOR_CODE_NAMES[code])
sectors
Out[10]:
The Dividend 1500 has a very unbalanced sector composition, with 34% of equities being from the financial services sector. This exposes the universe to sector risk and makes it vulnerable to volatility in the financial services sector. The QTradableStocksUS avoids this by capping the number of equities from any single sector at 30%.
In [11]:
# Your code goes here
universe = morningstar.valuation_ratios.pe_ratio.latest.top(1500)
pipe = Pipeline(
columns={
},
screen=universe
)
result = run_pipeline(pipe, '2016-07-01', '2016-07-01')
pd.DataFrame(result.index.levels[1])
Out[11]:
In [12]:
# Your code goes here
pipe = Pipeline(
columns={'Price to Earnings Ratio 1500' : universe
},
screen=universe
)
result = run_pipeline(pipe, '2016-01-01', '2017-01-01')
result = result.unstack().fillna(False)
turnover = calculate_daily_turnover(result)
turnover.plot(figsize=(14, 8));
print turnover.describe().loc['mean']
The mean turnover was almost twice as high as in the QTradableStocksUS, which has built-in smoothing features to prevent equities near the threshold from entering and exiting frequently.
In [13]:
# Your code goes here
universe_smoothed = AtLeastN(inputs=[universe],
window_length=21,
N=16,)
pipe = Pipeline(
columns={'Smoothed PE 1500' : universe_smoothed
},
screen=universe_smoothed
)
result = run_pipeline(pipe, '2016-01-01', '2017-01-01')
result = result.unstack().fillna(False)
turnover = calculate_daily_turnover(result)
print turnover.describe().loc['mean']
The mean turnover of the smoothed universe is less than half of what it was before the smoothing. This action reduced the noise from small movements near the threshold and left only the meaningful turnover events.
Congratulations on completing the Universe Selection answer key!
As you learn more about writing trading models and the Quantopian platform, enter a daily Quantopian Contest. Your strategy will be evaluated for a cash prize every day.
Start by going through the Writing a Contest Algorithm tutorial.
This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian, Inc. has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian, Inc. at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.