In [1]:
    
from quantopian.pipeline import Pipeline
from quantopian.research import run_pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import SimpleMovingAverage, AverageDollarVolume
    
Sometimes we want to ignore certain assets when computing pipeline expresssions. There are two common cases where ignoring assets is useful:
Factor computing the coefficients of a regression (RollingLinearRegressionOfReturns).Factor method top to compute the top 200 assets by earnings yield, ignoring assets that don't meet some liquidity constraint.To support these two use-cases, all Factors and many Factor methods can accept a mask argument, which must be a Filter indicating which assets to consider when computing.
Let's say we want our pipeline to output securities with a high or low percent difference but we also only want to consider securities with a dollar volume above $10,000,000. To do this, let's rearrange our make_pipeline function so that we first create the high_dollar_volume filter. We can then use this filter as a mask for moving average factors by passing high_dollar_volume as the mask argument to SimpleMovingAverage.
In [2]:
    
# Dollar volume factor
dollar_volume = AverageDollarVolume(window_length=30)
# High dollar volume filter
high_dollar_volume = (dollar_volume > 10000000)
# Average close price factors
mean_close_10 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=10, mask=high_dollar_volume)
mean_close_30 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=30, mask=high_dollar_volume)
# Relative difference factor
percent_difference = (mean_close_10 - mean_close_30) / mean_close_30
    
Applying the mask to SimpleMovingAverage restricts the average close price factors to a computation over the ~2000 securities passing the high_dollar_volume filter, as opposed to ~8000 without a mask. When we combine mean_close_10 and mean_close_30 to form percent_difference, the computation is performed on the same ~2000 securities.
Masks can be also be applied to methods that return filters like top, bottom, and percentile_between.
Masks are most useful when we want to apply a filter in the earlier steps of a combined computation. For example, suppose we want to get the 50 securities with the highest open price that are also in the top 10% of dollar volume. Suppose that we then want the 90th-100th percentile of these securities by close price. We can do this with the following:
In [3]:
    
# Dollar volume factor
dollar_volume = AverageDollarVolume(window_length=30)
# High dollar volume filter
high_dollar_volume = dollar_volume.percentile_between(90,100)
# Top open price filter (high dollar volume securities)
top_open_price = USEquityPricing.open.latest.top(50, mask=high_dollar_volume)
# Top percentile close price filter (high dollar volume, top 50 open price)
high_close_price = USEquityPricing.close.latest.percentile_between(90, 100, mask=top_open_price)
    
Let's put this into make_pipeline and output an empty pipeline screened with our high_close_price filter.
In [4]:
    
def make_pipeline():
    # Dollar volume factor
    dollar_volume = AverageDollarVolume(window_length=30)
    # High dollar volume filter
    high_dollar_volume = dollar_volume.percentile_between(90,100)
    # Top open securities filter (high dollar volume securities)
    top_open_price = USEquityPricing.open.latest.top(50, mask=high_dollar_volume)
    # Top percentile close price filter (high dollar volume, top 50 open price)
    high_close_price = USEquityPricing.close.latest.percentile_between(90, 100, mask=top_open_price)
    return Pipeline(
        screen=high_close_price
    )
    
Running this pipeline outputs 5 securities on May 5th, 2015.
In [5]:
    
result = run_pipeline(make_pipeline(), '2015-05-05', '2015-05-05')
print 'Number of securities that passed the filter: %d' % len(result)
    
    
Note that applying masks in layers as we did above can be thought of as an "asset funnel".
In the next lesson, we'll look at classifiers.