Welcome

Welcome to Quantopian. In this tutorial, we introduce Quantopian, the problems it aims to solve, and the tools it provides to help you solve those problems. At the end of this lesson, you should have a high level understanding of what you can do with Quantopian.

The focus of the tutorial is to get you started, not to make you an expert Quantopian user. If you already feel comfortable with the basics of Quantopian, there are other resources to help you learn more about Quantopian's tools:

All you need to get started on this tutorial is some basic Python programming skills.

Note: You are currently viewing this tutorial lesson in the Quantopian Research environment. Research is a hosted Jupyter notebook environment that allows you to interactively run Python code. Research comes with a mix of proprietary and open-source Python libraries pre-installed. To learn more about Research, see the documentation. You can follow along with the code in this notebook by cloning it. Each cell of code (grey boxes) can be run by pressing Shift + Enter. This tutorial notebook is read-only. If you want to make changes to the notebook, create a new notebook and copy the code from this tutorial.

What is Quantopian?

Quantopian is a cloud-based software platform that allows you to research cross-sectional factors in developed and emerging equity markets around the world using Python. Quantopian makes it easy to iterate on ideas by supplying a fast, uniform API on top of all sorts of financial data. Additionally, Quantopian provides tools to help you upload your own financial datasets, analyze the efficacy of your factors, and download your work into a local environment so that you can integrate it with other systems.

Typically, researching cross-sectional equity factors involves the following steps:

Define a universe of assets.
Define a factor over the universe.
Test the factor.
Export factor data for integration with another system or application.

On Quantopian, steps 1 and 2 are achieved using the Pipeline API, step 3 is done using a tool called Alphalens, and step 4 is done using a tool called Aqueduct. The rest of this tutorial will give a brief walkthrough of an end-to-end factor research workflow on Quantopian.

Research Environment

The code in this tutorial can be run in Quantopian's Research environment (this notebook is currently running in Research). Research is a hosted Jupyter notebook environment that allows you to interactively run Python code. Research comes with a mix of proprietary and open-source Python libraries pre-installed. To learn more about Research, see the documentation.

Press Shift+Enter to run each cell of code (grey boxes).

Step 1 - Define a universe of assets.

The first step to researching a cross-sectional equity factor is to select a “universe” of equities over which our factor will be defined. In this context, a universe represents the set of equities we want to consider when performing computations later. On Quantopian, defining a universe is done using the the Pipeline API. Later on, we will use the same API to compute factors over the equities in this universe.

The Pipeline API provides a uniform interface to several built-in datasets, as well as any custom datasets that we upload to our account. Pipeline makes it easy to define computations or expressions using built-in and custom data. For example, the following code snippet imports two built-in datasets, FactSet Fundamentals and FactSet Equity Metadata, and uses them to define an equity universe.



In [1]:

    
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.factset import Fundamentals, EquityMetadata

is_share = EquityMetadata.security_type.latest.eq('SHARE')
is_primary = EquityMetadata.is_primary.latest
primary_shares = (is_share & is_primary)
market_cap = Fundamentals.mkt_val.latest

universe = market_cap.top(1000, mask=primary_shares)

The above example defines a universe to be the top 1000 primary issue common stocks ranked by market cap. Universes can be defined using any of the data available on Quantopian. Additionally, you can upload your own data, such as index constituents or another custom universe to the platform using the Self-Serve Data tool. To learn more about uploading a custom dataset, see the Self-Serve Data documentation. For now, we will stick with the universe definition above.

Step 2 - Define a factor.

After defining a universe, the next step is to define a factor for testing. On Quantopian, a factor is a computation that produces numerical values at a regular frequency for all assets in a universe. Similar to step 1, we will use the the Pipeline API to define factors. In addition to providing a fast, uniform API on top of pre-integrated and custom datasets, Pipeline also provides a set of built-in classes and methods that can be used to quickly define factors. For example, the following code snippet defines a momentum factor using fast and slow moving average computations.



In [2]:

    
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data import EquityPricing
from quantopian.pipeline.factors import SimpleMovingAverage

# 1-month (21 trading day) moving average factor.
fast_ma = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=21)

# 6-month (126 trading day) moving average factor.
slow_ma = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=126)

# Divide fast_ma by slow_ma to get momentum factor and z-score.
momentum = fast_ma / slow_ma
momentum_factor = momentum.zscore()

Now that we defined a universe and a factor, we can choose a market and time period and simulate the factor. One of the defining features of the Pipeline API is that it allows us to define universes and factors using high level terms, without having to worry about common data engineering problems like adjustments, point-in-time data, symbol mapping, delistings, and data alignment. Pipeline does all of that work behind the scenes and allows us to focus our time on building and testing factors.

The below code creates a Pipeline instance that adds our factor as a column and screens down to equities in our universe. The Pipline is then run over the US equities market from 2016 to 2019.



In [3]:

    
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data import EquityPricing
from quantopian.pipeline.data.factset import Fundamentals, EquityMetadata
from quantopian.pipeline.domain import US_EQUITIES, ES_EQUITIES
from quantopian.pipeline.factors import SimpleMovingAverage

is_share = EquityMetadata.security_type.latest.eq('SHARE')
is_primary = EquityMetadata.is_primary.latest
primary_shares = (is_share & is_primary)
market_cap = Fundamentals.mkt_val.latest

universe = market_cap.top(1000, mask=primary_shares)

# 1-month moving average factor.
fast_ma = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=21)

# 6-month moving average factor.
slow_ma = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=126)

# Divide fast_ma by slow_ma to get momentum factor and z-score.
momentum = fast_ma / slow_ma
momentum_factor = momentum.zscore()


# Create a US equities pipeline with our momentum factor, screening down to our universe.
pipe = Pipeline(
    columns={
        'momentum_factor': momentum_factor,
    },
    screen=momentum_factor.percentile_between(50, 100, mask=universe),
    domain=US_EQUITIES,
)

# Run the pipeline from 2016 to 2019 and display the first few rows of output.
from quantopian.research import run_pipeline
factor_data = run_pipeline(pipe, '2016-01-01', '2019-01-01')
print("Result contains {} rows of output.".format(len(factor_data)))
factor_data.head()









    





 
 










    




Pipeline Execution Time: 13.71 Seconds






    



Result contains 376888 rows of output.






    Out[3]:






  
    
      
      
      momentum_factor
    
  
  
    
      2016-01-04 00:00:00+00:00
      Equity(67 [ADSK])
      1.211037
    
    
      Equity(76 [TAP])
      1.252325
    
    
      Equity(114 [ADBE])
      0.816440
    
    
      Equity(161 [AEP])
      0.407423
    
    
      Equity(185 [AFL])
      0.288431

The next step is to test the predictiveness of the factor we defined in step 2. In order to determine if our factor is predictive, we calculate the forward returns for the factor's assets over the factor's dates. We then pass the factor and the forward returns into Alphalens. The following code cell shows how to get this returns data and send it to Alphalens.

Step 3 - Test the factor.

The next step is to test the predictiveness of the factor we defined in step 2. In order to determine if our factor is predictive, load returns data from Pipeline, and then feed the factor and returns data into Alphalens. The following code cell loads the 1-day trailing returns for equities in our universe, shifts them back, and formats the data for use in Alphalens.



In [4]:

    
from quantopian.research import get_forward_returns
import alphalens as al

# Get the 1-day forward returns for the assets and dates in the factor
returns_df = get_forward_returns(
    factor_data['momentum_factor'],
    [1],
    US_EQUITIES
)

# Format the factor and returns data so that we can run it through Alphalens.
al_data = al.utils.get_clean_factor(
    factor_data['momentum_factor'],
    returns_df,
    quantiles=5,
    bins=None,
)









    



Dropped 0.1% entries from factor data: 0.1% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions).
max_loss is 35.0%, not exceeded: OK!

Then, we can create a factor tearsheet to analyze our momentum factor.



In [5]:

    
from alphalens.tears import create_full_tear_sheet

create_full_tear_sheet(al_data)









    



Quantiles Statistics






    






  
    
      
      min
      max
      mean
      std
      count
      count %
    
    
      factor_quantile
      
      
      
      
      
      
    
  
  
    
      1
      -0.075124
      0.421066
      0.210713
      0.087641
      75500
      20.047317
    
    
      2
      0.035894
      0.565007
      0.345344
      0.088665
      75320
      19.999522
    
    
      3
      0.176662
      0.749339
      0.493434
      0.095231
      74976
      19.908181
    
    
      4
      0.333945
      1.049542
      0.694472
      0.117190
      75320
      19.999522
    
    
      5
      0.550049
      8.979527
      1.237854
      0.522658
      75493
      20.045458
    
  








    



Returns Analysis






    






  
    
      
      1D
    
  
  
    
      Ann. alpha
      -0.010
    
    
      beta
      0.113
    
    
      Mean Period Wise Return Top Quantile (bps)
      0.195
    
    
      Mean Period Wise Return Bottom Quantile (bps)
      -0.432
    
    
      Mean Period Wise Spread (bps)
      0.626
    
  








    



/venvs/py35/lib/python3.5/site-packages/alphalens/tears.py:275: UserWarning: 'freq' not set in factor_data index: assuming business day
  UserWarning,






    





<matplotlib.figure.Figure at 0x7f7192619cf8>






    












    



Information Analysis






    






  
    
      
      1D
    
  
  
    
      IC Mean
      0.005
    
    
      IC Std.
      0.135
    
    
      Risk-Adjusted IC
      0.038
    
    
      t-stat(IC)
      1.037
    
    
      p-value(IC)
      0.300
    
    
      IC Skew
      -0.288
    
    
      IC Kurtosis
      0.007
    
  








    



/venvs/py35/lib/python3.5/site-packages/statsmodels/nonparametric/kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j






    












    



/venvs/py35/lib/python3.5/site-packages/alphalens/utils.py:912: UserWarning: Skipping return periods that aren't exact multiples of days.
  + " of days."






    



Turnover Analysis






    






  
    
      
      1D
    
  
  
    
      Quantile 1 Mean Turnover
      0.117
    
    
      Quantile 2 Mean Turnover
      0.111
    
    
      Quantile 3 Mean Turnover
      0.096
    
    
      Quantile 4 Mean Turnover
      0.070
    
    
      Quantile 5 Mean Turnover
      0.030
    
  








    






  
    
      
      1D
    
  
  
    
      Mean Factor Rank Autocorrelation
      0.996

The Alphalens tearsheet offers insight into the predictive ability of a factor.

To learn more about Alphalens, check out the documentation.

Step 4 - Download Results Locally

When we have a factor that we like, the next step is often to download the factor data so we can integrate it with another system. On Quantopian, downloading pipeline results to a local environment is done using Aqueduct. Aqueduct is an HTTP API that enables remote execution of pipelines, and makes it possible to download results to a local environment.

Quantopian accounts do not have access to Aqueduct by default. It is an additional feature to which you will need to request access. If you would like to learn more about adding Aqueduct to your Quantopian account, please contact us at feedback@quantopian.com.

Recap & Next Steps

In this tutorial, we introduced Quantopian and walked through an example factor research workflow using Pipeline, Alphalens, and Aqueduct. Quantopian has a rich set of documentation and tutorials on these tools and others. We recommend starting with the tutorials or the User Guide section of the documentation if you would like to grow your understanding of Quantopian.

If you would like to learn more about Quantopian's enterprise offering, please contact us at enterprise@quantopian.com.

		momentum_factor
2016-01-04 00:00:00+00:00	Equity(67 [ADSK])	1.211037
	Equity(76 [TAP])	1.252325
	Equity(114 [ADBE])	0.816440
	Equity(161 [AEP])	0.407423
	Equity(185 [AFL])	0.288431

	min	max	mean	std	count	count %
factor_quantile
1	-0.075124	0.421066	0.210713	0.087641	75500	20.047317
2	0.035894	0.565007	0.345344	0.088665	75320	19.999522
3	0.176662	0.749339	0.493434	0.095231	74976	19.908181
4	0.333945	1.049542	0.694472	0.117190	75320	19.999522
5	0.550049	8.979527	1.237854	0.522658	75493	20.045458

	1D
Ann. alpha	-0.010
beta	0.113
Mean Period Wise Return Top Quantile (bps)	0.195
Mean Period Wise Return Bottom Quantile (bps)	-0.432
Mean Period Wise Spread (bps)	0.626

	1D
IC Mean	0.005
IC Std.	0.135
Risk-Adjusted IC	0.038
t-stat(IC)	1.037
p-value(IC)	0.300
IC Skew	-0.288
IC Kurtosis	0.007

	1D
Quantile 1 Mean Turnover	0.117
Quantile 2 Mean Turnover	0.111
Quantile 3 Mean Turnover	0.096
Quantile 4 Mean Turnover	0.070
Quantile 5 Mean Turnover	0.030