bigtempo

Getting Started

We’ll start with a quick, non-comprehensive overview of the fundamental concepts to get you started.

This document was created using ipython notebook, great tool for prototyping. Highly recommended.

IMPORTANT: The engine instance

In this example, everything will be keept contained in a single module, but for larger projects you want to create a dedicated module for the engine instance. Since this instance is used as the entry-point for everything you want to do with bigtempo, like creating datasources, using selections and evaluating data, it must be possible for other modules to import it.



In [2]:

    
import bigtempo.core

engine = bigtempo.core.DatasourceEngine()

Saving the code above as instances.py, for instance, will you enable to reach the same engine instance through an import:



In [3]:

    
from instances import engine

engine









    Out[3]:





<bigtempo.core.DatasourceEngine at 0xa0f994c>

Creating a simple datasource

The default datasource must the following contract:



In [11]:

    
import numpy
import pandas


@engine.datasource('RANDOM', tags=['RAW'])
class RawRandom(object):

    def evaluate(self, context, symbol, start=None, end=None):
        column_name = symbol
        data = numpy.random.randn(100)
        index = pandas.date_range('1/1/2000', periods=100) 

        return pandas.DataFrame(data, index=index, columns=[column_name])

Using the engine instance, we can then verify the datasource was registered successfuly:



In [14]:

    
engine.select().all()









    Out[14]:





<selection 187552588 currently-with="[
    "RANDOM"
]">

Selections make it easy to pick groups of datasources accordingly using its tag definitions.

A selection can be iterated, and also provides a getter method in which you can get an specific datasource processor:



In [17]:

    
selection = engine.select().all()
result = selection.get(0).process('data_variant_name')
result.plot()









    Out[17]:





<matplotlib.axes.AxesSubplot at 0xb2ebc2c>

Now, let's create some other datasources that uses the first one as a dependency:



In [ ]:

    
@engine.datasource('WEEKLY',
                   dependencies=['RANDOM'],
                   lookback=6,
                   tags=['WEEKLY', 'TAG'])
class Weekly(object):

    def evaluate(self, context, symbol, start=None, end=None):
        data = context.dependencies('RANDOM')
        index = dateutils.week_range(start, end)

        result = pandas.DataFrame(columns=df_norm.columns, index=df_index)

        return result.dropna()