We’ll start with a quick, non-comprehensive overview of the fundamental concepts to get you started.
This document was created using ipython notebook, great tool for prototyping. Highly recommended.
In this example, everything will be keept contained in a single module, but for larger projects you want to create a dedicated module for the engine instance. Since this instance is used as the entry-point for everything you want to do with bigtempo, like creating datasources, using selections and evaluating data, it must be possible for other modules to import it.
In [2]:
import bigtempo.core
engine = bigtempo.core.DatasourceEngine()
Saving the code above as instances.py
, for instance, will you enable to reach the same engine instance through an import:
In [3]:
from instances import engine
engine
Out[3]:
In [11]:
import numpy
import pandas
@engine.datasource('RANDOM', tags=['RAW'])
class RawRandom(object):
def evaluate(self, context, symbol, start=None, end=None):
column_name = symbol
data = numpy.random.randn(100)
index = pandas.date_range('1/1/2000', periods=100)
return pandas.DataFrame(data, index=index, columns=[column_name])
Using the engine instance, we can then verify the datasource was registered successfuly:
In [14]:
engine.select().all()
Out[14]:
Selections make it easy to pick groups of datasources accordingly using its tag definitions.
A selection can be iterated, and also provides a getter method in which you can get an specific datasource processor:
In [17]:
selection = engine.select().all()
result = selection.get(0).process('data_variant_name')
result.plot()
Out[17]:
Now, let's create some other datasources that uses the first one as a dependency:
In [ ]:
@engine.datasource('WEEKLY',
dependencies=['RANDOM'],
lookback=6,
tags=['WEEKLY', 'TAG'])
class Weekly(object):
def evaluate(self, context, symbol, start=None, end=None):
data = context.dependencies('RANDOM')
index = dateutils.week_range(start, end)
result = pandas.DataFrame(columns=df_norm.columns, index=df_index)
return result.dropna()