Example using the data manager classes

This notebook shows how to use the data manager framework for simpler API usage and for caching capabilities.

Please note that in order to request bloomberg fields using property access, it must be CAPITALIZED. (sid.PX_AST NOT sid.px_last)


In [1]:
import pandas as pd
import tia.bbg.datamgr as dm

Single Security Accessor


In [2]:
# create a DataManager for simpler api access
mgr = dm.BbgDataManager()
# retrieve a single security accessor from the manager
msft = mgr['MSFT US EQUITY']

In [3]:
#  Can now access any Bloomberg field (as long as it is upper case)
msft.PX_LAST, msft.PX_OPEN


Out[3]:
(47.590000000000003, 47.229999999999997)

In [4]:
# Access multiple fields at the same time
msft['PX_LAST', 'PX_OPEN']


Out[4]:
[47.59, 47.23]

In [5]:
# OR pass an array
msft[['PX_LAST', 'PX_OPEN']]


Out[5]:
[47.59, 47.23]

In [6]:
# Have the manager default to returning a frame instead of values
mgr.sid_result_mode = 'frame'
msft.PX_LAST


Out[6]:
PX_LAST
MSFT US EQUITY 47.59

In [7]:
# multiple fields returned as data frame
msft[['PX_LAST', 'PX_OPEN']]


Out[7]:
PX_LAST PX_OPEN
MSFT US EQUITY 47.585 47.23

In [8]:
# Retrieve historical data
msft.get_historical(['PX_OPEN', 'PX_HIGH', 'PX_LOW', 'PX_LAST'], '1/1/2014', '1/12/2014').head()


Out[8]:
PX_OPEN PX_HIGH PX_LOW PX_LAST
date
2014-01-02 37.350 37.40 37.10 37.16
2014-01-03 37.200 37.22 36.60 36.91
2014-01-06 36.850 36.89 36.11 36.13
2014-01-07 36.325 36.49 36.21 36.41
2014-01-08 36.000 36.14 35.58 35.76

Multi-security accessor


In [9]:
sids = mgr['MSFT US EQUITY', 'IBM US EQUITY', 'CSCO US EQUITY']
sids.PX_LAST


Out[9]:
PX_LAST
CSCO US EQUITY 28.89
IBM US EQUITY 170.97
MSFT US EQUITY 47.58

In [10]:
sids.get_historical('PX_LAST', '1/1/2014', '11/12/2014').head()


Out[10]:
IBM US EQUITY CSCO US EQUITY MSFT US EQUITY
date
2014-01-02 185.53 22.000 37.16
2014-01-03 186.64 21.980 36.91
2014-01-06 186.00 22.010 36.13
2014-01-07 189.71 22.310 36.41
2014-01-08 187.97 22.293 35.76

In [11]:
sids.get_historical(['PX_OPEN', 'PX_LAST'], '1/1/2014', '11/12/2014').head()


Out[11]:
IBM US EQUITY CSCO US EQUITY MSFT US EQUITY
PX_OPEN PX_LAST PX_OPEN PX_LAST PX_OPEN PX_LAST
date
2014-01-02 187.21 185.53 22.17 22.000 37.350 37.16
2014-01-03 185.83 186.64 22.09 21.980 37.200 36.91
2014-01-06 187.15 186.00 21.96 22.010 36.850 36.13
2014-01-07 186.39 189.71 22.26 22.310 36.325 36.41
2014-01-08 189.33 187.97 22.29 22.293 36.000 35.76

Caching


In [12]:
#
# ability to cache requests in memory or in h5 file
#
ms = dm.MemoryStorage()
cmgr = dm.CachedDataManager(mgr, ms, pd.datetime.now())

In [13]:
cmsft = cmgr['MSFT US EQUITY']
cmsft.PX_LAST


Out[13]:
PX_LAST
MSFT US EQUITY 47.585

In [14]:
%timeit msft.PX_LAST


1 loops, best of 3: 277 ms per loop

In [15]:
%timeit cmsft.PX_LAST


1000 loops, best of 3: 1.66 ms per loop

In [16]:
csids = cmgr['MSFT US EQUITY', 'IBM US EQUITY']
sids = mgr['MSFT US EQUITY', 'IBM US EQUITY']

In [17]:
%timeit sids.get_historical('PX_LAST', start='1/3/2000', end='1/3/2014').head()


1 loops, best of 3: 987 ms per loop

In [18]:
%timeit csids.get_historical('PX_LAST', start='1/3/2000', end='1/3/2014').head()


The slowest run took 371.09 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 3: 5.17 ms per loop
C:\Anaconda\lib\site-packages\pandas\core\index.py:1196: FutureWarning: using '-' to provide set differences with Indexes is deprecated, use .difference()
  "use .difference()",FutureWarning)

In [19]:
#
# HD Storage
# - note after executing the warning from hf api. I decided to leave blanks instead of replacing
#

import tempfile
fh, fp = tempfile.mkstemp()

h5storage = dm.HDFStorage(fp)  # Can set compression level for smaller files
h5mgr = dm.CachedDataManager(mgr, h5storage, pd.datetime.now())
h5msft = h5mgr['MSFT US EQUITY']
h5msft.PX_LAST


C:\Anaconda\lib\site-packages\tables\path.py:100: NaturalNameWarning: object name is not a valid Python identifier: 'MSFT US EQUITY'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though
  NaturalNameWarning)
Out[19]:
PX_LAST
MSFT US EQUITY 47.59

In [20]:
# Notice no warning as it is taken from cache
h5msft.PX_LAST


Out[20]:
PX_LAST
MSFT US EQUITY 47.59

In [21]:
h5msft.get_historical('PX_LAST', start='1/2/2000', end='1/2/2014').head()


Out[21]:
PX_LAST
date
2000-01-03 58.2813
2000-01-04 56.3125
2000-01-05 56.9063
2000-01-06 55.0000
2000-01-07 55.7188

In [22]:
%timeit h5msft.get_historical('PX_LAST', start='1/2/2000', end='1/2/2014')


100 loops, best of 3: 6.18 ms per loop

In [23]:
# notice only IBM gets warning as MSFT is already cached, so it only retrieves IBM data
h5sids = h5mgr['MSFT US EQUITY', 'IBM US EQUITY']
h5sids.get_historical('PX_LAST', start='1/3/2000', end='1/2/2014').tail()


C:\Anaconda\lib\site-packages\tables\path.py:100: NaturalNameWarning: object name is not a valid Python identifier: 'IBM US EQUITY'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though
  NaturalNameWarning)
Out[23]:
IBM US EQUITY MSFT US EQUITY
date
2013-12-26 185.35 37.44
2013-12-27 185.08 37.29
2013-12-30 186.41 37.29
2013-12-31 187.57 37.41
2014-01-02 185.53 37.16

In [24]:
# not perfect as it retrieves for each security and then concats BUT better than roundtrip to bloomberg plus consistency added for free
%timeit h5sids.get_historical('PX_LAST', start='1/3/2000', end='1/2/2014')


100 loops, best of 3: 14.8 ms per loop