enterprise Data Structures

This guide will give an introduction to the unique data structures used in enterprise. These are all designed with the goal of making this code as user-friendly as possible, both for the end user and the developer.


In [2]:
% matplotlib inline
%config InlineBackend.figure_format = 'retina'

from __future__ import division

import numpy as np
import matplotlib.pyplot as plt

from enterprise.pulsar import Pulsar
import enterprise.signals.parameter as parameter
from enterprise.signals import utils
from enterprise.signals import signal_base
from enterprise.signals import selections
from enterprise.signals.selections import Selection
#from tests.enterprise_test_data import datadir
import tests

In [3]:
tests


Out[3]:
<module 'tests' from '/Users/staylor/miniconda2/lib/python2.7/site-packages/IPython/extensions/tests/__init__.pyc'>

Class Factories

The enterprise code makes heavy use of so-called class factories. Class factories are functions that return classes (not objects of class instances). A simple example is as follows:


In [2]:
def A(farg1, farg2):
    class A(object):
        
        def __init__(self, iarg):
            self.iarg = iarg
        
        def print_info(self):
            print('Object instance {}\nInstance argument: {}\nFunction args: {} {}\n'.format(
                self, self.iarg, farg1, farg2))
    return A

In [3]:
# define class A with arguments that can be seen within the class
a = A('arg1', 'arg2')

# instantiate 2 instances of class A with different arguments
a1 = a('iarg1')
a2 = a('iarg2')

# call print_info method
a1.print_info()
a2.print_info()


Object instance <__main__.A object at 0x10e356810>
Instance argument: iarg1
Function args: arg1 arg2

Object instance <__main__.A object at 0x1116f4f10>
Instance argument: iarg2
Function args: arg1 arg2

In the example above we see that the arguments arg1 and arg2 are seen by both instances a1 and a2; however these instances were intantiated with different input arguments iarg1 and iarg2. So we see that class-factories are great when we want to give "global" parameters to a class without having to pass them on initialization. This also allows us to mix and match classes, as we will do in enterprise before we instantiate them.

The Pulsar class

The Pulsar class is a simple data structure that stores all of the important information about a pulsar that is obtained from a timing package such as the TOAs, residuals, error-bars, flags, design matrix, etc.

This class is instantiated with a par and a tim file. Full documentation on this class can be found here.


In [4]:
psr = Pulsar(datadir+'/B1855+09_NANOGrav_9yv1.gls.par', datadir+'/B1855+09_NANOGrav_9yv1.tim')

This Pulsar object is then passed to other enterprise data structures in a loosley coupled way in order to interact with the pulsar data.

The Parameter class

In enterprise signal parameters are set by specifying a prior distribution (i.e., Uniform, Normal, etc.). These Parameters are how enterprise builds signals. Below we will give an example of this functionality.


In [5]:
# lets define an efac parameter with a uniform prior from [0.5, 5]
efac = parameter.Uniform(0.5, 5)
print(efac)


<class 'enterprise.signals.parameter.Uniform'>

Uniform is a class factory that returns a class. The parameter is then intialized via a name. This way, a single parameter class can be initialized for multiple signal parameters with different names (i.e. EFAC per observing backend, etc). Once the parameter is initialized then you then have access to many useful methods.


In [6]:
# initialize efac parameter with name "efac_1"
efac1 = efac('efac_1')
print(efac1)

# return parameter name
print(efac1.name)

# get pdf at a point (log pdf is access)
print(efac1.get_pdf(1.3), efac1.get_logpdf(1.3))

# return 5 samples from this prior distribution
print(efac1.sample(size=5))


"efac_1":Uniform(0.5,5)
efac_1
(0.22222222222222221, -1.5040773967762742)
[ 2.82288791  3.47338006  1.68693806  4.34250608  4.79228485]

The Function structure

In enterprise we have defined a special data structure called Function. This data structure provides the user with a way to use and combine several different enterprise components in a user friendly way. More explicitly, it converts and standard function into an enterprise Function which can extract information from the Pulsar object and can also interact with enterprise Parameters.

[put reference to docstring here]

For example, consider the function:


In [7]:
@signal_base.function
def sine_wave(toas, log10_A=-7, log10_f=-8):
    return 10**log10_A * np.sin(2*np.pi*toas*10**log10_f)

Notice that the first positional argument of the function is toas, which happens to be a name of an attribute in the Pulsar class and the keyword arguments specify the default parameters for this function.

The decorator converts this standard function to a Function which can be used in two ways: the first way is to treat it like any other function.


In [8]:
# treat it just as a standard function with a vector input
sw = sine_wave(np.array([1,2,3]), log10_A=-8, log10_f=-7.5)
print(sw)


[  1.98691765e-15   3.97383531e-15   5.96075296e-15]

the second way is to use it as a Function:


In [9]:
# or use it as an enterprise function
sw_function = sine_wave(log10_A=parameter.Uniform(-10,-5), log10_f=parameter.Uniform(-9, -7))
print(sw_function)


<class 'enterprise.signals.signal_base.Function'>

Here we see that Function is actually a class factory, that is, when initialized with enterprise Parameters it returns a class that is initialized with a name and a Pulsar object as follows:


In [10]:
sw2 = sw_function('sine_wave', psr=psr)
print(sw2)


<enterprise.signals.signal_base.Function object at 0x10e3567d0>

Now this Function object carries around instances of the Parameter classes given above for this particular function and Pulsar


In [11]:
print(sw2.params)


["sine_wave_log10_A":Uniform(-10,-5), "sine_wave_log10_f":Uniform(-9,-7)]

Most importantly it can be called in three different ways: If given without parameters it will fall back on the defaults given in the original function definition


In [12]:
print(sw2())


[  5.97588901e-08   5.97588901e-08   5.97588901e-08 ...,  -5.80521219e-08
  -5.80521219e-08  -5.80521219e-08]

or we can give it new fixed parameters


In [13]:
print(sw2(log10_A=-8, log10_f=-6.5))


[ -7.23515356e-09  -7.23515356e-09  -7.23515356e-09 ...,   5.93768399e-09
   5.93768399e-09   5.93768399e-09]

or most importantly we can give it a parameter dictionary with the Parameter names as keys. This is how Functions are use internally inside enterprise.


In [14]:
params = {'sine_wave_log10_A':-8, 'sine_wave_log10_f':-6.5}
print(sw2(params=params))


[ -7.23515356e-09  -7.23515356e-09  -7.23515356e-09 ...,   5.93768399e-09
   5.93768399e-09   5.93768399e-09]

Notice that the last two methods give the same answer since we gave it the same values just in different ways. So you may be thinking: "Why did we pass the Pulsar object on initialization?" or "Wait. How does it know about the toas?!". Well the first question answers the second. By passing the pulsar object it grabs the toas attribute internally. This feature, combined with the ability to recognize Parameters and the ability to call the original function as we always would are the main strengths of Function, which is used heavily in enterprise.

Note that if we define a function without the decorator then we can still obtain a Function via:


In [15]:
def sine_wave(toas, log10_A=-7, log10_f=-8):
    return 10**log10_A * np.sin(2*np.pi*toas*10**log10_f)

sw3 = signal_base.Function(sine_wave, log10_A=parameter.Uniform(-10,-5), 
                           log10_f=parameter.Uniform(-9, -7))

print(sw3)


<class 'enterprise.signals.signal_base.Function'>

Make your own Function

To define your own Function all you have to do is to define a function with these rules in mind.

  1. If you want to use Pulsar attributes, define them as positional arguments with the same name as used in the Pulsar class (see here for more information.
  2. Any arguments that you may use as Parameters must be keyword arguments (although you can have others that aren't Parameters)
  3. Add the @function decorator.

And thats it! You can now define your own Functions with minimal overhead and use them in enterprise or for tests and simulations or whatever you want!

The Selection structure

In the course of our analysis it is useful to split different signals into pieces. The most common flavor of this is to split the white noise parameters (i.e., EFAC, EQUAD, and ECORR) by observing backend system. The Selection structure is here to make this as smooth and versatile as possible.

The Selection structure is also a class-factory that returns a specific selection dictionary with keys and Boolean arrays as values.

This will become more clear with an example. Lets say that you want to split our parameters between the first and second half of the dataset, then we can define the following function:


In [16]:
def cut_half(toas):
    midpoint = (toas.max() + toas.min()) / 2
    return dict(zip(['t1', 't2'], [toas <= midpoint, toas > midpoint]))

This function will return a dictionary with keys (i.e. the names of the different subsections) t1 and t2 and boolean arrays corresponding to the first and second halves of the data span, respectively. So for a simple input we have:


In [17]:
toas = np.array([1,2,3,4])
print(cut_half(toas))


{'t2': array([False, False,  True,  True], dtype=bool), 't1': array([ True,  True, False, False], dtype=bool)}

To pass this to enterprise we turn it into a Selection via:


In [18]:
ch = Selection(cut_half)
print(ch)


<class 'enterprise.signals.selections.Selection'>

As we have stated, this is class factory that will be initialized inside enterprise signals with a Pulsar object in a very similar way to Functions.


In [19]:
ch1 = ch(psr)
print(ch1)
print(ch1.masks)


<enterprise.signals.selections.Selection object at 0x111be3450>
{'t2': array([False, False, False, ...,  True,  True,  True], dtype=bool), 't1': array([ True,  True,  True, ..., False, False, False], dtype=bool)}

The Selection object has a method masks that uses the Pulsar object to evaluate the arguments of cut_half (these can be any number of Pulsar attributes, not just toas). The Selection object can also be called to return initialized Parameters with the split names as follows:


In [20]:
# make efac class factory
efac = parameter.Uniform(0.1, 5.0)

# now give it to selection
params, masks = ch1('efac', efac)

# named parameters
print(params)

# named masks
print(masks)


{u't1_efac': "B1855+09_t1_efac":Uniform(0.1,5.0), u't2_efac': "B1855+09_t2_efac":Uniform(0.1,5.0)}
{u't1_efac': array([ True,  True,  True, ..., False, False, False], dtype=bool), u't2_efac': array([False, False, False, ...,  True,  True,  True], dtype=bool)}

Make your own Selection

To define your own Selection all you have to do is to define a function with these rules in mind.

  1. If you want to use Pulsar attributes, define them as positional arguments with the same name as used in the Pulsar class (see here for more information.
  2. Make sure the return value is a dictionary with the names you want for the different segments and values as boolean arrays specifying which points to apply the split to.
  3. A selection does not have to apply to all points. You can make it apply to only single points or single segments if you wish.

And thats it! You can now define your own Selections with minimal overhead and use them in enterprise or for tests and simulations or whatever you want!

Signals, SignalCollections, and PTAs oh my!

  • The data (residuals) are modeled as the sum of Signal components which have their own Parameters.
  • The sum of all Signal components is a SignalCollection.

  • Each pulsar's model is a SignalCollection that are combined to form a PTA.
  • Common Signals are shared across pulsars
  • Likelihoods act on PTAs.

Anatomy of an enterprise Signal

  • $\delta\tau = \sum_{i} X(\phi_{\rm basis})_{(i)}w_{(i)} + s(\phi_{\rm det}) + n(\phi_{\rm white})$
  • $w_{(i)} | K_{(i)} = \mathrm{Normal}(0, K(\phi_{\rm gp})_{(i)})$
class Signal(object):
    """Base class for Signal objects."""

    def get_ndiag(self, params):
        """Returns the diagonal of the white noise vector `N`.
        This method also supports block diagaonal sparse matrices.
        """
        return None

    def get_delay(self, params):
        """Returns the waveform of a deterministic signal."""
        return 0

    def get_basis(self, params=None):
        """Returns the basis array of shape N_toa x N_basis."""
        return None

    def get_phi(self, params):
        """Returns a diagonal or full rank covaraince matrix 
        of the basis amplitudes."""
        return None

    def get_phiinv(self, params):
        """Returns inverse of the covaraince of basis amplitudes."""
        return None

In [ ]: