The LightCurve class

Background

What kind of physical data are we representing and what do these quantities mean?

Astrophysical variable sources including transients like supernovae are sources that are (roughly) fixed in position but change appreciably at human time scales, such that their measurements at different points of time in a survey are expected to be different. The key science of such sources has to do with interpreting such data.

In a photometric or imaging survey, the raw data associated with such measurements is in the form of a time series of images, along with metadata. This is often extracted into a catalog summarizing the inferred brightness of the source in the filter band and the uncertainties of the inferred. The uncertainties in this process are dominated by the sky noise which is Poisson distributed to a good approximation. Hence, the format of this recorded data is usually of the form

timestamp, flux, flux uncertainty, zero point, magnitude system

Note that the flux is really the band flux obtained in a filter band

The zero point, and magnitude system are methods of specifying the units of the reported flux, and flux uncertainty. flux may be reported so that the following equation with bandflux and std_bandflux in the same physical units holds. The std_bandflux depends on the transmissions and the magnitude system used.

bandmag := -2.5 log10(bandflux / std_bandflux) = -2.5 log10(flux) + zero point

so that a zero point of zero, implies that flux is bandflux in units of std_bandflux

Desired Functionality

the LightCurve class is meant to be a standard internal representation of this kind of catalog data in our software. This means that

  • given a different format/model of data, we will first convert it into an instance of this class.
  • All our calculations will be defined on instances of this class
  • We should have write methods to convert the instances of this class to useful formats for serialization.

Structure


In [56]:
import sncosmo
import analyzeSN as ans
import numpy as np

In [2]:
from analyzeSN import LightCurve

In [50]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()


/usr/local/miniconda/lib/python2.7/site-packages/IPython/html.py:14: ShimWarning: The `IPython.html` package has been deprecated. You should import from `notebook` instead. `IPython.html.widgets` has moved to `ipywidgets`.
  "`IPython.html.widgets` has moved to `ipywidgets`.", ShimWarning)

Instantiation

The class is instantiated using a constructor that has a mandatory input of a dataframe. and optional inputs. The dataframe must contain the light curve information, and the minimum set of columns required are: mjd, band, flux, flux_err, zp, zpsys or recognized aliases. Here, we load some data and demonstrate this


In [3]:
ex_data = sncosmo.load_example_data().to_pandas()
ex_data.head()


Out[3]:
time band flux fluxerr zp zpsys
0 55070.000000 sdssg 0.363512 0.672844 25.0 ab
1 55072.051282 sdssr -0.200801 0.672844 25.0 ab
2 55074.102564 sdssi 0.307494 0.672844 25.0 ab
3 55076.153846 sdssz 1.087761 0.672844 25.0 ab
4 55078.205128 sdssg -0.436679 0.672844 25.0 ab

This was a SNCosmo example dataset, loaded into pandas.DataFrame. Note that the column representing temporal information is time. Now, we will use this to instantiate our class.


In [4]:
lc = LightCurve(ex_data)

We get the representation in our convention by accessing the attribute lightCurve, which is a dataFrame with columns of the standard name. It could have more columns, but the minimal columns have to be there. If we supplied a dataFrame without having the minimal number of columns, it will raise a valueError, as described later. From the lightCurve attribute, we note that time has changed to our standard name mjd


In [5]:
lc.lightCurve.head()


Out[5]:
mjd band flux fluxerr zp zpsys
0 55070.000000 sdssg 0.363512 0.672844 25.0 ab
1 55072.051282 sdssr -0.200801 0.672844 25.0 ab
2 55074.102564 sdssi 0.307494 0.672844 25.0 ab
3 55076.153846 sdssz 1.087761 0.672844 25.0 ab
4 55078.205128 sdssg -0.436679 0.672844 25.0 ab

What would have happened if an essential column was missing? to check let us do the following:

  • We create a dataframe like the above, but remove the essential column band
  • We attempt to create an instance of the LightCurve class

In [6]:
lc_tmp = ex_data.copy()
del lc_tmp['band']
lc_tmp.head()


Out[6]:
time flux fluxerr zp zpsys
0 55070.000000 0.363512 0.672844 25.0 ab
1 55072.051282 -0.200801 0.672844 25.0 ab
2 55074.102564 0.307494 0.672844 25.0 ab
3 55076.153846 1.087761 0.672844 25.0 ab
4 55078.205128 -0.436679 0.672844 25.0 ab

In [7]:
lc_bad = LightCurve(lc_tmp)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-0768a2d4af41> in <module>()
----> 1 lc_bad = LightCurve(lc_tmp)

/Users/rbiswas/.local/lib/python2.7/site-packages/analyzeSN/lightcurve.pyc in __init__(self, lcdf, bandNameDict, ignore_case)
    114         self._lightCurve  = lcdf
    115         self.ignore_case = ignore_case
--> 116         _ = self.lightCurve
    117 
    118 

/Users/rbiswas/.local/lib/python2.7/site-packages/analyzeSN/lightcurve.pyc in lightCurve(self)
    156         if len(missingColumns) > 0:
    157             raise ValueError('light curve data has missing columns',
--> 158                              missingColumns)
    159         else:
    160             _lc.band = _lc.band.apply(lambda x: x.strip())

ValueError: ('light curve data has missing columns', set(['band']))

This raises a ValueError pointing out that the column band was missing. However, it correctly recognizes that while mjd was missing, it had the aliased column time instead and did not report it missing.

Light Curves in SNCosmo Format

SNCosmo has a data format using astropy.table.Table. Since, we use SNCosmo to plot, or fit light curves, it is essential that we can regenerate the SNCosmo format data easily. We do this by the following method


In [8]:
SNCosmoLC = lc.snCosmoLC()

In [9]:
SNCosmoLC


Out[9]:
<Table length=40>
timebandfluxfluxerrzpzpsys
float64str5float64float64float64str2
55070.0sdssg0.363511535970.67284384754125.0ab
55072.0512821sdssr-0.2008012958640.67284384754125.0ab
55074.1025641sdssi0.3074942329810.67284384754125.0ab
55076.1538462sdssz1.087761036560.67284384754125.0ab
55078.2051282sdssg-0.436678956450.67284384754125.0ab
55080.2564103sdssr1.097809667790.67284384754125.0ab
55082.3076923sdssi3.75626856270.67284384754125.0ab
55084.3589744sdssz5.348588949660.67284384754125.0ab
55086.4102564sdssg2.826141872690.67284384754125.0ab
55088.4615385sdssr7.565470450540.67284384754125.0ab
..................
55131.5384615sdssi3.995204040210.67284384754125.0ab
55133.5897436sdssz5.739894580940.67284384754125.0ab
55135.6410256sdssg0.3307022831070.67284384754125.0ab
55137.6923077sdssr0.5652867265790.67284384754125.0ab
55139.7435897sdssi3.043183467950.67284384754125.0ab
55141.7948718sdssz5.626926863840.67284384754125.0ab
55143.8461538sdssg-0.7226547890130.67284384754125.0ab
55145.8974359sdssr1.120917642620.67284384754125.0ab
55147.9487179sdssi2.12466952640.67284384754125.0ab
55150.0sdssz5.34821756450.67284384754125.0ab

We can now plot the data using the usual SNCosmo plotting method, showing the data in g, r, and z bands.


In [10]:
sncosmo.plot_lc(SNCosmoLC)


Out[10]:

SNANA simulations


In [11]:
from analyzeSN import SNANASims

In [12]:
from astropy.table import Table

In [13]:
megacamBandNames = 'ugriz'
megacamRegisteredNames = tuple('mega_' + band for band in megacamBandNames)

In [14]:
snana_eg = SNANASims.fromSNANAfileroot(snanafileroot='snana_fits',
                                       location='../analyzeSN/example_data/',
                                       coerce_inds2int=False,
                                       SNANABandNames=megacamBandNames,
                                       registeredBandNames=megacamRegisteredNames)

In [15]:
Table.read(snana_eg.photFile)[45:50]


Out[15]:
<Table length=5>
MJDFLTFIELDTELESCOPEPHOTFLAGPHOTPROBFLUXCALFLUXCALERRMAGMAGERRPSF_SIG1PSF_SIG2PSF_RATIOSKY_SIGSKY_SIG_TRDNOISEZEROPTZEROPT_ERRGAIN
float64str2str12str20int32float32float32float32float32float32float32float32float32float32float32float32float32float32float32
53023.23rNULLALL00.00.821.75627.7170.00.00.00.00.00.00.00.00.00.0
53026.27iNULLALL00.02.542.87726.4870.00.00.00.00.00.00.00.00.00.0
53026.34rNULLALL00.0-3.11.816666.00.00.00.00.00.00.00.00.00.00.0
-777.0-XXXXXXXX00.0-777.0-777.00.00.00.00.00.00.00.00.00.00.00.0
52881.5iNULLALL00.0-3.92.704666.00.00.00.00.00.00.00.00.00.00.0

Let us look at what the array in the fits file looks like:

  • It has fields that are not of the standard name (FLUXCAL)
  • It has things like "XXX" in he fields

In [16]:
snana_eg.bandNames


Out[16]:
'ugriz'

In [17]:
snana_eg.newbandNames


Out[17]:
('mega_u', 'mega_g', 'mega_r', 'mega_i', 'mega_z')

In [18]:
lcInstance = snana_eg.get_SNANA_photometry(snid='03D1aw')

In [19]:
lcInstance.lightCurve[['mjd', 'band', 'flux', 'fluxerr', 'zp', 'zpsys']]


Out[19]:
mjd band flux fluxerr zp zpsys
0 52881.500 mega_i 13.440000 2.846000 27.5 ab
1 52881.539 mega_r 7.740000 1.701000 27.5 ab
2 52881.559 mega_z 19.719999 6.539000 27.5 ab
3 52886.602 mega_i 44.520000 2.909000 27.5 ab
4 52900.531 mega_i 86.570000 2.914000 27.5 ab
5 52900.559 mega_r 62.169998 2.324000 27.5 ab
6 52900.590 mega_z 84.410004 8.752000 27.5 ab
7 52904.531 mega_i 80.720001 2.710000 27.5 ab
8 52904.570 mega_r 55.090000 2.009000 27.5 ab
9 52908.559 mega_r 54.500000 1.930000 27.5 ab
10 52908.602 mega_i 76.160004 2.932000 27.5 ab
11 52909.609 mega_z 81.099998 7.752000 27.5 ab
12 52912.488 mega_i 70.440002 2.866000 27.5 ab
13 52914.570 mega_r 42.810001 2.690000 27.5 ab
14 52915.621 mega_i 62.090000 3.366000 27.5 ab
15 52916.461 mega_r 38.060001 1.845000 27.5 ab
16 52934.512 mega_r 8.750000 3.558000 27.5 ab
17 52937.512 mega_i 23.530001 5.018000 27.5 ab
18 52937.578 mega_r -0.060000 5.241000 27.5 ab
19 52938.520 mega_z 78.430000 24.221001 27.5 ab
20 52942.488 mega_i 19.780001 8.521000 27.5 ab
21 52944.441 mega_r 7.440000 3.887000 27.5 ab
22 52944.469 mega_z 10.470000 13.880000 27.5 ab
23 52961.379 mega_i 9.140000 3.233000 27.5 ab
24 52961.480 mega_r 4.840000 2.418000 27.5 ab
25 52962.309 mega_z 20.190001 5.934000 27.5 ab
26 52964.320 mega_i 9.600000 2.981000 27.5 ab
27 52964.398 mega_r 1.800000 2.409000 27.5 ab
28 52964.441 mega_z 6.420000 15.410000 27.5 ab
29 52968.371 mega_r 3.590000 2.250000 27.5 ab
30 52972.270 mega_i -1.910000 3.710000 27.5 ab
31 52990.262 mega_i 7.550000 2.904000 27.5 ab
32 52990.340 mega_r 0.500000 1.987000 27.5 ab
33 52992.250 mega_i 4.020000 3.062000 27.5 ab
34 52992.289 mega_r -2.430000 2.229000 27.5 ab
35 52995.352 mega_i 3.080000 3.076000 27.5 ab
36 52995.391 mega_r 0.450000 1.911000 27.5 ab
37 52999.238 mega_i 0.830000 2.948000 27.5 ab
38 52999.281 mega_r 0.720000 1.837000 27.5 ab
39 52999.359 mega_z 20.700001 11.553000 27.5 ab
40 53000.238 mega_z 8.660000 6.506000 27.5 ab
41 53018.262 mega_i 3.550000 2.804000 27.5 ab
42 53018.309 mega_r -3.700000 2.257000 27.5 ab
43 53022.289 mega_i 2.410000 3.066000 27.5 ab
44 53022.320 mega_r -0.010000 4.057000 27.5 ab
45 53023.230 mega_r 0.820000 1.756000 27.5 ab
46 53026.270 mega_i 2.540000 2.877000 27.5 ab
47 53026.340 mega_r -3.100000 1.816000 27.5 ab

In [20]:
lcInstance.bandNameDict


Out[20]:
{'g': 'mega_g', 'i': 'mega_i', 'r': 'mega_r', 'u': 'mega_u', 'z': 'mega_z'}

In [21]:
snana_eg.headData['REDSHIFT_FINAL']


Out[21]:
SNID
03d1aw    0.5817
03d1ax    0.4960
Name: REDSHIFT_FINAL, dtype: float32

Fitting


In [22]:
import sncosmo

To infer the light curve model parameters, we need a model. We will use the SALT model from SNCosmo


In [35]:
dust = sncosmo.CCM89Dust()
model = sncosmo.Model(source='salt2', effects=[dust],
                     effect_names=['host', 'mw'], effect_frames=['rest', 'obs'])

We will not try to infer the redshift first


In [36]:
model.set(z=0.5817)

We have two main methods of trying to infer the model parameters: the first one is a maximum likelihood estimate


In [37]:
resfit = sncosmo.fit_lc(lcInstance.snCosmoLC(), model, vparam_names=['t0', 'x0', 'x1', 'c'], modelcov=False)

In [38]:
res = sncosmo.mcmc_lc(lcInstance.snCosmoLC(), model, vparam_names=['t0', 'x0', 'x1', 'c'], modelcov=False)

In [39]:
from analyzeSN import ResChar

In [40]:
reschar = ResChar.fromSNCosmoRes(res)

In [33]:
reschar.salt_samples()['t0'].mean()


Out[33]:
52902.329679348411

In [42]:
reschar.salt_samples()['c'].mean()


Out[42]:
0.27463205033466986

In [44]:
resfit[0].parameters[2]


Out[44]:
7.7076051685948634e-06

In [43]:
resfit[0].parameters[2]


Out[43]:
7.7076051685948634e-06

In [49]:
reschar.salt_samples()['c'].std()


Out[49]:
0.038004173905723491

In [70]:
fig, ax  = plt.subplots(1, 4)
reschar.salt_samples()['c'].hist(bins=np.arange(-0.6, 0.6, 0.03), lw=2., histtype='step', normed=1, ax=ax[0])
reschar.salt_samples()['x1'].hist(bins=np.arange(-4., 4., 0.2), lw=2., histtype='step', normed=1, ax=ax[1])
reschar.salt_samples()['t0'].hist(bins=10, lw=2., histtype='step', normed=1, ax=ax[2])
reschar.salt_samples()['mu'].hist(bins=10, lw=2., histtype='step', normed=1, ax=ax[3])


Out[70]:
<matplotlib.axes._subplots.AxesSubplot at 0x11013b090>

In [80]:
sns.distplot(reschar.salt_samples()['t0'], rug=False, color='r', hist_kws=dict(histtype='step', lw=2., alpha=1., color='k'))


Out[80]:
<matplotlib.axes._subplots.AxesSubplot at 0x1106a25d0>

In [47]:
resfit[0].errors['c']


Out[47]:
0.029235316112965083

Scratch


In [ ]:
10**(-0.4 * 0.27)

In [ ]:
-2.5 * np.log10(f)

In [ ]:
10.0**(0.27 / 2.5 )

In [ ]: