In [1]:
# Basic setup
%load_ext autoreload
%autoreload 2
%matplotlib osx
import numpy as np
import astropy.io.fits as pf
In [2]:
# Ok, now let's open an event data file at take a look
# !wget -O data/test_events_0000.fits http://www.slac.stanford.edu/~echarles/temp_stats/test_events_0000.fits
f_evt = pf.open("data/test_events_0000.fits")
f_evt.info()
That file is a simulation of 1 day of data from a fairly small region of the sky. It isn't particularly interesting, but it is useful to see the data structures. Note that hdu[0] is an empty image (required by FITS standard), hdu[1] is the event data table and hdu[1] is a set of Good Time Intervals (actually, just a single interval in this case).
In [3]:
f_evt[1].columns
Out[3]:
The first several columns are the reconstruction energy of the photon and the reconstructed coordinates in several refernces frames ( J2000, Galactic, Instrument frame, Earth centered frame ), and the arrival time of the photon.
The last two columns are the "Monte Carlo Truth" i.e., what we actually generated in the simulations. Obviously these wouldn't be there if we were looking at flight data.
In [4]:
# Ok, now let's take a glance at the spacecraft pointing history
# !wget -O data/test_scData_0000.fits http://www.slac.stanford.edu/~echarles/temp_stats/test_scData_0000.fits
f_ptg = pf.open("data/test_scData_0000.fits")
f_ptg.info()
Ok, in this case we just have one table. This is the pointing information from the spacecraft.
In [6]:
f_ptg[1].columns
Out[6]:
The pointing information is given in 30s intervals (since the LAT doesn't slew very quickly this is generally good enough to calculate the exposure and instrument response, on the instrument the pointing information is update at ~5Hz, and that is what is used for the direction assignment of the individual photons).
If we were actually running the analysis chain, at this point we would run the gtbin tool to make a binned counts map of a small region of the sky around Draco. (The region is actually 10x10 degrees, for the LAT that is "small", for HST that would be "absolutely gigantic"). That output of gtbin is the "draco_ccube.fits" file in the remote data area.
In [7]:
#!wget -O data/draco_ccube.fits http://www.slac.stanford.edu/~echarles/temp_stats/draco_ccube.fits
f_ccube = pf.open("data/draco_ccube.fits")
f_ccube.info()
So, this is a series of 24 images, showing the counts maps for 1/8 of a decade energy bins running from 500 MeV to 500 GeV. Here examples of the counts maps at 500MeV, 5GeV and 50GeV. The images are smoothed to 0.3 degrees and the Fermi Catalog sources in the region have been overlaid. At the higher energies we can see the individual photons.
Recall that the LAT is a very wide field-of-view instrument, (we see gamma-rays out to 80 degrees away from the LAT boresight). The instrumental response changes signifcantly accross the field-of-view, so we have to keep track of how much time any given direction in the sky spends at a particular direction in the instrument frame. We refer to this as the "observing profile" ($t_{obs}$), as an example, here is the observing profile for draco.
In fact, we actually want to keep track of the observing profile for every direction in the sky. We call this a "Livetime cube" and the there is a LAT analysis tool gtltcube that builds this for us. The output of the gtltcube file is the "ltcube_6year.fits" file in the remote data area.
In [8]:
# Let's take a peek at the livetime cube file:
#! wget -O data/draco_ccube.fits http://www.slac.stanford.edu/~echarles/temp_stats/ltcube_6year.fits
f_ltcube = pf.open("data/ltcube_6year.fits")
f_ltcube.info()
The main things here are the "EXPOSURE" and "WEIGHTED_EXPSOURE" tables. They are more or less the same (I'm not going to talk about the differences). What those tables contain is a observing time as a function of off-axis angle (i.e., the observing time for 40 different bins in $\cos \theta$) for every direction in a HEALPix pixelization of the sky (i.e., each of the 49152 rows in the table).
The next step of the analysis is to combine the livetime cube with the tabulated parameterizations of the instrument response as a function of the off-axis angle.
Here is the effective collection area as a function of energy and off-axis angle.
Here is a plot of the 68% and 95% containment radii of the LAT point-spread function as a function of energy. At low energy (< 1GeV) the PSF is dominated by multiple Coulomb scattering of the converted electron and positron. At high energies the PSF is dominated the the pitch of strips in the LAT tracker.
Finally, here is a plot of the 68% containment width of the energy dispersion function. The LAT does best around 1-10 GeV where the majority of the gamma-ray energy is deposited in the LAT calorimeter.
Note that the energy resolution is smaller than the bins size we are using (1/8 decade = 30%) over the entire energy range of our analysis (500 MeV to 500 GeV). In fact, we will be ignoring the energy resolution in our analysis, and assuming that the reconstructed energy is the true energy of the incident gamma-ray.
We calculate the exposure by integrating the product of the effective area as of function of energy and off-axis angle and the observing profile over the LAT field of view:
Here $E$ is the photon energy, $\hat{p}$ and $\hat{v}$ are the photon direction in celestial coordinates and in the instrument frame, respectively. The variable $s$ indicates the particular event selection that we are using. The LAT team makes several different photon data sets with increasingly stringent selection criteria. The different data sets are optimized for different types of analysis. For example, when analyzing a short duration transient such as a Gamma-Ray Burst, we use much looser event selection criteria than when analyzing the large-scale diffuse gamma-ray backgrounds.
Normally we would use the tool gtexpcube2 to calculate the exposure as a function of direction and position. You can find the "draco_bexmap.fits" file in the remove data directory.
In [9]:
# Let's peek inside the exposure file:
#! wget -O data/draco_ccube.fits http://www.slac.stanford.edu/~echarles/temp_stats/draco_bexmap.fits
f_expmap = pf.open("data/draco_bexmap.fits")
f_expmap.info()
The first HDU in the file is an image, note that it is very different from the counts cube image. It is actually an image of the entire sky (360 x 180 1x1 degree pixels) and it is calculated at the edges of the energy bins (so there are 25 layers instead of 24).
Here in an example of the all sky binned exposure map at 5GeV, the green box is the Draco ROI.
If we focus on the Draco ROI we see that the exposure varies very little across the ROI.
The gradiations in color are about 1% variations in the exposure.
We will be fitting for gamma-ray emission from Draco by hypothesis testing; in particular we will test the likelihood of a model that includes emission from Draco against the likelihood of a model that does not.
What we actually calculate is the predicted model counts from each source in our model in each pixel for each energy bin. We get this by convolving a flux model of the source with the instrument response:
In fact, since we will be fitting the spectral parameters of sources that we have spatial templates for, then it makes sense to seperate the model into spectral and spatial components. In fact we can compute a spectrum-independent prefactor:
For point sources the templates are simply exposure-weighted images of the PSF. For extended sources the tempatles are the exposure-weighted convolution of the true intensity, $\tilde{S_{i}}(\hat{p})$, with the PSF.
By design, in both cases we can construct the expected counts maps for each source by multiplying the templates by the source spectrum:
We refer to the templates as "source maps" and we have a tool gtsrcmaps to compute them.
Here are some examples of the templates (i.e., the $d_{j}$ for different sources in our model:
Source | 500 MeV | 5 GeV |
---|---|---|
Galactic Diffuse | ||
Isotropic | ||
Point Source | ||
Draco |
I've put the point source on a log scale so that you can see long tails of the PSF.
In [10]:
# Let's look inside a source map file
# wget -O data/draco_ccube.fits http://www.slac.stanford.edu/~echarles/temp_stats/draco_srcmap.fits
f_srcmap = pf.open("data/draco_srcmap.fits")
f_srcmap.info()
Note that this file include the binned counts data, and that there is one set of image templates for each source near the draco ROI. Note also that the source maps are actually computed at the edges of the energy bins (i.e., there are 25 layers instead of 24), this is done to improve the accuracy of the numerical integration when making the model maps.
Simply put, this file contains everything needed to do the likelihood fitting except for the spectra of the various sources.
There are 9 point sources in the Draco ROI, and several others that are outside the ROI, but close enough that they might contribute some photons to the ROI, because the 95% containment of the LAT PSF is several degrees across at 500MeV.
We are not going to fit every source in the ROI. Rather we are going to fix all of the point-sources to the value in the 3FGL catalog (which largely overlaps with the time range of our analysis). Also, in the higher energy bins we don't really have the statistics need to distinguish the Galactic diffuse emission from the isotropic emission and they become degenerate. So we are only going to fit for the fluxes of Draco and the Galactic diffuse emission. Finally, to simplify things we are only going to fit the normalization of the fluxes, not the other spectral parameters.
In fact, we've already fit there parameters of all the sources in the region to establish a basline model. You can find the results of that fitting in three places:
An XML file describing the sources in the ROI.
A text file with a python dictionary giving the paramaters of all the sources in the ROI.
A FITS file giving the number of counts attributed to each source in each energy bin.
Finally, you can find a FITS file giving the best-fit model counts in each pixel / energy bin in the remote data area.
In [ ]:
#!wget -O data/draco_mcube.fits http://www.slac.stanford.edu/~echarles/temp_stats/draco_mcube.fits
We are not going to re-fit every source in the ROI. Rather we are going to fix all of the point-sources to the values in the baseline fit (that are mainly taken from the 3FGL catalog which largely overlaps with the time range of our analysis). Also, in the higher energy bins we don't really have the statistics needed to distinguish the Galactic diffuse emission from the isotropic emission and they become degenerate. So we are only going to fit for the fluxes of Draco and the Galactic diffuse emission. Finally, to simplify things we are only going to fit the normalization of the fluxes, not the other spectral parameters.
To set us up to do this, I've made the file draco_srcTemplates.fits
in the data directory.
In [11]:
f_tmpl = pf.open("data/draco_srcTemplates.fits")
f_tmpl.info()
This file is similar to the source map file, but I've merged all of the fixed source into a single template. Also, I've actually done the integration over the energy bins, so the templates have 24 energy layers instead of 25. This is fine because we will only be fitting the normalization of the Draco and GLL_IEM_V06 (i.e., Galactic diffuse) templates.
Let's move on the likelihood fitting example now.