The python likelihood tools are a very powerful set of analysis tools that expand upon the command line tools provided with the Fermi Science Tools package. Not only can you perform all of the same likelihood analysis with the python tools that you can with the standard command line tools but you can directly access all of the model parameters. You can more easily script a standard analysis. There are also a few things built into the python tools that are not available from the command line like the calculation of upper limits.
There are many user contributed packages built upon the python backbone of the Science Tools and this thread will highlight the use of the fermipy package.
This sample analysis is based on the PG 1553+113 analysis performed by the LAT team and described in Abdo, A. A. et al. 2010, ApJ, 708, 1310 and closely follows the Likelihood Analysis with Python thread. This tutorial assumes you have the most recent ScienceTools installed and fermipy installed on top of it. For instructions on installing fermipy and the Fermi ScienceTools you should consult the fermipy Installation Instructions. We will also make significant use of python, so you might want to familiarize yourself with python including matplotlib and other libraries (there's a beginner's guide at http://wiki.python.org/moin/BeginnersGuide). This tutorial also assumes that you've gone through the non-python based Unbinned Likelihood Tutorial.
For this thread the original data were extracted from the LAT data server with the following selections (these selections are similar to those in the paper):
Once you exectute the query you can download the data and put it in your working directory. You'll need to change the names of the files to match the filenames that the data server gave you. Alternatively you can run with the following tarball which already contains the downloaded files as well as all of the ancillary files that will be generated for the analysis. In this example the output of the analysis will go into a subdirectory called pg1553.
In [1]:
import os
if os.path.isfile('../data/pg1553.tar.gz'):
!tar xzf ../data/pg1553.tar.gz
else:
!curl -OL https://raw.githubusercontent.com/fermiPy/fermipy-extras/master/data/pg1553.tar.gz
!tar xzf pg1553.tar.gz
If you would like to download the raw data files used in this analysis (FT1 and FT2) you can do so by executing the following cell.
In [2]:
if os.path.isdir('../data'):
os.environ['DATADIR'] = '../data'
else:
!mkdir pg1553/data
!mkdir pg1553/data/ft1
!mkdir pg1553/data/ft2
os.environ['DATADIR'] = 'pg1553/data'
!curl -o pg1553/data/ft1/L1504241622054B65347F25_PH00.fits -OL https://raw.githubusercontent.com/fermiPy/fermipy-extras/master/data/ft1/L1504241622054B65347F25_PH00.fits
!curl -o pg1553/data/ft1/L1504241622054B65347F25_PH01.fits -OL https://raw.githubusercontent.com/fermiPy/fermipy-extras/master/data/ft1/L1504241622054B65347F25_PH01.fits
#!curl -o pg1553/L1504241622054B65347F25_PH00.fits http://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/pyLikelihood/L1504241622054B65347F25_PH00.fits
#!curl -o pg1553/L1504241622054B65347F25_PH01.fits http://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/pyLikelihood/L1504241622054B65347F25_PH01.fits
#!curl -o pg1553/L1504241622054B65347F25_SC00.fits http://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/pyLikelihood/L1504241622054B65347F25_SC00.fits
In [3]:
!ls -1 $DATADIR/ft1/*PH*.fits > pg1553/ft1.txt
fermipy bases its analysis on a configuration file (in yaml format). We're just going to use a really simple config file for a standard analysis. There are many many more options which you can use or you can modify these options after the fact within the analysis chain.
Make a config file named 'config.yaml' like the following. For more details on the config file see config.html. You will probably need to customize this a bit since your files might not be in the same place or named the same. The galactic and isotropic diffuse will need to be located on your system (they are included in the science tools or can be downloaded from the FSSC). In the following example we set the path to these files with the environment variable FERMI_DIFFUSE_DIR. If FERMI_DIFFUSE_DIR is not defined fermipy will look for the location of these files within the FSSC STs distribution.
data:
evfile : PG1553.lst
scfile : L1511031432171D651E7F63_SC00.fits
binning:
roiwidth : 10.0
binsz : 0.1
binsperdec : 8
selection :
emin : 100
emax : 300000
zmax : 90
evclass : 128
evtype : 3
target : '3FGL J1555.7+1111'
gtlike:
edisp : True
irfs : 'P8R2_SOURCE_V6'
edisp_disable : ['isodiff','galdiff']
model:
src_roiwidth : 15.0
galdiff : '$FERMI_DIFFUSE_DIR/gll_iem_v06.fits'
isodiff : '$FERMI_DIFFUSE_DIR/iso_P8R2_SOURCE_V6_v06.txt'
catalogs : ['3FGL']
Next, you create an analysis script and run the setup steps which include running the selections and generating exposure maps etc. This will take a bit.
This is where the magic happens. fermipy will load the point source model, create your xml file for you, decide on all the appropriate cuts and binnings and just go. All of this is configurable from python or from the config file. And, if you need to rerun things, it's smart enough to not overwrite files if it doesn't need to.
In [4]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
You start by importing the module and then creating an instance of the analysis object from our config file. When instantiating the analysis object we can override any options defined in the configuration file by passing keyword arguments to the object constructor. Here we explicitly set the verbosity parameter to 3 (INFO) which supresses DEBUG output. When we create the object, it spits out a bunch of information about all of the parameters that were used. You can see there are many more options than the ones we chose.
In [5]:
from fermipy.gtanalysis import GTAnalysis
gta = GTAnalysis('pg1553/config.yaml',logging={'verbosity': 3})
matplotlib.interactive(True)
This gets everything ready for the likelihood analysis including instantiating the pylikelihood object. Note that fermipy will skip generating any ancillary files that already exist in the working directory. In the sample tarball these files have already been produced in order to speed up this stage of the analysis. If you want to see the behavior of fermipy when running from an empty working directory you can delete one or more of these files before running setup.
In [6]:
gta.setup()
Before proceeding with the analysis we'll have a quick look at the files that are produced by the setup function.
In [7]:
ls pg1553/*fits
Here is a brief explanation of the contents of each file and its role in the analysis:
Note that all of the files have a numerical suffix '00'. This is the analysis component index. In a multi-component analysis there would be instances of all of the above files for each analysis component. The files with no component index are co-added maps that are provided for visualization purposes.
To see example of one of these files we can open and plot the counts cube file. This is a 3D cube that contains the distribution of events as a function of energy and two spatial coordinates. In the example below we sum over the energy dimension of the cube to make a 2-D sky image.
In [8]:
import astropy.io.fits as pyfits
h = pyfits.open('pg1553/ccube.fits')
h.info()
counts = h[0].data
counts.shape
plt.figure()
plt.imshow(np.sum(counts,axis=0),interpolation='nearest',origin='lower')
Out[8]:
We can now inspect the state of the ROI prior with the print_roi() method.
In [9]:
gta.print_roi()
Additional details about an individual source can be retrieved by printing the corresponding source object. Here we use the bracket operator to return the properties of PG1553.
In [10]:
print(gta.roi['3FGL J1555.7+1111'])
In [11]:
# Free Normalization of all Sources within 3 deg of ROI center
gta.free_sources(distance=3.0,pars='norm')
# Free all parameters of isotropic and galactic diffuse components
gta.free_source('galdiff')
gta.free_source('isodiff')
In this simple anlaysis we are leaving the spectral shapes of sources fixed but we're going to free the spectral shape of the source we care about.
In [12]:
gta.free_source('3FGL J1555.7+1111')
Now, actually do the fit. The software does its best to get the fit to converge by running the fit several times.
In [13]:
fit_results = gta.fit()
The dictionary returned by the fit method returns a variety of diagnostic information about the fit including the fit quality, the relative improvement in the likelihood, and the correlations among the fit parameters. We can inspect the results of the fit by printing the source object for PG1553.
In [14]:
print('Fit Quality: ',fit_results['fit_quality'])
print(gta.roi['3FGL J1555.7+1111'])
You can then save the state of the roi to an output file for reference later. The write_roi function does this. The first argument is a string that will be prepended to the names of the output files generated by this method.
In [15]:
gta.write_roi('fit0',make_plots=True)
There are a lot of diagnostic plots also saved at the same time.
In [16]:
ls -l pg1553/*.png
In [17]:
from IPython.display import Image, display
from glob import glob
In [18]:
pngs = glob('pg1553/*.png')
In [19]:
for png in pngs:
my_image = Image(png)
display(my_image)
In [20]:
c = np.load('pg1553/fit0.npy').flat[0]
The sources
dictionary has an entry for each source in the model:
In [21]:
sorted(c['sources'].keys())
Out[21]:
Let's take a look at the flux, spectral parameters, and TS.
In [22]:
c['sources']['3FGL J1555.7+1111']['flux']
Out[22]:
In [23]:
print(c['sources']['3FGL J1555.7+1111']['param_names'][:4])
print(c['sources']['3FGL J1555.7+1111']['param_values'][:4])
In [24]:
c['sources']['3FGL J1555.7+1111']['ts']
Out[24]:
The SED is in there as well. We can plot it.
In [25]:
E = np.array(c['sources']['3FGL J1555.7+1111']['model_flux']['energies'])
dnde = np.array(c['sources']['3FGL J1555.7+1111']['model_flux']['dnde'])
dnde_hi = np.array(c['sources']['3FGL J1555.7+1111']['model_flux']['dnde_hi'])
dnde_lo = np.array(c['sources']['3FGL J1555.7+1111']['model_flux']['dnde_lo'])
In [26]:
plt.loglog(E, (E**2)*dnde, 'k--')
plt.loglog(E, (E**2)*dnde_hi, 'k')
plt.loglog(E, (E**2)*dnde_lo, 'k')
plt.xlabel('E [MeV]')
plt.ylabel(r'E$^2$ dN/dE [MeV cm$^{-2}$ s$^{-1}$]')
plt.show()
If you want SED points, there's a function for that. There are lots of options for this which you can set in the config file or from keyword arguments of the function itself.
In [27]:
sed = gta.sed('3FGL J1555.7+1111')
You can save the state to the yaml file or you can just access it directly. This is also the way to get at the dictionary for any individual source.
In [28]:
src = gta.roi['3FGL J1555.7+1111']
In [29]:
plt.loglog(E, (E**2)*dnde, 'k--')
plt.loglog(E, (E**2)*dnde_hi, 'k')
plt.loglog(E, (E**2)*dnde_lo, 'k')
plt.errorbar(np.array(sed['e_ctr']),
sed['e2dnde'],
yerr=sed['e2dnde_err'], fmt ='o')
plt.xlabel('E [MeV]')
plt.ylabel(r'E$^{2}$ dN/dE [MeV cm$^{-2}$ s$^{-1}$]')
plt.show()
Looks like those last two points should be upper limits. Let's plot those instead.
In [30]:
plt.loglog(E, (E**2)*dnde, 'k--')
plt.loglog(E, (E**2)*dnde_hi, 'k')
plt.loglog(E, (E**2)*dnde_lo, 'k')
plt.errorbar(sed['e_ctr'][:-2],
sed['e2dnde'][:-2],
yerr=sed['e2dnde_err'][:-2], fmt ='o')
plt.errorbar(np.array(sed['e_ctr'][-2:]),
sed['e2dnde_ul95'][-2:], yerr=0.2*sed['e2dnde_ul95'][-2:],
fmt='o', uplims=True)
plt.xlabel('E [MeV]')
plt.ylabel(r'E$^{2}$ dN/dE [MeV cm$^{-2}$ s$^{-1}$]')
plt.show()