XID+ Example Run Script

(This is based on a Jupyter notebook, available in the XID+ package and can be interactively run and edited)

XID+ is a probababilistic deblender for confusion dominated maps. It is designed to:

  1. Use a MCMC based approach to get FULL posterior probability distribution on flux
  2. Provide a natural framework to introduce additional prior information
  3. Allows more representative estimation of source flux density uncertainties
  4. Provides a platform for doing science with the maps (e.g XID+ Hierarchical stacking, Luminosity function from the map etc)

Cross-identification tends to be done with catalogues, then science with the matched catalogues.

XID+ takes a different philosophy. Catalogues are a form of data compression. OK in some cases, not so much in others, i.e. confused images: catalogue compression loses correlation information. Ideally, science should be done without compression.

XID+ provides a framework to cross identify galaxies we know about in different maps, with the idea that it can be extended to do science with the maps!!

Philosophy:

  • build a probabilistic generative model for the SPIRE maps
  • Infer model on SPIRE maps

Bayes Theorem

$p(\mathbf{f}|\mathbf{d}) \propto p(\mathbf{d}|\mathbf{f}) \times p(\mathbf{f})$

In order to carry out Bayesian inference, we need a model to carry out inference on.

For the SPIRE maps, our model is quite simple, with likelihood defined as: $L = p(\mathbf{d}|\mathbf{f}) \propto |\mathbf{N_d}|^{-1/2} \exp\big\{ -\frac{1}{2}(\mathbf{d}-\mathbf{Af})^T\mathbf{N_d}^{-1}(\mathbf{d}-\mathbf{Af})\big\}$

where: $\mathbf{N_{d,ii}} =\sigma_{inst.,ii}^2+\sigma_{conf.}^2$

Simplest model for XID+ assumes following:

  • All sources are known and have positive flux (fi)
  • A global background (B) contributes to all pixels
  • PRF is fixed and known
  • Confusion noise is constant and not correlated across pixels

Because we are getting the joint probability distribution, our model is generative i.e. given parameters, we generate data and vica-versa

Compared to discriminative model (i.e. neural network), which only obtains conditional probability distribution i.e. Neural network, give inputs, get output. Can't go other way'

Generative model is full probabilistic model. Allows more complex relationships between observed and target variables

XID+ SPIRE

XID+ applied to GALFORM simulation of COSMOS field

  • SAM simulation (with dust) ran through SMAP pipeline_ similar depth and size as COSMOS
  • Use galaxies with an observed 100 micron flux of gt. $50\mathbf{\mu Jy}$. Gives 64823 sources
  • Uninformative prior: uniform $0 - 10{^3} \mathbf{mJy}$

Import required modules


In [1]:
from astropy.io import ascii, fits
import pylab as plt
%matplotlib inline
from astropy import wcs


import numpy as np
import xidplus
from xidplus import moc_routines
import pickle

Set image and catalogue filenames


In [2]:
xidplus.__path__[0]


Out[2]:
'/Users/pdh21/Work/Astro/XID_plus/xidplus'

In [3]:
#Folder containing maps
imfolder=xidplus.__path__[0]+'/../test_files/'

pswfits=imfolder+'cosmos_itermap_lacey_07012015_simulated_observation_w_noise_PSW_hipe.fits.gz'#SPIRE 250 map
pmwfits=imfolder+'cosmos_itermap_lacey_07012015_simulated_observation_w_noise_PMW_hipe.fits.gz'#SPIRE 350 map
plwfits=imfolder+'cosmos_itermap_lacey_07012015_simulated_observation_w_noise_PLW_hipe.fits.gz'#SPIRE 500 map


#Folder containing prior input catalogue
catfolder=xidplus.__path__[0]+'/../test_files/'
#prior catalogue
prior_cat='lacey_07012015_MillGas.ALLVOLS_cat_PSW_COSMOS_test.fits'


#output folder
output_folder='./'

Load in images, noise maps, header info and WCS information


In [4]:
#-----250-------------
hdulist = fits.open(pswfits)
im250phdu=hdulist[0].header
im250hdu=hdulist[1].header

im250=hdulist[1].data*1.0E3 #convert to mJy
nim250=hdulist[2].data*1.0E3 #convert to mJy
w_250 = wcs.WCS(hdulist[1].header)
pixsize250=3600.0*w_250.wcs.cd[1,1] #pixel size (in arcseconds)
hdulist.close()
#-----350-------------
hdulist = fits.open(pmwfits)
im350phdu=hdulist[0].header
im350hdu=hdulist[1].header

im350=hdulist[1].data*1.0E3 #convert to mJy
nim350=hdulist[2].data*1.0E3 #convert to mJy
w_350 = wcs.WCS(hdulist[1].header)
pixsize350=3600.0*w_350.wcs.cd[1,1] #pixel size (in arcseconds)
hdulist.close()
#-----500-------------
hdulist = fits.open(plwfits)
im500phdu=hdulist[0].header
im500hdu=hdulist[1].header 
im500=hdulist[1].data*1.0E3 #convert to mJy
nim500=hdulist[2].data*1.0E3 #convert to mJy
w_500 = wcs.WCS(hdulist[1].header)
pixsize500=3600.0*w_500.wcs.cd[1,1] #pixel size (in arcseconds)
hdulist.close()

Load in catalogue you want to fit (and make any cuts)


In [5]:
hdulist = fits.open(catfolder+prior_cat)
fcat=hdulist[1].data
hdulist.close()
inra=fcat['RA']
indec=fcat['DEC']
# select only sources with 100micron flux greater than 50 microJy
sgood=fcat['S100']>0.050
inra=inra[sgood]
indec=indec[sgood]

XID+ uses Multi Order Coverage (MOC) maps for cutting down maps and catalogues so they cover the same area. It can also take in MOCs as selection functions to carry out additional cuts. Lets use the python module pymoc to create a MOC, centered on a specific position we are interested in. We will use a HEALPix order of 15 (the resolution: higher order means higher resolution), have a radius of 100 arcseconds centered around an R.A. of 150.74 degrees and Declination of 2.03 degrees.


In [6]:
from astropy.coordinates import SkyCoord
from astropy import units as u
c = SkyCoord(ra=[150.74]*u.degree, dec=[2.03]*u.degree)  
import pymoc
moc=pymoc.util.catalog.catalog_to_moc(c,100,15)

XID+ is built around two python classes. A prior and posterior class. There should be a prior class for each map being fitted. It is initiated with a map, noise map, primary header and map header and can be set with a MOC. It also requires an input prior catalogue and point spread function.


In [7]:
#---prior250--------
prior250=xidplus.prior(im250,nim250,im250phdu,im250hdu, moc=moc)#Initialise with map, uncertianty map, wcs info and primary header
prior250.prior_cat(inra,indec,prior_cat)#Set input catalogue
prior250.prior_bkg(-5.0,5)#Set prior on background (assumes Gaussian pdf with mu and sigma)
#---prior350--------
prior350=xidplus.prior(im350,nim350,im350phdu,im350hdu, moc=moc)
prior350.prior_cat(inra,indec,prior_cat)
prior350.prior_bkg(-5.0,5)

#---prior500--------
prior500=xidplus.prior(im500,nim500,im500phdu,im500hdu, moc=moc)
prior500.prior_cat(inra,indec,prior_cat)
prior500.prior_bkg(-5.0,5)


WARNING: AstropyDeprecationWarning: 
Private attributes "_naxis1" and "_naxis2" have been deprecated since v3.1.
Instead use the "pixel_shape" property which returns a list of NAXISj keyword values.
 [astropy.wcs.wcs]
WARNING: AstropyDeprecationWarning: 
Private attributes "_naxis1" and "_naxis2" have been deprecated since v3.1.
Instead use the "pixel_shape" property which returns a list of NAXISj keyword values.
 [astropy.wcs.wcs]

Set PRF. For SPIRE, the PRF can be assumed to be Gaussian with a FWHM of 18.15, 25.15, 36.3 '' for 250, 350 and 500 $\mathrm{\mu m}$ respectively. Lets use the astropy module to construct a Gaussian PRF and assign it to the three XID+ prior classes.


In [8]:
#pixsize array (size of pixels in arcseconds)
pixsize=np.array([pixsize250,pixsize350,pixsize500])
#point response function for the three bands
prfsize=np.array([18.15,25.15,36.3])
#use Gaussian2DKernel to create prf (requires stddev rather than fwhm hence pfwhm/2.355)
from astropy.convolution import Gaussian2DKernel

##---------fit using Gaussian beam-----------------------
prf250=Gaussian2DKernel(prfsize[0]/2.355,x_size=101,y_size=101)
prf250.normalize(mode='peak')
prf350=Gaussian2DKernel(prfsize[1]/2.355,x_size=101,y_size=101)
prf350.normalize(mode='peak')
prf500=Gaussian2DKernel(prfsize[2]/2.355,x_size=101,y_size=101)
prf500.normalize(mode='peak')

pind250=np.arange(0,101,1)*1.0/pixsize[0] #get 250 scale in terms of pixel scale of map
pind350=np.arange(0,101,1)*1.0/pixsize[1] #get 350 scale in terms of pixel scale of map
pind500=np.arange(0,101,1)*1.0/pixsize[2] #get 500 scale in terms of pixel scale of map

prior250.set_prf(prf250.array,pind250,pind250)#requires PRF as 2d grid, and x and y bins for grid (in pixel scale)
prior350.set_prf(prf350.array,pind350,pind350)
prior500.set_prf(prf500.array,pind500,pind500)

In [9]:
print('fitting '+ str(prior250.nsrc)+' sources \n')
print('using ' +  str(prior250.snpix)+', '+ str(prior250.snpix)+' and '+ str(prior500.snpix)+' pixels')


fitting 51 sources 

using 870, 870 and 219 pixels

Before fitting, the prior classes need to take the PRF and calculate how much each source contributes to each pixel. This process provides what we call a pointing matrix. Lets calculate the pointing matrix for each prior class


In [10]:
prior250.get_pointing_matrix()
prior350.get_pointing_matrix()
prior500.get_pointing_matrix()

Default prior on flux is a uniform distribution, with a minimum and maximum of 0.00 and 1000.0 $\mathrm{mJy}$ respectively for each source. running the function upper_lim _map resets the upper limit to the maximum flux value (plus a 5 sigma Background value) found in the map in which the source makes a contribution to.


In [11]:
prior250.upper_lim_map()
prior350.upper_lim_map()
prior500.upper_lim_map()

Now fit using the XID+ interface to pystan


In [12]:
from xidplus.stan_fit import SPIRE
fit=SPIRE.all_bands(prior250,prior350,prior500,iter=1000)


/XID+SPIRE found. Reusing

Initialise the posterior class with the fit object from pystan, and save alongside the prior classes


In [13]:
posterior=xidplus.posterior_stan(fit,[prior250,prior350,prior500])
xidplus.save([prior250,prior350,prior500],posterior,'test')

Alternatively, you can fit with the pyro backend.


In [20]:
%%time
from xidplus.pyro_fit import SPIRE
fit_pyro=SPIRE.all_bands([prior250,prior350,prior500],n_steps=10000,lr=0.001,sub=0.1)


ELBO loss: 1443524.1566124088
ELBO loss: 910844.4371568048
ELBO loss: 853514.7015728
ELBO loss: 958341.7346083529
ELBO loss: 942454.5074825891
ELBO loss: 774617.8222908534
ELBO loss: 650073.5492952218
ELBO loss: 519843.6504241303
ELBO loss: 672951.3874382508
ELBO loss: 530477.743343812
ELBO loss: 446401.50917499466
ELBO loss: 443207.7221103374
ELBO loss: 475020.51042302605
ELBO loss: 449308.8457927352
ELBO loss: 424223.396891532
ELBO loss: 361269.51020405284
ELBO loss: 371495.4485347897
ELBO loss: 328492.157357247
ELBO loss: 350314.74147871917
ELBO loss: 337240.6088554669
ELBO loss: 355712.2277569153
ELBO loss: 339152.8206738192
ELBO loss: 350310.74018785724
ELBO loss: 323537.846016178
ELBO loss: 324591.3325955458
ELBO loss: 335031.66303323145
ELBO loss: 324931.4920203859
ELBO loss: 320556.1040378223
ELBO loss: 311213.0589962671
ELBO loss: 318993.10559884284
ELBO loss: 303162.7707374462
ELBO loss: 306014.70833055786
ELBO loss: 310616.8348440269
ELBO loss: 321497.3606092724
ELBO loss: 299567.18033542583
ELBO loss: 304569.7697317204
ELBO loss: 314194.7938925505
ELBO loss: 296744.9675352595
ELBO loss: 301772.8370850471
ELBO loss: 306359.2243131147
ELBO loss: 298016.90828388697
ELBO loss: 292658.2475540696
ELBO loss: 295279.049614446
ELBO loss: 291817.73943772545
ELBO loss: 302292.42562063405
ELBO loss: 289392.6286525327
ELBO loss: 286766.81012443814
ELBO loss: 294621.0769002719
ELBO loss: 290040.7563088562
ELBO loss: 297774.0433324089
ELBO loss: 293400.23119536275
ELBO loss: 284750.31063640135
ELBO loss: 292837.90864221164
ELBO loss: 280227.2228185104
ELBO loss: 284509.1174854344
ELBO loss: 295757.24787565775
ELBO loss: 275677.86982623884
ELBO loss: 279346.2010262294
ELBO loss: 287955.7393150086
ELBO loss: 282268.86795325595
ELBO loss: 271330.56835386495
ELBO loss: 282094.66336443584
ELBO loss: 277081.42727055395
ELBO loss: 274209.7925725182
ELBO loss: 273972.16419982724
ELBO loss: 278165.7184229222
ELBO loss: 282543.2833907572
ELBO loss: 279595.4262836536
ELBO loss: 271890.0996185938
ELBO loss: 274840.94998768397
ELBO loss: 276284.9288065788
ELBO loss: 273281.1586626993
ELBO loss: 266612.3436653682
ELBO loss: 267854.5754898314
ELBO loss: 275000.62426360924
ELBO loss: 271225.61342700786
ELBO loss: 264406.83165800216
ELBO loss: 274922.5473209161
ELBO loss: 275326.14097076975
ELBO loss: 273080.06089750736
ELBO loss: 262040.9827019241
ELBO loss: 267040.3990704805
ELBO loss: 271459.3912361185
ELBO loss: 275945.8373264552
ELBO loss: 271047.27244301076
ELBO loss: 270287.6894715075
ELBO loss: 275070.81142739346
ELBO loss: 270338.9430335168
ELBO loss: 271157.1674629421
ELBO loss: 268489.72055991186
ELBO loss: 274697.85961187596
ELBO loss: 272597.9830145637
ELBO loss: 272122.92528689664
ELBO loss: 264597.12308714405
ELBO loss: 264368.4161852362
ELBO loss: 268640.20582452737
ELBO loss: 273767.18329628045
ELBO loss: 271817.74977537146
ELBO loss: 281918.39333307045
ELBO loss: 264147.3673830001
ELBO loss: 268966.61114906374
CPU times: user 7min 13s, sys: 5.66 s, total: 7min 18s
Wall time: 2min 43s

In [21]:
posterior_pyro=xidplus.posterior_pyro(fit_pyro,[prior250,prior350,prior500])
xidplus.save([prior250,prior350,prior500],posterior_pyro,'test_pyro')

In [22]:
plt.semilogy(posterior_pyro.loss_history)


Out[22]:
[<matplotlib.lines.Line2D at 0x13530d588>]