(This is based on a Jupyter notebook, available in the XID+ package and can be interactively run and edited)
This notebook provides some example code for basic analysis of the XID+ outputs, including:
Import required modules
In [1]:
import pylab as plt
%matplotlib inline
import numpy as np
import xidplus
from xidplus import moc_routines
output_folder='./'
Load up posterior output from XID+
In [2]:
priors,posterior=xidplus.load('test.pkl')
In order to compare how good our fit is, its often useful to look at original map. There is a routine within XID+ that makes the original fits map from the data stored within the prior class. Lets use that to make the SPIRE maps for the region we have fit.
In [3]:
figs,fig=xidplus.plot_map(priors)
We can use each sample we have from the posterior, and use it to make a replicated map, including simulating the instrumental noise, and the estimated confusion noise. You can think of these maps as all the possible maps that are allowed by the data.
NOTE: You will require the
FFmpeg
library installed to run the movie
In [4]:
xidplus.replicated_map_movie(priors,posterior,50)
Out[4]:
In [5]:
#Select source you want to plot joint distribution
s1=2
s2=18
Sources 2 and 18 are close together, lets look at their joint posterior probabiity distribution. The code below is an example of how you can plot the joint posterior probability density function for the $250 \mathrm{\mu m}$ flux, and an inset of the real map.
In [6]:
import aplpy
import seaborn as sns
sns.set(color_codes=True)
import pandas as pd
sns.set_style("white")
import xidplus.posterior_maps as postmaps
labels=[r'Source 1 $250\mathrm{\mu m}$ flux (mJy)',r'Source 2 $250\mathrm{\mu m}$ flux (mJy)']
df = pd.DataFrame(posterior.samples['src_f'][:,0,[s1,s2]],columns=labels)
g = sns.PairGrid(df,size=5)
g.map_diag(sns.kdeplot,c='Red')
g.map_lower(sns.kdeplot, cmap="Reds",alpha=0.8,n_levels=10,normed=True, shade=True,shade_lowest=False)
g.set(ylim=(0,40))
g.set(xlim=(0,40))
g.axes[0,1].spines['bottom'].set_color('white')
g.axes[0,1].spines['left'].set_color('white')
cmap=sns.cubehelix_palette(8, start=.5, rot=-.75,as_cmap=True)
real_250 = aplpy.FITSFigure(postmaps.make_fits_image(priors[0],priors[0].sim)[1],figure=g.fig,subplot=(2,2,2))
real_250.show_colorscale(cmap=cmap)
real_250.show_markers(priors[0].sra, priors[0].sdec, edgecolor='black', facecolor='black',
marker='o', s=40, alpha=0.5)
real_250.recenter(priors[0].sra[s1], priors[0].sdec[s1], radius=0.01)
real_250.add_label(priors[0].sra[s1], priors[0].sdec[s1]+0.0005, 1, relative=False,size=20,color='white')
real_250.add_label(priors[0].sra[s2], priors[0].sdec[s2]-0.0010, 2, relative=False,size=20,color='white')
real_250.tick_labels.set_xformat('dd.dd')
real_250.tick_labels.set_yformat('dd.dd')
real_250.add_colorbar(axis_label_text=r'$250\mathrm{\mu m}$ flux (mJy)')
When examining goodness of fits, the typical method is to look at the residuals. i.e. $\frac{data - model}{\sigma}$. Because we have distribution of $y^{rep}$, we can do this in a more probabilisitic way using posterior predictive checks. For more information on posterior predictive checks, Gelman et al. 1996 is a good starting point.
For our case, the best way to carry out posterior predictive checks is to think about one pixel. We can look at where the real flux value for our pixel is in relation to the distribution from $y^{rep}$.
In [12]:
from xidplus import posterior_maps as postmaps
rep_maps=postmaps.replicated_maps(priors,posterior)
import matplotlib as mpl
sns.set_style("white")
fig=plt.figure(figsize=(5,5))
# This is the colormap I'd like to use.
cm = sns.diverging_palette(220, 20, as_cmap=True)
# Get the histogramp
Y,X = np.histogram(rep_maps[0][20,:], 25, normed=1)
#C = [cm(((x-X.min())/x_span)) for x in X]
C = [cm(((((x-np.mean(rep_maps[0][20,:]))/np.std(rep_maps[0][20,:]))+6)/12.0)) for x in X]
plt.bar(X[:-1],Y,color=C,width=X[1]-X[0])
plt.xlabel('Pixel Flux mJy')
plt.ylabel('p(pixel flux)')
plt.axvline(5, linestyle='--')
plt.axvline(-10,linestyle=':')
plt.annotate('flux in map that \n model cannot explain',xy=(5, 0.01), xycoords='data',
xytext=(5, 0.1), textcoords='data',rotation='vertical',size='small')
plt.annotate('too much flux in model \n compared to map',xy=(-10, 0.01), xycoords='data',
xytext=(-10, 0.1), textcoords='data',rotation='vertical',size='small')
#ax1 = fig.add_axes([0.05, 0.80, 0.9, 0.15])
ax1 = fig.add_axes([0.95, 0.1, 0.05, 0.8])
norm = mpl.colors.Normalize(vmin=-6, vmax=6)
cb1 = mpl.colorbar.ColorbarBase(ax1, cmap=cm,
norm=norm,
orientation='vertical')
cb1.set_label('$\sigma$')
We can calculate fraction of $y^{rep}$ samples above and below real map value. This is often referred to as the Bayesian p-value and is telling us the probability of drawing the real pixel value, from our model which has been inferred on the data. This is tells us if the model is inconsistent with the data, given the uncertianties in parameters and data.
We can convert this to a typical '$\sigma$' level and create map versions of these Bayesian p-values:
In [13]:
figs, fig=xidplus.plot_Bayes_pval_map(priors, posterior)
Red indicates the flux value in the real map is higher than our model thinks is possible. This could be indicating there is a source there that is not in our model. Blue indicates the flux in the real map is lower than in our model. This is either indicating a very low density region or that too much flux has been assigned to one of the sources.
In [14]:
import xidplus.catalogue as cat
In [16]:
SPIRE_cat=cat.create_SPIRE_cat(posterior,priors[0],priors[1],priors[2])
SPIRE_cat.writeto('test.fits',overwrite=True)
In [20]:
from astropy.table import Table
In [21]:
catalogue=Table.read('test.fits')
In [22]:
catalogue
Out[22]: