Week 2 Tutorial

Simulating X-ray Image Data

In this tutorial we will:

  • Learn about the properties of images taken with X-ray telescopes
  • Generate a mock X-ray image dataset

For homework, you will:

  • Carry out a simple inference of some model parameters from some X-ray image data.

Reminder!

After pulling down the tutorial notebook, immediately make a copy. Then do not modify the original. Do your work in the copy. This will prevent the possibility of git conflicts should the version-controlled file change at any point in the future. (The same exhortation applies to homeworks.)

Import things


In [ ]:
import astropy.io.fits as pyfits
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from astropy.visualization import LogStretch
logstretch = LogStretch()
from io import StringIO   # StringIO behaves like a file object
import scipy.stats

class SolutionMissingError(Exception):
    def __init__(self):
        Exception.__init__(self,"You need to complete the solution for this code to work!")
def REPLACE_WITH_YOUR_SOLUTION():
    raise SolutionMissingError

Introduction to an X-ray CCD Observation

This part introduces a data set that we'll use this week in the Tutorial and Homework, and that we'll see again in the future. There's a fair bit of domain information here, but it's useful stuff to see if you haven't before (even if you're not an X-ray observer).

Modern X-ray CCDs are technologically similar to the CCDs used in optical astronomy. The main practical difference is that X-ray photons are rarer and their energies much higher.

This means that:

  • If we read out the CCD every few seconds, only for exceptionally bright sources will we ever have $>1$ photon hit a given pixel.
  • We no longer get 1 electron count per photon. Instead the number of counts is related to the photon's energy, which means that these imaging devices are actually imaging spectrometers.
  • When we say "counts" in this context, we mean "pixel activation events" rather than number of electrons trapped, so that (as in optical astronomy) we're referring to the number of photons detected (or other events that look like photons to the detector).

Let's look at some processed data from XMM-Newton for galaxy cluster Abell 1835.

Here the raw "event list" has been processed to form an image, so the spectral information has been discarded.

XMM actually has 3 CCD cameras, but we'll just work with 1 for simplicity, and with just one of the available observations.

We'll still need 2 files:

  • The image, in units of counts
  • The exposure map (units of seconds), which accounts for the exposure time and the variation in effective collecting area across the field due to vignetting

There's a nice web interface that allows one to search the public archives for data, but to save time, just download the particular image and exposure map here:

Details: this is an image produced from 1-3 keV events captured by the MOS2 camera in XMM's first observation of A1835, way back in 2001.


In [ ]:
datadir = './' # or whatever - path to where you put the downloaded files
imagefile = datadir + 'P0098010101M2U009IMAGE_3000.FTZ'
expmapfile = datadir + 'P0098010101M2U009EXPMAP3000.FTZ'

imfits = pyfits.open(imagefile)
exfits = pyfits.open(expmapfile)

Let's see what we've got:


In [ ]:
imfits.info()
exfits.info()

imfits is a FITS object, containing multiple data structures. The image itself is an array of integer type, and size 648x648 pixels, stored in the primary "header data unit" or HDU. exfits contains only the exposure map.


In [ ]:
im = imfits[0].data
ex = exfits[0].data
print(im.shape, im.dtype)
print(ex.shape, ex.dtype)

Note: If we need im to be floating point for some reason, we would need to cast it, as in im = imfits[0].data.astype('np.float32').

Let's have a look the image and exposure map. It's often helpful to stretch images on a logarithmic scale because some sources are much brighter than others. The exposure map varies much less, so a linear scale works better in that case.


In [ ]:
plt.rcParams['figure.figsize'] = (10.0, 10.0)
plt.imshow(logstretch(im), cmap='gray', origin='lower');

In [ ]:
plt.imshow(ex, cmap='gray', origin='lower');

Some details: FITS images (and the arrays we read from them) are indexed according to the ancient Fortran convention, with the first index corresponding to y (line) and the second index corresponding to x (sample). pyplot knows about this, although we still need to use origin='lower' to display the image the right way up.

In the image, we can see:

  1. Galaxy cluster Abell 1835 (the big blob in the center)
  2. Various other sources (smaller blobs). These are point-like sources - most likely AGN - that have been smeared out by the point spread function
  3. A roughly uniform background, consisting of unresolved AGN, diffuse X-rays from the Galactic halo and local bubble, and events due to particles (solar wind protons and cosmic rays) interacting with the CCD

Whether we're interested in the cluster or the other sources, it's good to have a catalog of what's where. Below, we define an array of circular (x y radius) regions roughly covering the neighborhoods of the non-cluster sources. We could use this, e.g., to mask them when studying the cluster (and will do in a couple of weeks).

Positions are given in image coordinates, i.e. pixels counted from 1 (not from 0). There are far more sopisticated ways of handling this type of region information, but we're lazily re-using some old code in this case.


In [ ]:
regions_string = \
u"""232 399 9.6640149
188 418 6.6794089
362 474 7.8936824
336 417 5.8364482
381 359 3.8626665
391 418 5.5885783
398 294 3.538914
417 209 5.2474726
271 216 5.3269609
300 212 6.0974003
286 162 3.7078355
345 153 4.8141418
168 361 5.6344116
197 248 4.6760734
277 234 5.0308471
241 212 4.1267598
251 379 4.4363759
310 413 2.5081459
460 287 5.9048854
442 353 4.6259039
288 268 4.4204645
148 317 4.7704631
151 286 7.9281045
223 239 5.561259
490 406 4.0450217
481 318 4.7402437
"""
regions = np.loadtxt(StringIO(regions_string))
regions

It's often useful to define arrays of the x and y (image) coordinates of each pixel, so that calculations later (e.g. for distance from a given point) can be done using numpy's array operations. In other words, the x array would just look like a horizontal gradient, and the y array would be a vertical gradient. We'll let you fill in the details.


In [ ]:
try:
    exec(open('Solution/xyramps.py').read())
except IOError:
    nx,ny = im.shape
    x = REPLACE_WITH_YOUR_SOLUTION()
    y = REPLACE_WITH_YOUR_SOLUTION()

# plot the x and y "images" as a sanity check
fig, ax = plt.subplots(1,2)
ax[0].imshow(x, cmap='gray', origin='lower')
ax[1].imshow(y, cmap='gray', origin='lower');

Generating a Mock X-ray Dataset

Suppose we want to model the X-ray image data introduced above. A good first step is to try and make a simulated dataset that has the same statistical properties; we can then use that to verify that our inferences are accurate in the case where our model is correct.

To keep things simple for this tutorial, we'll focus on a single AGN in the field - the one in regions[n] (your choice of n).

Let's start by extracting a small "postage stamp" image centered on our target AGN, and a corresponding exposure map. Feel free to tweak the center or size of the postage stamp, or even choose a different source if you want to. For completeness, also extract postage stamp x and y arrays.

Don't forget the differences between array indices and image coordinates (see above)

  • zero- vs one-based indexing
  • order of x and y axes

In [ ]:
try:
    exec(open('Solution/get_stamps.py').read())
except IOError:
    small_im = REPLACE_WITH_YOUR_SOLUTION()
    small_ex = REPLACE_WITH_YOUR_SOLUTION()
    small_x = REPLACE_WITH_YOUR_SOLUTION()
    small_y = REPLACE_WITH_YOUR_SOLUTION()

In [ ]:
# take a look
plt.rcParams['figure.figsize'] = (10.0, 10.0)
fig, ax = plt.subplots(1,2)
ax[0].imshow(small_im, cmap='gray', origin='lower')
ax[1].imshow(small_ex, cmap='gray', origin='lower');

Compare this with the complete image above as a sanity check. If you don't see a clear source, something is wrong!

Modeling the data

In your postage stamp, there's a real point-like source whose image has been smeared out by the telescope's point spread function (PSF), and a roughly uniform background. We'll assume that the PSF is a symmetric 2D Gaussian for simplicity (in real life, it isn't).

The position and flux of the AGN, and the background surface brightness, are all unknown parameters of this model. The width of the PSF could also be one, potentially (the PSF varies quiet a bit over the focal plane, and I wouldn't want to guess what it is at the particular location of your source). Let's ignore the effective area of the telesope for this problem (essentially just a re-definition of what we mean by "flux").

Draw a PGM that describes this model for the data set. Recall that we looked at a similar (but simpler) scenario when covering Bayes Theorem.

Each circular node in your PGM represents a PDF that we need to draw samples from when generating mock data. Write down the expression for your PGM's factorization of the joint PDF for all parameters and all data.

Generate a mock dataset and visualize it

You should now have all the ingredients to write a function that returns a mock X-ray postage stamp image (complete with realistic noise), given an exposure map and your assumed model, for a given set of model parameter values.

Our presumptive solution is divided into two functions: one to draw parameter values, and one to generate mock data given those parameters. That way you have the option of specifying specific parameter values to see whether you can produce a realistic image manually.

Note that, in general, we would need to convolve our model of the sky with the PSF in order to produce a mock observation. Since our model consists of only a uniform component and a point-like component, the convolution operation is analytic, so there's no need to code it explicitly.


In [ ]:
try:
    exec(open('Solution/mock_image.py').read())
except IOError:
    REMOVE_THIS_LINE()
    def draw_parameters():
        """
        Returns
        -------
        pars: dict
            Dictionary of model parameter values
        """
        pars = {}
        
        # Specify PDF for each parameter and draw a random sample from it:
        pars['x_pos'] = REPLACE_WITH_YOUR_SOLUTION()
        # ... and the rest
        
        return pars
    
    def generate_mock_image(exptime, xs, ys, pars=None):
        """
        Parameters
        ----------
        exptime: array, float 
            Exposure map, giving exposure time (in seconds) for each pixel in the data image
        xs: array, int 
            Array holding the x coordinate of each pixel in the image
        ys: array, int 
            Array holding the y coordinate of each pixel in the image
        pars: dict, float
            Parameters describing the AGN brightness and PSF
        
        Returns
        -------
        N: array, int
            Simulated data image, in counts per pixel
        """
        # Set up a mock data image with the same shape as the exposure map, but with zero counts:
        N = 0 * exptime.astype(int)
    
        # Draw parameters, if they are not passed in:
        if pars is None:
            pars = draw_parameters()
    
        # Use the parameters to realize an image of the model flux F, with the same shape as N.
        F = REPLACE_WITH_YOUR_SOLUTION()
    
        # Convert to an image of predicted counts, again with the same shape:
        mu = REPLACE_WITH_YOUR_SOLUTION()
    
        # Draw pixel counts from the sampling distribution:
        N = REPLACE_WITH_YOUR_SOLUTION()
        
        return N

Here's some (complete) code to display your mock data alongside the observation


In [ ]:
def compare_real_and_simulated(data, mock):
    """
    Parameters
    ----------
    data: array, int 
        Observed data image
    mock: array, int
        Simulated data image
        
    Returns
    -------
    fig: matplotlib figure
    """
    # Make sure images are shown on the same brightness scale:
    vmin, vmax = 0, np.max(data)
    
    # Start figure and plot images:
    plt.rcParams['figure.figsize'] = (10.0, 10.0)
    fig, ax = plt.subplots(1,3)
    ax[0].imshow(data, cmap='gray', origin='lower', vmin=vmin, vmax=vmax)
    ax[0].set_title("Observed")
    ax[1].imshow(mock, cmap='gray', origin='lower', vmin=vmin, vmax=vmax)
    ax[1].set_title("Simulated")
    ax[2].imshow(mock-data, cmap='gray', origin='lower', vmin=-vmax, vmax=vmax)
    ax[2].set_title("Difference")
    
    return fig

Play

Get the code above working, then mess around with setting parameter values by hand until you can produce mock data that look vaguely like the real thing. (This is a pale imitation of the inference algorithms we will study soon.)

Then see what mocks produced by generating parameter values from the distributions you provided look like.


In [ ]:
# call generate_mock_image with specified parameter values
try:
    exec(open('Solution/try_some_parameters.py').read())
except IOError:
    my_pars = REPLACE_WITH_YOUR_SOLUTION()
    
mock = generate_mock_image(small_ex, small_x, small_y, my_pars)
compare_real_and_simulated(small_im, mock);

In [ ]:
# call generate_mock_image without specified parameter values
mock = generate_mock_image(small_ex, small_x, small_y)
compare_real_and_simulated(small_im, mock);