Matched wavelength photometry

Version 0.1

For today's problem, we will perform matched-aperture photometry in 3 bands on multiple galaxies within a rich galaxy cluster. Ultimately, we will be looking for trends in galaxy colors and other properties as a function of cluster radius.

Note - we will use astropy for these tasks, though the use of Source Extractor is more standard within the galaxy community.


By M Alpaslan (NYU) & AA Miller (CIERA/Northwestern & Adler)

Problem 0) Install photutils

If you have not already done so, install the photutils package from the astropy conda channel within your DSFP environment. You will also need the scikit-image package.

conda install -c astropy photutils
conda install scikit-image

In [ ]:
import numpy as np
import pandas as pd
import astropy
from astropy.io import fits
from photutils.aperture import CircularAperture, CircularAnnulus, EllipticalAperture, EllipticalAnnulus
from photutils.segmentation import detect_sources, source_properties
from photutils.detection import detect_threshold
from photutils.centroids import centroid_com
from photutils import aperture_photometry
from photutils.utils import calc_total_error
import matplotlib.pyplot as plt

%matplotlib notebook

Problem 1) Download and Examine the Data

The images for this exercise can be downloaded from here: https://northwestern.box.com/s/x6nzuqtdys3jo1nufvswkx62o44ifa11. Be sure to place the images in the same directory as this notebook (but do not add them to your git repo!).

Before we dive in, here is some background information on the images we will be analyzing: the imaging data and the group information all come from the Galaxy And Mass Assembly (GAMA) survey; and more specifically, its panchromatic data release.

Many of the difficult steps associated with multiband galaxy photometry have already been done for you: GAMA constructs large mosaics of co-added FITS images in 20 bands to measure photometry. The images we will use today are from the g, r, and i mosaics that I (MA) built $\sim$7 years ago. They are built from SDSS observations in those bands, and have all been convolved to a seeing of approximately 2”, background subtracted, and renormalized to a common zeropoint of 30 magnitudes. The group catalogue was done by Aaron Robotham (see https://arxiv.org/abs/1106.1994).

In the downloaded directory there are g, r, and i images of 36 galaxies that all belong to the same cluster. These image cutouts have been centered on the galaxy position, are $\sim$80.7" on a side, and have a pixel scale of 0.339"/pix.

To begin we will focus on a single galaxy, before eventually working on the entire cluster.

Problem 1a

Display the $r$-band image of the galaxy 85698. Use a asinh stretch.


In [ ]:
r_filename = "galaxy_images/85698_sdss_r.fits"
r_data = fits.getdata( # complete

plt.imshow( # complete

plt.colorbar()
plt.tight_layout()

Problem 1b

Roughly how many sources are present in the image?

Hint - an exact count is not required here.

Solution 1b

Write your answer here

Problem 2) Source Detection

Prior to measuring any properties of sources in the image, we must first determine the number of sources present in the image. Source detection is challenging, and there are many different thresholding approaches.

Today, we will streamline this step in order to spend more time focusing on the issues associated with matching photometric measurements across different images. We will use the detect_sources function in photutils to identify objects in our image.

The simplest model assumes that the background is constant over the entire image. Once the background is determined, it can be subtracted from the image to determine high significance "peaks" corresponding to sources. After this week, we have learned that the background isn't so simple, nevertheless we will use the detect_threshold convenience function to estimate a constant background for our images. detect_threshold produces a "detection image" that can be used to estimate the significance of the flux detected in any individual pixel.

Problem 2a

Create a detection threshold image using the detect_threshold function, set the snr parameter to 3.


In [ ]:
threshold = detect_threshold( # complete

Problem 2b

Develop better intuition for the detection image by plotting it side-by-side with the actual image of the field.

Do you notice anything interesting about the threshold image?


In [ ]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(7,4))

ax1.imshow( # complete
ax2.imshow( # complete
fig.tight_layout()

Following this measurement of the background, we can find sources using the detect_sources function. Briefly, this function uses image segmentation to define and assign pixels to sources, which are defined as objects with $N$ connected pixels that are $s$ times brighter than the background (we already set $s = 3$). Read the docs for further details.

Problem 2c

Generate a segmentation image using detect_sources. Keep only sources with $N = 7$ pixels, which is keyword arg npixels in detect_sources.

If you have extra time Come back to this problem and see how changing $N$ affects your results.


In [ ]:
segm = detect_sources( # complete

Problem 2d

Plot the segmentation image side-by-side with the actual image of the field.

Are you concerned or happy with the results?

Hint - no stretch should be applied to the segmentation image.


In [ ]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(7,4))

ax1.imshow(# complete
ax2.imshow(# complete
fig.tight_layout()

Problem 3) Source Centroids and Shapes

Now that we have defined all of the sources in the image, we must determine the centroid for each source (in order to ultimately make some form of photometric measurement). As Dora mentioned earlier in the week, there are many ways to determine the centroid of a given source (e.g., fitting a model, finding the max of the marginalized 1-d distribution, etc). Today we will use the centroid_com function, which calculates the "center of mass" of the 2d image moments to determine the source centroids.

To measure the centroid we want to isolate the source in question, thus we have generated a convenience function to return the extent of each source from its corresponding segmentation image.


In [ ]:
def get_source_extent(segm_data, source_num):
    """
    Determine extent of sources for centroid measurements
    
    Parameters
    ----------
    segm_data : array-like
        Segementation image produced by photutils.segmentation.detect_sources
    
    source_num : int
        The source number from the segmentation image
        
    Returns
    -------
    source_extent : list
        The minimum y, maximum y, minimum x, and maximum x pixel values 
        over which a source is detected
    """
    source_pix = np.where(segm_data == source_num)
    source_extent = [np.min(source_pix[0]), np.max(source_pix[0]), 
                     np.min(source_pix[1]), np.max(source_pix[1])]

    return source_extent

Problem 3a

Measure the centroid for each source detected in the image using the centroid_com function.

Hint - you'll want to start with a subset of pixels containing the source.

Hint 2 - centroids are measured relative to the provided data, you'll need to convert back to "global" pixel values.


In [ ]:
xcentroid = np.zeros_like(np.unique(segm.data)[1:], dtype="float")
ycentroid = np.zeros_like(np.unique(segm.data)[1:], dtype="float")

for source_num in np.unique(segm.data)[1:]:
    source_extent = get_source_extent( # complete
    xc, yc = centroid_com( # complete
    xcentroid[source_num-1], ycentroid[source_num-1] = # complete

Problem 3b

Overplot the derived centroids on the image data as a sanity check for your methodology.


In [ ]:
fig, ax1 = plt.subplots()

ax1.imshow( # complete
ax1.plot( # complete
fig.tight_layout()

With an estimate of the centroid of every source in hand, we now need to determine the ellipse that best describes the galaxies in order to measure their flux. Fortunately, this can be done using the source_properties function within photutils.morphology package.

Briefly, source_properties takes both the data array, and the segmentation image as inputs, and then calculates properties for every source. The list of properties is long (see the attributes list), and for now we only care about the semi-major and semi-minor axes as well as the orientation of the source, all of which are needed to measure the flux in an elliptical aperture [this is a lot easier than trying to fit concentric ellipses, no?].

Problem 3c

Using source_properties to determine $a$, $b$, and the orientation of each source.


In [ ]:
cat = source_properties( # complete
tbl = cat.to_table(columns=['id', 'semimajor_axis_sigma','semiminor_axis_sigma', 'orientation'])

Problem 4) Photometry

We now have all the necessary information to measure the flux in elliptical apertures. The EllipticalAperture function in photutils defines apertures on an image based on input centroids, $a$, $b$, and orientation values.

Problem 4a

Define apertures for the sources that are detected in the image.

Note - the semimajor_axis_sigma reported by source_properties() is the "The 1-sigma standard deviation along the semimajor axis of the 2D Gaussian function that has the same second-order central moments as the source" according to the docs. Thus, be sure to multiple $a$ and $b$ by a factor of 3 in order to capture $\sim$3$\sigma$ of the source flux.

Note to the note - this isn't well motivated, but for the sake of argument assume that this adjustment creates a reasonable aperture.


In [ ]:
positions = # complete

apertures = [EllipticalAperture( # complete
             # complete
             # complete
             # complete

Problem 4b

Overplot your apertures on the sources that have been detected.

Hint - each aperture object has a plot() attribute that can be used to show the aperture for each source.


In [ ]:
fig, ax1 = plt.subplots()

ax1.imshow( # complete
# complete
# complete
fig.tight_layout()

With apertures now defined, we can finally measure the flux of each source. The aperture_photometry function returns the flux (actually counts) in an image for the provided apertures. It takes the image, apertures, and bakground image as arguments.

Note - the background has already been subtracted from these images so we currently do not have an estimate of the full background for these sources.

We will create a background image that is approximately correct (we know this because we know the properties of the SDSS survey and detector). In this case what we are doing is not only incorrect, it's entirely made up and should not be repeated in your own work. Nevertheless, this (bad) approximation is necessary to produce uncertainty estimates.

Execute the cell below to create an uncertainty image to use with the aperture_photometry function.


In [ ]:
bkg = np.random.normal(100, 35, r_data.shape)
uncertainty_img = calc_total_error(r_data, bkg - np.mean(bkg), 1)

Problem 4c

Measure the counts and uncertainty detected from each source within the apertures defined in 4a.

Hint - you will need to loop over each aperture as aperture_photometry does not take multiple apertures of different shapes as a single argument.


In [ ]:
source_cnts = # complete
source_cnts_unc = # complete
for source_num, ap in enumerate(apertures):
    phot = # complete
    source_cnts[source_num] = # complete
    source_cnts_unc[source_num] = # complete

The images have been normalized to a zero point of 30. Thus, we can convert from counts to magnitudes via the following equation:

$$m = 30 - 2.5 \log (\mathrm{counts}).$$

Recall from Dora's talk that the uncertainty of the magnitude measurements can be calculated as:

$$\frac{2.5}{\ln(10)} \frac{\sigma_\mathrm{counts}}{\mathrm{counts}}.$$

Problem 4d

Calculate the magnitude of each source in the image.


In [ ]:
source_mag = # complete
source_mag_unc = # complete

for source_num, (mag, mag_unc) in enumerate(zip(source_mag, source_mag_unc)):
    print("Source {:d} has m = {:.3f} +/- {:.3f} mag".format( # complete

That's it! You've measured the magnitude for every source in the image.

As previously noted, the images provided for this dataset are centered are galaxies within a cluster, and ultimately, these galaxies are all that we care about. For this first image, that means we care about the galaxy centered at $(x,y) \approx (118, 118)$.

Problem 4e

What is the magnitude of the galaxy we care about for this image? [We will need this moving forward]


In [ ]:
# complete

Problem 5) Multiwavelength Photometry

Ultimately we want to measure colors for these galaxies. We now know the $r$-band magnitude for galaxy 85698, we need to measure the $g$ and $i$ band magnitudes as well.

Problem 5a Using the various pieces described above, write a function to measure the magnitude of the galaxy at the center of the image. You should create a new background image for every field.

Hint - creating an actual function is essential as we will eventually run this on every image.

Hint 2 - source_properties directly measures source centroids, use this it will be faster.


In [ ]:
def cluster_galaxy_photometry(data):
    '''
    Determine the magnitude of the galaxy at the center of the image
    
    Parameters
    ----------
    data : array-like
        Background subtracted 2D image centered on the galaxy
        of interest
    
    Returns
    -------
    mag : float
        Magnitude of the galaxy
    
    mag_unc : float
        Uncertainty of the magnitude measurement
    '''

    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    # complete
    
    
    return mag, mag_unc

Problem 5b

Confirm that the function calculates the same $r$-band mag that was calculated in Problem 4.


In [ ]:
# complete

print("""Previously, we found m = {:.3f} mag. 
This new function finds m = {:.3f} mag.""".format( # complete

Problem 5c

Use this new function to calculate the galaxy magnitude in the $g$ and the $i$ band, and determine the $g - r$ and $r - i$ colors of the galaxy.


In [ ]:
g_data = fits.getdata( # complete
i_data = fits.getdata( # complete

# complete
# complete
# complete
print("""The g-r color = {:.3f} +/- {:.3f} mag.
The r-i color = {:.3f} +/- {:.3f} mag""".format(g_mag - r_mag, np.hypot(g_mag_unc, r_mag_unc), 
                                                r_mag - i_mag, np.hypot(r_mag_unc, i_mag_unc)))

But wait!

Problem 5d

Was this calculation "fair"?

Hint - this is a relatively red galaxy.

Solution 5d

This calculation was not "fair" because identical apertures were not used in all 3 filters.

Problem 5e

[Assuming your calculation was not fair] Calculate the $g - r$ and $r - i$ colors of the galaxy in a consistent fashion.

Hint - split your initial function into two functions, one to determine an aperture and another to measure photometry. Use the $r$-band image (where the signal-to-noise ratio of the data is highest) to define the aperture for all 3 images.


In [ ]:
def cluster_galaxy_aperture(data):
# complete
# complete
# complete
# complete
# complete
# complete
# complete
# complete
# complete
# complete
# complete
    return aperture

def cluster_galaxy_phot(data, aperture):
# complete
# complete
# complete
# complete
# complete
    return mag, mag_unc

In [ ]:
r_ap = # complete

# complete
# complete
# complete

print("""The g-r color = {:.3f} +/- {:.3f} mag.
The r-i color = {:.3f} +/- {:.3f} mag""".format(g_mag - r_mag, np.hypot(g_mag_unc, r_mag_unc), 
                                                r_mag - i_mag, np.hypot(r_mag_unc, i_mag_unc)))

Challenge Problem) Colors as a Function of Radius

Each of the provided FITS images corresponds to a single galaxy in the galaxy cluster. Measure the colors for each galaxy, and plot these colors as a function of cluster radius.

Hint - the file galsAngSep.txt has the galaxy names and separation from the center of the cluster.


In [ ]: