We recently found that the use of aperture flux measurements, when compared to PSF flux measurements, provides excellent discrimination for stars and galaxies. The catch in this case, however, is that due to a limited query the aperture fluxes were weighted using the PSF flux uncertainties. When weighting with the aperture flux measurements with the aperture flux uncertainties this discriminating power vanishes.
This notebook investigates the effects and discriminating power that is possible with different weighting schemes [while only examining photometry from the stack catalog - the assumption is that these results generalize to the Mean and Forced photometry catalogs, but that is not proven].
In [164]:
import numpy as np
import pandas as pd
from astropy.io import fits
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import describe, iqr
%matplotlib notebook
In [122]:
hst_data = fits.getdata("HST_COSMOS_features_adamamiller.fit")
star = np.where(hst_data['MU_CLASS'] == 2)
gal = np.where(hst_data['MU_CLASS'] == 1)
To determine the weighted white flux for a source, an individual filter can only be included if the source is detected in that filter, where detection is defined as:
PSFFlux > 0
ApFlux > 0
KronFlux > 0
[As a brief side note - the same test for Mean and Forced photometry from PS1 needs to include FluxErr
as well]
In [98]:
for filt in ['g','r','i','z','y']:
exec("""det_{0} = np.logical_and.reduce((hst_data['{0}PSFFlux'] > 0,
hst_data['{0}ApFlux'] > 0,
hst_data['{0}KronFlux'] > 0))""".format(filt))
det_arr = np.vstack((det_g, det_r, det_i, det_z, det_y))
In [136]:
def calc_wwFlux(flux_type, weight_type, flux_table, det_arr):
flux_arr = np.zeros_like(det_arr).astype(float)
flux_unc_arr = np.zeros_like(det_arr).astype(float)
for filtnum, filt in enumerate(['g','r','i','z','y']):
flux_col = filt + flux_type
unc_col = filt + weight_type
flux_arr[filtnum] = np.divide(flux_table[flux_col]**3,
flux_table[unc_col]**2,
where=det_arr[filtnum],
out=np.zeros_like(flux_table[flux_col]))
flux_unc_arr[filtnum] = np.divide(flux_table[flux_col]**2,
flux_table[unc_col]**2,
where=det_arr[filtnum],
out=np.zeros_like(flux_table[flux_col]))
return np.divide(np.sum(flux_arr, axis=0),
np.sum(flux_unc_arr, axis=0),
where=np.sum(flux_unc_arr, axis=0)>0,
out=-999*np.ones_like(flux_unc_arr[0]))
In [200]:
cmap_star = sns.cubehelix_palette(rot=0.5, light=0.7,dark=0.3,as_cmap=True)
cmap_gal = sns.cubehelix_palette(start=0.3,rot=-0.5,light=0.7,dark=0.3,as_cmap=True)
We want to compare the difference in flux ratios (PSF/Kron and PSF/Ap) for stars and galaxies using different weighting schemes, as follows: (i) using the corresponding uncertainties, (ii) using the PSF uncertainties, (iii) using the Kron uncertainties, and (iv) using the Ap uncertainties.
In [210]:
psf_psf_ww = calc_wwFlux('PSFFlux', 'PSFFluxErr', hst_data, det_arr)
kron_kron_ww = calc_wwFlux('KronFlux', 'KronFluxErr', hst_data, det_arr)
ap_ap_ww = calc_wwFlux('ApFlux', 'ApFluxErr', hst_data, det_arr)
psf_kron = psf_psf_ww/kron_kron_ww
psf_ap = psf_psf_ww/ap_ap_ww
star_det = np.logical_and(hst_data['MU_CLASS'] == 2, psf_psf_ww > 0)
gal_det = np.logical_and(hst_data['MU_CLASS'] == 1, psf_psf_ww > 0)
star_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 2,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
gal_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 1,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
with sns.axes_style('whitegrid'):
fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, figsize=(10,5))
for subset, col, ht in zip([gal_det, gal_mean_det, star_det, star_mean_det],
["C0", "C0", "C1", "C1"],
["step", "stepfilled", "step", "stepfilled"]):
ax1.hist(psf_kron[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.hist(psf_ap[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.set_xticks(np.linspace(0,2,11))
fig.tight_layout()
# calculate a mini-stats table
ww_properties = pd.DataFrame(columns=['name', 'IQR', 'Median', 'Mean', 'Std', 'Skew'])
ww_properties['name'] = ['PSFKron_star', 'PSFKron_gal', 'PSFAp_star', 'PSFAp_gal']
ww_properties.set_index('name', inplace=True)
for pd_idx, flux_ratio in zip(ww_properties.index,
[psf_kron[star_det], psf_kron[gal_det],
psf_ap[star_det], psf_ap[gal_det]]):
ww_properties["IQR"][pd_idx] = iqr(flux_ratio)
ww_properties["Median"][pd_idx] = np.median(flux_ratio)
describe_ratio = describe(flux_ratio)
ww_properties["Mean"][pd_idx] = describe_ratio.mean
ww_properties["Std"][pd_idx] = describe_ratio.variance**0.5
ww_properties["Skew"][pd_idx] = describe_ratio.skewness
ww_properties
Out[210]:
When weighting each white flux measurement by its corresponding uncertainties, it is clear that Kron does a much better job of separating stars and galaxies. My best guess for the reason behind this is that the aperture uncertainties are significantly underestimated and therefore wrong.
Focusing on the PSFKron distribution, the means are separated by $\sim$0.25, and the IQR is relatively narrow.
In [211]:
psf_psf_ww = calc_wwFlux('PSFFlux', 'PSFFluxErr', hst_data, det_arr)
kron_psf_ww = calc_wwFlux('KronFlux', 'PSFFluxErr', hst_data, det_arr)
ap_psf_ww = calc_wwFlux('ApFlux', 'PSFFluxErr', hst_data, det_arr)
psf_kron = psf_psf_ww/kron_psf_ww
psf_ap = psf_psf_ww/ap_psf_ww
star_det = np.logical_and(hst_data['MU_CLASS'] == 2, psf_psf_ww > 0)
gal_det = np.logical_and(hst_data['MU_CLASS'] == 1, psf_psf_ww > 0)
star_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 2,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
gal_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 1,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
with sns.axes_style('whitegrid'):
fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, figsize=(10,5))
for subset, col, ht in zip([gal_det, gal_mean_det, star_det, star_mean_det],
["C0", "C0", "C1", "C1"],
["step", "stepfilled", "step", "stepfilled"]):
ax1.hist(psf_kron[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.hist(psf_ap[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.set_xticks(np.linspace(0,2,11))
fig.tight_layout()
# calculate a mini-stats table
ww_properties = pd.DataFrame(columns=['name', 'IQR', 'Median', 'Mean', 'Std', 'Skew'])
ww_properties['name'] = ['PSFKron_star', 'PSFKron_gal', 'PSFAp_star', 'PSFAp_gal']
ww_properties.set_index('name', inplace=True)
for pd_idx, flux_ratio in zip(ww_properties.index,
[psf_kron[star_det], psf_kron[gal_det],
psf_ap[star_det], psf_ap[gal_det]]):
ww_properties["IQR"][pd_idx] = iqr(flux_ratio)
ww_properties["Median"][pd_idx] = np.median(flux_ratio)
describe_ratio = describe(flux_ratio)
ww_properties["Mean"][pd_idx] = describe_ratio.mean
ww_properties["Std"][pd_idx] = describe_ratio.variance**0.5
ww_properties["Skew"][pd_idx] = describe_ratio.skewness
ww_properties
Out[211]:
The picture changes dramatically when weighting all the flux measurements by the PSF uncertainties. For PSFKron the statistics reamin similar, but the major change is for the PSFAp ratios, which show tighter distributions than the use of Kron [as traced by IQR]. The contrast in the mean is smaller, however, and the use of Aperture photometry leads to more heavily skewed distributions.
In [212]:
psf_kron_ww = calc_wwFlux('PSFFlux', 'KronFluxErr', hst_data, det_arr)
kron_kron_ww = calc_wwFlux('KronFlux', 'KronFluxErr', hst_data, det_arr)
ap_kron_ww = calc_wwFlux('ApFlux', 'KronFluxErr', hst_data, det_arr)
psf_kron = psf_kron_ww/kron_kron_ww
psf_ap = psf_kron_ww/ap_kron_ww
star_det = np.logical_and(hst_data['MU_CLASS'] == 2, psf_kron_ww > 0)
gal_det = np.logical_and(hst_data['MU_CLASS'] == 1, psf_kron_ww > 0)
star_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 2,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
gal_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 1,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
with sns.axes_style('whitegrid'):
fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, figsize=(10,5))
for subset, col, ht in zip([gal_det, gal_mean_det, star_det, star_mean_det],
["C0", "C0", "C1", "C1"],
["step", "stepfilled", "step", "stepfilled"]):
ax1.hist(psf_kron[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.hist(psf_ap[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.set_xticks(np.linspace(0,2,11))
fig.tight_layout()
# calculate a mini-stats table
ww_properties = pd.DataFrame(columns=['name', 'IQR', 'Median', 'Mean', 'Std', 'Skew'])
ww_properties['name'] = ['PSFKron_star', 'PSFKron_gal', 'PSFAp_star', 'PSFAp_gal']
ww_properties.set_index('name', inplace=True)
for pd_idx, flux_ratio in zip(ww_properties.index,
[psf_kron[star_det], psf_kron[gal_det],
psf_ap[star_det], psf_ap[gal_det]]):
ww_properties["IQR"][pd_idx] = iqr(flux_ratio)
ww_properties["Median"][pd_idx] = np.median(flux_ratio)
describe_ratio = describe(flux_ratio)
ww_properties["Mean"][pd_idx] = describe_ratio.mean
ww_properties["Std"][pd_idx] = describe_ratio.variance**0.5
ww_properties["Skew"][pd_idx] = describe_ratio.skewness
ww_properties
Out[212]:
The results when weighting using the Kron Flux uncertainties is similar to the PSF flux uncertainties. THe skew and IQR are slightly smaller with Kron weighting, but the differences are small.
In [213]:
psf_ap_ww = calc_wwFlux('PSFFlux', 'ApFluxErr', hst_data, det_arr)
kron_ap_ww = calc_wwFlux('KronFlux', 'ApFluxErr', hst_data, det_arr)
ap_ap_ww = calc_wwFlux('ApFlux', 'ApFluxErr', hst_data, det_arr)
psf_kron = psf_ap_ww/kron_ap_ww
psf_ap = psf_ap_ww/ap_ap_ww
star_det = np.logical_and(hst_data['MU_CLASS'] == 2, psf_ap_ww > 0)
gal_det = np.logical_and(hst_data['MU_CLASS'] == 1, psf_ap_ww > 0)
star_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 2,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
gal_mean_det = np.logical_and.reduce((hst_data['MU_CLASS'] == 1,
psf_psf_ww > 0,
hst_data['nDetections'] > 0))
with sns.axes_style('whitegrid'):
fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, figsize=(10,5))
for subset, col, ht in zip([gal_det, gal_mean_det, star_det, star_mean_det],
["C0", "C0", "C1", "C1"],
["step", "stepfilled", "step", "stepfilled"]):
ax1.hist(psf_kron[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.hist(psf_ap[subset], histtype=ht, lw=2, color=col,
bins=np.arange(0,2.1,0.025), alpha=0.7)
ax2.set_xticks(np.linspace(0,2,11))
fig.tight_layout()
# calculate a mini-stats table
ww_properties = pd.DataFrame(columns=['name', 'IQR', 'Median', 'Mean', 'Std', 'Skew'])
ww_properties['name'] = ['PSFKron_star', 'PSFKron_gal', 'PSFAp_star', 'PSFAp_gal']
ww_properties.set_index('name', inplace=True)
for pd_idx, flux_ratio in zip(ww_properties.index,
[psf_kron[star_det], psf_kron[gal_det],
psf_ap[star_det], psf_ap[gal_det]]):
ww_properties["IQR"][pd_idx] = iqr(flux_ratio)
ww_properties["Median"][pd_idx] = np.median(flux_ratio)
describe_ratio = describe(flux_ratio)
ww_properties["Mean"][pd_idx] = describe_ratio.mean
ww_properties["Std"][pd_idx] = describe_ratio.variance**0.5
ww_properties["Skew"][pd_idx] = describe_ratio.skewness
ww_properties
Out[213]:
The use of aperture flux uncertainty weights noticibly reduces the skew of the ratio distributions, but the constrast between stars and galaxies is reduced relative to the other schemes. As a result weighting with the aperture flux uncertainties should be avoided.
Upshot Ultimately, the goal is to provide the maximum contrast betweens stars and galaxies. When comparing PSF fluxes to Kron fluxes the contrast is maximized by weighting by the respective uncertainties, but when comparing PSF and aperture fluxes, the contrast is strongest when weighting both by either the PSF uncertainties or the Kron uncertainties.
Ultimately, we elect to weight the Aperture fluxes by the PSF flux SNR.