Compare PS1 Mean, Forced, Stacked Photometry

This notebook utilizes HST imaging to the COSMOS field, which provides reliable morphological classifications to a magnitude limit of $\sim{26}\,\mathrm{mag}$, to compare the results from a "simple" star-galaxy separation model for the 3 different photometry types extracted from PanSTARRS-1: Mean single-epoch photometry, Forced single-epoch photometry, and Stacked photometry.

In brief, the "simple model" combines the SNR$^2$ weighted-mean of the flux measurements for each filter where the source is detected to create a "white" flux measurement. Classifications are then performed by measuring the distance between individual sources and a slope $\approx 1$ line in the white PSF flux-white Kron flux (or white Ap flux) plane.


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from astropy.io import fits
from astropy.table import Table
from mpl_toolkits.axes_grid1.inset_locator import inset_axes,zoomed_inset_axes
import seaborn as sns

from statsmodels.nonparametric.kernel_density import KDEMultivariate
import statsmodels.nonparametric.api as smnp

from sklearn.model_selection import KFold

%matplotlib notebook

In [2]:
ps1_white_phot = fits.getdata("COSMOS_compare_white_phot_adamamiller.fit")

In [45]:
print("Of the {:d} unique sources in the COSMOS cross-match:".format(len(ps1_white_phot)))
print("\tThere are {:d} sources with Mean white phot,".format(sum(~np.isnan(ps1_white_phot["wwMeanPSFFlux"]))))
print("\tThere are {:d} sources with Forced white phot,".format(sum(~np.isnan(ps1_white_phot["wwFPSFFlux"]))))
print("\tThere are {:d} sources with Stacked white phot.".format(sum(~np.isnan(ps1_white_phot["wwPSFFlux"]))))


Of the 80974 unique sources in the COSMOS cross-match:
	There are 42687 sources with Mean white phot,
	There are 80865 sources with Forced white phot,
	There are 75927 sources with Stacked white phot.

For a fair, apples-to-apples comparison, we test the different photometric catalogs using only sources that are common to all 3 methods (mfs below). We also compare the results on fainter sources by comparing sources in the Stacked and Forced tables (fs below).


In [3]:
wwMeanPSFKronRatio = ps1_white_phot["wwMeanPSFFlux"]/ps1_white_phot["wwMeanKronFlux"]
wwFPSFKronRatio = ps1_white_phot["wwFPSFFlux"]/ps1_white_phot["wwFKronFlux"]
wwPSFKronRatio = ps1_white_phot["wwPSFFlux"]/ps1_white_phot["wwKronFlux"]

det_mfs = np.where(np.isfinite(wwMeanPSFKronRatio) & 
                   np.isfinite(wwFPSFKronRatio) & 
                   np.isfinite(wwPSFKronRatio))
det_mfs_star = np.where(np.isfinite(wwMeanPSFKronRatio) & 
                        np.isfinite(wwFPSFKronRatio) & 
                        np.isfinite(wwPSFKronRatio) & 
                        (ps1_white_phot["mu_class"] == 2) &
                        (wwMeanPSFKronRatio <= 2.0))
det_mfs_gal = np.where(np.isfinite(wwMeanPSFKronRatio) & 
                        np.isfinite(wwFPSFKronRatio) & 
                        np.isfinite(wwPSFKronRatio) & 
                        (ps1_white_phot["mu_class"] == 1) &
                        (wwMeanPSFKronRatio <= 2.0))

det_fs = np.where(np.isfinite(wwFPSFKronRatio) & 
                  np.isfinite(wwPSFKronRatio))
det_fs_star = np.where(np.isfinite(wwFPSFKronRatio) & 
                       np.isfinite(wwPSFKronRatio) & 
                       (ps1_white_phot["mu_class"] == 2) )
det_fs_gal = np.where(np.isfinite(wwFPSFKronRatio) & 
                      np.isfinite(wwPSFKronRatio) & 
                      (ps1_white_phot["mu_class"] == 1) )


/Users/adamamiller/miniconda3/envs/py36/lib/python3.6/site-packages/ipykernel/__main__.py:12: RuntimeWarning: invalid value encountered in less_equal
/Users/adamamiller/miniconda3/envs/py36/lib/python3.6/site-packages/ipykernel/__main__.py:17: RuntimeWarning: invalid value encountered in less_equal

Examine the PSF to Aperture ratio


In [4]:
wwMeanPSFApRatio = ps1_white_phot["wwMeanPSFFlux"]/ps1_white_phot["wwMeanApFlux"]
wwFPSFApRatio = ps1_white_phot["wwFPSFFlux"]/ps1_white_phot["wwFApFlux"]
wwPSFApRatio = ps1_white_phot["wwPSFFlux"]/ps1_white_phot["wwApFlux"]

psfap_mfs = np.where(np.isfinite(wwMeanPSFApRatio) & 
                   np.isfinite(wwFPSFApRatio) & 
                   np.isfinite(wwPSFApRatio))
psfap_mfs_star = np.where(np.isfinite(wwMeanPSFApRatio) & 
                        np.isfinite(wwFPSFApRatio) & 
                        np.isfinite(wwPSFApRatio) & 
                        (ps1_white_phot["mu_class"] == 2) &
                        (wwMeanPSFApRatio <= 2.0))
psfap_mfs_gal = np.where(np.isfinite(wwMeanPSFApRatio) & 
                        np.isfinite(wwFPSFApRatio) & 
                        np.isfinite(wwPSFApRatio) & 
                        (ps1_white_phot["mu_class"] == 1) &
                        (wwMeanPSFApRatio <= 2.0))

psfap_fs = np.where(np.isfinite(wwFPSFApRatio) & 
                  np.isfinite(wwPSFApRatio))
psfap_fs_star = np.where(np.isfinite(wwFPSFApRatio) & 
                       np.isfinite(wwPSFApRatio) & 
                       (ps1_white_phot["mu_class"] == 2) )
psfap_fs_gal = np.where(np.isfinite(wwFPSFApRatio) & 
                      np.isfinite(wwPSFApRatio) & 
                      (ps1_white_phot["mu_class"] == 1) )


/Users/adamamiller/miniconda3/envs/py36/lib/python3.6/site-packages/ipykernel/__main__.py:12: RuntimeWarning: invalid value encountered in less_equal
/Users/adamamiller/miniconda3/envs/py36/lib/python3.6/site-packages/ipykernel/__main__.py:17: RuntimeWarning: invalid value encountered in less_equal

Part 1 - Bright Sources Detected in Single Exposures

Plot distance vs. Kron mag for 3 different phot measures

Show the distribution of these sources in their respective PSF vs. Kron planes.


In [5]:
ps1_df = Table.read("COSMOS_compare_white_phot_adamamiller.fit").to_pandas()

ps1_df["MeanDist"] = (ps1_white_phot["wwMeanPSFFlux"] - ps1_white_phot["wwMeanKronFlux"])/np.sqrt(2)
ps1_df["FDist"] = (ps1_white_phot["wwFPSFFlux"] - ps1_white_phot["wwFKronFlux"])/np.sqrt(2)
ps1_df["Dist"] = (ps1_white_phot["wwPSFFlux"] - ps1_white_phot["wwKronFlux"])/np.sqrt(2)

ps1_df["psfapMeanDist"] = (ps1_white_phot["wwMeanPSFFlux"] - ps1_white_phot["wwMeanApFlux"])/np.sqrt(2)
ps1_df["psfapFDist"] = (ps1_white_phot["wwFPSFFlux"] - ps1_white_phot["wwFApFlux"])/np.sqrt(2)
ps1_df["psfapDist"] = (ps1_white_phot["wwPSFFlux"] - ps1_white_phot["wwApFlux"])/np.sqrt(2)

ps1_df["MeanKronMag"] = -2.5*np.log10(ps1_white_phot["wwMeanKronFlux"]/3631)
ps1_df["FKronMag"] = -2.5*np.log10(ps1_white_phot["wwFKronFlux"]/3631)
ps1_df["KronMag"] = -2.5*np.log10(ps1_white_phot["wwKronFlux"]/3631)

In [130]:
def kde_contour_dat(x, y, extent = (0,0,0,0), bw_type = "silverman", grid_bins = 100):
    """Determine normalized KDE PDF to draw contours"""
    
    if isinstance(x, pd.Series):
        x = x.values
    if isinstance(y, pd.Series):
        y = y.values
    
    if extent == (0,0,0,0):
        extent = (x.min(), x.max(), y.min(), y.max())

    if bw_type == "silverman":
        bw = np.array([smnp.bandwidths.bw_silverman(x), smnp.bandwidths.bw_silverman(y)])
    elif bw_type == "scott":
        bw = np.array([smnp.bandwidths.bw_scott(x), smnp.bandwidths.bw_scott(y)])

    kde = KDEMultivariate([x,y], var_type='cc', bw = bw)

    xi, yi = np.mgrid[extent[0]:extent[1]:grid_bins*1j,extent[2]:extent[3]:grid_bins*1j]

    kde_prob = kde.pdf(np.vstack([xi.flatten(), yi.flatten()]))

    zi = (kde_prob-kde_prob.min())/(kde_prob.max() - kde_prob.min())
    zi = zi.reshape(xi.shape)

    return xi, yi, zi

Get KDEs for PSF-Kron phot


In [131]:
gal_dist_mean = ps1_df.ix[det_mfs_gal]["MeanDist"]
gal_mag_mean = ps1_df.ix[det_mfs_gal]["MeanKronMag"]

star_dist_mean = ps1_df.ix[det_mfs_star]["MeanDist"]
star_mag_mean = ps1_df.ix[det_mfs_star]["MeanKronMag"]

xgal_mean, ygal_mean, zgal_mean = kde_contour_dat(gal_mag_mean, gal_dist_mean, extent = (16,24,-2e-5,2e-5))
xstar_mean, ystar_mean, zstar_mean = kde_contour_dat(star_mag_mean, star_dist_mean, extent = (16,24,-2e-5,2e-5))

gal_dist_forced = ps1_df.ix[det_mfs_gal]["FDist"]
gal_mag_forced = ps1_df.ix[det_mfs_gal]["FKronMag"]

star_dist_forced = ps1_df.ix[det_mfs_star]["FDist"]
star_mag_forced = ps1_df.ix[det_mfs_star]["FKronMag"]

xgal_forced, ygal_forced, zgal_forced = kde_contour_dat(gal_mag_forced, gal_dist_forced, extent = (16,24,-2e-5,2e-5))
xstar_forced, ystar_forced, zstar_forced = kde_contour_dat(star_mag_forced, star_dist_forced, extent = (16,24,-2e-5,2e-5))

gal_dist_stacked = ps1_df.ix[det_mfs_gal]["Dist"]
gal_mag_stacked = ps1_df.ix[det_mfs_gal]["KronMag"]

star_dist_stacked = ps1_df.ix[det_mfs_star]["Dist"]
star_mag_stacked = ps1_df.ix[det_mfs_star]["KronMag"]

xgal_stacked, ygal_stacked, zgal_stacked = kde_contour_dat(gal_mag_stacked, gal_dist_stacked, extent = (16,24,-2e-5,2e-5))
xstar_stacked, ystar_stacked, zstar_stacked = kde_contour_dat(star_mag_stacked, star_dist_stacked, extent = (16,24,-2e-5,2e-5))

Get KDEs for PSF-Ap phot


In [132]:
gal_dist_mean_psfap = ps1_df.ix[psfap_mfs_gal]["psfapMeanDist"]
gal_mag_mean_psfap = ps1_df.ix[psfap_mfs_gal]["MeanKronMag"]

star_dist_mean_psfap = ps1_df.ix[psfap_mfs_star]["psfapMeanDist"]
star_mag_mean_psfap = ps1_df.ix[psfap_mfs_star]["MeanKronMag"]

xgal_mean_psfap, ygal_mean_psfap, zgal_mean_psfap = kde_contour_dat(gal_mag_mean_psfap, gal_dist_mean_psfap, extent = (16,24,-2e-5,2e-5))
xstar_mean_psfap, ystar_mean_psfap, zstar_mean_psfap = kde_contour_dat(star_mag_mean_psfap, star_dist_mean_psfap, extent = (16,24,-2e-5,2e-5))

gal_dist_forced_psfap = ps1_df.ix[psfap_mfs_gal]["psfapFDist"]
gal_mag_forced_psfap = ps1_df.ix[psfap_mfs_gal]["FKronMag"]

star_dist_forced_psfap = ps1_df.ix[psfap_mfs_star]["psfapFDist"]
star_mag_forced_psfap = ps1_df.ix[psfap_mfs_star]["FKronMag"]

xgal_forced_psfap, ygal_forced_psfap, zgal_forced_psfap = kde_contour_dat(gal_mag_forced_psfap, gal_dist_forced_psfap, extent = (16,24,-2e-5,2e-5))
xstar_forced_psfap, ystar_forced_psfap, zstar_forced_psfap = kde_contour_dat(star_mag_forced_psfap, star_dist_forced_psfap, extent = (16,24,-2e-5,2e-5))

gal_dist_stacked_psfap = ps1_df.ix[psfap_mfs_gal]["psfapDist"]
gal_mag_stacked_psfap = ps1_df.ix[psfap_mfs_gal]["KronMag"]

star_dist_stacked_psfap = ps1_df.ix[psfap_mfs_star]["psfapDist"]
star_mag_stacked_psfap = ps1_df.ix[psfap_mfs_star]["KronMag"]

xgal_stacked_psfap, ygal_stacked_psfap, zgal_stacked_psfap = kde_contour_dat(gal_mag_stacked_psfap, gal_dist_stacked_psfap, extent = (16,24,-2e-5,2e-5))
xstar_stacked_psfap, ystar_stacked_psfap, zstar_stacked_psfap = kde_contour_dat(star_mag_stacked_psfap, star_dist_stacked_psfap, extent = (16,24,-2e-5,2e-5))

In [142]:
origin = 'lower'
levels = [0.1, 0.25, 0.5, 0.75, 0.9,1]
cmap_star = sns.cubehelix_palette(rot=0.5, light=0.7,dark=0.3,as_cmap=True)
cmap_gal = sns.cubehelix_palette(start=0.3,rot=-0.5,light=0.7,dark=0.3,as_cmap=True)

with sns.axes_style("whitegrid"):
    fig, axes = plt.subplots(1, 3, sharex=True, sharey=True, figsize = (10, 4))
    for axnum, (xstar, ystar, zstar, xgal, ygal, zgal) in enumerate([(xstar_mean, ystar_mean, zstar_mean, xgal_mean, ygal_mean, zgal_mean), 
                                                                     (xstar_forced, ystar_forced, zstar_forced, xgal_forced, ygal_forced, zgal_forced), 
                                                                     (xstar_stacked, ystar_stacked, zstar_stacked,xgal_stacked, ygal_stacked, zgal_stacked)]):
        axes[axnum].contourf(xstar, ystar, zstar, levels = levels,
                             origin = origin,
                             cmap = cmap_star, alpha = 0.8)
        axes[axnum].contour(xstar, ystar, zstar, levels = levels,
                            linewidths=(0.3,), origin = origin,
                            colors = ("w",), alpha = 0.5, zorder = 11)
        axes[axnum].contourf(xgal, ygal, zgal, levels = levels,
                             origin = origin, 
                             cmap = cmap_gal, alpha = 0.8, zorder = 10)
        axes[axnum].contour(xgal, ygal, zgal, levels = levels,
                            linewidths=(0.5,), origin = origin, 
                            colors = ("w",), alpha = 0.5)

    axes[axnum].set_xlim(18.5, 24)
    axes[axnum].set_ylim(-1e-5, 1e-5)
    axes[1].set_xlabel(r"$m_\mathrm{Kron,white}\;(\mathrm{mag})$")
    axes[0].set_ylabel(r"$\mathrm{distance}$")
    fig.tight_layout()



In [138]:
cmap_star = sns.cubehelix_palette(rot=0.5, light=0.7,dark=0.3,as_cmap=True)
cmap_gal = sns.cubehelix_palette(start=0.3,rot=-0.5,light=0.7,dark=0.3,as_cmap=True)

with sns.axes_style("whitegrid"):
    fig, axes = plt.subplots(1, 3, sharex=True, sharey=True, figsize = (10, 4))
    for axnum, flux_ratio in enumerate([wwMeanPSFKronRatio, wwFPSFKronRatio, wwPSFKronRatio]):
        axes[axnum].hist(flux_ratio[det_mfs_gal], 
                         alpha=0.8, bins=np.arange(0,2.1,0.025), 
                         color = cmap_gal(0.6))
        axes[axnum].hist(flux_ratio[det_mfs_star], 
                         alpha=0.5, bins=np.arange(0,2.1,0.025), 
                         color = cmap_star(0.6))
    axes[0].set_xticks(np.linspace(0,2,9))
    fig.tight_layout()



In [141]:
origin = 'lower'
levels = [0.1, 0.25, 0.5, 0.75, 0.9,1]
cmap_star = sns.cubehelix_palette(rot=0.5, light=0.7,dark=0.3,as_cmap=True)
cmap_gal = sns.cubehelix_palette(start=0.3,rot=-0.5,light=0.7,dark=0.3,as_cmap=True)

with sns.axes_style("whitegrid"):
    fig, axes = plt.subplots(1, 3, sharex=True, sharey=True, figsize = (10, 4))
    for axnum, (xstar, ystar, zstar, xgal, ygal, zgal) in enumerate([(xstar_mean_psfap, ystar_mean_psfap, zstar_mean_psfap, xgal_mean_psfap, ygal_mean_psfap, zgal_mean_psfap), 
                                                                     (xstar_forced_psfap, ystar_forced_psfap, zstar_forced_psfap, xgal_forced_psfap, ygal_forced_psfap, zgal_forced_psfap), 
                                                                     (xstar_stacked_psfap, ystar_stacked_psfap, zstar_stacked_psfap, xgal_stacked_psfap, ygal_stacked_psfap, zgal_stacked_psfap)]):
        axes[axnum].contourf(xstar, ystar, zstar, levels = levels,
                             origin = origin,
                             cmap = cmap_star, alpha = 0.8)
        axes[axnum].contour(xstar, ystar, zstar, levels = levels,
                            linewidths=(0.3,), origin = origin,
                            colors = ("w",), alpha = 0.5, zorder = 11)
        axes[axnum].contourf(xgal, ygal, zgal, levels = levels,
                             origin = origin, 
                             cmap = cmap_gal, alpha = 0.8, zorder = 10)
        axes[axnum].contour(xgal, ygal, zgal, levels = levels,
                            linewidths=(0.5,), origin = origin, 
                            colors = ("w",), alpha = 0.5)

    axes[axnum].set_xlim(18.5, 24)
    axes[axnum].set_ylim(-1e-5, 1e-5)
    axes[1].set_xlabel(r"$m_\mathrm{Kron,white}\;(\mathrm{mag})$")
    axes[0].set_ylabel(r"$\mathrm{distance}$")
    fig.tight_layout()



In [143]:
cmap_star = sns.cubehelix_palette(rot=0.5, light=0.7,dark=0.3,as_cmap=True)
cmap_gal = sns.cubehelix_palette(start=0.3,rot=-0.5,light=0.7,dark=0.3,as_cmap=True)

with sns.axes_style("whitegrid"):
    fig, axes = plt.subplots(1, 3, sharex=True, sharey=True, figsize = (10, 4))
    for axnum, flux_ratio in enumerate([wwMeanPSFApRatio, wwFPSFApRatio, wwPSFApRatio]):
        axes[axnum].hist(flux_ratio[det_mfs_gal], 
                         alpha=0.8, bins=np.arange(0,2.1,0.025), 
                         color = cmap_gal(0.6))        
        axes[axnum].hist(flux_ratio[det_mfs_star], 
                         alpha=0.8, bins=np.arange(0,2.1,0.025), 
                         color = cmap_star(0.6))
    axes[0].set_xticks(np.linspace(0,2,9))

    fig.tight_layout()


Determine the combo of data + slope to maximize TPR at FPR = 0.005

There are 4 numerical quantities that we care about, though ultimately we hope to maximize the TPR at FPR = 0.005.


In [6]:
from sklearn.metrics import accuracy_score, roc_curve, roc_auc_score

def calc_distance(a, x, y): # model: y = ax
    a = np.array(a)
    model = (a*x).astype(float)
    wd = (y-model)/np.sqrt(1 + a**2)
    return np.array(wd)

def calc_accuracy(a, flux1, flux2, true_class):
    a = np.array(a)
    delta = calc_distance(a, flux1, flux2)
    pred_class = np.array((np.sign(delta)+1)/2, dtype = int)    # psf = kron >> gal
    acc = accuracy_score(true_class, pred_class)
    return acc

def calc_roc_auc(a, flux1, flux2, true_class): 
    a = np.array(a)
    delta = calc_distance(a, flux1, flux2)
    auc = roc_auc_score(true_class, delta)
    return auc

def calc_informedness_and_tpr(a, flux1, flux2, true_class): 
    a = np.array(a)
    delta = calc_distance(a, flux1, flux2)
    fpr, tpr, thre = roc_curve(true_class, delta)
    fom = np.interp(0.005, fpr, tpr)
    
    return np.max(tpr-fpr), fom

def calc_roc_curve(a, flux1, flux2, true_class): 
    a = np.array(a)
    delta = calc_distance(a, flux1, flux2)
    fpr, tpr, thre = roc_curve(true_class, delta)    
    return fpr, tpr, thre

In [146]:
a_grid = np.linspace(0, 2, 101)

for phot, wwphot in zip(["mean", "forced", "stacked"],
                        ["wwMean", "wwF", "ww"]):

    exec("acc_{:s} = np.empty(len(a_grid))".format(phot))
    exec("auc_{:s} = np.empty(len(a_grid))".format(phot))
    exec("inform_{:s} = np.empty(len(a_grid))".format(phot))
    exec("fom_{:s} = np.empty(len(a_grid))".format(phot))

    for i_a, a in enumerate(a_grid):
        exec("""acc_{:s}[i_a] = calc_accuracy(a, ps1_df.ix[det_mfs]['{:s}KronFlux'], 
                                      ps1_df.ix[det_mfs]['{:s}PSFFlux'], 
                                      np.array(ps1_df.ix[det_mfs]['mu_class'], dtype = int)-1)""".format(phot, wwphot, wwphot))
        exec("""auc_{:s}[i_a] = calc_roc_auc(a, ps1_df.ix[det_mfs]["{:s}KronFlux"], 
                                      ps1_df.ix[det_mfs]["{:s}PSFFlux"], 
                                      np.array(ps1_df.ix[det_mfs]["mu_class"], dtype = int)-1)""".format(phot, wwphot, wwphot))
        exec("""inform_{:s}[i_a], fom_{:s}[i_a] = calc_informedness_and_tpr(a, ps1_df.ix[det_mfs]['{:s}KronFlux'], 
                                      ps1_df.ix[det_mfs]['{:s}PSFFlux'], 
                                      np.array(ps1_df.ix[det_mfs]['mu_class'], dtype = int)-1)""".format(phot, phot, wwphot, wwphot))

In [151]:
with sns.axes_style("whitegrid"):
    fig, axes = plt.subplots(1, 4, sharex=True, figsize=(10,4))
    for phot in ["mean", "forced", "stacked"]:
        exec("axes[0].plot(a_grid, acc_{0}, label='{0}')".format(phot))
        exec("axes[1].plot(a_grid, auc_{0}, label='{0}')".format(phot))
        exec("axes[2].plot(a_grid, inform_{0}, label='{0}')".format(phot))
        exec("axes[3].plot(a_grid, fom_{0}, label='{0}')".format(phot))

    axes[3].set_xlim(0.5,1.25)

    for ax, ylims, ylabel in zip(axes, 
                                 [(0.35,0.9),(0.4,1),(0,0.8),(0,0.7)],
                                 [r"$\mathrm{Accuracy}$", r"$\mathrm{ROC \; AUC}$",
                                  r"$\mathrm{Informedness}$", r"$\mathrm{FoM}$"]):
        ax.set_ylim(ylims)
        ax.set_ylabel(ylabel)
        ax.set_xlabel(r"$a$")
        ax.legend(frameon=True, framealpha=0.95)
    axes[0].plot([0,2], [0.73, 0.73], '--', 
                 color = '0.5', zorder = -4)
    fig.tight_layout()

print("Max Acc = {:.4f} at a = {:.4f}".format(max(acc_stacked), 
                                              a_grid[np.argmax(acc_stacked)]))
print("Max AUC = {:.4f} at a = {:.4f}".format(max(auc_stacked), 
                                              a_grid[np.argmax(auc_stacked)]))
print("Max TPR-FPR = {:.4f} at a = {:.4f}".format(max(inform_stacked), 
                                                  a_grid[np.argmax(inform_stacked)]))
print("Max FoM = {:.4f} at a = {:.4f}".format(max(fom_stacked), 
                                              a_grid[np.argmax(fom_stacked)]))