In [1]:
%pylab inline
from pandas.io.parsers import read_csv
d = read_csv('NGC6181.txt', index_col=0) # made with SQL_NGC6181.txt
In [2]:
# How good were the photo-z's for the objects identified
# as satellites of NGC 6181? I.e. how many sigma's away
# from z_host (where "sigma" is the photoZ error)?
objid_sat = [1237662698115433445, 1237662661610767569,
1237662698115432544, 1237662662147571761]
z_true = 0.007922 # z-spec of NGC 6181
# There are two flavors of SDSS photoz's: one based on a
# nearest neighbor algorithm (NN), and one based on a
# random forest (RF) algorithm.
nsigma_nn = np.abs((d.zNN-z_true)/d.zNN_err)
nsigma_rf = np.abs((d.zRF-z_true)/d.zRF_err)
print 'N_sigma_photozNN: '
print nsigma_nn[objid_sat]
print ' '
print 'N_sigma_photozRF: '
print nsigma_rf[objid_sat]
The takeaway is that the SDSS photoz's were within 1.8$\sigma$ of the host redshift. By what factor could we reduce the number of targets if we require that both flavors of photoZ are within some number of sigma of the host redshift?
In [3]:
nsigma = 3.0 # maximum (photoz-zHost)/photoz_error
r_max = 20.5 # maximum r mag
gr_max = 1.3 # maximum g-r color
ri_max = 0.7 # maximum r-i color
# Baseline cuts are (r mag, g-r color, r-i color)
wh_baseline = np.where((d.r<r_max) &
((d.g-d.r)<gr_max) &
((d.r-d.i)<ri_max))[0]
# New cuts are Baseline + nsigma_photoz.
wh_target = np.where((d.r<r_max) &
((d.g-d.r)<gr_max) &
((d.r-d.i)<ri_max) &
(nsigma_nn<nsigma) &
(nsigma_rf<nsigma))[0]
n_baseline = len(wh_baseline)
n_target = len(wh_target)
print '%i objects in baseline sample.'%n_baseline
print '%i objects in new sample.'%n_target
print 'Reduction of %0.2f'%(1.*n_baseline/n_target)
The reduction factor is even better if you only keep objects within 2.0$\sigma$ of the host redshift, and the reduction factors don't seem to depend much on the maximum $r$ magnitude of the sample:
| $r_{\rm{max}}$ | $\sigma_{\rm{photoZ}}$ | Reduction factor |
|---|---|---|
| 19.5 | 2.0 | 6.6 |
| 20.5 | 2.0 | 6.9 |
| 19.5 | 3.0 | 3.4 |
| 20.5 | 3.0 | 3.2 |
I'm sure there's a tradeoff here. If you cut on $\sigma_{\rm{photoZ}}$, you'll reduce the number of targets by a factor of 3-7 (and perhaps another factor of 2 if you cut on SDSS's star/galaxy classification), but at the expense of completeness in the final satellite galaxy sample.