This notebook will calculate a simple error model for the Twinkles data as a function of the seeing and depth.
You will need the DESC Monitor
and its dependencies.
You will also need the truth database and OpSim database for the desired Twinkles run. This example uses the run 1.1 settings and thus we use kraken_1042
as our OpSim database, but further investigation will be done when 1.3 results are ready and we will move to the minion_1016
database. LSST OpSim databases can be downloaded here.
The Error Model will be a way to estimate the uncertainties in the Twinkles data from the properties of a visit and an object. There are three major motivations behind this:
So far, we have worked on modeling the relationships between flux uncertainties and the observed seeing and 5 sigma depth of a visit.
NOTE: As mentioned above the current analysis in this notebook is done using Twinkles Run 1.1. This notebook will be updated with our latest findings when Run 1.3 is ready.
In [1]:
import os
import desc.monitor
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from lsst.sims.photUtils import calcNeff
%matplotlib inline
%load_ext autoreload
%autoreload 2
The Monitor has the ability to load the true fluxes and the observed fluxes (stored in MySQL databases on NERSC) for us. If you are not running this on jupyter-dev you will have to open an ssh-tunnel as described in the Monitor package setup.
Let's start by specifying the location of our truth database and setting up a truth database connection.
Since we are using the same field and chip in run 1.1 and 1.3 we can use the run 1.1 stars database without issues when 1.3 is ready.
In [2]:
star_db_name = '../../twinkles_run1.1.db'
truth_dbConn = desc.monitor.TruthDBInterface(database=star_db_name, driver='sqlite')
Then we'll establish a database connection to the NERSC MySQL database for the observed data from Twinkles.
In [3]:
twinkles_dbConn = desc.monitor.DBInterface(database='DESC_Twinkles_Level_2',
#host='127.0.0.1', port='3307', ##if not running jupyter-dev
host='scidb1.nersc.gov', port=3306,
driver='mysql', project='Twinkles Run1.1')
And finally we'll establish the connection the Opsim database.
In [4]:
opsim_dbConn = desc.monitor.OpsimDBInterface('../../kraken_1042_sqlite.db') ##Run 1.1 OpSim database
#opsim_dbConn = desc.monitor.OpsimDBInterface('../../minion_1016_sqlite.db')
In [5]:
dm_visit_info = twinkles_dbConn.get_all_visit_info()
In [6]:
opsim_info = opsim_dbConn.get_summary_depth_info_for_visits(1427) #1427 is the Twinkles field ID
In [7]:
#Find entries with the proper obsHistIds in the Opsim data to match the Twinkles run
obs_list = []
for visit_id in dm_visit_info['visit_id']:
obs_list.append(np.where(opsim_info['obsHistID'] == visit_id)[0][0])
opsim_info = opsim_info[obs_list]
In [8]:
worker = desc.monitor.Monitor(twinkles_dbConn, truth_dbConn=truth_dbConn)
In [9]:
depth_curve = worker.measure_depth_curve()
In [10]:
seeing_curve = worker.measure_seeing_curve()
We use values from CcdVisit
table in our Twinkles
pserv database along with the following adapted from LSE-40, http://www.astro.washington.edu/users/ivezic/Teaching/Astr511/LSST_SNRdoc.pdf.
seeing
is a value direct from the CcdVisit
table.sky_noise
values in the CcdVisit
table.$zero\_point$ below is the zero_point
value in the CcdVisit
table.
We have that for a single Gaussian profile and in the $\textbf{background dominated limit}$:
$n_{eff} = 2.266*(\frac{seeing}{pixel\_scale})^{2}$
$SNR = \frac{Signal}{Noise}$
$SNR = \frac{Source\_Counts}{\sigma_{tot}*\sqrt{n_{eff}}}$
So that, SNR = 5 gives
$Source\_Counts_{5\sigma} = 5 * \sigma_{tot} * \sqrt{n_{eff}}$
and to convert to flux in maggies we scale by the zeropoint and then multiply by 1e9 to get nanomaggies.
$F_{5} = \frac{10^{9}* Source\_Counts_{5\sigma}}{zero\_point}$
But, the version we use in the monitor
is modified a bit to account for the Poisson noise of the source and thus derive a general form outside of the background dominated limit.
$SNR = \frac{Signal}{\sqrt{Signal + Noise^2}}$
$SNR = 5 = \frac{Source\_Counts_{5\sigma}}{\sqrt{Source\_Counts_{5\sigma} + (\sigma_{tot}*\sqrt{n_{eff}})^2}}$
We solve this as a quadratic equation and take the positive root.
$Source\_Counts_{5\sigma} = \frac{25 + 25*\sqrt{1 + \frac{4}{25}*(\sigma_{tot}*\sqrt{n_{eff}})^2}}{2}$
and once again convert to flux in nanomaggies using the zeropoint and 1e9 conversion factor.
These results are the $\color{red}{red\ line}$ in the plot below.
FWHMeff
values from the Summary
table in the Opsim database.Now we will plot and compare the two values for 5-sigma depth we get.
In [11]:
fig = plt.figure()
bins = 15
n,bins,p = plt.hist(depth_curve.lightcurve['mag'], histtype='step', lw=4, bins=15, label='From CcdVisit', range=(21.5,26.5))
plt.hist(opsim_info['fiveSigmaDepth'], histtype='step', bins=bins, lw=4, label='Opsim Values')
plt.legend()
plt.xlabel('5 sigma depth (mags)')
plt.ylabel('Number of Visits')
plt.title('Twinkles 1.1 5-sigma Depth')
#plt.ylim(0, 6500)
Out[11]:
And here we plot by filter.
In [12]:
fig = plt.figure(figsize=(18, 12))
fig_num = 1
for filter_val in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,fig_num)
n,bins,p = plt.hist(depth_curve.lightcurve['mag'][depth_curve.lightcurve['bandpass'] == str('lsst'+filter_val)],
histtype='step', lw=4, bins=15, label='CcdVisit', range=(21.5,26.5))
plt.hist(opsim_info['fiveSigmaDepth'][opsim_info['filter'] == filter_val], histtype='step', bins=bins, lw=4, label='Opsim Values')
if fig_num == 1:
plt.legend()
plt.xlabel('5 sigma depth (mags)')
plt.ylabel('Number of Visits')
#if fig_num == 1:
# plt.ylim(0, 2800)
plt.title(filter_val)
fig_num += 1
plt.suptitle('Twinkles 1.1 5-sigma Depth by filter')
plt.subplots_adjust(top=0.93)
It looks like there are discrepancies between the values measured using the PhoSim images and the Opsim values. We need to look into this. It looks like the DM calculated values in the PhoSim images are consistently deeper than the Opsim depth. Also, the effect looks the worst in the u and y filters. This indicates that there are differences that need to be explained between the PhoSim sky and the OpSim sky.
In [13]:
plt.hist(opsim_info['FWHMeff'] - seeing_curve.seeing_curve['seeing'], range=(-0.2, 0.4), bins=20)
plt.title(r'Opsim $FWHM_{\bf{eff}}$ - DM seeing')
plt.xlabel(r'Opsim $FWHM_{\bf{eff}}$ - DM seeing (arcsec)')
plt.ylabel('# of visits')
Out[13]:
In [14]:
fig = plt.figure(figsize=(8,6))
plt.scatter(opsim_info['FWHMeff'], opsim_info['FWHMeff'] - seeing_curve.seeing_curve['seeing'])
l1, = plt.plot(np.arange(0, 1.8, 0.01), np.zeros(len(np.arange(0, 1.8, 0.01))), c='r', label='DM seeing = Opsim seeing')
plt.xlim(0, 1.8)
#plt.ylim(0, 1.8)
plt.xlabel(r'Opsim $FWHM_{\bf{eff}}$ (arcsec)')
plt.ylabel(r'Opsim $FWHM_{\bf{eff}}$ - DM seeing (arcsec)')
plt.legend([l1], ['PhoSim+DM seeing = Opsim seeing'], loc=2)
plt.title('Twinkles 1.1 Seeing Comparison')
Out[14]:
There are also discrepencies here as well. It seems like the PhoSim+DM values are consistently under the Opsim values.
The large outliers were a result of cosmic ray repair not being turned on in the processing and was discussed in this issue. We therefore do not expect to see these particular outliers in the 1.3 data.
In the rest of this notebook we compare the flux values from our PhoSim images to the "true" values that we know from the input catalogs to the simulation. We start with a position matching method that finds matches between the objects of the observed catalogs and the truth catalogs based upon angular distance.
Then the worker calculates the flux residuals (observed - truth) and finds the mean and variance of this value among the objects in each visit.
Note: The following is currently Run 1.1 data since we have not finished the run 3 processing. As soon as it is available we will update these plots with 1.3 data.
In [15]:
distances = worker.match_catalogs(return_distance=True, within_radius=1./3600.)
In [16]:
worker.calc_flux_residuals(depth_curve, seeing_curve)
In [17]:
fig = plt.figure(figsize=(18,12))
i=1
for band in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,i)
worker.plot_bias_map(with_bins=12, in_band=band, use_existing_fig=fig)
i+=1
plt.tight_layout()
In [18]:
fig = plt.figure(figsize=(18,12))
i=1
for band in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,i)
worker.plot_bias_map(with_bins=12, in_band=band, use_existing_fig=fig,
normalize=True)
i+=1
plt.tight_layout()
In [19]:
fig = plt.figure(figsize=(18,12))
i=1
for band in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,i)
worker.plot_bias_scatter(in_band=band, use_existing_fig=fig)
i+=1
plt.gca().set_axis_bgcolor('k')
plt.tight_layout()
In [20]:
fig = plt.figure(figsize=(18,12))
i=1
for band in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,i)
worker.plot_bias_scatter(in_band=band, use_existing_fig=fig, normalize=True)
i+=1
plt.gca().set_axis_bgcolor('k')
plt.tight_layout()
In [21]:
fig = plt.figure(figsize=(18,12))
i=1
for band in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,i)
worker.plot_variance_map(with_bins=12, in_band=band, use_existing_fig=fig)
i+=1
plt.tight_layout()
In [22]:
fig = plt.figure(figsize=(18,12))
i=1
for band in ['u', 'g', 'r', 'i', 'z', 'y']:
fig.add_subplot(2,3,i)
worker.plot_variance_scatter(in_band=band, use_existing_fig=fig)
i+=1
plt.tight_layout()
The bias and variance maps show a similar per-band effect as the depth and seeing comparisons above and show that our priority task right now is to figure out why the depth and seeing disparities are occurring.
There are two possible sources of discrepancy, either 1) the PhoSim realization of the OpSim observation, or 2) the DM processing of the PhoSim realization. If we simply want to predict DM measurements given OpSim inputs, the error model shown here (and its descendants) will enable that. However, when looking to develop an accurate inference scheme for real data, we will need to isolate step 2) by comparing with PhoSim 'inputs'. This kind of investigation will also be enabled by the Monitor functionality demonstrated here.