This is a notebook to explore opSim outputs in different ways, mostly useful to supernova analysis. We will look at the opsim output called Enigma_1189


In [1]:
import numpy as np
%matplotlib inline 
import matplotlib.pyplot as plt

In [2]:
# Required packages sqlachemy, pandas (both are part of anaconda distribution, or can be installed with a python installer)
# One step requires the LSST stack, can be skipped for a particular OPSIM database in question
import OpSimSummary.summarize_opsim as so
from sqlalchemy import create_engine
import pandas as pd
print so.__file__


/Users/rbiswas/.local/lib/python2.7/site-packages/OpSimSummary-0.0.1.dev0-py2.7.egg/OpSimSummary/summarize_opsim.py

In [3]:
# This step requires LSST SIMS package MAF. The main goal of this step is to set DD and WFD to integer keys that 
# label an observation as Deep Drilling or for Wide Fast Deep.
# If you want to skip this step, you can use the next cell by uncommenting it, and commenting out this cell, if all you
# care about is the database used in this example. But there is no guarantee that the numbers in the cell below will work
# on other versions of opsim database outputs

#from lsst.sims.maf import db
#from lsst.sims.maf.utils import opsimUtils

In [4]:
# DD = 366
# WFD = 364

Read in OpSim output for modern versions: (sqlite formats)

Description of OpSim outputs are available on the page https://confluence.lsstcorp.org/display/SIM/OpSim+Datasets+for+Cadence+Workshop+LSST2015http://tusken.astro.washington.edu:8080 Here we will use the opsim output http://ops2.tuc.noao.edu/runs/enigma_1189/data/enigma_1189_sqlite.db.gz I have downloaded this database, unzipped and use the variable dbname to point to its location


In [5]:
# Change dbname to point at your own location of the opsim output
dbname = '/Users/rbiswas/data/LSST/OpSimData/enigma_1189_sqlite.db'
#opsdb = db.OpsimDatabase(dbname)
#propID, propTags = opsdb.fetchPropInfo()
#DD = propTags['DD'][0]
#WFD = propTags['WFD'][0]

Read in the OpSim DataBase into a pandas dataFrame


In [6]:
engine = create_engine('sqlite:///' + dbname)

The opsim database is a large file (approx 4.0 GB), but still possible to read into memory on new computers. You usually only need the Summary Table, which is about 900 MB. If you are only interested in the Deep Drilling Fields, you can use the read_sql_query to only select information pertaining to Deep Drilling Observations. This has a memory footprint of about 40 MB. Obviously, you can reduce this further by narrowing down the columns to those of interest only. For the entire Summary Table, this step takes a few minutes on my computer.

If you are going to do the read from disk step very often, you can further reduce the time used by storing the output on disk as a hdf5 file and reading that into memory

We will look at three different Summaries of OpSim Runs. A summary of the

  1. Deep Drilling fields: These are the observations corresponding to propID of the variable DD above, and are restricted to a handful of fields
  2. WFD (Main) Survey: These are the observations corresponding to the propID of the variables WFD
  3. Combined Survey: These are observations combining DEEP and WFD in the DDF. Note that this leads to duplicate observations which must be subsequently dropped.

In [7]:
# Load to a dataframe
# Summary = pd.read_hdf('storage.h5', 'table')
Summary = pd.read_sql_table('Summary', engine, index_col='obsHistID')
# EnigmaDeep  = pd.read_sql_query('SELECT * FROM SUMMARY WHERE PROPID is 366', engine)
# EnigmaD  = pd.read_sql_query('SELECT * FROM SUMMARY WHERE PROPID is 366', engine)

In [8]:
EnigmaCombined = Summary.query('propID == [364, 366]')# & (fieldID == list(EnigmaDeep.fieldID.unique().values)')

In [9]:
EnigmaCombined.propID.unique()


Out[9]:
array([364, 366])

Some properties of the OpSim Outputs


In [10]:
EnigmaCombined.fieldID.unique().size


Out[10]:
2295

Construct our Summary


In [11]:
Full = so.SummaryOpsim(EnigmaCombined)

In [12]:
fig = plt.figure(figsize=(10, 5))
ax = fig.add_subplot(111, projection='mollweide');
fig = Full.showFields(ax=fig.axes[0], marker='o', s=1)


/usr/local/manual/anaconda/lib/python2.7/site-packages/matplotlib/collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == str('face'):

First Season

We can visualize the cadence during the first season using the cadence plot for a particular field: The following plot shows how many visits we have in different filters on a particular night:


In [13]:
fieldList = Full.fieldIds

In [14]:
len(fieldList)


Out[14]:
2295

Example to obtain the observations of in a 100 day period in a field

First find the fieldID witha center closest to your coordinates. The fields are of radial size about 1.75 degrees. I would suggest just going to a fieldID, as you probably don't care about the coordinates. Then the following query would get this done. Alternatively, you could also achieve some of these goals using sql queries on the opsim database.


In [24]:
selected = Full.df.query('fieldID == 290 and expMJD > 49490 and expMJD < 49590')

In [26]:
selected.head()


Out[26]:
sessionID propID fieldID fieldRA fieldDec filter expDate expMJD night visitTime ... humidity slewDist slewTime fiveSigmaDepth ditheredRA ditheredDec MJDay simLibPsf simLibZPTAVG simLibSkySig
obsHistID
140297 1189 366 290 6.097944 -1.10516 r 15402156 49531.265704 178 34 ... 0 0.93277 83.261838 23.364830 6.076685 -1.088628 49531 2.457219 31.682579 48.425003
140298 1189 366 290 6.097944 -1.10516 r 15402193 49531.266133 178 34 ... 0 0.00000 3.000000 23.294772 6.076685 -1.088628 49531 2.625730 31.682781 48.364798
140299 1189 366 290 6.097944 -1.10516 r 15402229 49531.266549 178 34 ... 0 0.00000 2.000000 23.296010 6.076685 -1.088628 49531 2.624079 31.683107 48.354382
140300 1189 366 290 6.097944 -1.10516 r 15402266 49531.266978 178 34 ... 0 0.00000 3.000000 23.297206 6.076685 -1.088628 49531 2.622481 31.683424 48.344477
140301 1189 366 290 6.097944 -1.10516 r 15402303 49531.267406 178 34 ... 0 0.00000 3.000000 23.298434 6.076685 -1.088628 49531 2.620845 31.683747 48.334124

5 rows × 48 columns


In [27]:
# write to disk in ascii file
selected.to_csv('selected_obs.csv', index='obsHistID')

In [34]:
# write to disk in ascii file with selected columns
selected[['expMJD', 'night', 'filter', 'fiveSigmaDepth', 'filtSkyBrightness', 'finSeeing']].to_csv('selected_cols.csv', index='obsHistID')

Plots


In [28]:
fig_firstSeason, firstSeasonCadence = Full.cadence_plot(fieldList[0], observedOnly=False, sql_query='night < 366')


This is a DDF.

  • Many observations per night
  • Often about 20-25 per filter per night

In [14]:
fig_firstSeason_1, firstSeasonCadence_1 = Full.cadence_plot(fieldList[0], observedOnly=True, sql_query='night < 366')


WFD field


In [15]:
fig_firstSeason_main, firstSeasonCadence_main = Full.cadence_plot(fieldList[1], observedOnly=False, sql_query='night < 366')



In [16]:
fig_long, figCadence_long  = Full.cadence_plot(fieldList[0], observedOnly=False, sql_query='night < 3655', nightMax=3655)



In [18]:
fig_2, figCadence_2  = Full.cadence_plot(fieldList[0], observedOnly=False, 
                                         sql_query='night < 720', nightMax=720, nightMin=365)



In [20]:
fig_SN, SN_matrix = Full.cadence_plot(fieldList[0], observedOnly=False, mjd_center=49540., mjd_range=[-30., 50.])