This is a notebook to explore opSim outputs in different ways, mostly useful to supernova analysis. We will look at the opsim output called (minion_1016)[https://confluence.lsstcorp.org/display/SIM/Summary+Table+Column+Descriptions]. Other relevant OpSim outputs are listed on a confluence page. All of the newer outputs are in the format of sqlite databases (zipped up to save space), while older OpSim versions are in ASCII files. The database table which I think has all of the information we will need is the Summary Table. The quantities in the columns of this table are described (here)[https://confluence.lsstcorp.org/display/SIM/Summary+Table+Column+Descriptions]. The OpSim outputs are official products that the LSST project provides. This notebook will demonstrate a way of exploring these outputs (which is by no means official or project supplied).

Note: The Summary Table is only about a GB in size, and has ~ million rows. For newish laptops, this is a small table and can be easily read into memory. For more constrained memory systems or computations that require a lot of memory, this may be a bad thing to do.

Gotcha: The column obsHistID is unique identifier of OpSim observations. Very often, you might end up with a table with multiple rows with the same obsHistID, but with other columns (like propID) having different values. These are not different observations, and a SN light curve corresponding to these observations should include only one of these.

A bit more information about the columns from a SN perspective

The LSST observations are over roughly half of the sky (~20000 sq degrees, or ~2\pi). This is covered by overlapping pointings of the telescope each covering (~10 sq degrees). These pointings are currently thought of as being located on a grid along with dithers. The grid locations of the pointings are assigned a unique integer called fieldID which is associated with its location :(fieldRa, fieldDec ). The actual location of the pointings (including dithers) are in (ditheredRA, ditheredDec). There are a number of columns that hold quantities of similar information, and the exact definitions of each quantity is provided in the description. For most purposes filtSkyBrightness and FWHMeff rather than the other similar looking quantities are probably recommended. PropID (also described there) refers to proposals under which the observation was taken. For the basic SN purposes, we will want the proposals (Wide Fast Deep (WFD)) and (Deep Drilling Fields (DDF)). The propID for these quantities differs from one opsim output to another. To find out what these are, you can use OpSim utils as in this notebook or look at the PROPOSAL table (5 lines) like this : and pick the integers corresponding to DDcosmology1.conf and Universal ...

 SELECT * FROM PROPOSAL;
362|../conf/survey/GalacticPlaneProp.conf|WL|27692624|enigma|1189
363|../conf/survey/SouthCelestialPole-18.conf|WL|27692368|enigma|1189
364|../conf/survey/Universal-18-0824B.conf|WLTSS|27692112|enigma|1189
365|../conf/survey/NorthEclipticSpur-18c.conf|WLTSS|27692240|enigma|1189
366|../conf/survey/DDcosmology1.conf|WLTSS|29065424|enigma|1189



In [1]:

    
import numpy as np
%matplotlib inline 
import matplotlib.pyplot as plt



In [2]:

    
# Required packages sqlachemy, pandas (both are part of anaconda distribution, or can be installed with a python installer)
# One step requires the LSST stack, can be skipped for a particular OPSIM database in question
import opsimsummary as oss
import opsimsummary.summarize_opsim as so
from sqlalchemy import create_engine
import pandas as pd
print so.__file__









    



/Users/rbiswas/.local/lib/python2.7/site-packages/opsimsummary/summarize_opsim.pyc



In [3]:

    
# This step requires LSST SIMS package MAF. The main goal of this step is to set DD and WFD to integer keys that 
# label an observation as Deep Drilling or for Wide Fast Deep.
# If you want to skip this step, you can use the next cell by uncommenting it, and commenting out this cell, if all you
# care about is the database used in this example. But there is no guarantee that the numbers in the cell below will work
# on other versions of opsim database outputs

from lsst.sims.maf import db
from lsst.sims.maf.utils import opsimUtils



In [4]:

    
# DD = 56
# WFD = 54

Read in OpSim output for modern versions: (sqlite formats)

Here we will use the opsim output minion_1016 I have downloaded this database, unzipped and use the variable dbname to point to its location



In [5]:

    
# Change dbname to point at your own location of the opsim output
dbname = '/Users/rbiswas/data/LSST/OpSimData/minion_1016_sqlite.db'
opsdb = db.OpsimDatabase(dbname)
propID, propTags = opsdb.fetchPropInfo()
DD = propTags['DD'][0]
WFD = propTags['WFD'][0]



In [6]:

    
print("The propID for the Deep Drilling Field {0:2d}".format(DD))
print("The propID for the Wide Fast Deep Field {0:2d}".format(WFD))









    



The propID for the Deep Drilling Field 56
The propID for the Wide Fast Deep Field 54

Read in the OpSim DataBase into a pandas dataFrame

Here we will read the opsim database into a pandas.DataFrame



In [7]:

    
engine = create_engine('sqlite:///' + dbname)

The opsim database is a large file (approx 4.0 GB), but still possible to read into memory on new computers. You usually only need the Summary Table, which is about 900 MB. If you are only interested in the Deep Drilling Fields, you can use the read_sql_query to only select information pertaining to Deep Drilling Observations. This has a memory footprint of about 40 MB. Obviously, you can reduce this further by narrowing down the columns to those of interest only. For the entire Summary Table, this step takes a few minutes on my computer.

If you are going to do the read from disk step very often, you can further reduce the time used by storing the output on disk as a hdf5 file and reading that into memory

We will look at three different Summaries of OpSim Runs. A summary of the

Deep Drilling fields: These are the observations corresponding to propID of the variable DD above, and are restricted to a handful of fields
WFD (Main) Survey: These are the observations corresponding to the propID of the variables WFD
Combined Survey: These are observations combining DEEP and WFD in the DDF. Note that this leads to duplicate observations which must be subsequently dropped.



In [8]:

    
# Load to a dataframe
# Summary = pd.read_hdf('storage.h5', 'table') # This loads the table from a hdf file which I store as intermediate result, this is extremely quick
# Summary = pd.read_sql_table('Summary', engine, index_col='obsHistID'), #'loads all of the summary table'
# EnigmaDeep  = pd.read_sql_query('SELECT * FROM SUMMARY WHERE PROPID is 366', engine) # loads only the DDF

If we knew ahead of time the proposal ID, then we could have done this quicker using



In [9]:

    
OpSim_combined = pd.read_sql_query('SELECT * FROM SUMMARY WHERE PROPID in ({0}, {1})'.format(DD, WFD), engine, index_col='obsHistID')



In [10]:

    
OpSim_combined.head()









    Out[10]:






  
    
      
      sessionID
      propID
      fieldID
      fieldRA
      fieldDec
      filter
      expDate
      expMJD
      night
      visitTime
      ...
      moonBright
      darkBright
      rawSeeing
      wind
      humidity
      slewDist
      slewTime
      fiveSigmaDepth
      ditheredRA
      ditheredDec
    
    
      obsHistID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      1016
      54
      316
      1.676483
      -1.082473
      y
      2922
      59580.033829
      0
      34.0
      ...
      0.0
      105.610378
      0.920473
      0.0
      0.0
      1.620307
      0.000000
      21.021236
      1.643930
      -1.108924
    
    
      2
      1016
      54
      372
      1.694120
      -1.033972
      y
      2961
      59580.034275
      0
      34.0
      ...
      0.0
      105.167017
      0.920473
      0.0
      0.0
      0.049266
      4.574944
      21.026646
      1.664257
      -1.060423
    
    
      3
      1016
      54
      441
      1.708513
      -0.985271
      y
      2999
      59580.034722
      0
      34.0
      ...
      0.0
      104.849578
      0.920473
      0.0
      0.0
      0.049298
      4.595615
      21.030520
      1.680878
      -1.011722
    
    
      4
      1016
      54
      505
      1.720374
      -0.936476
      y
      3038
      59580.035169
      0
      34.0
      ...
      0.0
      104.669398
      0.920473
      0.0
      0.0
      0.049266
      4.605366
      21.032718
      1.694604
      -0.962927
    
    
      5
      1016
      54
      587
      1.730248
      -0.887672
      y
      3077
      59580.035616
      0
      34.0
      ...
      0.0
      104.627207
      0.920473
      0.0
      0.0
      0.049177
      4.604023
      21.033233
      1.706054
      -0.914123
    
  

5 rows × 46 columns

This could have duplicates unlike in the case of the OpSim Deep. This is because there are now two proposal IDs both of which may correspond to the same observation. We can check that this is indeed the case by:



In [11]:

    
len(OpSim_combined) == len(OpSim_combined.index.unique())









    Out[11]:





False

Dropping the duplicate pointings can be done in the following way. The reset_index() makes 'obsHistID' an ordinary column rather than an index, drop_duplicates drops duplicate rows where



In [12]:

    
OpSim_combined.reset_index().drop_duplicates(subset='obsHistID', inplace=True)



In [13]:

    
OpSim_combined.head()









    Out[13]:






  
    
      
      sessionID
      propID
      fieldID
      fieldRA
      fieldDec
      filter
      expDate
      expMJD
      night
      visitTime
      ...
      moonBright
      darkBright
      rawSeeing
      wind
      humidity
      slewDist
      slewTime
      fiveSigmaDepth
      ditheredRA
      ditheredDec
    
    
      obsHistID
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1
      1016
      54
      316
      1.676483
      -1.082473
      y
      2922
      59580.033829
      0
      34.0
      ...
      0.0
      105.610378
      0.920473
      0.0
      0.0
      1.620307
      0.000000
      21.021236
      1.643930
      -1.108924
    
    
      2
      1016
      54
      372
      1.694120
      -1.033972
      y
      2961
      59580.034275
      0
      34.0
      ...
      0.0
      105.167017
      0.920473
      0.0
      0.0
      0.049266
      4.574944
      21.026646
      1.664257
      -1.060423
    
    
      3
      1016
      54
      441
      1.708513
      -0.985271
      y
      2999
      59580.034722
      0
      34.0
      ...
      0.0
      104.849578
      0.920473
      0.0
      0.0
      0.049298
      4.595615
      21.030520
      1.680878
      -1.011722
    
    
      4
      1016
      54
      505
      1.720374
      -0.936476
      y
      3038
      59580.035169
      0
      34.0
      ...
      0.0
      104.669398
      0.920473
      0.0
      0.0
      0.049266
      4.605366
      21.032718
      1.694604
      -0.962927
    
    
      5
      1016
      54
      587
      1.730248
      -0.887672
      y
      3077
      59580.035616
      0
      34.0
      ...
      0.0
      104.627207
      0.920473
      0.0
      0.0
      0.049177
      4.604023
      21.033233
      1.706054
      -0.914123
    
  

5 rows × 46 columns



In [14]:

    
len(OpSim_combined) == len(OpSim_combined.index.unique())









    Out[14]:





False



In [15]:

    
OpSim_Deep = pd.read_sql_query('SELECT * FROM SUMMARY WHERE PROPID is ' + str(DD), engine, index_col='obsHistID')

We can also sub-select this from the all-encompassing Summay Table. This can be done in two way:



In [16]:

    
OpSim_combined.propID.unique()









    Out[16]:





array([54, 56])



In [17]:

    
OpSim_Deep.propID.unique()









    Out[17]:





array([56])



In [18]:

    
# Get fieldID closest to ra, dec
fieldIDFromRADec = oss.fieldID(OpSim_combined, np.radians(190.), np.radians(-83.0))
print fieldIDFromRADec

Some properties of the OpSim Outputs

Construct our Summary



In [19]:

    
OpSimDeepSummary = so.SummaryOpsim(OpSim_Deep)
OpSimCombinedSummary = so.SummaryOpsim(OpSim_combined)

Plot the location of deep fields



In [20]:

    
fig = plt.figure(figsize=(10, 5))
ax = fig.add_subplot(111, projection='mollweide');
fig = OpSimDeepSummary.showFields(ax=fig.axes[0], marker='o', s=40)

How often do multiple DDF fields get observed ?



In [21]:

    
DDF_fieldIDs = OpSim_Deep.fieldID.unique()
DDF_fieldIDs









    Out[21]:





array([1427, 2786,  290,  744, 2412])



In [22]:

    
grouped = OpSim_Deep[['night', 'fieldID']].groupby(['night'])



In [23]:

    
fig, ax = plt.subplots()
grouped.agg({'fieldID': lambda x: x.unique().size}).hist(ax=ax)









    Out[23]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x12b887e10>], dtype=object)



In [24]:

    
fig.savefig('DeepOnly_uniqueDDFFields.png')

What about if we count WFD visits to these fields ?



In [25]:

    
OpSimCombined_DDF_Fields = OpSim_combined.query('fieldID in @DDF_fieldIDs')



In [26]:

    
len(OpSimCombined_DDF_Fields)









    Out[26]:





114225



In [27]:

    
OpSimCombined_DDF_Fields.fieldID.unique()









    Out[27]:





array([ 744, 2412, 1427, 2786,  290])



In [28]:

    
combinedGrouped = OpSimCombined_DDF_Fields.groupby('night')



In [29]:

    
fig, ax = plt.subplots()
combinedGrouped.agg({'fieldID': lambda x: x.unique().size}).hist(ax=ax)









    Out[29]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x12fca8ed0>], dtype=object)



In [30]:

    
fig.savefig('Combined_DDFvisits.png')

Example plotting of a single field



In [31]:

    
fig = plt.figure(figsize=(10, 5))
ax = fig.add_subplot(111, projection='mollweide');
opsimFieldSummary = so.SummaryOpsim(OpSim_combined.query('fieldID==347'))
fig = opsimFieldSummary.showFields(ax=fig.axes[0], marker='o', s=40)

Plot all fields



In [32]:

    
OpSimCombinedSummary.showFields(ax=ax, marker='o', color='r', s=8)









    Out[32]:



In [33]:

    
fieldIDFromRADec = oss.fieldID(OpSim_Deep, np.radians(53.), np.radians(-28.))
print fieldIDFromRADec



In [34]:

    
fieldIDFromRADec = oss.fieldID(OpSim_Deep, np.radians(0.), np.radians(-45.))
print fieldIDFromRADec



In [35]:

    
fieldIDFromRADec = oss.fieldID(OpSim_combined, np.radians(53.), np.radians(-28.))
print fieldIDFromRADec



In [36]:

    
fieldIDFromRADec = oss.fieldID(OpSim_combined, np.radians(85.7), np.radians(-14.4))
print fieldIDFromRADec



In [57]:

    
fieldList = OpSimCombinedSummary.fieldIds

Obtain a list of filters, mjds, depths, and field ra, dec for deep fields



In [38]:

    
OpSim_combined.columns









    Out[38]:





Index([u'sessionID', u'propID', u'fieldID', u'fieldRA', u'fieldDec', u'filter',
       u'expDate', u'expMJD', u'night', u'visitTime', u'visitExpTime',
       u'finRank', u'FWHMeff', u'FWHMgeom', u'transparency', u'airmass',
       u'vSkyBright', u'filtSkyBrightness', u'rotSkyPos', u'rotTelPos', u'lst',
       u'altitude', u'azimuth', u'dist2Moon', u'solarElong', u'moonRA',
       u'moonDec', u'moonAlt', u'moonAZ', u'moonPhase', u'sunAlt', u'sunAz',
       u'phaseAngle', u'rScatter', u'mieScatter', u'moonIllum', u'moonBright',
       u'darkBright', u'rawSeeing', u'wind', u'humidity', u'slewDist',
       u'slewTime', u'fiveSigmaDepth', u'ditheredRA', u'ditheredDec'],
      dtype='object')



In [39]:

    
DDF_fieldIDs









    Out[39]:





array([1427, 2786,  290,  744, 2412])



In [40]:

    
selectedFields = OpSim_combined.query('fieldID in @DDF_fieldIDs')[['expMJD', 'propID', 'filter', 'fieldRA', 'fieldDec', 'fieldID', 'fiveSigmaDepth']]#.head()



In [41]:

    
selectedFields.head()









    Out[41]:






  
    
      
      expMJD
      propID
      filter
      fieldRA
      fieldDec
      fieldID
      fiveSigmaDepth
    
    
      obsHistID
      
      
      
      
      
      
      
    
  
  
    
      132712
      59760.431760
      54
      z
      0.0
      -0.794553
      744
      22.088572
    
    
      132741
      59760.445023
      54
      z
      0.0
      -0.794553
      744
      22.037498
    
    
      138180
      59770.355327
      54
      y
      0.0
      -0.794553
      744
      21.921016
    
    
      138192
      59770.362167
      54
      r
      0.0
      -0.794553
      744
      24.264669
    
    
      141306
      59773.317909
      54
      y
      0.0
      -0.794553
      744
      21.852082



In [42]:

    
selectedFields.to_hdf('DDF_Fields_Info.hdf', 'summaryInfo')









    



/Users/rbiswas/soft/LSST/DarwinX86/miniconda2/4.2.12.lsst1/lib/python2.7/site-packages/pandas/io/pytables.py:270: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->unicode,key->axis0] [items->None]

  f(store)
/Users/rbiswas/soft/LSST/DarwinX86/miniconda2/4.2.12.lsst1/lib/python2.7/site-packages/pandas/io/pytables.py:270: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->unicode,key->block0_items] [items->None]

  f(store)
/Users/rbiswas/soft/LSST/DarwinX86/miniconda2/4.2.12.lsst1/lib/python2.7/site-packages/pandas/io/pytables.py:270: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->unicode,key->block1_items] [items->None]

  f(store)
/Users/rbiswas/soft/LSST/DarwinX86/miniconda2/4.2.12.lsst1/lib/python2.7/site-packages/pandas/core/generic.py:1138: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->unicode,key->block2_values] [items->[u'filter']]

  return pytables.to_hdf(path_or_buf, key, self, **kwargs)
/Users/rbiswas/soft/LSST/DarwinX86/miniconda2/4.2.12.lsst1/lib/python2.7/site-packages/pandas/io/pytables.py:270: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->unicode,key->block2_items] [items->None]

  f(store)



In [43]:

    
# CHECK
read_table = pd.read_hdf('DDF_Fields_Info.hdf', 'summaryInfo')



In [44]:

    
read_table.head()









    Out[44]:






  
    
      
      expMJD
      propID
      filter
      fieldRA
      fieldDec
      fieldID
      fiveSigmaDepth
    
    
      obsHistID
      
      
      
      
      
      
      
    
  
  
    
      132712
      59760.431760
      54
      z
      0.0
      -0.794553
      744
      22.088572
    
    
      132741
      59760.445023
      54
      z
      0.0
      -0.794553
      744
      22.037498
    
    
      138180
      59770.355327
      54
      y
      0.0
      -0.794553
      744
      21.921016
    
    
      138192
      59770.362167
      54
      r
      0.0
      -0.794553
      744
      24.264669
    
    
      141306
      59773.317909
      54
      y
      0.0
      -0.794553
      744
      21.852082



In [45]:

    
from pandas.util.testing import assert_frame_equal



In [46]:

    
assert_frame_equal(read_table, selectedFields)

First Season

We can visualize the cadence during the first season using the cadence plot for a particular field: The following plot shows how many visits we have in different filters on a particular night:



In [47]:

    
firstSeasonDeep = OpSimDeepSummary.cadence_plot(fieldID=1427, observedOnly=False, sql_query='night < 366')



In [48]:

    
firstSeasonCombined = OpSimCombinedSummary.cadence_plot(fieldID=1427, observedOnly=False, sql_query='night < 366')



In [80]:

    
firstSeasonCombined[0].savefig('minion_1427.pdf')



In [53]:

    
firstSeason_main[0].savefig('minion_1430.pdf')



In [49]:

    
firstSeason = OpSimDeepSummary.cadence_plot(fieldID=744, observedOnly=False, sql_query='night < 732', 
                                             nightMin=0, nightMax=732)



In [58]:

    
tenCadence = OpSimCombinedSummary.cadence_plot(fieldID=fieldList[2000], observedOnly=False, sql_query='night < 3500', nightMax=3500)

Suppose we have a supernova with a peak around a particular MJD of 49540, and we want to see what the observations happened around it:



In [51]:

    
SN = OpSimDeepSummary.cadence_plot(summarydf=OpSimDeepSummary.df, fieldID=1427, #racol='fieldRA', deccol='fieldDec',
                                                  observedOnly=False, mjd_center=59640., mjd_range=[-30., 50.])
# ax = plt.gca()
# ax.axvline(49540, color='r', lw=2.)
# ax.xaxis.get_major_formatter().set_useOffset(False)



In [52]:

    
SN[0].savefig('SN_observaton.pdf')

Scratch



In [ ]:

    
SN_matrix.sum(axis=1).sum()



In [ ]:

    
EnigmaDeep.query('fieldID == 744 and expMJD < 49590 and expMJD > 49510').expMJD.size



In [41]:

    
SN_matrix[SN_matrix > 0.5] = 1



In [52]:

    
SN_matrix.sum().sum()









    Out[52]:





336.0



In [55]:

    
len(SN_matrix.sum(axis=1).dropna())









    Out[55]:





7



In [56]:

    
nightlySN_matrix = SN_matrix.copy(deep=True)



In [57]:

    
nightlySN_matrix[SN_matrix > 0.5] =1



In [64]:

    
nightlySN_matrix.sum(axis=1).dropna().sum()









    Out[64]:





18.0



In [69]:

    
nightlySN_matrix.sum(axis=1).dropna().size









    Out[69]:





7



In [ ]:

    
nightlySN_matrix.sum(ax)

	sessionID	propID	fieldID	fieldRA	fieldDec	filter	expDate	expMJD	night	visitTime	...	moonBright	darkBright	rawSeeing	wind	humidity	slewDist	slewTime	fiveSigmaDepth	ditheredRA	ditheredDec
obsHistID
1	1016	54	316	1.676483	-1.082473	y	2922	59580.033829	0	34.0	...	0.0	105.610378	0.920473	0.0	0.0	1.620307	0.000000	21.021236	1.643930	-1.108924
2	1016	54	372	1.694120	-1.033972	y	2961	59580.034275	0	34.0	...	0.0	105.167017	0.920473	0.0	0.0	0.049266	4.574944	21.026646	1.664257	-1.060423
3	1016	54	441	1.708513	-0.985271	y	2999	59580.034722	0	34.0	...	0.0	104.849578	0.920473	0.0	0.0	0.049298	4.595615	21.030520	1.680878	-1.011722
4	1016	54	505	1.720374	-0.936476	y	3038	59580.035169	0	34.0	...	0.0	104.669398	0.920473	0.0	0.0	0.049266	4.605366	21.032718	1.694604	-0.962927
5	1016	54	587	1.730248	-0.887672	y	3077	59580.035616	0	34.0	...	0.0	104.627207	0.920473	0.0	0.0	0.049177	4.604023	21.033233	1.706054	-0.914123

	expMJD	propID	filter	fieldRA	fieldDec	fieldID	fiveSigmaDepth
obsHistID
132712	59760.431760	54	z	0.0	-0.794553	744	22.088572
132741	59760.445023	54	z	0.0	-0.794553	744	22.037498
138180	59770.355327	54	y	0.0	-0.794553	744	21.921016
138192	59770.362167	54	r	0.0	-0.794553	744	24.264669
141306	59773.317909	54	y	0.0	-0.794553	744	21.852082