O. Smirnov <o.smirnov@ru.ac.za>, January 2018
Radiopadre is a framework, built on the Jupyter notebook, for browsing and visualizing data reduction products. It is particularly useful for visualizing data products on remote servers, where connection latencies and/or lack of software etc. limits the usual visualization options. It includes integration with the JS9 browser-based FITS viewer (with CARTA integration coming soon).
The general use case for Radiopadre is "here I am sitting with a slow ssh connection into a remote cluster node, my pipeline has produced 500 plots/logs/FITS images, how do I make sense of this mess?" More specifically, there are three (somewhat overlapping) scenarios that Radiopadre is designed for:
Just browsing: interactively exploring the aforementioned 500 files using a notebook.
Automated reporting: customized Radiopadre notebooks that automatically generate a report composed of a pipeline's outputs and intermediate products. Since your pipeline's output is (hopefully!) structured, i.e. in terms of filename conventions etc., you can write a notebook to exploit that structure and make a corresponding report automatically.
Sharing notebooks: fiddle with a notebook until everything is visualized just right, insert explanatory text in mardkdown cells in between, voila, you have an instant report you can share with colleagues.
Refer to README.md on the github repository: https://github.com/ratt-ru/radiopadre
Data files for this tutorial are available here: https://www.dropbox.com/sh/be4pc23rsavj67s/AAB2Ejv8cLsVT8wj60DiqS8Ya?dl=0
Download the tutorial and untar itsomewhere. Then run Radiopadre (locally or remotely, if you unpacked the tutorial on a remote node) in the resulting directory. A Jupyter console will pop up in your browser. Click on radiopadre-tutorial.ipynb
to open it in a separate window, then click the "Run all" button on the toolbar (or use "Cell|Run all" in the menu, which is the same thing.) Wait for the notebook to run through and render, then carry on reading.
In [ ]:
from radiopadre import ls, settings
dd = ls() # calls radiopadre.ls() to get a directory listing, assigns this to dd
dd # standard notebook feature: the result of the last expression on the cell is rendered in HTML
In [ ]:
dd.show()
print "Calling .show() on an object renders it in HTML anyway, same as if it was the last statement in the cell"
So what can you see from the above? dd
is a directory object than can render itself -- you get a directory listing. Clearly, Radiopadre can recognize certain types of files -- you can see an images/
subdirectory above, a measurement set, a couple of FITS files, some PNG images, etc. Clicking on a file will either download it or display it in a new tab (this works well for PNG or text files -- don't click on FITS files unless you mean to download a whole copy!) FITS files have a "JS9" button next to them that invokes the JS9 viewer either below the cell, or in a new browser tab. Try it!
Now let's get some objects from the directory listing and get them to render.
In [ ]:
images_subdir = dd[0]
demo_ms = dd[1]
fits_image = dd[2]
log_file = dd[-1] # last file in directory... consistent with Python list syntax
images_subdir.show()
demo_ms.show(_=(32,0)) # _ selects channels/correlations... more detail later
fits_image.show()
log_file.show()
# be prepared for a lot of output below... scroll through it
In [ ]:
images_subdir[5:10]
Since a directory is a list of files, it makes sence that the Python slice syntax [5:10]
returns an object that is also a list of files. There are other list-like objects in radiopadre. For example, an MS can be considered a list of rows. So...
In [ ]:
sub_ms = demo_ms[5:10] # gives us a table containing rows 5 through 9 of the MS
sub_ms.show(_=(32,0)) # _ selects channels/correlations... more detail later
And a text file is really just a list of lines, so:
In [ ]:
log_file[-10:] # extract last ten lines and show them
NB: FITS images and PNG images are not lists in any sense, so this syntax doesn't work on them. (In the future I'll consider supporting numpy-like slicing, e.g. [100:200,100:200]
, to transparently extract subsections of images, but for now this is not implemented.)
In [ ]:
png_files = dd("*.png") # on directories, () works like a shell pattern
png_files
In [ ]:
log_file("Gain plots") # on text files, () works like grep
In [ ]:
demo_ms("ANTENNA1==1").show(_=(32,0)) # on tables, () does a TaQL query
In [ ]:
png_files.thumbs() # for PNG images, these are nice and clickable!
And calling .images
on a directory returns a list of images. For which we can, of course, render thumbnails:
In [ ]:
images_subdir.images.thumbs()
Other such "list of files by type" attributes are .fits
, .tables
, and .dirs
:
In [ ]:
dd.fits.show()
dd.tables.show()
dd.dirs.show()
In [ ]:
dd.fits.thumbs(vmin=-1e-4, vmax=0.01) # and FITS files also know how to make themselves a thumbnail
# note that thumbs() takes optional arguments just like show()
And the show_all()
method will call show()
on every file object in the list. This is useful if you want to render a bunch of objects with the same parameters:
In [ ]:
# note the difference: dd.fits selects all files of type FITS, dd("*fits") selects all files matching "*fits".
# In our case this happens to be one and the same thing, but it doesn't have to be
dd("*fits").show_all(vmin=0, vmax=1e-2, colormap='hot')
# show_all() passes all its arguments to the show() method of each file.
In [ ]:
dirties = dd("j0839-5417_2-MFS-dirty.fits")
print "This is a list:", type(dirties), len(dirties) # this is a list even though we only specified one file
print "This is a single file:", type(dirties[0]) # so we have to use [0] to get at the FITS file itself
# Note that the summary attribute returns a short summary of any radiopadre object (as text or HTML).
# You can show() or print it
print "This is a summary of the list:",dirties.summary
print "And now in HTML:"
dirties.summary.show()
print "This is a summary of the file:",dirties[0].summary
print "And now in HTML:"
dirties[0].summary.show()
If you want to get at one specific file, using dd(name_or_pattern)[0]
becomes a hassle. Filelists therefore support a direct [name_or_pattern]
operation which always returns a single file object. If name_or_pattern
matches multiple files, only the first one is returned (but radiopadre will show you a transient warning message).
In [ ]:
dirty_image = dd["*fits"] # matches 2 files. if you re-execute this with Ctrl+Enter, you'll see a warning
print type(dirty_image)
In [ ]:
dirty_image = dd["*dirty*fits"] # this will match just the one file
dirty_image.show()
In [ ]:
log_file
In [ ]:
log_file.head(5) # same as log_file.show(head=5). Number is optional -- default is 10
In [ ]:
log_file.tail(5) # same as log_file.show(tail=5)
In [ ]:
log_file.full() # same as log_file.show(full=True). Use the scrollbar next to the cell output.
In [ ]:
log_file("Gain") # same as log_file.grep("Gain") or log_file.show(grep="Gain")
In [ ]:
# and of course all objects are just "lists of lines", so the normal list slicing syntax works
log_file("Gain")[10:20].show()
log_file("Gain")[-1]
If you're still running a reduction and want to keep an eye on a log file that's being updated, use the .watch()
method. This works exactly like .show()
and takes the same arguments, but adds a "refresh" button at the top right corner of the cell, which re-executes the cell every time you click it.
In [ ]:
log_file.watch(head=0, tail=10)
In [ ]:
dd.sh("df -h")
In [ ]:
dd.sh("df -h")("/boot")
In [ ]:
dirty_image.summary.show()
dirty_image.js9()
With multiple FITS files, it's possible to load all of them into JS9, and use the "<" and ">" keys to switch between images. Use the "JS9 all" button to do this:
In [ ]:
dd("*fits")
There's a shortcut for doing this directly -- just call .js9()
on a list of FITS files (note that "collective" functions such as .thumbs()
and .js9()
will only work on homogenuous filelists, i.e. lists of FITS files. Don't try calling them on a list contaning a mix of files -- it won't work!)
In [ ]:
# If you're wondering how to tell JS9 to start with specific scale settings, use the "with settings" trick
# shown here. It will be explained below...
with settings.fits(vmin=-1e-4, vmax=0.01):
dd("*fits").js9()
The .header
attribute of a FITS file object returns the FITS header, in the same kind of object (list-of-lines) as a text file. So all the tricks we did on text files above still apply:
In [ ]:
dirty_image.header
In [ ]:
dirty_image.header("CDELT*")
In [ ]:
dirty_image.header.full()
If you want to read in data from the FITS file, the .fitsobj
attribute returns a PrimaryHDU
object, just like astropy.io.fits.open(filename)
would:
In [ ]:
dirty_image.fitsobj
In [ ]:
demo_ms
With optional arguments to .show()
, you can render just a subset of rows (given as start_row, nrows), and a subset of columns, taking a slice through an array column. The below tells radiopadre to render the first 10 rows, taking the column TIME in its entirety, and taking a [32:34,:]
slice through the DATA column.
In [ ]:
demo_ms.show(0,10,TIME=(),DATA=(slice(32,34),None))
If you want to render all columns with a common slice, use the optional _
argument (we saw this above). The given slice will be applied to all columns as much as possible (or at least to those that match the shape):
In [ ]:
demo_ms.show(0, 10, _=(32,0)) # selects channel 32, correlation 0 from all 2D array columns. Doesn't apply to
# other types of columns
The .table
attribute returns a casacore table object with which you can do all the normal casacore table operations:
In [ ]:
print type(demo_ms.table)
But if you want to quickly read data from a table, radiopadre provides some fancier methods. For example, subtables of the table are available as a .SUBTABLE_NAME
attribute. This gives another table object, with all the functions above available:
In [ ]:
demo_ms.ANTENNA
...while columns of the table can be read via a .COLUMN_NAME()
function (with optional start_row, nrows, step arguments):
In [ ]:
demo_ms.UVW()
So combining the above, here's how to compute the UVW in wavelengths of all baselines to antenna 1, and make a uv-coverage plot of that subset of baselines:
In [ ]:
import numpy as np
freqs = demo_ms.SPECTRAL_WINDOW.CHAN_FREQ(0, 1) # read frequencies for spw 0
print freqs
subset = demo_ms("ANTENNA1 == 1")
uvw_lambda = subset.UVW()[np.newaxis,:,:]*3e+8/freqs[0,:,np.newaxis,np.newaxis]
print uvw_lambda.shape
import pylab
pylab.plot(uvw_lambda[:,:,0].flatten(), uvw_lambda[:,:,1].flatten(), '.')
In [ ]:
ls("*txt -rt") # give *txt files in reverse order of modification time
In [ ]:
logs = ls("*txt -rt") # of course this just returns a list-of-files object
logs
You can also use the "R" switch for a recursive directory listing:
In [ ]:
ls("*png -R")
Or give a filename to get an object representing that one file:
In [ ]:
image = ls("1525170187-1_meqtrees-gjones_plots-chan.png")
image
Om the same principle, give a subdirectory name to get a directory object:
In [ ]:
images_dir = ls("images")
images_dir
One thing to note is that ls()
(i.e. with no patterns) doesn't necessarily list all files. The files included by default are governed by radiopadre settings. Below we'll see how to change those.
In [ ]:
settings # same as settings.show(), if it's the last expression in the cell
In [ ]:
# and the various sections will also render themselves
settings.files
In [ ]:
# changing settings is as easy as
settings.files.include = "*png"
# the new settings apply from that point onwards, so you probably want to do this at the top of a notebook
ls()
In [ ]:
# from now on, only "*png" files will be listed. Unless you override this by an explicit pattern to ls(),
# e.g. in this case "*" overrides settings.files.include:
ls("*")
In [ ]:
settings.fits
Here's how we can render FITS images with different settings, without changing the global settings. Whatever we set in with
only applies in the body of the with
statement. In this case it is particularly useful, as it will also apply to the JS9 displays by default:
In [ ]:
with settings.fits(vmin=1e-6, vmax=1, colormap='hot', scale='log'):
ls("*fits").show() # this shows a list of FITS files
ls("*fits").show_all() # and this calls show() on every FITS file
In [ ]:
# observe that the global settings haven't changed:
settings.fits