Radiopadre Tutorial

                                                            O. Smirnov <o.smirnov@ru.ac.za>, January 2018

Radiopadre is a framework, built on the Jupyter notebook, for browsing and visualizing data reduction products. It is particularly useful for visualizing data products on remote servers, where connection latencies and/or lack of software etc. limits the usual visualization options. It includes integration with the JS9 browser-based FITS viewer (with CARTA integration coming soon).

The general use case for Radiopadre is "here I am sitting with a slow ssh connection into a remote cluster node, my pipeline has produced 500 plots/logs/FITS images, how do I make sense of this mess?" More specifically, there are three (somewhat overlapping) scenarios that Radiopadre is designed for:

  • Just browsing: interactively exploring the aforementioned 500 files using a notebook.

  • Automated reporting: customized Radiopadre notebooks that automatically generate a report composed of a pipeline's outputs and intermediate products. Since your pipeline's output is (hopefully!) structured, i.e. in terms of filename conventions etc., you can write a notebook to exploit that structure and make a corresponding report automatically.

  • Sharing notebooks: fiddle with a notebook until everything is visualized just right, insert explanatory text in mardkdown cells in between, voila, you have an instant report you can share with colleagues.

Installing Radiopadre

Refer to README.md on the github repository: https://github.com/ratt-ru/radiopadre

Running this tutorial

Data files for this tutorial are available here: https://www.dropbox.com/sh/be4pc23rsavj67s/AAB2Ejv8cLsVT8wj60DiqS8Ya?dl=0

Download the tutorial and untar itsomewhere. Then run Radiopadre (locally or remotely, if you unpacked the tutorial on a remote node) in the resulting directory. A Jupyter console will pop up in your browser. Click on radiopadre-tutorial.ipynb to open it in a separate window, then click the "Run all" button on the toolbar (or use "Cell|Run all" in the menu, which is the same thing.) Wait for the notebook to run through and render, then carry on reading.

Every Radiopadre notebook starts with this


In [ ]:
from radiopadre import ls, settings
dd = ls()         # calls radiopadre.ls() to get a directory listing, assigns this to dd
dd                # standard notebook feature: the result of the last expression on the cell is rendered in HTML

In [ ]:
dd.show()
print "Calling .show() on an object renders it in HTML anyway, same as if it was the last statement in the cell"

Most objects knows how to show() themselves

So what can you see from the above? dd is a directory object than can render itself -- you get a directory listing. Clearly, Radiopadre can recognize certain types of files -- you can see an images/ subdirectory above, a measurement set, a couple of FITS files, some PNG images, etc. Clicking on a file will either download it or display it in a new tab (this works well for PNG or text files -- don't click on FITS files unless you mean to download a whole copy!) FITS files have a "JS9" button next to them that invokes the JS9 viewer either below the cell, or in a new browser tab. Try it!

Now let's get some objects from the directory listing and get them to render.


In [ ]:
images_subdir = dd[0]
demo_ms = dd[1]
fits_image = dd[2]
log_file = dd[-1]   # last file in directory... consistent with Python list syntax
images_subdir.show()
demo_ms.show(_=(32,0))  # _ selects channels/correlations... more detail later
fits_image.show()
log_file.show()
# be prepared for a lot of output below... scroll through it

Most things are list-like

What you see above is that different object types know how to show themselves intelligently. You also see that a directory object acts like a Python list -- dd[n] gets the n-th object from the directory. What about a slice?


In [ ]:
images_subdir[5:10]

Since a directory is a list of files, it makes sence that the Python slice syntax [5:10] returns an object that is also a list of files. There are other list-like objects in radiopadre. For example, an MS can be considered a list of rows. So...


In [ ]:
sub_ms = demo_ms[5:10]   # gives us a table containing rows 5 through 9 of the MS
sub_ms.show(_=(32,0))    # _ selects channels/correlations... more detail later

And a text file is really just a list of lines, so:


In [ ]:
log_file[-10:]   # extract last ten lines and show them

NB: FITS images and PNG images are not lists in any sense, so this syntax doesn't work on them. (In the future I'll consider supporting numpy-like slicing, e.g. [100:200,100:200], to transparently extract subsections of images, but for now this is not implemented.)

And list-like things can be searched with ()

Radiopadre's list-like objects (directories/file lists, text files, CASA tables) also support a "search" function, invoked by calling them like a function. This returns an object that is subset of the original object. Three examples:


In [ ]:
png_files = dd("*.png")   # on directories, () works like a shell pattern
png_files

In [ ]:
log_file("Gain plots")   # on text files, () works like grep

In [ ]:
demo_ms("ANTENNA1==1").show(_=(32,0))    # on tables, () does a TaQL query

Other useful things to do with directories/lists of files

If you have a list of image or FITS files, you can ask for thumbnails by calling .thumbs().


In [ ]:
png_files.thumbs()    # for PNG images, these are nice and clickable!

And calling .images on a directory returns a list of images. For which we can, of course, render thumbnails:


In [ ]:
images_subdir.images.thumbs()

Other such "list of files by type" attributes are .fits, .tables, and .dirs:


In [ ]:
dd.fits.show()
dd.tables.show()
dd.dirs.show()

In [ ]:
dd.fits.thumbs(vmin=-1e-4, vmax=0.01)   # and FITS files also know how to make themselves a thumbnail
# note that thumbs() takes optional arguments just like show()

And the show_all() method will call show() on every file object in the list. This is useful if you want to render a bunch of objects with the same parameters:


In [ ]:
# note the difference: dd.fits selects all files of type FITS, dd("*fits") selects all files matching "*fits".
# In our case this happens to be one and the same thing, but it doesn't have to be
dd("*fits").show_all(vmin=0, vmax=1e-2, colormap='hot')
# show_all() passes all its arguments to the show() method of each file.

Accessing a single file by name

The (pattern) operation applied to a directory always returns a filelist (possibly an empty one), even if the pattern is not relly a pattern and selects only one file:


In [ ]:
dirties = dd("j0839-5417_2-MFS-dirty.fits")
print "This is a list:", type(dirties), len(dirties)   # this is a list even though we only specified one file
print "This is a single file:", type(dirties[0]) # so we have to use [0] to get at the FITS file itself
# Note that the summary attribute returns a short summary of any radiopadre object (as text or HTML). 
# You can show() or print it
print "This is a summary of the list:",dirties.summary
print "And now in HTML:"
dirties.summary.show()
print "This is a summary of the file:",dirties[0].summary
print "And now in HTML:"
dirties[0].summary.show()

If you want to get at one specific file, using dd(name_or_pattern)[0] becomes a hassle. Filelists therefore support a direct [name_or_pattern] operation which always returns a single file object. If name_or_pattern matches multiple files, only the first one is returned (but radiopadre will show you a transient warning message).


In [ ]:
dirty_image = dd["*fits"]   # matches 2 files. if you re-execute this with Ctrl+Enter, you'll see a warning
print type(dirty_image)

In [ ]:
dirty_image = dd["*dirty*fits"]  # this will match just the one file
dirty_image.show()

Working with text files

By default, radiopadre renders the beginning and end of a text file. But you can also explicitly render just the head, or just the tail, or the full file.


In [ ]:
log_file

In [ ]:
log_file.head(5)   # same as log_file.show(head=5). Number is optional -- default is 10

In [ ]:
log_file.tail(5)  # same as log_file.show(tail=5)

In [ ]:
log_file.full()   # same as log_file.show(full=True). Use the scrollbar next to the cell output.

In [ ]:
log_file("Gain")     # same as log_file.grep("Gain") or log_file.show(grep="Gain")

In [ ]:
# and of course all objects are just "lists of lines", so the normal list slicing syntax works
log_file("Gain")[10:20].show()
log_file("Gain")[-1]

"Watching" text files

If you're still running a reduction and want to keep an eye on a log file that's being updated, use the .watch() method. This works exactly like .show() and takes the same arguments, but adds a "refresh" button at the top right corner of the cell, which re-executes the cell every time you click it.


In [ ]:
log_file.watch(head=0, tail=10)

Running shell commands

Use .sh("command") on a directory object to quickly run a shell command in that directory. The result is output as a list-of-lines, so all the usual display tricks work.


In [ ]:
dd.sh("df -h")

In [ ]:
dd.sh("df -h")("/boot")

Working with FITS files

As you saw above, FITS files can be rendered with show(), or viewed via the JS9 buttons. There's also an explicit .js9() method which invokes the viewer in a cell:


In [ ]:
dirty_image.summary.show()
dirty_image.js9()

With multiple FITS files, it's possible to load all of them into JS9, and use the "<" and ">" keys to switch between images. Use the "JS9 all" button to do this:


In [ ]:
dd("*fits")

There's a shortcut for doing this directly -- just call .js9() on a list of FITS files (note that "collective" functions such as .thumbs() and .js9() will only work on homogenuous filelists, i.e. lists of FITS files. Don't try calling them on a list contaning a mix of files -- it won't work!)


In [ ]:
# If you're wondering how to tell JS9 to start with specific scale settings, use the "with settings" trick 
# shown here. It will be explained below...
with settings.fits(vmin=-1e-4, vmax=0.01):  
    dd("*fits").js9()

The .header attribute of a FITS file object returns the FITS header, in the same kind of object (list-of-lines) as a text file. So all the tricks we did on text files above still apply:


In [ ]:
dirty_image.header

In [ ]:
dirty_image.header("CDELT*")

In [ ]:
dirty_image.header.full()

If you want to read in data from the FITS file, the .fitsobj attribute returns a PrimaryHDU object, just like astropy.io.fits.open(filename) would:


In [ ]:
dirty_image.fitsobj

Working with CASA tables

As you saw above, a CASA table object knows how to render itself as a table. Default is to render rows 0 to 100. With array columns, the default display becomes a little unwieldy:


In [ ]:
demo_ms

With optional arguments to .show(), you can render just a subset of rows (given as start_row, nrows), and a subset of columns, taking a slice through an array column. The below tells radiopadre to render the first 10 rows, taking the column TIME in its entirety, and taking a [32:34,:] slice through the DATA column.


In [ ]:
demo_ms.show(0,10,TIME=(),DATA=(slice(32,34),None))

If you want to render all columns with a common slice, use the optional _ argument (we saw this above). The given slice will be applied to all columns as much as possible (or at least to those that match the shape):


In [ ]:
demo_ms.show(0, 10, _=(32,0))  # selects channel 32, correlation 0 from all 2D array columns. Doesn't apply to
# other types of columns

The .table attribute returns a casacore table object with which you can do all the normal casacore table operations:


In [ ]:
print type(demo_ms.table)

But if you want to quickly read data from a table, radiopadre provides some fancier methods. For example, subtables of the table are available as a .SUBTABLE_NAME attribute. This gives another table object, with all the functions above available:


In [ ]:
demo_ms.ANTENNA

...while columns of the table can be read via a .COLUMN_NAME() function (with optional start_row, nrows, step arguments):


In [ ]:
demo_ms.UVW()

So combining the above, here's how to compute the UVW in wavelengths of all baselines to antenna 1, and make a uv-coverage plot of that subset of baselines:


In [ ]:
import numpy as np
freqs = demo_ms.SPECTRAL_WINDOW.CHAN_FREQ(0, 1)  # read frequencies for spw 0
print freqs
subset = demo_ms("ANTENNA1 == 1")
uvw_lambda = subset.UVW()[np.newaxis,:,:]*3e+8/freqs[0,:,np.newaxis,np.newaxis]
print uvw_lambda.shape
import pylab
pylab.plot(uvw_lambda[:,:,0].flatten(), uvw_lambda[:,:,1].flatten(), '.')

The ls() function

...is where it all begins. As you saw, ls() gives you the current directory. You can also use ls with filename patterns, and also specify a sort order:


In [ ]:
ls("*txt -rt")   # give *txt files in reverse order of modification time

In [ ]:
logs = ls("*txt -rt")   # of course this just returns a list-of-files object
logs

You can also use the "R" switch for a recursive directory listing:


In [ ]:
ls("*png -R")

Or give a filename to get an object representing that one file:


In [ ]:
image = ls("1525170187-1_meqtrees-gjones_plots-chan.png")
image

Om the same principle, give a subdirectory name to get a directory object:


In [ ]:
images_dir = ls("images")
images_dir

One thing to note is that ls() (i.e. with no patterns) doesn't necessarily list all files. The files included by default are governed by radiopadre settings. Below we'll see how to change those.

Using and changing settings

The settings object we imported above can be used to set various defaults of Radiopadre. Like most other objects, it knows how to render itself:


In [ ]:
settings   # same as settings.show(), if it's the last expression in the cell

In [ ]:
# and the various sections will also render themselves
settings.files

In [ ]:
# changing settings is as easy as
settings.files.include = "*png"
# the new settings apply from that point onwards, so you probably want to do this at the top of a notebook
ls()

In [ ]:
# from now on, only "*png" files will be listed. Unless you override this by an explicit pattern to ls(),
# e.g. in this case "*" overrides settings.files.include:
ls("*")

Using "with" to change settings temporarily

Python's with statement works with radiopadre settings to change settings temporarily. For example, the default FITS rendering settings look like this:


In [ ]:
settings.fits

Here's how we can render FITS images with different settings, without changing the global settings. Whatever we set in with only applies in the body of the with statement. In this case it is particularly useful, as it will also apply to the JS9 displays by default:


In [ ]:
with settings.fits(vmin=1e-6, vmax=1, colormap='hot', scale='log'):
    ls("*fits").show()        # this shows a list of FITS files
    ls("*fits").show_all()    # and this calls show() on every FITS file

In [ ]:
# observe that the global settings haven't changed:
settings.fits