Spectral Line Data Cubes in Astronomy - Part 1

In this notebook we will introduce spectral line data cubes in astronomy. They are a convenient way to store many spectra at points in the sky. Much like having a spectrum at every pixel in a CCD. In this Part 1 we will keep it as much "pure python", and not use astronomical units and just work in "pixel" or "voxel" space. In Part2 we will repeat the process with a more astronomy rich set of modules that you will have to install.

They normally are presented as a FITS file, with two sky coordinates (often Right Ascension and Declination) and one spectral coordinate (either an observing frequency or wavelength, and when there is a known spectral line, you can reference using this line with a velocity using the doppler effect). For radio data, such as ALMA and the VLA, we often use GHz or MHz. For optical data we often use the Angstrom (the visible range is around 4000 - 8000 Angstrom).

Outline

Main Goal: To introduce the concepts of spectral line data cubes

Definition of image cube
Data representation image cube
Introduction to galaxy rotating disks

A reminder on ipython notebooks: each cell needs to be executed in succession by using "SHIFT ENTER", not just "ENTER", in the cell.



In [1]:

    
%matplotlib inline









    



/astromake/opt/python/anaconda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

This first line of code is actually not real python code, but a magic ipython command, to make sure that the standard plotting commands are going to be displayed within the browser. You will see that happen below. The cube figure about is just a static PNG file.

As we will progress learning about the data and how to explore it further, you will notice this decision making process throughout this notebook.

Reading the data



In [2]:

    
import numpy as np
from astropy.io import fits

The astropy package has an I/O package to simplify reading and writing a number of popular formats common in astronomy.



In [3]:

    
hdu = fits.open('../data/ngc6503.cube.fits')
print len(hdu)

A FITS file consists of a series of Header-Data-Units (HDU). Usually there is only one, representing the image. But this file has two. For now, we're going to ignore the second, which is a special table and in this example happens to be empty anyways. Each HDU has a header, and data. The data in this case is a numpy array, and represents the image (cube):



In [4]:

    
h = hdu[0].header
d = hdu[0].data
print d.shape, d.min(), d.max(), d.mean(), np.median(d), d.std()
print "Signal/Noise  (S/N):",d.max()/d.std()









    



(1, 89, 251, 371) -0.00315721 0.0169835 4.73436e-05 1.33753e-05 0.000740354
Signal/Noise  (S/N): 22.9397

From the shape (1,89,251,371) we can see this image is actually 4 dimensional, although the 4th dimension is dummy. There are 371 pixels along X, 251 along Y, and 89 slices or spectral channels. It looks like the noise is around 0.00073 and a peak value 0.017, thus a signal to noise of a little over 23, so quite strong.

In python you can remove that dummy 4th axis, since we are not going to use it any further.



In [5]:

    
d = d.squeeze()
print d.shape









    



(89, 251, 371)

In case you were wondering about that 4th redundant axis. In astronomy we sometimes observe more than one type of radiation. Since waves are polarized, we can have up to 4 so called Stokes parameters, describing the waves as e.g. linear or circular polarized radiation. We will ignore that here, but they are sometimes stored in that 4th dimension. Sometimes they are stored as separate cubes.

Plotting some basics



In [6]:

    
import matplotlib.pyplot as plt



In [7]:

    
z = 0
im = d[z,:,:]                              #   im = d[z]     also works
plt.imshow(im,origin=['Lower'])
plt.colorbar()









    Out[7]:





<matplotlib.colorbar.Colorbar at 0x7f1bc1693390>

There are 89 channels (slices) in this cube, numbered 0 through 88 in the usual python sense. Pick a few other slices by changing the value in z= and notice that the first few and last few appear to be just noise and that the V-shaped signal changes shape through the channels. Perhaps you should not be surprised that these are referred to as butterfly diagrams.



In [8]:

    
# look at a histogram of all the data (histogram needs a 1D array)
d1 = d.ravel()                 # ravel() doesn't make a new copy of the array, saving memory
print d1.shape
(n,b,p) = plt.hist(d1, bins=100)

Notice that the histogram is on the left in the plot, and we already saw the maximum data point is 0.0169835.

So let us plot the vertical axis logarithmically, so we can better see what is going on.



In [9]:

    
(n,b,p) = plt.hist(d1,bins=100,log=True)



In [10]:

    
# pick a slice and make a histogram and print the mean and standard deviation of the signal in that slice
z=0
imz = d[z,:,:].flatten()
(n,b,p) = plt.hist(imz,bins=100)
print imz.mean(), imz.std()









    



-1.2134e-06 0.000558292

Exercise : observe by picking some values of z that the noise seems to vary a little bit from one end of the band to the other. Store the noise in channel 0 and 88 in variables sigma0 and sigma88:



In [11]:

    
xpeak = 175
ypeak = 250
channel = np.arange(d.shape[0])
spectrum = d[:,ypeak,xpeak]
zero = spectrum * 0.0
plt.plot(channel,spectrum)
plt.plot(channel,zero)









    Out[11]:





[<matplotlib.lines.Line2D at 0x7f1bbc380810>]



In [12]:

    
sigma0 = 0.00056
sigma88 = 0.00059



In [13]:

    
import scipy.signal
import scipy.ndimage.filters as filters

Smoothing a cube to enhance the signal to noise



In [14]:

    
z = 35
sigma = 2.0
ds1 = filters.gaussian_filter(d[z],sigma)                    # ds1 is a single smoothed slice
print ds1.shape, ds1.mean(), ds1.std()
plt.imshow(ds1,origin=['Lower'])
plt.colorbar()









    



(251, 371) 6.14612e-05 0.000519966






    Out[14]:





<matplotlib.colorbar.Colorbar at 0x7f1ba99973d0>

Notice that the noise is indeed lower than your earlier value of sigma0. We only smoothed one single slice, but we actually need to smooth the whole cube. Each slice with sigma, but we can optionally also smooth in the spectral dimension a little bit.



In [15]:

    
ds = filters.gaussian_filter(d,[1.0,sigma,sigma])              # ds is a smoothed cube
plt.imshow(ds[z],origin=['Lower'])
plt.colorbar()
print ds.max(),ds.max()/ds1.std()









    



0.010926 21.0128

Notice that, although the peak value was lowered a bit due to the smoothing, the signal to noise has increased from the original cube. So, the signal should stand out a lot better.

Exercise : Observe a subtle difference in the last two plots. Can you see what happened here?

Masking



In [16]:

    
import numpy.ma as ma



In [17]:

    
#  sigma0 is the noise in the original cube
nsigma = 4.0
dm = ma.masked_inside(d,-nsigma*sigma0,nsigma*sigma0)
print dm.count()



In [18]:

    
mom0 = dm.sum(axis=0)
plt.imshow(mom0,origin=['Lower'])
plt.colorbar()
#
(ypeak,xpeak) = np.unravel_index(mom0.argmax(),mom0.shape)
print "PEAK at location:",xpeak,ypeak,mom0.argmax()









    



PEAK at location: 148 113 42071



In [19]:

    
spectrum2 = ds[:,ypeak,xpeak]
plt.plot(channel,spectrum2)
plt.plot(channel,zero)









    Out[19]:





[<matplotlib.lines.Line2D at 0x7f1ba969fe50>]



In [20]:

    
mom0s = ds.sum(axis=0)
plt.imshow(mom0s,origin=['Lower'])
plt.colorbar()









    Out[20]:





<matplotlib.colorbar.Colorbar at 0x7f1ba95061d0>



In [ ]:

Velocity fields

The mean velocity is defined a the first moment

$$ <V> = {\sum{(v.I)} \over \sum{(I)} } $$



In [21]:

    
nz = d.shape[0]
vchan = np.arange(nz).reshape(nz,1,1)
vsum = vchan * d
vmean = vsum.sum(axis=0)/d.sum(axis=0)
print "MINMAX",vmean.min(),vmean.max()
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=88)
plt.colorbar()









    



MINMAX -925832.096555 302361.32862






    Out[21]:





<matplotlib.colorbar.Colorbar at 0x7f1ba96ab490>

Although we can recognize an area of coherent motions (the red and blue shifted sides of the galaxy), there is a lot of noise in this image. Looking at the math, we are dividing two numbers, both of which can be noise, so the outcome can be "anything". If anything, it should be a value between 0 and 88, so we could mask for that and see how that looks.

Let us first try to see how the smoothed cube looked.



In [22]:

    
nz = ds.shape[0]
vchan = np.arange(nz).reshape(nz,1,1)
vsum = vchan * ds
vmean = vsum.sum(axis=0)/ds.sum(axis=0)
print vmean.shape,vmean.min(),vmean.max()
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=89)
plt.colorbar()









    



(251, 371) -379739.299007 1023678.63765






    Out[22]:





<matplotlib.colorbar.Colorbar at 0x7f1ba989e710>

Although more coherent, there are still bogus values outside the image of the galaxy. So we are looking for a hybrid of the two methods. In the smooth cube we saw the signal to noise is a lot better defined, so we will define areas in the cube where the signal to noise is high enough and use those in the original high resolution cube.



In [23]:

    
# this is all messy , we need a better solution, a hybrid of the two:
noise = ds[0:5].flatten()
(n,b,p) = plt.hist(noise,bins=100)
print noise.mean(), noise.std()









    



2.96841e-07 0.000105553



In [24]:

    
sigma0 = noise.std()
nsigma = 5.0
cutoff = sigma0*nsigma
dm = ma.masked_inside(ds,-cutoff,cutoff)
print cutoff,dm.count()









    



0.000527763331775 186888



In [25]:

    
dm2=ma.masked_where(ma.getmask(dm),d)



In [26]:

    
vsum = vchan * dm2
vmean = vsum.sum(axis=0)/dm2.sum(axis=0)
print vmean.min(),vmean.max()
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=89)
plt.colorbar()
# some guess of where the major axis of the galaxy is
s = [50,55,300,205]
plt.plot([s[0],s[2]],[s[1],s[3]], 'k-', lw=2)









    



-9.47214823274 91.7744508962






    Out[26]:





[<matplotlib.lines.Line2D at 0x7f1ba910c650>]

And voila, now this looks a lot better, although only velocies between 0 and 88 are possible. Any other values are no doubt due to a noisy division of two numbers. Experiment with a different value of nsigma here.

Rotation Curves

The simplest way perhaps is to measure the velocities along the major axis. Here's a crummy way to extract it along the whole major axis, just to see how it looks.



In [27]:

    
import scipy.ndimage

s = [20,25,300,205]
x0, y0 = s[0],s[1]
x1, y1 = s[2],s[3]
num = 200
x, y = np.linspace(x0, x1, num), np.linspace(y0, y1, num)
z = scipy.ndimage.map_coordinates(np.transpose(vmean), np.vstack((x,y)))
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=89)
plt.plot([x0, x1], [y0, y1], 'ro-')









    Out[27]:





[<matplotlib.lines.Line2D at 0x7f1ba906ac50>]



In [28]:

    
plt.plot(z)









    Out[28]:





[<matplotlib.lines.Line2D at 0x7f1ba8ef6c10>]

Receding and Approaching rotation curve

And here the version of a rotation curve along the receding and approaching sides. This is generally a better idea, not only to get an idea if the two sides are compatible.



In [29]:

    
import scipy.ndimage

center = [163.5,123.2]
length = 150.0
pa     = 120.0
cosp = math.cos(pa*math.pi/180.0) 
sinp = math.sin(pa*math.pi/180.0) 
x0,y0 = center
#
x1 = x0 - length * sinp         # redshifted side (receding)
y1 = y0 + length * cosp
x2 = x0 + length * sinp         # blue shifted side (approaching)
y2 = y0 - length * cosp
num = 200
x1s, y1s = np.linspace(x0, x1, num), np.linspace(y0, y1, num)
x2s, y2s = np.linspace(x0, x2, num), np.linspace(y0, y2, num)
z1 = scipy.ndimage.map_coordinates(np.transpose(vmean), np.vstack((x1s,y1s)))
z2 = scipy.ndimage.map_coordinates(np.transpose(vmean), np.vstack((x2s,y2s)))
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=89)
plt.plot([x0, x1], [y0, y1], 'ro-')
plt.plot([x0, x2], [y0, y2], 'bo-')









    Out[29]:





[<matplotlib.lines.Line2D at 0x7f1bbc0a4950>]



In [30]:

    
plt.plot(z1)
plt.plot(z2)









    Out[30]:





[<matplotlib.lines.Line2D at 0x7f1ba8f0d810>]



In [31]:

    
v1 = z1 - z1[0]
v2 = z2[0]-z2
print("Offsets: v1,v2=",z1[0],z2[0])
plt.plot(v1)
plt.plot(v2)









    



('Offsets: v1,v2=', 42.222278342024374, 42.222278342024374)






    Out[31]:





[<matplotlib.lines.Line2D at 0x7f1ba98c4790>]

Compare these two rotation curves with Figure 12 in the Greisen et al. (2009) paper:

Epilogue

Some of the pure python constructs that we discussed here, notably masking and smoothing, become cumbersome. Also the absence of astronomical units (instead of pixels) can be cumbersome. In the advanced case we will use some community developed code that makes working with such spectral line image cubes a lot easier. For the particular problem of rotation curves there are also non-python solutions available, see the ASCL based list below.

ASCL

The is an online registry, ASCL, where you can search for astrophysics codes. For example, to analyse velocity fields to extract rotation curves there are several:

NEMO look for the rotcur program
GIPSY look for the rotcur program
AIPS look for the CUBIT and GAL programs
TiRiFiC uses GIPSY, 3D fit
FAT TiRiFiC was deemed hard to use, so this is a frontend to make it automated
3D-Barolo 3D fit
velfit fitting a velocity field
diskfit fitting a velocity field as well as mass (light) distribution



In [ ]:

ipython widgets are interactive widgets that you can add to your script and with a helper function dynamically control output, e.g. a figure.



In [ ]:

    
import networkx as nx
from ipywidgets import interact



In [ ]:



In [ ]:



In [ ]: