Spectral Line Data Cubes in Astronomy - Part 1

In this notebook we will introduce spectral line data cubes in astronomy. They are a convenient way to store many spectra at points in the sky. Much like having a spectrum at every pixel in a CCD. In this Part 1 we will keep it as much "pure python", and not use astronomical units and just work in "pixel" or "voxel" space. In Part2 we will repeat the process with a more astronomy rich set of modules that you will have to install.

They normally are presented as a FITS file, with two sky coordinates (often Right Ascension and Declination) and one spectral coordinate (either an observing frequency or wavelength, and when there is a known spectral line, you can reference using this line with a velocity using the doppler effect). For radio data, such as ALMA and the VLA, we often use GHz or MHz. For optical data we often use the Angstrom (the visible range is around 4000 - 8000 Angstrom, or 400 - 800 nm).

Outline

Main Goal: To introduce the concepts of spectral line data cubes

Definition of image cube
Data representation image cube
Introduction to galaxy rotating disks



In [ ]:

    
%matplotlib inline
import matplotlib.pyplot as plt

This first line of code is actually not real python code, but a magic ipython command, to make sure that the standard plotting commands are going to be displayed within the browser. You will see that happen below. The cube figure about is just a static PNG file.

As we will progress learning about the data and how to explore it further, you will notice this decision making process throughout this notebook..

Reading the data



In [ ]:

    
import numpy as np
from astropy.io import fits

The astropy package has an I/O package to simplify reading and writing a number of popular formats common in astronomy.



In [ ]:

    
hdu = fits.open('../data/ngc6503.cube.fits')
print(len(hdu))
print(hdu[0])
print(hdu[1])

A FITS file consists of a series of Header-Data-Units (HDU). Usually there is only one, representing the image. But this file has two. For now, we're going to ignore the second, which is a special table and in this example happens to be empty anyways. Each HDU has a header, and data. The data in this case is a numpy array, and represents the image (cube):



In [ ]:

    
h = hdu[0].header
d = hdu[0].data
print(d.shape, d.min(), d.max(), d.mean(), np.median(d), d.std())
print("Signal/Noise  (S/N):",d.max()/d.std())

From the shape (1,89,251,371) we can see this image is actually 4 dimensional, although the 4th dimension is dummy. There are 371 pixels along X, 251 along Y, and 89 slices or spectral channels. It looks like the noise is around 0.00073 and a peak value 0.017, thus a signal to noise of a little over 23, so quite strong. There probably is something interesting in this cube!

In python you can remove that dummy 4th axis, since we are not going to use it any further. It would otherwise just means typing more indices.



In [ ]:

    
# what's a python dictionary?



In [ ]:

    
# printing out the header (a python dictionary)
print(h.keys)



In [ ]:

    
d = d.squeeze()
print(d.shape)
# nz=d.shape[0]

In case you were wondering about that 4th redundant axis. In astronomy we sometimes observe more than one type of radiation. Since waves are polarized, we can have up to 4 so called Stokes parameters, describing the waves as e.g. linear or circular polarized radiation. We will ignore that here, but they are sometimes stored in that 4th dimension. Sometimes they are stored as separate cubes.

Plotting some basics



In [ ]:

    
z = 38
# z = 45         # the mystery blob
im = d[z,:,:]  #   im = d[z]     also works
#im = d[z, 50:110, 210:270]
#im = d[z, 100:150, 140:180]
plt.imshow(im,origin=['Lower'])
plt.colorbar()
print(im.shape)

There are 89 channels (slices) in this cube, numbered 0 through 88 in the usual python sense. Pick a few other slices by changing the value in z= and notice that the first few and last few appear to be just noise and that the V-shaped signal changes shape through the channels. Perhaps you should not be surprised that these are referred to as butterfly diagrams.



In [ ]:

    
# look at a histogram of all the data (histogram needs a 1D array)
# d1 = d.flatten()          # flatten() makes a copy of the array!
d1 = d.ravel()              # ravel() doesn't make a new copy !!
print(d1.shape)
(n,b,p) = plt.hist(d1, bins=100)

Notice that the histogram is on the left in the plot, and we already saw the maximum data point is 0.0169835.

So let us plot the vertical axis logarithmically, so we can better see what is going on. If you can, use ds9 to try and bring up similar plots.



In [ ]:

    
(n,b,p) = plt.hist(d1,bins=100,log=True)



In [ ]:

    
# pick a slice and make a histogram and print the mean 
# and standard deviation of the signal in that slice
z=0
imz = d[z,:,:].flatten()
(n,b,p) = plt.hist(imz,bins=100)
print(imz.mean(), imz.std())

Exercise : observe by picking some values of z that the noise seems to vary a little bit from one end of the band to the other. Store the noise in channel 0 and 88 in variables sigma0 and sigma88:

Question : we would like to know if this distribution is gaussian. How could we do this?

Now that we have computed the RMS in a channel, we might as well compute them for all channels!

We are also comparing this channel based RMS with the single cube RMS that we determined earlier.



In [ ]:

    
nchan = d.shape[0]            #  a.k.a.  nz
channel = np.arange(nchan)    # channel numbers 0..nz-1 (88)
rms = np.zeros(nchan)         # placeholder for the RMS
peak = np.zeros(nchan)        # placeholder for the PEAK value 

cuberms = np.zeros(nchan) + d.std()
for z in range(nchan):
    imz = d[z,:,:].flatten()
    rms[z] = imz.std()
    peak[z] = imz.max()
plt.plot(channel,rms,label='chan_rms')
#plt.plot(channel,peak,label='peak')
plt.plot(channel,cuberms,label='cube_rms',color='red')
plt.legend(loc='best')
plt.xlabel("Channel")
plt.ylabel("RMS");

Question: can you think of a better way to compute the RMS as function of channel (the blue line) and not have it depend so much on where there is signal?



In [ ]:

    
# helper function for slice statistics
import numpy.ma as ma
def robust(d, method=0, ns=4.0, rf=1.5):
    if method==0:
        return d.std()
    elif method==1:
        m = d.mean()
        s = d.std()
        d1 = ma.masked_outside(d,m-ns*s,m+ns*s)
        return d1.std()
    elif method==2:
        # assume mean is close enough to zero and no absorbtion
        m = d.min()
        d1 = ma.masked_outside(d,m,-m)
        return d1.std()
    elif method==3:
        n = len(d)
        d.sort()
        q1 = d[n//4]
        q3 = d[(3*n)//4]
        D = q3-q1
        d1 = ma.masked_outside(d,q1-rf*D,q3+rf*D)
        return d1.std()
    else:
        return d.std()



In [ ]:

    
nchan = d.shape[0]
channel = np.arange(nchan)
rms0 = np.zeros(nchan)
rms1 = np.zeros(nchan)
rms2 = np.zeros(nchan)
rms3 = np.zeros(nchan)
rms4 = np.zeros(nchan)
rms5 = np.zeros(nchan)
peak = np.zeros(nchan)
cuberms = np.zeros(nchan) + d.std()
for z in range(nchan):
    imz = d[z,:,:].flatten()
    imz4 = d[z,0:80,280:355].flatten()
    imz5 = d[z,170:250,0:120].flatten()
    rms0[z] = robust(imz,0)
    rms1[z] = robust(imz,1,ns=4.0)
    rms2[z] = robust(imz,2)
    rms3[z] = robust(imz,3,rf=1.5)
    rms4[z] = robust(imz4,0)
    rms5[z] = robust(imz5,0)
    peak[z] = imz.max()
plt.plot(channel,rms0,label='chan_rms0')
plt.plot(channel,rms1,label='chan_rms1')
plt.plot(channel,rms2,label='chan_rms2')
plt.plot(channel,rms3,label='chan_rms3')
plt.plot(channel,rms4,label='chan_rms_lr')
plt.plot(channel,rms5,label='chan_rms_tl')
# plt.plot(channel,peak,label='peak')
plt.plot(channel,cuberms,label='cube_rms',color='black')
plt.legend(loc='best',fontsize='small')
plt.xlabel("Channel")
plt.ylabel("RMS")
plt.savefig("n6503_rms.png")

Next we are interested in the Signal/Noise per channel where is there is no signal. This is clear in the first few and last channels. Recall that in the absence of real signal the peak will always be a few times sigma, purely based on the error function behavior of the distribution of gaussian noise. In our case something like $4\sigma$. For small maps more like $3\sigma$, for really big maps or cubes $5\sigma$.



In [ ]:

    
rms0 = rms[0:15].mean()
rms1 = rms[88-13:88].mean()
cuberms = np.zeros(nchan) + 0.5*(rms0+rms1)
sn0 = peak/rms0
sn1 = peak/rms1
plt.plot(channel,sn0,label='S/N(low)')
plt.plot(channel,sn1,label='S/N(high)')
plt.plot(channel[0:15],np.zeros(15)+1,color='black',label='edge')
plt.plot(channel[88-13:88],np.zeros(13)+1,color='black')
plt.legend(loc='best')
print(rms0,rms1)

What is sloppy about the above code fragment



In [ ]:

    
s1=peak[0:15]/rms[0:15]
s2=peak[75:88]/rms[75:88]
print("First few channels:",s1.mean(),s1.std())
print("Last  few channels:",s2.mean(),s2.std())

Gaussian noise probability distribution is given by $$ P(x) = { 1 \over {\sigma \sqrt{2\pi}}} {e^{- { x^2 \over {2 \sigma^2}}}} $$ where the mean is 0 and RMS is $\sigma$. This function is normalized, integrated over x results in 1.

Lets do a simulation to see if we can understand the S/N in this plot. We will need the error function to compute the chance of being in the tail part of the gaussian. The error function is defined as: $$ erf(x) = { {2}\over{\sqrt{\pi}}} \int_0^x e^{-t^2} dt $$



In [ ]:

    
import math
def pnoise(n):
    """ chance measuring noise of n sigma"""
    return 0.5*math.erfc(n/math.sqrt(2.0))

nsample = 10000
g = np.random.normal(size=nsample)
sn = g.max()/g.std()
print("S/N: ",sn)
print("1/P(S/N)=",1/pnoise(sn))



In [ ]:

    
# 1/chance for a +1,2,3 sigma detection
print(1/pnoise(1.0))
print(1/pnoise(2.0))
print(1/pnoise(3.0))
print(1/pnoise(4.0))
print(1/pnoise(5.0))
nxy = d.shape[1]*d.shape[2]
print("Number of pixels in a map:",nxy)



In [ ]:

    
peakpos = (175,125)     # some strong point in the disk of the galaxy
peakpos = (231,80)     # the mystery blob?
#peakpos = (310,50)      # no signal
spectrum = d[:,peakpos[1],peakpos[0]]
sns = spectrum.max()/rms[0:15].mean()
zero = spectrum * 0.0
plt.plot(channel,spectrum,'o-',markersize=2)
plt.plot(channel,zero)
plt.plot(channel,cuberms,'r--',label='1$\sigma$')
plt.plot(channel,-cuberms,'r--')
plt.title("Spectrum at position %s  S/N=%.3g" % (str(peakpos),sns))
plt.legend();

Is the noise correlated? Hanning smoothing is often used to increase the S/N. Test this by taking the differences between neighboring signals and computing the RMS of this "signal". If noise is normal and not correlated, the ratio of this RMS to the original RMS of the signal should be $\sqrt{2}$. Pick a point where there is no obvious signal, such as the (310,50) position.



In [ ]:

    
cdelt3 = h['CDELT3']
crval3 = h['CRVAL3']
crpix3 = h['CRPIX3']
restfreq=h['RESTFREQ']
freq = (channel-crpix3+1)*cdelt3 + crval3      # at the reference pixel we get the reference value
c = 299792.458
channelv = (1.0-freq/restfreq) * c          # convert to doppler velocity in km/s
print("min/max/dv:",channelv[0],channelv[nchan-1],channelv[0]-channelv[1])
plt.plot(channelv,spectrum,'o-',markersize=2)
plt.plot(channelv,zero)
plt.plot(channelv,cuberms,'r--',label='1$\sigma$')
plt.plot(channelv,-cuberms,'r--')
plt.title("Spectrum at positon %s  S/N=%.3g" % (str(peakpos),sns))
plt.legend()
plt.xlabel("velocity (km/s)");



In [ ]:

    
# saving a descriptive spectrum using pickle
try:
    import cPickle as pickle
except:
    import pickle
   
# construct a descriptive spectrum 
sp = {}
sp['z'] = channelv
sp['i'] = spectrum
sp['zunit'] = 'km/s'
sp['iunit'] = h['BUNIT'] 
sp['xpos']  = peakpos[0]
sp['ypos']  = peakpos[1]
    
# write it
pfile = "n6503-sp.p" 
pickle.dump(sp,open(pfile,"wb"))
print("Wrote spectrum",pfile)



In [ ]:

    
dspectrum = spectrum[1:] - spectrum[:-1]
# dspectrum = np.diff(spectrum)     # this also works (but look up docs!)
rms1 = dspectrum.std()
rms0 = spectrum.std()
print(rms1,"/",rms0,"=",rms1/rms0)

The ratio of the noise you see here should be $\sqrt{2}$, but let's see for a typical normal distribution how close we are to $\sqrt{2}$:



In [ ]:

    
%%time 
nsample = 100000
g = np.random.normal(10.0,5.0,nsample)
delta = np.diff(g)
gh=plt.hist([g,delta],32)
print(g.std(),delta.std(),delta.std()/g.std())

Smoothing a cube to enhance the signal to noise



In [ ]:

    
import scipy.signal
import scipy.ndimage.filters as filters



In [ ]:

    
z = 0
print("old rms",rms[z])
sigma = 2.0
ds1 = filters.gaussian_filter(d[z],sigma)      # ds1 = smoothed slice
print("new:",ds1.shape, ds1.mean(), ds1.std())
plt.imshow(ds1,origin=['Lower'])
plt.colorbar()

Notice that the noise is indeed lower than your earlier value of sigma0. We only smoothed one single slice, but we actually need to smooth the whole cube. Each slice with sigma, but we can optionally also smooth in the spectral dimension a little bit.



In [ ]:

    
ds = filters.gaussian_filter(d,[1.0,sigma,sigma])  # ds is a smoothed cube 
plt.imshow(ds[z],origin=['Lower'])
plt.colorbar()
print(ds[z].std())
print(ds.max(),ds.max()/ds1.std())

Notice that, although the peak value was lowered a bit due to the smoothing, the signal to noise has increased from the original cube. So, the signal should stand out a lot better.

Exercise : Observe a subtle difference in the last two plots. Can you see what happened here?

Masking



In [ ]:

    
import numpy.ma as ma



In [ ]:

    
#  sigma0 is the noise in the original cube
sigma0 = rms0
nsigma = 5.0
dm = ma.masked_inside(d,-nsigma*sigma0,nsigma*sigma0)
print(dm.count())



In [ ]:

    
mom0 = dm.sum(axis=0)
plt.imshow(mom0,origin=['Lower'])
plt.colorbar()
#
(ypeak,xpeak) = np.unravel_index(mom0.argmax(),mom0.shape)
print("PEAK at location:",xpeak,ypeak,mom0.argmax())



In [ ]:

    
spectrum2 = ds[:,ypeak,xpeak]
plt.plot(channel,spectrum2)
plt.plot(channel,zero);



In [ ]:

    
mom0s = ds.sum(axis=0)
plt.imshow(mom0s,origin=['Lower'])
plt.colorbar();



In [ ]:

Velocity fields

The mean velocity is defined as the first moment

$$ <V> = {\Sigma{(v.I)} \over \Sigma{(I)} } $$



In [ ]:

    
nz = d.shape[0]
vchan = np.arange(nz).reshape(nz,1,1)
vsum = vchan * d
vmean = vsum.sum(axis=0)/d.sum(axis=0)
print("MINMAX",vmean.min(),vmean.max())
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=88)
#plt.imshow(vmean,origin=['Lower'])
plt.colorbar();



In [ ]:

Although we can recognize an area of coherent motions (the red and blue shifted sides of the galaxy), there is a lot of noise in this image. Looking at the math, we are dividing two numbers, both of which can be noise, so the outcome can be "anything". If anything, it should be a value between 0 and 88, so we could mask for that and see how that looks.

Let us first try to see how the smoothed cube looked.



In [ ]:

    
nz = ds.shape[0]
vchan = np.arange(nz).reshape(nz,1,1)
vsum = vchan * ds
vmean = vsum.sum(axis=0)/ds.sum(axis=0)
print(vmean.shape,vmean.min(),vmean.max())
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=88)
plt.colorbar();

Although more coherent, there are still bogus values outside the image of the galaxy. So we are looking for a hybrid of the two methods. In the smooth cube we saw the signal to noise is a lot better defined, so we will define areas in the cube where the signal to noise is high enough and use those in the original high resolution cube.



In [ ]:

    
# this is all messy , we need a better solution, a hybrid of the two:
noise = ds[0:5].flatten()
(n,b,p) = plt.hist(noise,bins=100)
print(noise.mean(), noise.std())



In [ ]:

    
sigma0 = noise.std()
nsigma = 5.0
cutoff = sigma0*nsigma
dm = ma.masked_inside(ds,-cutoff,cutoff)    # assumes mean is close to 0
print(cutoff,dm.count())



In [ ]:

    
dm2=ma.masked_where(ma.getmask(dm),d)



In [ ]:

    
%%time 
vsum = vchan * dm2
%time vmean = vsum.sum(axis=0)/dm2.sum(axis=0)
print(vmean.min(),vmean.max())
%time plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=88)
plt.colorbar()
print(vmean.shape)

And voila, now this looks a lot better.

Saving your output

This result is now stored in the vmean numpy array. But how do we make this information persistent? We could use the python pickle technique, but this would not be very user friendly for those who do not use python.

The answer is again in FITS format. Where the fits.open() function would retrieve a Header and Data (or series of), we need to construct a Header with this Data and write it using fits.writeto().



In [ ]:

    
# the old hdu[0] is still available, but points to a 3D cube, 
# so lets just try and make it 2D
hv = h.copy()
hv['NAXIS'] = 2
#   cannot write yet: complains about illegal axes
hv.remove('NAXIS3')
hv.remove('NAXIS4')
print(type(vmean))
print(vmean.shape)
print(h['BITPIX'])
#   cannot write yet: complains about masking
vmean0 = ma.filled(vmean,0.0)
#   finally write it successfully
fits.writeto('n6503-vmean.fits',vmean0,hv,overwrite=True)

What size, in bytes, would you roughly expect for this file? What is the size you found. Use ds9 to inspect this velocity field.

Finally, to write



In [ ]:

    
num = np.arange(50.0).reshape(5,10)
hdu = fits.PrimaryHDU(num)
hdu.writeto('num100.fits',overwrite=True)
plt.matshow(num,origin='lower')
plt.colorbar();

Papers

The data cube we have used in this notebook has been provided by Eric Greisen (NRAO), and his 2009 paper discussed results in detail: http://adsabs.harvard.edu/abs/2009AJ....137.4718G

Data are also available on Greisen's ftp

Epilogue

Some of the pure python constructs that we discussed here, notably masking and smoothing, become cumbersome. In the advanced case we will use some community developed code that makes working with such spectral line image cubes a lot easier. Thing that come to mind are:

WCS (astronomical coordinate systems)
units (the flux unit in radio astronomy is Jy/beam)
arbitrary slices through the cube