Spectral Line Data Cubes in Astronomy - Part 2 (unfinished)

We are doing a complete re-analysis of the same NGC 6503 case, but now with more tools that the community has developed. For this you will need to modify your python environment.

See also http://adsabs.harvard.edu/abs/2015ASPC..499..363G for the paper describing this project.


In [ ]:
%matplotlib inline

In [ ]:
# python 2-3 compatibility
from __future__ import print_function

Reading the data

Two new community modules are introduced here: spectral_cube and radio_beam, both from the https://github.com/radio-astro-tools project.

Manual

git clone https://github.com/radio-astro-tools/spectral_cube
cd spectral_cube
python setup.py install

git clone https://github.com/radio-astro-tools/radio_beam
cd radio_beam
python setup.py install

Automated

1) For spectral-cube you can simple use the pip command from the dos/unix shell to install it:

pip install spectral_cube

Note the underscore and dash convention.

2) For radio_beam the developers have not submitted this to the python community (https://pypi.python.org/pypi), and thus you will need to install this manually. Go to the developers page on github: https://github.com/radio-astro-tools/radio_beam and near the top right corner you will see a button Download ZIP. Extract this somewhere (often in your Downloads directory) and use your python to install this. For example, again from your doc/unix shell you would type:

 cd ~/Downloads/radio_beam-master
 python setup.py install

In [ ]:
import numpy as np
import astropy.units as u
from spectral_cube import SpectralCube
import radio_beam

In [ ]:
cube = SpectralCube.read('../data/ngc6503.cube.fits')
print(cube)

A FITS file consists of a series of Header-Data-Units (HDU). Usually there is only one, representing the image. But this file has two. For now, we're going to ignore the second, which is a special table and in this example happens to be empty anyways. Each HDU has a header, and data. The data in this case is a numpy array, and represents the image (cube):


In [ ]:
h = cube.header
d = cube.unmasked_data[:,:,:]
print(d.shape, d.min(), d.max(), d.mean(), np.median(d), d.std())
print("Signal/Noise  (S/N):",d.max()/d.std())

From the shape (89,251,371) we can see this image is already 3 dimensional, the dummy 4th dimension we saw in the previous example is not present here. There are 371 pixels along X, 251 along Y, and 89 slices or spectral channels.

In case you were wondering about that missing 4th redundant axis. This is possibly a simplification of the data model left out of spectal_cube at the moment.

The material below this should be the same as the previous "case1" notebook (even though the data array is not exactly just a numpy array, but a Quantity!)

Plotting some basics


In [ ]:
import matplotlib.pyplot as plt

In [ ]:
z = 35
im = d[z,:,:]                              #   im = d[z]     also works
plt.imshow(im,origin=['Lower'])
plt.colorbar()

There are 89 channels (slices) in this cube, numbered 0 through 88 in the usual python sense. Pick a few other slices by changing the value in z= and notice that the first few and last few appear to be just noise and that the V-shaped signal changes shape through the channels. Perhaps you should not be surprised that these are referred to as butterfly diagrams.


In [ ]:
# look at a histogram of all the data (histogram needs a 1D array)
d1 = d.ravel()                 # ravel() doesn't make a new copy of the array, saving memory
print(d1.shape)
(n,b,p) = plt.hist(d1, bins=100)

Notice that the histogram is on the left in the plot, and we already saw the maximum data point is 0.0169835.

So let us plot the vertical axis logarithmically, so we can better see what is going on.


In [ ]:
(n,b,p) = plt.hist(d1,bins=100,log=True)

In [ ]:
# pick a slice and make a histogram and print the mean and standard deviation of the signal in that slice
z=0
imz = d[z,:,:].flatten()
(n,b,p) = plt.hist(imz,bins=100)
print(imz.mean(), imz.std())

Exercise : observe by picking some values of z that the noise seems to vary a little bit from one end of the band to the other. Store the noise in channel 0 and 88 in variables sigma0 and sigma88:


In [ ]:
xpeak = 175
ypeak = 125
channel = np.arange(d.shape[0])
spectrum = d[:,ypeak,xpeak]
zero = spectrum * 0.0
plt.plot(channel,spectrum)
plt.plot(channel,zero)

In [ ]:
sigma0 = 0.00056
sigma88 = 0.00059

In [ ]:
import scipy.signal
import scipy.ndimage.filters as filters

Smoothing a cube to enhance the signal to noise


In [ ]:
z = 0
sigma = 2.0
ds1 = filters.gaussian_filter(d[z],sigma)                    # ds1 is a single smoothed slice
print ds1.shape, ds1.mean(), ds1.std()
plt.imshow(ds1,origin=['Lower'])
plt.colorbar()

Notice that the noise is indeed lower than your earlier value of sigma0. We only smoothed one single slice, but we actually need to smooth the whole cube. Each slice with sigma, but we can optionally also smooth in the spectral dimension a little bit.


In [ ]:
ds = filters.gaussian_filter(d,[1.0,sigma,sigma])              # ds is a smoothed cube
plt.imshow(ds[z],origin=['Lower'])
plt.colorbar()
print ds.max(),ds.max()/ds1.std()

Notice that, although the peak value was lowered a bit due to the smoothing, the signal to noise has increased from the original cube. So, the signal should stand out a lot better.

Exercise : Observe a subtle difference in the last two plots. Can you see what happened here?

Masking


In [ ]:
import numpy.ma as ma

In [ ]:
#  sigma0 is the noise in the original cube
nsigma = 0.0
dm = ma.masked_inside(d,-nsigma*sigma0,nsigma*sigma0)
print dm.count()

In [ ]:
mom0 = dm.sum(axis=0)
plt.imshow(mom0,origin=['Lower'])
plt.colorbar()
#
(ypeak,xpeak) = np.unravel_index(mom0.argmax(),mom0.shape)
print "PEAK at location:",xpeak,ypeak,mom0.argmax()

In [ ]:
spectrum2 = ds[:,ypeak,xpeak]
plt.plot(channel,spectrum2)
plt.plot(channel,zero)

In [ ]:
mom0s = ds.sum(axis=0)
plt.imshow(mom0s,origin=['Lower'])
plt.colorbar()

Velocity fields

The mean velocity is defined a the first moment

$$ <V> = {\Sigma{(v.I)} \over \Sigma{(I)} } $$

In [ ]:
nz = d.shape[0]
vchan = np.arange(nz).reshape(nz,1,1)
vsum = vchan * d
vmean = vsum.sum(axis=0)/d.sum(axis=0)
print "MINMAX",vmean.min(),vmean.max()
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=88)
plt.colorbar()

Although we can recognize an area of coherent motions (the red and blue shifted sides of the galaxy), there is a lot of noise in this image. Looking at the math, we are dividing two numbers, both of which can be noise, so the outcome can be "anything". If anything, it should be a value between 0 and 88, so we could mask for that and see how that looks.

Let us first try to see how the smoothed cube looked.


In [ ]:
nz = ds.shape[0]
vchan = np.arange(nz).reshape(nz,1,1)
vsum = vchan * ds
vmean = vsum.sum(axis=0)/ds.sum(axis=0)
print vmean.shape,vmean.min(),vmean.max()
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=89)
plt.colorbar()

Although more coherent, there are still bogus values outside the image of the galaxy. So we are looking for a hybrid of the two methods. In the smooth cube we saw the signal to noise is a lot better defined, so we will define areas in the cube where the signal to noise is high enough and use those in the original high resolution cube.


In [ ]:
# this is all messy , we need a better solution, a hybrid of the two:
noise = ds[0:5].flatten()
(n,b,p) = plt.hist(noise,bins=100)
print noise.mean(), noise.std()

In [ ]:
sigma0 = noise.std()
nsigma = 5.0
cutoff = sigma0*nsigma
dm = ma.masked_inside(ds,-cutoff,cutoff)
print cutoff,dm.count()

In [ ]:
dm2=ma.masked_where(ma.getmask(dm),d)

In [ ]:
vsum = vchan * dm2
vmean = vsum.sum(axis=0)/dm2.sum(axis=0)
print vmean.min(),vmean.max()
plt.imshow(vmean,origin=['Lower'],vmin=0,vmax=89)
plt.colorbar()

And voila, now this looks a lot better.