I've been examining the SHFI baselines from the CMZ survey (tuned to 218.9 or 219.0 GHz). There are baseline issues that have been hampering the analysis of our data.

I've been conversing with a few folks about how to deal with this problem. Fitting the baselines didn't work out very well. Flagging may be an option. Below, I investigate a few approaches for identifying bad spectra. This still doesn't provide an answer, so I'd like to hear further ideas if you have any.

Note that the work below is all done with python, but it could just as easily be done with CLASS.

Baselines are a significant problem in our spectra. This is an example spectrum with a two-component fit superposed. The residual is dominated by large-scale baseline ripples. This spectrum ("CoolSpot") is an average over a few pixels and includes ~1 minute of integration time (give or take an order of magnitude).


In [1]:
from IPython.display import Image,HTML
Image('../tex/figures/simple/CoolSpot_fit_4_lines_simple.png')


Out[1]:

I'm going to show a few methods I've tried to identify 'bad' baseline spectra. Start by loading the spectra for the "low" channel:


In [2]:
from pyspeckit.spectrum.readers import read_class
cl = read_class.ClassObject('/Users/adam/work/gc/apex/april2014/E-093.C-0144A.2014APR02/E-093.C-0144A-2014-2014-04-01.apex')
cl


Out[2]:
ClassObject(4430438736) with 5890 entries
sources: set(['RAFGL1922   ', 'EP-AQR      ', 'MAP_001     '])
tels: set(['AP-H201-X202', 'AP-H201-X201'])
lines: set(['CO(2-1)     ', 'shfi219GHz  '])
scans: set([8645, 8646, 8648, 8649, 8651, 8652])

In [3]:
# Select only the science data for just AP-H201-X202
spectra = cl.get_spectra(sourcere='^MAP_001', telescope=list(cl.tels)[0])

In [4]:
# Get the data as an array ("spectra" is a list of spectrum,header pairs)
data = np.array([s for s,h in spectra])
data.shape


Out[4]:
(2869, 32768)

In [5]:
# Show the mean, median spectrum
import pylab as pl
import numpy as np
pl.plot(data.mean(axis=0), ',')
pl.plot(np.median(data,axis=0), ',')


Out[5]:
[<matplotlib.lines.Line2D at 0x108132090>]

There is probably a detected signal around pixel 17,000. There are baseline artifacts at >30,000 pixels and the baseline is junky below 5,000.

Next, I'll point out some 'bad' or at least 'extreme' spectra. In the plot below, there are some spectra around #10 and #100 that have much higher noise than the others.


In [6]:
err = data.std(axis=1)
pl.plot(err)
pl.xlabel("Integration #")
pl.ylabel("RMS")


Out[6]:
<matplotlib.text.Text at 0x13e610390>

In [7]:
# identify "bad" spectrum around 100
np.where(err[:500] > 1.3)


Out[7]:
(array([145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158]),)

In [8]:
# show a 'good' and a 'bad' spectrum:
xax = read_class.make_axis(spectra[0][1]) # create the frequency axis
pl.plot(xax, data[150,:],',', label='bad')
pl.plot(xax, data[500,:],',', label='good')
pl.legend(loc='best')
pl.xlim(xax.min(), xax.max())


Out[8]:
(SpectroscopicAxis([array(216900.3204320392)], units='MHz', refX=218900.0, refX_units='MHz', frame='rest', redshift=None, xtype='frequency', velocity convention='radio'),
 SpectroscopicAxis([array(219399.91992923827)], units='MHz', refX=218900.0, refX_units='MHz', frame='rest', redshift=None, xtype='frequency', velocity convention='radio'))

It's clear that the blue 'bad' spectrum has both a gradient and some ripples. However, this is probably a high Tsys pointing.


In [9]:
# show two more normal-ish spectra
pl.plot(xax, data[2000,:],',', alpha=0.5)
pl.plot(xax, data[500,:],',', alpha=0.5)
pl.ylim(-15,5)
pl.xlim(xax.min(), xax.max())


Out[9]:
(SpectroscopicAxis([array(216900.3204320392)], units='MHz', refX=218900.0, refX_units='MHz', frame='rest', redshift=None, xtype='frequency', velocity convention='radio'),
 SpectroscopicAxis([array(219399.91992923827)], units='MHz', refX=218900.0, refX_units='MHz', frame='rest', redshift=None, xtype='frequency', velocity convention='radio'))

Both of these show baseline issues, but the blue is much worse than the green. You can also see hints of a pattern in the middle section: higher frequency baseline ripples.


In [10]:
# show two more normal-ish spectra
pl.plot(xax, data[2000,:], alpha=0.5, linewidth=0.5)
pl.plot(xax, data[500,:], alpha=0.5, linewidth=0.5)
pl.plot(xax, data[2000,:], 'b,', alpha=0.5)
pl.plot(xax, data[500,:], 'g,', alpha=0.5)
pl.ylim(-15,5)
pl.xlim(xax[16000],xax[18000])


Out[10]:
(218120.86524710269, 218273.43334898562)

These 'local' baseline ripples are much worse than the 'global' ripples. They are ~20 MHz wide, or...


In [11]:
from astropy import units as u
from astropy import constants
((20*u.MHz/(218.1*u.GHz))*constants.c).to(u.km/u.s)


Out[11]:
$27.4913 \; \mathrm{\frac{km}{s}}$

... about 30 km/s. That's comparable to the expected observed line width.

So, now I'll test a few measurements of the baseline quality.

First, the approach of measuring variance, downsampling, then measuring variance again, ad nauseum. Zhiyu Zhang suggested this approach; it's relatively easy to understand: Downsample the data by averaging adjacent channels by different amounts, and watch for whether the measured RMS follows the theoretical expectation.


In [12]:
from image_tools import downsample_axis, downsample_1d
spec1 = data[500,:]
spec2 = data[2000,:]
# Downsample from a resolution 0.07 MHz to 40 MHz, then go coarser
dsfactors = list(range(1,512)) + [512+8*ii for ii in range(64)]
def dsvar(arr):
    return [downsample_1d(arr, int(factor)).var() for factor in dsfactors]
spec1dsvar = dsvar(spec1)
spec2dsvar = dsvar(spec2)

Theoretically, for normally distributed noise, the variance should go as 1/n, i.e. 1/downsample_factor. So, I divide by the theoretical value first. I'll demonstrate that this works, too:


In [13]:
# Synthetic spectrum: noise amplitude ~1
synthnoise = np.random.randn(32768)
# Add a gaussian with width ~20 km/s, amplitude ~2
synthspec = synthnoise + 2*np.exp(-(np.arange(32768)-18000.)**2/(2*100**2))
synthnoise_var = dsvar(synthnoise)
synthspec_var = dsvar(synthspec)

In [14]:
# Show the bright-line 10 km/s synthetic spectrum
pl.plot(xax, synthspec, 'k,')
pl.xlim(xax.min(), xax.max())
pl.xlabel("Frequency (MHz)")
# This is probably typical of a Sgr A spectrum, for example


Out[14]:
<matplotlib.text.Text at 0x156133550>

In [15]:
pl.loglog(dsfactors, synthnoise_var/(synthnoise.var()/dsfactors), 'r')
pl.loglog(dsfactors, synthspec_var/(synthspec.var()/dsfactors), 'k', alpha=0.5)
pl.loglog(dsfactors, spec1dsvar/(spec1.var()/dsfactors), 'g')
pl.loglog(dsfactors, spec2dsvar/(spec2.var()/dsfactors), 'b')
pl.ylabel("Variance/Expected Variance")
pl.xlabel("Downsampling Factor")
pl.xlim(1,np.max(dsfactors))


Out[15]:
(1, 1016)

If you go to larger downsampling factors, the downsampling gets bad: if you downsample by a factor of 10,000 and have 32,000 pixels, you're measuring a variance from 3.2 points.

Otherwise, though, the red (gaussian noise) stays almost entirely flat up to a downsampling factor ~1000. Good! The green and blue real data have a rising variance/expected variance indicative of "signal" in the data. The signal could be real or could be baselines.

The bad news is that the grey line - the synthetic spectrum with purely gaussian noise plus a single 10 km/s wide (23.5 km/s FWHM) line - looks identical to the data. There's no way to say whether this data has a line in it or bad baselines from the data alone. I'll show how this looks for a few different synthetic spectra:


In [16]:
# 1000 channels = 104 km/s
synthspec1 = synthnoise + 2*np.exp(-(np.arange(32768)-13000.)**2/(2*1000**2))
# 10 channels = 1.04 km/s
synthspec2 = synthnoise + 2*np.exp(-(np.arange(32768)-16000.)**2/(2*10**2))
synthspec1_var = dsvar(synthspec1)
synthspec2_var = dsvar(synthspec2)

In [17]:
pl.loglog(dsfactors, synthnoise_var/(synthnoise.var()/dsfactors), 'r')
pl.loglog(dsfactors, synthspec_var/(synthspec.var()/dsfactors), 'k', alpha=0.5)
pl.loglog(dsfactors, synthspec1_var/(synthspec1.var()/dsfactors), 'g')
pl.loglog(dsfactors, synthspec2_var/(synthspec2.var()/dsfactors), 'b')
pl.ylabel("Variance/Expected Variance")
pl.xlabel("Downsampling Factor")
pl.xlim(1,np.max(dsfactors))


Out[17]:
(1, 1016)

So, narrow lines are indistinguishable from noise (fine), but wide lines - insanely wide lines that are not believable for even the CMZ - look rather terrible. This is actually somewhat useful. We can look at, for example, the overall distribution of the variance of the data downsampled by $2^{10} = 1024$


In [18]:
datavar = data.var(axis=1)
datadsvar = downsample_axis(data, 1024, 1).var(axis=1)

In [19]:
var1024div1 = datadsvar / datavar
p,l,h = pl.hist(var1024div1, bins=100, log=True)


It is plausible that the highest variance spectra have the worse baselines. We can inspect some: I'll plot the spectra with variance > 0.25.


In [20]:
pl.plot(xax, data[var1024div1 > 0.25, :].T, ',', alpha=0.1)
pl.xlim(xax.min(), xax.max())
pl.ylim(10,30)


Out[20]:
(10, 30)

Some of the best:


In [21]:
pl.plot(xax, data[var1024div1 < 0.01, :][:10,:].T, ',', alpha=0.1)
pl.xlim(xax.min(), xax.max())
pl.ylim(-5,5)


Out[21]:
(-5, 5)

So, it seems like it may be fair to reject, say, the top 1% 'noisiest' spectra, where the noise is measured from the strongly downsampled data.

Unfortunately, those high-frequency ripples are clearly still present.


In [22]:
pl.plot(xax, data[var1024div1 < 0.01, :][:25,:].T, ',', alpha=0.1)
pl.xlim(217.8e3, 218.2e3)
pl.ylim(-5,5)


Out[22]:
(-5, 5)

Fourier Based Flagging

I also explored the fourier domain, looking for power on the appropriate scales. The baselines are quasi-periodic, so we might expect to see peaks in the power spectrum associated with the noise. We don't, though.


In [23]:
dft = np.fft.fft(data, axis=1)
frq = np.fft.fftfreq(data.shape[1])

In [24]:
pl.loglog(frq, np.abs(dft[500,:]), 'b', alpha=0.5) # good
pl.loglog(frq, np.abs(dft[2000,:]), 'g', alpha=0.5) # OK
pl.loglog(frq, np.abs(dft[150,:]), 'r', alpha=0.5) # bad
pl.xlim(frq.min(), 1e-2)
pl.ylim(10,3e4)


Out[24]:
(10, 30000.0)

While there is more power on large scales in the 'bad' data, it's not obvious that it's concentrated in any particular way. We can again make histograms:


In [25]:
for ii in range(1,8):
    pl.hist(np.abs(dft[:,ii]), alpha=0.5, bins=50, log=True, histtype='step', label=str(ii))
pl.legend(loc='best')


Out[25]:
<matplotlib.legend.Legend at 0x15610f490>

So a few of these may be rejectable. The power in each fourier mode is approximately power-law distributed out to some limit, above which we might choose to reject the data.

However, it's worth noting that Fourier mode #2 and #3 are multi-modal: perhaps that means those spectra are particularly affected by a single mode? I'll investigate that below; for now we focus on rejecting the data with high power in modes 1 & 2.


In [26]:
bad_ft_1 = np.abs(dft[:,1]) > 11000
bad_ft_2 = np.abs(dft[:,2]) > 6000

In [27]:
pl.plot(xax, data[bad_ft_1, :].T, ',', alpha=0.1)
pl.plot(xax, data[bad_ft_2, :].T, ',', alpha=0.1)
pl.xlim(xax.min(), xax.max())
pl.ylim(10,30)


Out[27]:
(10, 30)

And some of the best:


In [28]:
pl.plot(xax, data[np.abs(dft[:,1]) < 1000, :][:100:10,:].T, ',', alpha=0.1)
pl.plot(xax, data[np.abs(dft[:,2]) < 1000, :][:100:10,:].T, ',', alpha=0.1)
pl.xlim(xax.min(), xax.max())
pl.ylim(-5,6)


Out[28]:
(-5, 6)

Again, this looks like a reasonable rejection criterion: perhaps dump the top 1% of spectra by fourier mode 1 or 2 (not zero - rejecting on the mean is not acceptable)

Still, the question remains, is there any way to reject those narrow ripples? Or mitigate them? To further explore this question, I cut out a section of the spectrum around 217.6-218.2 GHz, where the small-scale ripples seem worst.


In [29]:
# select a middle chunk of spectra with "good" data by the previous criteria
mid_chunk = data[np.abs(dft[:,2]) < 2200, 10000:18000]
print("Frequency range: {0}-{1}".format(xax[10000],xax[18000]))
mid_chunk.shape


Frequency range: 217663.160941-218273.433349
Out[29]:
(2559, 8000)

Baseline ripples are clear as "absorption" features in the spectrum below at ~4000 and ~6400:


In [30]:
pl.plot(mid_chunk.mean(axis=0), linewidth=0.1)


Out[30]:
[<matplotlib.lines.Line2D at 0x13e761590>]

In [31]:
mid_chunk_fft = np.fft.fft(mid_chunk, axis=1)
mid_chunk_mean_fft = np.fft.fft(mid_chunk.mean(axis=0))
mid_chunk_frq = np.fft.fftfreq(mid_chunk.shape[1])

I plot the mean power-spectrum and a handful of individual power spectra:


In [32]:
pl.loglog(mid_chunk_frq, np.abs(mid_chunk_mean_fft), linewidth=2)
pl.loglog(mid_chunk_frq, np.abs(mid_chunk_fft[::100,:]).T, alpha=0.5, linewidth=0.5)
pl.ylim(1,2e3)
pl.xlim(1e-4,1e-2)


Out[32]:
(0.0001, 0.01)

There is nothing obvious in these power spectra. I would expect a bump at the preferred frequency of the fluctuation. This means that the small-scale baseline ripples do not have a preferred frequency: they are not well-modeled as collections of sine functions.

Now, I'll revisit Fourier Modes #2 and #3, which are vaguely multi-modal:


In [33]:
p,l,h = pl.hist(np.abs(dft[:,2]), alpha=0.5, bins=50, log=True, histtype='step', color='b')
p,l,h = pl.hist(np.abs(dft[:,3]), alpha=0.5, bins=50, log=True, histtype='step', color='r')
pl.vlines([2200,4300], 0.1,1e3,)


Out[33]:
<matplotlib.collections.LineCollection at 0x1563fb8d0>

Seeing a bump at ~2500, I split the data at 2200 and 4300:


In [34]:
low = np.abs(dft[:,3]) < 2200
mid = (np.abs(dft[:,3]) < 4300) & ~low
high = ~low & ~mid
mean_low = data[low,:].mean(axis=0)
mean_mid = data[mid,:].mean(axis=0)
mean_high = data[high,:].mean(axis=0)

Plot the low-power (blue), mid-power (green), and high-power (red) spectra below:


In [35]:
pl.plot(xax, mean_low-mean_low.mean()+0.5, 'b,', alpha=0.5)
pl.plot(xax, mean_mid-mean_mid.mean(), 'g,', alpha=0.5)
pl.plot(xax, mean_high-mean_high.mean()-1.5, 'r,', alpha=0.5)
pl.xlim(xax.min(), xax.max())
pl.ylim(-3,1)


Out[35]:
(-3, 1)

Plot the mean "low-power" spectrum, which is supposed to be the "good" data:


In [36]:
pl.plot(xax, mean_low, linewidth=0.1)
pl.xlim(217.6e3, 218.4e3)
pl.ylim(0.15,0.6)


Out[36]:
(0.15, 0.6)

So fourier mode 3 is almost usable as a rejection criterion. Clearly the "low" compnent has not filtered out the H2CO line, which is good!

How many did we flag out?


In [37]:
print("Kept:{0} out of {1} ({2}%)".format(low.sum(), data.shape[0], 100*low.sum() / float(data.shape[0])))


Kept:2549 out of 2869 (88.8462879052%)

12% of the data is a lot - not too much, but right at the border where automated selection of the flagging criteria may be very difficult to do safely.

Does it make sense to flag based on Fourier Mode #2 too?


In [38]:
pl.plot([0,1e4],[0,1e4],'k--')
pl.plot(np.abs(dft[:,2]), np.abs(dft[:,3]), '.', )


Out[38]:
[<matplotlib.lines.Line2D at 0x1560a9210>]

No, in the range we're interested in, the two modes are very well correlated.

Key Points

  1. Flagging bad baseline data seems like a good idea - up to 10%? - but has not been tested in real data.
  2. Flagging can be done on the basis of RMS as a function of smoothing scale or simple fourier power
  3. It is still not clear what effect higher TSYS (i.e., pointing at a continuum source) will have on this
  4. Flagging does not provide a means to get rid of the 10-30 km/s baseline ripples. Those are still unsolved and deeply problematic.