Delay compensation through FIR filtering

For LOFAR, fractional delay compensation has up to now always been implemented in the Fourier domain. This has a couple disadvantages. Firstly, the optimal size of the Fourier transform depends on the longest baseline for which delays must be compensated. Secondly, the procedure involves a pair of Fourier transforms to go back to the original time resolution before other processing is done. Thirdly, unconditioned Fourier transforms lead to all kinds of artifacts.

Another approach is to derive a FIR filter that achieves the delay by convolution with the time series. This avoids the Fourier transform pair. A major advantage is that it is possible to design a very nice frequency response, independent of the number of output channels that the observer may want to have later on.

We derive the expression for the filter response by doing an analytic Fourier transform pair. The underlying complex signal $s(t)$ that arrives at the correlator is band limited between $\nu_\mathrm{c} -\frac{1}{2}\Delta\nu$ and $\nu_\mathrm{c} +\frac{1}{2}\Delta\nu$, and is sampled at time stamps $t_{m+k} = t_m + k\Delta t$, where $\Delta t = \Delta\nu^{-1}$. The sampling function in the time domain is therefore

\begin{equation} d(t) = \sum_{m=1}^M \delta(t - t_m), \end{equation}

where $\delta$ is the dirac delta function. Thus, the sampled and observed signal is then given by \begin{equation} \tilde{s} = s(t)\sum_{m} \delta(t - t_m). \end{equation}

If a station's clock is delayed with respect to UTC at the phase centre of LOFAR, the samples are time stamped with \begin{equation} t = t_{\mathrm{UTC}} - \tau(t), \end{equation} where $\tau$ is the delay of the station at $t_{\mathrm{UTC}}$. $\tau$ is positive if the signal of the station is delayed with respect to UTC.

Delay compensation is done in two steps. Given

\begin{equation} \tau = n(t)\Delta t + \delta\tau(t), \end{equation}

where $n$ is the required integral number of time steps at labelled time $t$, and $\delta\tau$ is the remaining amount, which obeys $\|\delta \tau\| \le \frac{1}{2}\Delta t$. Thus, $\tilde{s}(t)$ contains in fact the data that belongs at \begin{equation} \tilde{s}(t_{\mathrm{UTC}}-\tau) = \tilde{s}(t+ n(t) \Delta t+ \delta\tau(t)). \end{equation}

We therefore have to perform the correction \begin{equation} s'(\tau) = \tilde{s}(t - n \Delta t - \delta\tau). \end{equation}

First, the full sample delay is compensated by simply shifting the time series by an integer number of time steps. Subsequently, the fractional delay must be compensated. The FIR filter derived here is aimed at implementing the latter shift. For the sake of simplicity we now drop the integer sample delays: \begin{equation} s'(t) = \tilde{s}(t - \delta\tau). \end{equation} Delay compensation usually deals with slowly increasing or decreasing delays. On the time ranges relevant to us, this increase or decrease of the delay can be well approximated by a linear function \begin{equation} \delta\tau(t) = \delta\tau(t=t_m) + (t-t_m)\cdot\frac{\mathrm{d}\tau}{\mathrm{d}t}, \end{equation}

In the following treatment, we take into account that the time-to-frequency Fourier transforms for the LOFAR correlators are implemented using FFTW_FORWARD, which implies a minus sign in the exponent of the transform's phase factor. This shift can be described in the frequency domain (including the sampling function) as follows: \begin{equation} S'(\nu) = \frac{\int_{-\infty}^{+\infty} \tilde{s}(t - \delta\tau(t)) \mathrm{e}^{-2\pi\mathrm{i}\nu t}\ \mathrm{d}t}{\int_{-\infty}^{+\infty} d(t)\ \mathrm{d}t}, \end{equation} which, when evaluated with input samples at $t_{m-N}$ to $t_{m+N}$ inclusive, expands to \begin{equation} S'(\nu) = \frac{\int_{-\infty}^{+\infty} \tilde{s}(t - \delta\tau(t=t_m) - (t-t_m)\cdot\frac{\mathrm{d}\tau}{\mathrm{d}t}) \mathrm{e}^{-2\pi\mathrm{i}\nu t}\ \mathrm{d}t}{2N+1}. \end{equation} Applying the shift theorem and reorganizing the argument of $\tilde{s}$ gives: \begin{equation} S'(\nu) = \frac{\mathrm{e}^{-2\pi\mathrm{i}\nu \delta\tau(t=t_m)}\int_{-\infty}^{+\infty} \tilde{s}(t(1- \frac{\mathrm{d}\tau}{\mathrm{d}t}) + t_0\frac{\mathrm{d}\tau}{\mathrm{d}t}) \mathrm{e}^{-2\pi\mathrm{i}\nu t}\ \mathrm{d}t}{2N+1}. \end{equation} Without loss of generality, we can set $t_m = 0$, and rename $\delta\tau(t=t_m)$ to $\delta\tau_m$: \begin{equation} S'(\nu) = \frac{\mathrm{e}^{-2\pi\mathrm{i}\nu \delta\tau_m}\int_{-\infty}^{+\infty} \tilde{s}(t(1- \frac{\mathrm{d}\tau}{\mathrm{d}t})) \mathrm{e}^{-2\pi\mathrm{i}\nu t}\ \mathrm{d}t}{2N+1}. \end{equation}

Now let's see how big $\frac{\mathrm{d}\tau}{\mathrm{d}t}$ can actually become. The angular velocity of the earth is


In [1]:
%pylab inline
omega_earth_rad_s = 2*pi/(23.9344696*3600)
print(omega_earth_rad_s, 'rad/s')


Populating the interactive namespace from numpy and matplotlib
(7.292115852838157e-05, 'rad/s')

The maximum distance between two points on the Earth's equator is


In [2]:
equatorial_diameter_m = 12.756280e6

The maximum velocity of a terrestrial baseline observing the equator is therefore


In [3]:
max_velocity = equatorial_diameter_m*omega_earth_rad_s
print max_velocity, 'm/s'


930.202716112 m/s
or

In [4]:
dtau_dt = max_velocity / 299792458.0
print 'dtau_dt =', dtau_dt, 's/s'


dtau_dt = 3.10282227351e-06 s/s

Given that the dynamic range in a 32 bit floating point number is of the order of $10^{-6}$ and that we intend to perform the delay compensation in single precision floating point on timeseries that are most likely of order a few ms long, we can safely set the $(1- \frac{\mathrm{d}\tau}{\mathrm{d}t})$ factor to 1: \begin{equation} S'(\nu) = \frac{\mathrm{e}^{-2\pi\mathrm{i}\nu \delta\tau_m}\int_{-\infty}^{+\infty} \tilde{s}(t) \mathrm{e}^{-2\pi\mathrm{i}\nu t}\ \mathrm{d}t}{2N+1}. \end{equation}

Evaluating the integral yields \begin{equation} S'(\nu) = (2N+1)^{-1}\mathrm{e}^{-2\pi\mathrm{i}\nu \delta\tau_m}\sum_{k=-N}^{+N}s_{m+k} \mathrm{e}^{-2\pi\mathrm{i}\nu t_{m+k}}. \end{equation} Transforming back to the time domain, we have \begin{equation} s'(t_m) = (\Delta\nu(2N+1))^{-1}\int_{\nu_\mathrm{c}-\frac{1}{2}\Delta\nu}^{\nu_\mathrm{c}+\frac{1}{2}\Delta\nu}\mathrm{e}^{-2\pi\mathrm{i}\nu \delta\tau_m}\sum_{k=-N}^{+N}s_{m+k} \mathrm{e}^{-2\pi\mathrm{i}\nu t_{m+k}}\mathrm{e}^{+2\pi\mathrm{i}\nu t_m}\ \mathrm{d}\nu, \end{equation} because each sub band only contains signals with frequencies between $\nu_\mathrm{c}-\frac{1}{2}\Delta\nu$ and $\nu_\mathrm{c}+\frac{1}{2}\Delta\nu$, where $\nu_\mathrm{c}$ is the central frequency of the sub band and $t_m$ is the time of the time domain sample that is currently being converted.

Shifting the integration range: \begin{equation} s'_m = (\Delta\nu(2N+1))^{-1}\int_{-\frac{1}{2}\Delta\nu}^{+\frac{1}{2}\Delta\nu}\mathrm{e}^{-2\pi\mathrm{i}(\nu_\mathrm{c}+\nu )\delta\tau_m}\sum_{k=-N}^{+N}s_{m+k} \mathrm{e}^{-2\pi\mathrm{i}(\nu_\mathrm{c}+\nu )t_{m+k}}\mathrm{e}^{+2\pi\mathrm{i}(\nu_\mathrm{c}+\nu) t_{m}}\ \mathrm{d}\nu, \end{equation} Because we only evaluate the integral at the time $t_m$, we can, without loss of generality, set $t_m = 0$. The latter phase factor reduces then to 1, and $t_{m+k}$ to $k\Delta t$. Implementing this, and changing the order of summation and integration, we get \begin{equation} s'_m = (2N+1)^{-1}\sum_{k=-N}^{+N}s_{m+k}(\Delta\nu)^{-1}\int_{-\frac{1}{2}\Delta\nu}^{+\frac{1}{2}\Delta\nu}\mathrm{e}^{-2\pi\mathrm{i}(\nu_\mathrm{c}+\nu )\delta\tau_m} \mathrm{e}^{-2\pi\mathrm{i}(\nu_\mathrm{c}+\nu )k\Delta t}\ \mathrm{d}\nu, \end{equation} \begin{equation} s'_m = (2N+1)^{-1}\mathrm{e}^{-2\pi\mathrm{i}\nu_\mathrm{c}\delta\tau_m}\sum_{k=-N}^{+N}s_{m+k}\mathrm{e}^{-2\pi\mathrm{i}\nu_\mathrm{c}k\Delta t}(\Delta\nu)^{-1}\int_{-\frac{1}{2}\Delta\nu}^{+\frac{1}{2}\Delta\nu} \mathrm{e}^{-2\pi\mathrm{i}\nu(k\Delta t+\delta\tau_m)}\ \mathrm{d}\nu. \end{equation}

Because $\nu_\mathrm{c}$ is always an integer multiple of $\Delta\nu = \Delta t^{-1}$, with $\Delta t = \nu_\mathrm{clk}/1024 $, the factor $\mathrm{e}^{-2\pi\mathrm{i}\nu_\mathrm{c}k\Delta t}$ is always equal to 1: \begin{equation} s'_m = (2N+1)^{-1}\mathrm{e}^{-2\pi\mathrm{i}\nu_\mathrm{c}\delta\tau_m}\sum_{k=-N}^{+N}s_{m+k}(\Delta\nu)^{-1}\int_{-\frac{1}{2}\Delta\nu}^{+\frac{1}{2}\Delta\nu} \mathrm{e}^{-2\pi\mathrm{i}\nu(k\Delta t+\delta\tau_m)}\ \mathrm{d}\nu. %\label{eq:filter-with-integral} \end{equation} The integral \begin{equation} f_{mk}(\Delta\nu)^{-1}\int_{-\frac{1}{2}\Delta\nu}^{+\frac{1}{2}\Delta\nu} \mathrm{e}^{-2\pi\mathrm{i}\nu(k\Delta t+\delta\tau_m)}\ \mathrm{d}\nu \end{equation} evaluates to \begin{equation} f_{mk} = \frac{\sin \pi \Delta\nu(k\Delta t +\delta\tau_m)}{\pi \Delta\nu(k\Delta t +\delta\tau_m)}, \end{equation} which, because $\Delta \nu = \Delta t^{-1}$, simplifies to \begin{equation} f_{mk} = \frac{\sin \pi (k + \frac{\delta\tau_m}{\Delta t})}{\pi (k + \frac{\delta\tau_m}{\Delta t})}. \end{equation}

Substituting this in Eq. (\ref{eq:filter-with-integral}) gives \begin{equation} s'_m = (2N+1)^{-1}\mathrm{e}^{-2\pi\mathrm{i}\nu_\mathrm{c}\delta\tau_m}\sum_{k=-N}^{+N}s_{m+k}f_{mk}. %\label{eq:filter-without-integral} \end{equation} The previous expression implements a quasi-convolution of the timeseries $s$ in the neighbourhood of $s_{m}$ with filter $f_{mk}$. Because $N$ is a finite number, this is in fact a finite impulse response (FIR) filter with coefficients $f_{mk}$.

The delay will change from sample to sample, implying that we need to compute the filter coefficients for every sample $m$, which would be excessively expensive. However, we only need to make sure that the delay errors for the filter $f_k(\delta\tau)$ are such that decorrelation at the edges of the linear phase response remains below a certain pre-set limit. If we set that limit to decorrelate at most one part in $10^4$ at the edges, then the maximum error in phase must remain below $1.6^\circ$. If we precompute the coefficients at this granularity in fractional sample delay, we end up at a lookup table with only $L=240$ filters, where a filter can then be selected by choosing index \begin{equation} l = \left\lfloor\frac{L}{2} + L\frac{\delta\tau}{\Delta t} + \frac{1}{2}\right \rfloor, \end{equation} where \begin{equation} -\frac{1}{2} \le \frac{\delta\tau}{\Delta t}<\frac{1}{2}. \end{equation}

We now define a function that computes the fractional sample shift coefficients $f_k\left(\frac{\delta\tau}{\Delta t}\right)$:


In [35]:
@vectorize
def fractional_sample_shift_coefficients(indices, fractional_delay, squeeze=1.0):
    r'''
    Given an array of indices, compute un-tapered fractional delay FIR filter
    coefficients.
    
    **Parameters**
    
    indices : sequence of integers
        The indices ``k'', for example arange(-N, +N+1).
        
    fractional_delay: float
        The fractional delay $\frac{\delta \tau}{\Delta t}$, which should be in the 
        range -0.5 <= fractional_delay < 0.5.
        
    **Returns*
    
    A sequence of double precision floats containing the filter coefficients.
    
    **Examples**
    
    >>> fractional_sample_shift_coefficients([-2, -1, 0, +1, +2], 0.0)
    >>> fractional_sample_shift_coefficients([-2, -1, 0, +1, +2], -0.5)
    >>> fractional_sample_shift_coefficients([-2, -1, 0, +1, +2], +0.5)
    >>> fractional_sample_shift_coefficients([-2, -1, 0, +1, +2], +1.0)
    '''
    arg = pi*(fractional_delay+indices)
    if arg == 0.0:
        return 1.0
    else:
        return sin(squeeze*pi*(fractional_delay+indices))/(squeeze*pi*(fractional_delay+indices))

Let's see what these filters look like in the inner part:


In [36]:
def hdfig():
    return figure(figsize=(8, 4.5), dpi=200)

n    = 4
k    = arange(-n, n+1)
fssc = fractional_sample_shift_coefficients

hdfig()
plot(k, fssc(k,  0.0), linewidth=2, color='blue', label = r'$\frac{\delta\tau}{\Delta t} = 0$')
plot(k, fssc(k, -0.3), linewidth=2, color='green', label = r'$\frac{\delta\tau}{\Delta t} = -\frac{3}{10}$')
plot(k, fssc(k, -0.5), linewidth=2, color='red', label = r'$\frac{\delta\tau}{\Delta t} = -\frac{1}{2}$')
plot(k, fssc(k, -0.7), linewidth=2, color='brown', label = r'$\frac{\delta\tau}{\Delta t} = -\frac{7}{10}$')
plot(k, fssc(k, -1.0), linewidth=2, color='purple', label = r'$\frac{\delta\tau}{\Delta t} = -1$')
title('Inner part of sub-sample shift FIR filters')
legend()
grid()
xlabel(r'Index $k$')
ylabel(r'$f_k$')


Out[36]:
<matplotlib.text.Text at 0x58f3750>

And in the outer part:


In [37]:
n    = 64
k    = arange(-n, n+1)
fssc = fractional_sample_shift_coefficients

hdfig()
plot(k, fssc(k,  0.0), linewidth=2, color='blue', label = r'$\frac{\delta\tau}{\Delta t} = 0$')
plot(k, fssc(k, -0.3), linewidth=2, color='green', label = r'$\frac{\delta\tau}{\Delta t} = -\frac{3}{10}$')
plot(k, fssc(k, -0.5), linewidth=2, color='red', label = r'$\frac{\delta\tau}{\Delta t} = -\frac{1}{2}$')
title('Outer part of sub-sample shift FIR filters')
legend()
grid()
xlabel(r'Index $k$')
ylabel(r'$f_k$')


Out[37]:
<matplotlib.text.Text at 0x5c6ee90>

The ringing in particularly the half sample shift is very obvious. Suddenly cutting off the filter at $+/- N$ will lead to severe ringing in the frequency response. The frequency response of a FIR filter is simply the discrete Fourier transform (DFT) of its coefficients. For channel $-\frac{P}{2}\le p \le \frac{P}{2}$ : \begin{equation} F_p = \sum_{k=-N}^{+N} f_k \mathrm{e}^{-2\pi\mathrm{i}kp/P} \end{equation}


In [82]:
def filter_frequency_response(indices, coefficients, n=None, dtype=complex128):
    r'''
    Compute the complex frequency response of a FIR filter defined by
    its coefficients at certain indices $k$. It evaluates the frequency
    response in $N$ channels across the full nyquist range.
    
    **Parameters**
    
    indices : sequence of integers
        The values $k$ at which the coefficients are defined.
    
    coefficients : sequence of numbers
        The coefficients $f_k$ of the FIR filter. These may be complex valued.
    
    n : integer or None
        The number of channels on which to evaluate the frequency response. n 
        must be >= the number of coefficients. If not specified,
        n = len(coefficients)*4
        
    dtype : numpy dtype
        The precision with which the response should be evaluated. Either
        numpy.complex64 (single precision) or numpy.complex128 (double
        precision)
        
    **Examples**
    
    >>> n = 16
    >>> k = arange(-n, n+1)
    >>> f_k = sin(pi*(k+0.2))/(pi*(k+0.1))
    >>> response = filter_frequency_response(k, f_k, n = len(f_k)*4)
    '''
    if n is None:
        n = len(coefficients)*8
    channels = arange(-n/2, -n/2 +n+2)
    f_k      = coefficients
    k        = indices
    return (array(f_k[newaxis,:], dtype=dtype)*array(exp((2.j*pi*k[newaxis, :]*channels[:, newaxis])/float(len(channels))),
                                                     dtype=dtype)).sum(axis=-1)

We can now have a look at a couple frequency responses. First we define a function that plots a frequency response.


In [99]:
def plot_freq_response(response):
    figure(figsize=(16,4.5), dpi=100)
    x = arange(len(response),dtype=float64)/(len(response)-1)
    xlabel('Fractional sub band')
    plot(x, abs(response)-1.0, linewidth=3, color='blue', label='abs')
    epsilon=20*abs((abs(response)[len(response)/8:-len(response)/8]-1.0).std())
    #ylim(-epsilon, epsilon)
    ylim(-5e-4, +5e-4)
    ylabel('Abs - 1', color='blue')
    yticks(color='blue')
    grid(color='blue')
    twinx()
    plot(x, angle(response),linewidth=3, color='red', label='phase')
    ylabel('Phase [rad]', color='red')
    yticks(color='red')
    grid(color='red')
    ylim(-pi,pi)
    xlim(0, 1.0)




N = 32
k = arange(-N, N+1)
fractional_delay = -0.7
beta =8.2
f_k = fractional_sample_shift_coefficients(k, fractional_delay)
w_k = kaiser(2*N+1, beta=beta)
f_k /= (w_k*f_k).sum()
response = filter_frequency_response(k, array(w_k*f_k, dtype=float32))
plot_freq_response(response)
title(r'Frequency response (%d point filter, delay %.2f $\Delta t$, $\beta = %.1f$)' % (2*N +1, fractional_delay, beta))


Out[99]:
<matplotlib.text.Text at 0x134f9990>

In [57]:
#from IPython.parallel import Client
#cl = Client()
#dv = cl[:]
#pmap = dv.map_sync
   
nf, nt = (517, 100000)
noise = array(random.normal(size=nf*nt)+1.0j*random.normal(size=nf*nt), dtype=complex64)
shifted=convolve(noise, f_k*w_k, mode='same')
noise = noise.reshape((nt, nf))
shifted = shifted.reshape((nt, nf))
corr = fft.fftshift(fft.fftn(noise, axes=(1,))*conj(fft.fftn(shifted, axes=(1,))), axes=1).mean(axis=0)

In [58]:
plot_freq_response(corr/median(abs(corr)[nf/10:-nf/10]))
title(r'Frequency response (%d point filter, delay %.2f $\Delta t$, $\beta = %.1f$)' % (2*N +1, fractional_delay, beta))


Out[58]:
<matplotlib.text.Text at 0xa8a5710>

A not entirely unimportant aspect of FIR filtering is the computational cost. For LOFAR, the total number of floating point operations per second that is required, is the length of the filter $n$ times 2 for one multiply and one add, times 2 for the real and imaginary parts, times 2 for the two polarizations, times $\nu_\mathrm{clk}/1024 = 195312.5$ for the number of samples per second per sub band per station for the 200 MHz clock, times 488 sub bands, times 80 antenna fields.


In [60]:
def fir_flops(filter_length, num_sub_bands=488, num_antenna_fields=80, clock_mhz=200, implementation_overhead=7):
    return filter_length*2*2*2*(clock_mhz*1e6/1024.)*num_sub_bands*num_antenna_fields*implementation_overhead

print(fir_flops(129)*1e-12, ' Tflops')


(55.083, ' Tflops')

This seems to be a bit much. Fortunately, a high update rate for the fringe stopping becomes only important for stations further away than about 100 km. We can therefore consider doing this only for the international stations, assuming we will at some point have 16 of them:


In [59]:
print(fir_flops(129, num_antenna_fields=16)*1e-12, ' Tflops')


(11.0166, ' Tflops')

In [ ]: