Import standard modules:


In [ ]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from IPython.display import HTML 
from IPython.display import Image, display, clear_output
from ipywidgets import HBox, Label, FloatSlider, Layout
HTML('../style/course.css') #apply general CSS

Import section specific modules:


In [ ]:
from IPython.display import Image
import track_simulator
import AA_filter

5.3 Gridding and Degridding for using the FFT

In the previous section several sampling functions were presented. There the sampling functions were already neatly discretized and displayed as images. Each image was a grid of pixels (all with the same size). Fourier inverting such regularly sampled data is done with a fast Fourier Transform (FFT) algorithm. This is called the "fast" Fourier Transform because it is computationally more efficient that the Direct Fourier Transform (DFT). To give an idea of how fast this algorithm is, if there are $N^2$ pixels in the image, the FFT takes roughly $2N^2\log(N)$ computational operations. In contrast, the complexity of the DFT also depends on the number of visibilities $M$ and takes $N^2M$ steps. Here $M\approx N^2$ and for each pixel we must take $M$ complex exponentiations and multiplications. Specifically, the DFT calculates the intensity of each pixel:

\begin{equation} I(l,m) = \sum_{k=0}^{M-1}V_k(u,v)e^{2\pi i (lu+mv)}\text{, }V_k\text{ are the M measurements taken by the telescope} \end{equation}

From this it should be clear that as the number of baselines or observation time is increased, the FFT approach would be far less time-consuming than the the direct approach. Unfortunately radio interferometers don't take measurements at regular intervals, and thus an FFT cannot be used on the observation data directly. Instead the data has to be resampled onto a grid with points spaced at regular intervals before taking the FFT. This resampling process (called gridding) and its inverse (called degridding) is the topic of this section. The big idea here is that we do gridding and degridding because it enables us to create an image faster than if we did the DFT.

As you will see later some u,v space image deconvolution algorithms such as the Cotton-Schwab ➞ major-minor cycle algorithm require that sources in image space are reconverted back into the non-regular measurement space. Here an accurate degridding operation is required to "interpolate" regularly sampled visibilities back onto the u,v tracks shown above.

In addition to the issue of resampling when using the FFT transform approach, is the issue of aliasing. The FFT assumes that the input signal (here the spatial frequency domain) is periodic in nature. The resultant image constructed by resampling and inverse FFT therefore repeats at regular intervals: sources near the top of the image are aliased back into the image at the bottom for instance. This introduces the necessity to filter the image with a filter that only passes signal that falls within the field of view being reconstructed. Aliasing is an effect of Nyquist sampling ($\S$ 2.9 ➞) the visibilities based on the grid size. An example of this form of aliasing will be given later on.

The following points will be discussed in this chapter:

  1. Image resolution and pixel size
  2. Gridding and degridding, along with a discussion on the use anti-aliasing filters
  3. Sample code for degridding a model sky, and gridding and inverting visibilities to form a dirty image.

5.3.1 Image Resolution and Pixel Size

When generating an image from visibilities, using either a direct or fast Fourier transform, two parameters need to be defined: the resolution of each pixel and the extent of the image either as the number of pixels or as the size of the field of view (depending on the particular imager). An image will be a two-dimensional array of size $N_l \times N_m$ and each pixel will have a resolution of $(\Delta \theta_l, \Delta \theta_m)$ if we are making the small angle approximation for the field of view. Recall that the image size is $l' = \cos{\theta_l}$, $m' = \cos{\theta_m}$, the resolution is $\Delta l = \cos{\Delta \theta_l}$, $\Delta m = \cos{\Delta \theta_m}$ and in the small angle approximation $\Delta l \sim \Delta \theta_l$, $\Delta m \sim \Delta \theta_m$. Though, many imagers can create images which break the small angle approximation the notation is retained. There are a number of techniques for representing a point in spherical coordinates, via a non-linear transform, on a two-dimensional plane. In radio interferometry the standard technique is SIN-projection, see AIPS Memo 27 for a detailed discussion of different coordinate projections.

Given the the resolution $(\Delta \theta_l, \Delta \theta_m)$ and the desired field of view $(\theta_l, \theta_m)$, the number of pixels in the image (the image size) is

$$N_l = \frac{\theta_l}{\Delta \theta_l}$$$$N_m = \frac{\theta_m}{\Delta \theta_m}$$

Recall that the uv tracks samples spatial frequency, so therefore the chosen image resolution (cell size) must satisfy the Nyquist relation. For a given interferometer resolution and image size the image domain resolution/grid size $(\Delta \theta_l, \Delta \theta_m)$ is

$$\Delta \theta_l = \frac{1}{2N_l \Delta u} = \frac{1}{2\max{(||\min{u}||,\max{u})}} \text{ radians}$$$$\Delta \theta_m = \frac{1}{2N_m \Delta v} = \frac{1}{2\max{(||\min{v}||,\max{v})}} \text{ radians}$$

And the number of pixels is unchanged $N_u = N_l$, $N_v = N_m$.

An important note about the number of pixels is that one should try to use values which are powers of 2, i.e. $N_l = 2^j$, $N_m = 2^k, $ for some positive $j,k$. This is because of how FFT's are implemented, the optimal run-time efficiency of an FFT is with input lengths which are powers of 2 and are least efficient when the input length is a prime number. For example, the time required to generate a 256 by 256 pixel ($2^8$ by $2^8$) image will be less than a 251 by 251 pixel image even though the resulting image will have more pixels. Also, note, interferometric images are almost always square by convention.

When using the Fast Fourier Transform for the inversion between uv and image space the image has to be scaled to the correct size using the Similarity property of the FFT: $$V(au, av) \rightleftharpoons \frac{1}{|a|}I(\frac{l}{a},\frac{m}{a})$$

This implies that we can scale the image to $-0.5N_x\Delta\theta_l \leq l \leq 0.5N_x\Delta\theta_l$ and $-0.5N_y\Delta\theta_m \leq m \leq 0.5N_y\Delta\theta_m$ by scaling with the uv tracks with the image size (in radians). This is in contrast to the approach taken when using the DFT: in the direct per pixel evaluation of the fourier sum the resolution can be specified directly.

5.3.2 Gridding and Degridding

As you may suspect there are many ways to interpolate data to and from regularly-spaced coordinates. The most widely-used interpolation technique used in radio imaging programs, such as lwimager, is known as "convolutional-resampling". In this technique each visibility is weighted and "smeared out" onto grid points that lie within a small distance from the original coordinate.


In [ ]:
Image(filename="figures/gridding_illustration.png")

Figure: Each observed visibility is centered at some sampling coordinate in continuous u,v space and is weighted with some function $C(u,v)$, which extends only to finite “full support” region as illustrated. The result is either binned in a regularly spaced grid when gridding or gathered from this grid when degridding. After all of the observed visibilities have been gridded an Inverse Fast Fourier Transform is performed to create an image of the sky. The reverse operations are done when simulating a set of visibility measurements from a model sky.

The value at each grid point, then, is a weighted accumulation of all the nearby visibilties. In one dimension this can be stated as (visually illustrated below):

\begin{equation} (\forall a \in \{1,2,\dots,N\}) \mathscr{V}(a\Delta{x}) = \sum_{\substack{ i | x_i \geq a\Delta{x}-\text{half support}, \\ x_i \leq a\Delta{x}+\text{half support}} }{\mathscr{V}(x_i) \, C(a\Delta{x}-x_i)} \end{equation}

In [ ]:
N = 30
dx = 1
a = np.arange(N)
M = 75
tap_pos = (N//2)*dx
conv_hsup = 5*dx
conv_x = np.linspace(-conv_hsup,conv_hsup,1000)
vis_x = np.sort(np.random.rand(M)*N*dx)
vis = 1.5+np.random.rand(M)*0.7 + 0.3
plt.figure(figsize=(13,5))
ax1 = plt.axes()
ax1.axes.get_yaxis().set_visible(False)
for x in a*dx:
    plt.plot([x,x],[-1.5,-1.0],'k')
plt.plot(vis_x,vis,'b.')
plt.plot(vis_x,np.ones([M])*-1.25,'g.')
plt.plot([vis_x[0],vis_x[0]],[-1.25,vis[0]],'b--')
plt.plot(tap_pos+conv_x,np.sinc(conv_x),'r')

plt.arrow(tap_pos-conv_hsup, -0.25, 0, 2.5+0.5, head_width=0.0, head_length=0.0, fc='m', ec='m')
plt.arrow(tap_pos+conv_hsup, -0.25, 0, 2.5+0.5, head_width=0.0, head_length=0.0, fc='m', ec='m')
plt.arrow(tap_pos-conv_hsup, -0.25, conv_hsup*2, 0, head_width=0.0, head_length=0.0, fc='m', ec='m')
plt.arrow(tap_pos-conv_hsup, 2.5+0.25, conv_hsup*2, 0, head_width=0.0, head_length=0.0, fc='m', ec='m')

plt.text(tap_pos+0.75, -0.70, "$\sum{\mathscr{V}(x_i)C(a\Delta{x}-x_i)}$", fontsize=11,color='m')
plt.text(tap_pos+conv_hsup+0.15, 0, "$C$", fontsize=13,color='r')
plt.text(0.5, 2.5, "$\mathscr{V}(x_i),x_i\in\mathbb{R}$", fontsize=16,color='b')
plt.arrow(tap_pos, -0.25, 0, -0.45, head_width=0.75, head_length=0.3, fc='m', ec='m')
plt.ylim(-1.75,3.0)
plt.xlim(-0.5*dx,N*dx)
plt.xlabel("Grid position ($a\Delta{x}$)",fontsize=15)
plt.show()

Figure: Here we have illustrated the gridding process in one dimension. Given a continuous visibility function sampled at some, non-regular, points in $x$ and a convolution function, $C$, the points stored at regular intervals (black bars) is approximately a convolution between the visibility function and the convolution function. The coordinates the visibility function is sampled at, is plotted with green dots in-between the regularly sampled grid positions.

The weighting function, $C$ can be any number of functions proposed in the literature. These include linear, Lagrange, sinc (including one of the many window functions), Gaussian, modified B-spline, etc.

You may have noticed that the interpolating function above is remarkably close to that of a discrete convolution. If the resampling was done on data that was regularly sampled and the convolution function evaluated at these regular discrete steps then the function would just the ordinary discrete convolution. However, the function as it stands is not quite a convolution by the strictest definition of the word. Gridding and degridding should be thought of as approximations to the discrete convolution. Nevertheless we will use the regular convolution notation in our discussion.

For those coming from a signal processing background it is useful to think of the convolutional gridding and degridding operations in terms of the ordinary upsampling and downsampling operations. In gridding, as with traditional upsampling, the space in-between samples are filled with zero values. The only difference is that with gridding the original measurements are not regularly-spaced, as would be the case with upsampling. Just as with upsampling it is then necessary to assign values to these new zero values in-between the measured values. With gridding the values are smeared out over the grid points within a some area of support.

This is a very important point. During the gridding process, because the uv plane is not fully sampled, many of the grid points are assigned a value fo zero! Now, there is essantially no chance that a gridded visibility value is actually zero, but since we have not sampled that point in the visibility domain the only option is to assign that pixel some value. Zero is convenient, but not the true value. We will come back to this point in the next chapter on deconvolution ➞ which requires us to include additional knowledge to make an informed guess about what the value of these pixels could be.

With this understanding in hand we can define gridding and degridding more rigorously:

\begin{equation} \begin{split} V_\text{gridded}[u,v]&=[(\mathscr{V}(u,v) \, S(u,v))\circ C(u,v)] \, III[u,v]\\ V_\text{degridded}(u,v)&=[V_\text{gridded}[u,v]\circ C(u,v)] \, S(u,v)\\ \end{split} \end{equation}

In gridding the sampled visibilities are convolved with a convolution function then discretized onto regular points by the shah (bed-of-nails) function ($\S$ 2.2 ➞). In degridding the opposite is done: the regularly sampled discerete values are convolved and sampled along the sampling tracks in the u,v plane. The convolution function smears (gridding) and gathers (degridding) the visibilities over / from some area of support before discretizing the visibilities to new coordinates. Ideally this function would be computed during the gridding and degridding operations, however, considering that the processing costs of gridding and degridding both scale as $MC_\text{sup}^2$ these functions can be too computationally expensive to compute for every visibility and is normally pretabulated for a given support size. Additionally it is important to sample this function much more densely than the spacings between grid cells; interferometers take measurements in the spatial frequency domain and any large snapping / rounding operation on the coordinates of the samples will result in a decorrelation in the structural information about the image. The figure below illustrates how values are picked from the oversampled filter.


In [ ]:
Image(filename="figures/oversampled_filter_illustration.png")

Figure: Here the indexing for a padded, oversampled filter is illustrated for a 3-cell full-support region (half support of 1 to both sides of the centre value), padded with one value on both sides. The filter is 5x oversampled, as indicated by the spaces between the asterisks. The bars represent the grid resolution ($\Delta{u}$ or $\Delta{v}$). If the measured uv coordinate falls exactly on the nearest grid cell (red dot) then values 6,11 and 16 are selected as interpolation coefficients. If the uv value is slightly offset, for instance $\text{round}(\text{fraction}(u, v)m_\text{oversample factor})$ = 2 (green dot), then 8, 13 and 18 are selected for the 3 interpolation coefficients. In other words: a denser bed of nails is placed over the bed of nails of the grid and the closest set of coefficients for the convolution are selected.

More importantly, the alias-reduction properties of the convolution filter being used are essential to the FFT approach. By the convolution theorem the reconstructed image of the radio sky can be stated as follows:

\begin{equation} I_\text{dirty}[l,m] = ([I(l,m)\circ\text{PSF}(l,m)] \, c(l,m))\circ\mathscr{F}\{III\}[l,m] \end{equation}

The Fourier transform of the shah function $\mathscr{F}\{III\}[l,m]$ is a series of periodic functions in the image domain. Convolution with these periodic functions replicates the field of view at a period of $M\Delta{\theta_l}$ and $N\Delta{\theta_m}$ for an $M\times N$ pixel image, and it is this aliasing effect that must be stopped. To that end one would hope that the Fourier transform of the convolution filter, $c(l,m)$, maximizes the following ratio:

\begin{equation} \frac{\int_\text{FOV} \lvert c(l,m) \rvert^2dS}{\int_{-\infty}^\infty \lvert c(l,m) \rvert^2dS} \end{equation}

Simply stated, it is desirable that the function $c$ is only non-zero over a small central region: the field of view.

Both the remarks about accuracy and anti-aliasing properties of the filter precludes using a nearest-neighbour approach to interpolating points to and from regular coordinates. Interpolation accuracy takes presidence in degridding, while alias-reduction is important for gridding. Nearest-neighbour interpolation (also known as cell-summing in older literature) simply accumulates the neighbouring points that fall within a rectangular region around the new coordinate, without considering the distance those points are from the new coordinate. The Fourier transform of this box function is an infinite sinc function, which ripples out slowly towards infinity, and doesn't stop much of the aliasing effect. Convolutional gridding/degridding is therefore a more attractive approach, because the distance between the grid point and the measured uv point is taken into account when selecting a set of convolution weights.

The observation about the Fourier transform of the box function leads us to a partial solution for the aliasing problem, in that convolving with an infinite sinc will yield an image tapered by a box function. Unfortunately this is not computationally feasible and instead the best option is to convolve with either a truncated sinc function, or some other function that has a similar centre-heavy Fourier transform and preferably tapers off reasonably quickly. The images below illustrates the significant improvement using a truncated sinc function instead of nearest-neighbour interpolation.


In [ ]:
Image(filename="figures/NN_interpolation_aliasing.png", width=512)

In [ ]:
Image(filename="figures/AA_kernel_alias_reduction.png", width=512)

Figure: Above two synthesized images of a grid of point sources. The first using cell-summing (nearest neighbour) interpolation and the second using convolutional resampling with a simple truncated sinc function. In the first the sources of this grid sky pattern that fall slightly outside the field of view are aliased back into the field of view. In the second the aliasing energy is limited by the box response of the sinc function.

Below the magnitude of the sidelobes of the Fourier transforms of several functions are plotted. The sidelobes of the Fourier transform of the box function is significantly higher than that of truncated and windowed sinc functions.


In [ ]:
half_sup = 6
oversample = 15
full_sup_wo_padding = (half_sup * 2 + 1)
full_sup = full_sup_wo_padding + 2 #+ padding
no_taps = full_sup + (full_sup - 1) * (oversample - 1)
taps = np.arange(-no_taps//2,no_taps//2 + 1)/float(oversample)

#unit box
box = np.where((taps >= -0.5) & (taps <= 0.5),
               np.ones([len(taps)]),np.zeros([len(taps)]))
fft_box = np.abs(np.fft.fftshift(np.fft.fft(np.fft.ifftshift(box))))
#truncated (boxed) sinc
sinc = np.sinc(taps)
fft_sinc = np.abs(np.fft.fftshift(np.fft.fft(np.fft.ifftshift(sinc))))
#gaussian sinc
alpha_1 = 1.55
alpha_2 = 2.52
gsinc = np.sin(np.pi/alpha_1*(taps+0.00000000001))/(np.pi*(taps+0.00000000001))*np.exp(-(taps/alpha_2)**2)
fft_gsinc = np.abs(np.fft.fftshift(np.fft.fft(np.fft.ifftshift(gsinc))))
#plot it up
plt.figure(figsize=(7, 5), dpi=80)
l = np.arange(-(no_taps)//2,(no_taps)//2+1) * (1.0/oversample)
a, = plt.plot(2*l, 10.*np.log10(fft_box))
b, = plt.plot(2*l, 10.*np.log10(fft_sinc))
c, = plt.plot(2*l, 10.*np.log10(fft_gsinc))
ax = plt.gca()
ax.set_xlim(0,no_taps//2 * (1.0/oversample))
#ax.set_yscale("log", nonposy='clip')
plt.legend([a,b,c],["Box","Sinc","Gaussian Sinc"])
plt.xlabel("$2\Delta{u}l$")
plt.ylabel("Magnitude of $c(l)$ (dB)")
plt.title("Magnitude of Fourier transforms of several convolution functions")
plt.show()

Figure: The magnitudes of the Fourier transforms of various functions. It is desirable that most of the energy of these functions fall within some central region and that the response drops off sharply at the edge of this central region

After Fourier transformation the effects of the convolution function on the image can be mitigated by point-wise dividing the image through by the Fourier transform of the convolution function, $c(l,m)$. This has the effect of flattening the response of the passband, by removing the tapering towards the edges of the image, but raises the amplitude of any aliased sources at the edge of the image.

In practice the proloid spheroidal functions are used in imaging programs such as lwimager, but the definition of these functions are beyond the scope of the introductory discussion here and the reader is referred to the work of Donald Rhodes, On the Spheroidal Functions for a detailed discussion of their definition and proof of their aliasing reduction properties.

It is also worth noting that the convolution functions used in gridding and degridding need not be the same function. In degridding the focus is solidly on the accuracy of the predicted visibility. Here it can be advantageous to minimize the difference between a direct transformation approach and a Fast Fourier Transform approach with degridding, see for instance the discussion by Sze Tan, Aperture-synthesis mapping and parameter estimation for further detail.

5.3.3 Example Simulator and Imager

We conclude this section with some sample code to illustrate prediction and imaging using resampling and the FFT. To start off let's set up a uv tracks as would be seen by the JVLA in D configuration. Recall that image resolution is given by the longest track in u and v and that the baseline is always measured in wavelengths. You can play around with the nyquist rate and image sizes. Notice how subnyquist sampling adversely affect the angular resolution of your image.


In [ ]:
%run jvla_d_constants
RA = 0.0
DECLINATION = 90.0
global Nx, Ny, uvw, scaled_uv, model_sky, model_regular, max_uv, cell_size_l, cell_size_m
cellsize_slider=FloatSlider(min=0.25, #subnyquist
                            max=2.5,
                            value=1.0, 
                            step=0.1, 
                            continuous_update=False)
sampling_field = HBox([Label("Nyquist rate scaling factor (1x=critical sampling)"), cellsize_slider])
imsize_slider=FloatSlider(min=1,
                          max=3,
                          value=2.0,
                          step=0.1,
                          continuous_update=False)
imsize_field = HBox([Label("Image size scaling factor (1x=1deg)"), imsize_slider])

def interact_plot(key):
    global Nx, Ny, uvw, scaled_uv, model_sky, model_regular, max_uv, cell_size_l, cell_size_m
    #User defined scaling factors:
    clear_output(wait=True)
    sampling_tweek_factor = cellsize_slider.value
    im_size_tweek_factor = imsize_slider.value
    
    #Set up interferometer sampling pattern
    uvw = track_simulator.sim_uv(RA, DECLINATION, 1.5, 60/3600.0, ENU, ARRAY_LATITUDE)

    #Work out the required sampling rate (let's just do a square grid for simplicity)
    max_uv = np.max(np.abs(uvw[:,0:2]/CENTRE_CHANNEL))
    print "Maximum extent in uv: %f" % (max_uv)
    cell_size_l = cell_size_m = np.rad2deg((1 / (2 * max_uv)) / sampling_tweek_factor)
    print "Cell sizes in l,m: (%f,%f) arcsecs" % (cell_size_l*3600.0, cell_size_m*3600.0)

    #arbitrarily choose the field size to be 1.0 square degrees
    FIELD_SIZE = 1.0 * im_size_tweek_factor

    #Then work out the number of pixels required to create a field of this size, given the nyquist image resolution
    Nx = int(np.round(FIELD_SIZE / cell_size_l))
    Ny = int(np.round(FIELD_SIZE / cell_size_m)) 
    print "Image size in pixels to cover %.3f square degrees: (%d,%d)" % (FIELD_SIZE, Nx, Ny)

    #Setup a random model sky consisting of point sources:
    model_sky = np.zeros([Nx,Ny])
    for i in range(15):
        model_sky[int(np.random.rand()*Nx),int(np.random.rand()*Ny)] = np.random.rand()*5.0 
    model_regular = np.fft.fftshift(np.fft.fft2(np.fft.ifftshift(model_sky)))

    #In the DFT we can pick the sampling rate in l and m directly when we evaluate a sum for each pixel.
    #However when using the FFT we have to employ the similarity theorem: a multiplication in one domain
    #results in a division in the other domain. Thus we scale the invertion such that it ranges from 
    #-0.5*N*cell_size <= pixel <= 0.5*N*cell_size
    scaled_uv = np.copy(uvw[:,0:2])
    scaled_uv[:,0] *= np.deg2rad(cell_size_l * Nx)
    scaled_uv[:,1] *= np.deg2rad(cell_size_m * Ny)
    
    #Finally plot up the results
    plt.figure(figsize=(15, 15))
    plt.subplot(131)
    plt.title("Model sky")
    plt.ticklabel_format(useOffset=False)
    plt.imshow(model_sky,cmap="gray", extent=[RA - Nx / 2 * cell_size_l, RA + Nx / 2 * cell_size_l,
                                              DECLINATION - Ny / 2 * cell_size_m, DECLINATION + Ny / 2 * cell_size_m])
    plt.xlabel("RA")
    plt.ylabel("DEC")
    plt.subplot(132)
    plt.title("Logscale amplitudes of visibilliy space")
    plt.imshow(10*np.log10(np.abs(model_regular+0.000000000001)), extent=[-max_uv/1e4, +max_uv/1e4,
                                                                          -max_uv/1e4, +max_uv/1e4])
    plt.plot(uvw[:,0]/CENTRE_CHANNEL/1e4,
             uvw[:,1]/CENTRE_CHANNEL/1e4,
             "k.",label="Baselines", markersize=1)

    plt.xlabel("uu $(k\lambda)$")
    plt.ylabel("vv $(k\lambda)$")
    plt.subplot(133)
    plt.title("Phase of visibility space")
    plt.imshow(np.angle(model_regular), extent=[-max_uv/1e4, +max_uv/1e4,
                                                -max_uv/1e4, +max_uv/1e4])
    plt.plot(uvw[:,0]/CENTRE_CHANNEL/1e4,
             uvw[:,1]/CENTRE_CHANNEL/1e4,
             "k.",label="Baselines", markersize=1)
    plt.xlabel("uu $(k\lambda)$")
    plt.ylabel("vv $(k\lambda)$")
    plt.figure(figsize=(15, 15))
    plt.subplot(121)
    plt.title("Scaled tracks over log amplitude grid")
    plt.imshow(10*np.log10(np.abs(model_regular+0.000000000001)), extent = [0,Nx,0,Ny])
    plt.plot(scaled_uv[:,0]/CENTRE_CHANNEL + Nx / 2,
             scaled_uv[:,1]/CENTRE_CHANNEL + Ny / 2,
             "k.",label="Baselines", markersize=1)
    plt.xlabel("$N_x$")
    plt.ylabel("$N_y$")
    plt.subplot(122)
    plt.title("Scaled tracks over phase grid")
    plt.imshow(np.angle(model_regular), extent = [0,Nx,0,Ny])
    plt.plot(scaled_uv[:,0]/CENTRE_CHANNEL + Nx / 2,
             scaled_uv[:,1]/CENTRE_CHANNEL + Ny / 2,
             "k.",label="Baselines", markersize=1)
    plt.xlabel("$N_x$")
    plt.ylabel("$N_y$")
    plt.show()
    

cellsize_slider.observe(interact_plot)
display(sampling_field)
imsize_slider.observe(interact_plot)
display(imsize_field)
interact_plot("")

Figure: The simulated model sky (l,m space) and its fourier transform in the visibility space (u,v space). Overlayed on top are the uv tracks for JVLA D. In the bottom plots the scaled tracks are overlayed on the grid predicted from the model. As you can see if subnyquist rates are chosen angular resolution is lost because the long baselines fall outside the grid and must be discarded during imaging

To complete the prediction (also known as the "forward" step) the measurements are resampled onto the u,v tracks of the interferometer using the degridding algorithm discussed above. Measurements are gathered and weighted from the vacinity of each of the points along the sampling track in order to "predict" a value at the u,v coordinate.


In [ ]:
# %load convolutional_degridder.py
import numpy as np

def fft_degrid(model_image, uvw, ref_lda, Nx, Ny, convolution_filter):
    """
    Convolutional gridder (continuum)

    Keyword arguments:
    model_image --- Model image
    uvw --- interferometer's scaled uvw coordinates
            (Prerequisite: these uv points are already scaled by the similarity
            theorem, such that -N_x*Cell_l*0.5 <= theta_l <= N_x*Cell_l*0.5 and
            -N_y*Cell_m*0.5 <= theta_m <= N_y*Cell_m*0.5)
    ref_lda --- array of reference lambdas (size of vis channels)
    Nx,Ny --- size of image in pixels
    convolution_filter --- pre-instantiated AA_filter anti-aliasing
                           filter object
    """
    assert model_image.ndim == 3
    filter_index = \
        np.arange(-convolution_filter.half_sup,convolution_filter.half_sup+1)
    model_vis_regular = np.zeros(model_image.shape, dtype=np.complex64)
    for p in xrange(model_image.shape[0]):
        model_vis_regular[p, :, :] = \
            np.fft.fftshift(np.fft.fft2(np.fft.ifftshift(model_image[p, :, :])))
    vis = \
        np.zeros([uvw.shape[0],
                  ref_lda.shape[0],
                  model_image.shape[0]],
                 dtype=np.complex)

    for r in xrange(uvw.shape[0]):
        for c in xrange(vis.shape[1]):
            scaled_uv = uvw[r,:] / ref_lda[c]
            disc_u = int(np.round(scaled_uv[0]))
            disc_v = int(np.round(scaled_uv[1]))
            frac_u_offset = int((1 + convolution_filter.half_sup +
                                 (-scaled_uv[0] + disc_u)) *
                                convolution_filter.oversample)
            frac_v_offset = int((1 + convolution_filter.half_sup +
                                 (-scaled_uv[1] + disc_v)) *
                                convolution_filter.oversample)

            if (disc_v + Ny // 2 + convolution_filter.half_sup >= Ny or
                disc_u + Nx // 2 + convolution_filter.half_sup >= Nx or
                disc_v + Ny // 2 - convolution_filter.half_sup < 0 or
                disc_u + Nx // 2 - convolution_filter.half_sup < 0):
                continue
            for conv_v in filter_index:
                v_tap = \
                    convolution_filter.filter_taps[conv_v *
                                                   convolution_filter.oversample
                                                   + frac_v_offset]
                grid_pos_v = disc_v + conv_v + Ny // 2
                for conv_u in filter_index:
                    u_tap = \
                        convolution_filter.filter_taps[conv_u *
                                                       convolution_filter.oversample
                                                       + frac_u_offset]
                    conv_weight = v_tap * u_tap
                    grid_pos_u = disc_u + conv_u + Nx // 2
                    for p in range(vis.shape[2]):
                        vis[r, c, p] += \
                            model_vis_regular[p,
                                              grid_pos_v,
                                              grid_pos_u] * conv_weight

    return vis

In [ ]:
tabulated_filter = AA_filter.AA_filter(3,63,"sinc")
vis = fft_degrid(model_sky.reshape(1, Ny, Nx), scaled_uv, np.array([CENTRE_CHANNEL]), Nx, Ny, tabulated_filter)

Next comes a simplified imaging step using the FFT and gridding. The visibilities on the irregularly-spaced u,v tracks are resampled onto regular coordinates. This is done by weighting and smearing each measured visibility out onto the regular coordinates in the vacinity of its u,v coordinate. After resampling the inverse FFT is used to transform the measurements in the spatial frequency domain to those in the spacial domain, thereby approximately reconstructing the model sky we started with.


In [ ]:
# %load convolutional_gridder.py
import numpy as np

def grid_ifft(vis, uvw, ref_lda, Nx, Ny, convolution_filter):
    """
    Convolutional gridder (continuum)

    Keyword arguments:
    vis --- Visibilities as sampled by the interferometer
    uvw --- interferometer's scaled uvw coordinates
            (Prerequisite: these uv points are already scaled by the similarity
            theorem, such that -N_x*Cell_l*0.5 <= theta_l <= N_x*Cell_l*0.5 and
            -N_y*Cell_m*0.5 <= theta_m <= N_y*Cell_m*0.5)
    ref_lda --- array of reference lambdas (size of vis channels)
    Nx,Ny --- size of image in pixels
    convolution_filter --- pre-instantiated AA_filter anti-aliasing
                           filter object
    """
    assert vis.shape[1] == ref_lda.shape[0], (vis.shape[1], ref_lda.shape[0])
    filter_index = \
        np.arange(-convolution_filter.half_sup,convolution_filter.half_sup+1)
    # one grid for the resampled visibilities per correlation:
    measurement_regular = \
        np.zeros([vis.shape[2],Ny,Nx],dtype=np.complex)
    # for deconvolution the PSF should be 2x size of the image (see 
    # Hogbom CLEAN for details), one grid for the sampling function:
    sampling_regular = \
        np.zeros([2*Ny,2*Nx],dtype=np.complex)
    for r in xrange(uvw.shape[0]):
        for c in xrange(vis.shape[1]):
            scaled_uv = uvw[r,:] / ref_lda[c]
            disc_u = int(np.round(scaled_uv[0]))
            disc_v = int(np.round(scaled_uv[1]))
            frac_u_offset = int((1 + convolution_filter.half_sup +
                                 (-scaled_uv[0] + disc_u)) *
                                convolution_filter.oversample)
            frac_v_offset = int((1 + convolution_filter.half_sup +
                                 (-scaled_uv[1] + disc_v)) *
                                convolution_filter.oversample)
            disc_u_psf = int(np.round(scaled_uv[0]*2))
            disc_v_psf = int(np.round(scaled_uv[1]*2))
            frac_u_offset_psf = int((1 + convolution_filter.half_sup +
                                     (-scaled_uv[0]*2 + disc_u_psf)) *
                                    convolution_filter.oversample)
            frac_v_offset_psf = int((1 + convolution_filter.half_sup +
                                     (-scaled_uv[1]*2 + disc_v_psf)) *
                                    convolution_filter.oversample)
            if (disc_v + Ny // 2 + convolution_filter.half_sup >= Ny or
                disc_u + Nx // 2 + convolution_filter.half_sup >= Nx or
                disc_v + Ny // 2 - convolution_filter.half_sup < 0 or
                disc_u + Nx // 2 - convolution_filter.half_sup < 0):
                continue
            for conv_v in filter_index:
                v_tap = \
                    convolution_filter.filter_taps[conv_v *
                                                   convolution_filter.oversample
                                                   + frac_v_offset]
                v_tap_psf = \
                    convolution_filter.filter_taps[conv_v *
                                                   convolution_filter.oversample
                                                   + frac_v_offset_psf]

                grid_pos_v = disc_v + conv_v + Ny // 2
                grid_pos_v_psf = disc_v_psf + conv_v + Ny
                for conv_u in filter_index:
                    u_tap = \
                        convolution_filter.filter_taps[conv_u *
                                                       convolution_filter.oversample
                                                       + frac_u_offset]
                    u_tap_psf = \
                        convolution_filter.filter_taps[conv_u *
                                                       convolution_filter.oversample
                                                       + frac_u_offset_psf]
                    conv_weight = v_tap * u_tap
                    conv_weight_psf = v_tap_psf * u_tap_psf
                    grid_pos_u = disc_u + conv_u + Nx // 2
                    grid_pos_u_psf = disc_u_psf + conv_u + Nx
                    for p in range(vis.shape[2]):
                        measurement_regular[p, grid_pos_v, grid_pos_u] += \
                            vis[r, c, p] * conv_weight
                    # assuming the PSF is the same for different correlations:
                    sampling_regular[grid_pos_v_psf, grid_pos_u_psf] += \
                        (1+0.0j) * conv_weight_psf

    dirty = np.zeros(measurement_regular.shape, dtype=measurement_regular.dtype)
    psf = np.zeros(sampling_regular.shape, dtype=sampling_regular.dtype)

    for p in range(vis.shape[2]):
        dirty[p,:,:] = np.fft.fftshift(np.fft.ifft2(np.fft.ifftshift(measurement_regular[p,:,:])))
    psf[:,:] = np.fft.fftshift(np.fft.ifft2(np.fft.ifftshift(sampling_regular[:,:])))
    return dirty,psf

In [ ]:
tabulated_filter = AA_filter.AA_filter(3,63,"sinc")
dirty_sky, psf = grid_ifft(vis, scaled_uv, np.array([CENTRE_CHANNEL]), Nx, Ny, tabulated_filter)

In [ ]:
#plot it up :-)
plt.figure(figsize=(15, 45))
plt.subplot(311)
plt.title("Model sky")
plt.imshow(model_sky,cmap="gray", extent=[RA - Nx / 2 * cell_size_l, RA + Nx / 2 * cell_size_l,
                                          DECLINATION - Ny / 2 * cell_size_m, DECLINATION + Ny / 2 * cell_size_m])
plt.xlabel("RA")
plt.ylabel("DEC")
plt.subplot(312)
plt.title("PSF")
plt.imshow(np.real(psf[:, :]),cmap="gray", extent=[RA - Nx * 2 / 2 * cell_size_l, RA + Nx * 2 / 2 * cell_size_l,
                                                   DECLINATION - Ny * 2 / 2 * cell_size_m, DECLINATION + Ny * 2 / 2 * cell_size_m])
plt.xlabel("RA")
plt.ylabel("DEC")
plt.subplot(313)
plt.title("Dirty map")
plt.imshow(np.real(dirty_sky[0, :, :]),cmap="gray", extent=[RA - Nx / 2 * cell_size_l, RA + Nx / 2 * cell_size_l,
                                                            DECLINATION - Ny / 2 * cell_size_m, DECLINATION + Ny / 2 * cell_size_m])
plt.xlabel("RA")
plt.ylabel("DEC")
plt.show()

Figure: Reconstructed image of the model sky. This "dirty" image is convolved with the the PSF shown in the centre figure. Some of the fainter sources are hardly visible because of the sidelobes introduced by this convolution.

Of course this reconstruction can only ever be approximate, since the u,v plane is only partially sampled. The brightest sources are still visible, provided the observation time is long enough. Each of the sources are convolved with the PSF shown above in the centre figure; the ring-like structure of the PSF is clearly visible around the bright sources and can, in the worst case, obscure some of the fainter sources in the image. The deconvolution strategies discussed later attempt to remove the psf structure from the images and improve the fidelity of the reconstructed images.