Import standard modules:
In [ ]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from IPython.display import HTML
HTML('../style/course.css') #apply general CSS
Import section specific modules:
In [ ]:
pass
In [ ]:
HTML('../style/code_toggle.html')
As explained in Chapter 7 ➞, the increased field-of-view of modern telescopes causes direction-dependent effects, such as the primary beam and pointing error, to become apparent. Therefore, we cannot rely on using direction-independent self-calibration (see $\S$ 8.3 ➞). There are, in principle, many approaches one can use to perform direction-dependent calibration (see $\S$ 8.4.1 ⤵ for more details).
In this section we will concentrate on one specific approach: differential gains (Revisiting the radio interferometer measurement equation-II. Calibration and direction-dependent effects ⤴). This approach provides a nice framework with which one can build up some intuition regarding how direction-dependent calibration differs from direction-independent calibration.
In $\S$ 7.2 ➞ the following equation was presented:
The all-sky RIME
\begin{equation}
\mathbf{V}_{pq} = \mathbf{G}_p\mathbf{X}_{pq}\mathbf{G}_q^H.
\end{equation}
This equation is known as the all-sky RIME, where $\mathbf{V}_{pq}$ is the $2\times2$ correlation matrix measured by the interferometer and $\mathbf{X}_{pq}$ is the $2\times2$ coherency matrix. Moreover, $\mathbf{G}_p$ and $\mathbf{G}_q$ are G-Jones antenna matrices. During calibration we estimate $\mathbf{G}_p$ and $\mathbf{G}_q$ which we subsequently use to correct the correlation matrix $\mathbf{V}_{pq}$. The subscripts $p$ and $q$ denote the antennas that were used to make the measurement. Furthermore, $\mathbf{X}_{pq} = \sum_s \mathbf{X}_{spq}$ where $s$ is the source index, i.e. in the all-sky RIME we assume that the error that corrupts our visibilities is independent of the sources' positions. As explained in $\S$ 7.3 ➞, this assumption is violated when we work with a larger field-of-view. As an example, the primary beam of an instrument varies significantly over a large field-of-view (generally in time and frequency). In the case of the primary beam, we could try to model the direction dependent effects by adding an a-priori E-Jones matrix to our Jones' chain (see Incorporation of antenna primary beam patterns in radio-interferometric data reduction to produce wide-field, high-dynamic-range images ⤴). However, if we do not have any information about the physical source that is responsible for a direction dependent effect then we could use the idea of differential gains instead. In addition to to the direction-independent gain we add a differential gain which can be different for each source. Mathematically, we can express this as
Adding differential gains
\begin{equation}
\mathbf{V}_{pq} = \mathbf{G}_p \left (\sum_s\Delta\mathbf{E}_{sp}\mathbf{X}_{spq} \Delta \mathbf{E}_{sq}^H \right) \mathbf{G}_q^H,
\end{equation}
where $\Delta\mathbf{E}_{sp}$ and $\Delta\mathbf{E}_{sq}^H$ are the differential gains associated with source $s$ and antenna $p$ and $q$ respectively. We now use the above equation and least squares (see $\S$ 8.1 ➞) to estimate the unknown diferrential and direction independent gains. These gains can be used to correct $\mathbf{V}_{pq}$.
Another question now arises; how do we know if a source requires a diferential gain factor? Fig 8.4.1 ⤵ helps to answer this question.
We can learn two main things from Fig 8.4.1 ⤵ :
We now present a real radio image which was created using the differential gains method. In Fig. 8.4.2 ⤵ we have a JVLA image of the 3C147 field. The large image is the final end product, i.e the differential gains method has already been applied. The upper left image insert is just an enlargement of a subregion in the larger image. The bottom left image insert depicts how the same subregion looked prior to applying the differential gains method. The second image insert validates the comments we have already made; sources which require a differential gain factor, in addition to a direction independent gain, are usually surounded imaging artefacts. In this particular example, the image artefacts were caused by the rotation of the JVLA primary beam.
3GC can, in general, be divided into physics-based and heuristic-only approaches. If we know the underlying physical phenomenon which is responsible for a specific direction-dependent effect, we may employ a physics-based calibration approach. This is usually accomplished by constructing a parametrized model based on the underlying physical phenomenon. The aim of this approach is to estimate the parameters of this model and use the results to correct our observed visibilities. In some cases, the direction-dependent phenomenon is known a-priori and we simply need to correctly incorporate it whilst calibrating. The following list contains some examples of physics-based approaches:
3GC: Physics-based approaches
• Pointing-selfcal: [EVLA Memo 84. Solving for the antenna based pointing errors ⤴](www.aoc.nrao.edu/evla/geninfo/memoseries/evlamemo84.ps)
• Kalman filter: [Nonlinear Kalman filters for calibration in radio interferometry ⤴](http://arxiv.org/abs/1403.6308)
• Primary beam: [Incorporation of antenna primary beam patterns in radio-interferometric data reduction to produce wide-field, high-dynamic-range images ⤴](http://ieeexplore.ieee.org/Xplore/defdeny.jsp?url=http%3A%2F%2Fieeexplore.ieee.org%2Fstamp%2Fstamp.jsp%3Ftp%3D%26arnumber%3D7297163%26userType%3Dinst&denyReason=-134&arnumber=7297163&productsMatched=null&userType=inst)
Correcting for a direction-dependent effect is non-trivial, regardless of whether it is known a-priori or from calibration. The following list contains some approaches that have been proposed to accomplish this:
3GC: Correcting for a known direction dependent effect
• Faceting: [Radio-interferometric imaging of very large fields-The problem of non-coplanar arrays ⤴](http://adsabs.harvard.edu/abs/1992A%26A...261..353C)
• AW-projection: [Correcting direction-dependent gains in the deconvolution of radio interferometric images ⤴](http://arxiv.org/abs/0805.0834)
On the other end of the spectrum we have the heuristic-only approaches. In an heuristic approach we do not know the physical source of a specific direction-dependent effect. Instead, we introduce a number of free-parameters which we try to optimize based on some user-defined heuristic. Some 3GC heuristic approaches are listed below:
3GC: Heuristic-only approaches
• *Peeling*: [LOFAR calibration challenges ⤴](http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=847375)
• Differential gains: [Revisiting the radio interferometer measurement equation-II. Calibration and direction-dependent effects ⤴](http://arxiv.org/abs/1101.1765)
• Clustered calibration: [Clustered calibration: an improvement to radio interferometric direction-dependent self-calibration ⤴](http://arxiv.org/abs/1301.0633)
Once we obtain a heuristic solution, we can try to makes sense of it by fitting a physical model to it. Prime examples of this include:
3GC: Fitting a model to heuristic-only solutions
• SPAM (Source Peeling and Atmospheric Modelling): [Ionospheric calibration of low frequency radio interferometric observations using the peeling scheme-I. Method description and first results ⤴](http://arxiv.org/abs/0904.3975)
• Primary beam shapes: [Estimation of radio interferometer beam shapes using Riemannian optimization ⤴](http://arxiv.org/abs/1209.4236)
As we explained in $\S$ 8.2 ➞, least-squares solvers (see $\S$ 8.1 ➞) only became popular with the advent of self-calibration. Many improvements and alternatives to the least-squares solver have since been developed. We list the most recent developments below:
Solver Development
• Eigendecomposition: [Gain calibration methods for radio telescope arrays ⤴](http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1145704&abstractAccess=no&userType=inst)
• SAGEcal: [Radio interferometric calibration using the SAGE algorithm ⤴](http://arxiv.org/abs/1012.1722)
• Robust calibration: [Robust radio interferometric calibration using the t-distribution ⤴](http://arxiv.org/abs/1307.5040)
• StEFCal: [Fast gain calibration in radio astronomy using alternating direction implicit methods: Analysis and applications ⤴](http://arxiv.org/abs/1410.2101)
• Riemann-Manifold: [Radio interferometric calibration using a Riemannian manifold ⤴](http://arxiv.org/abs/1303.1029)
• Blind Calibration: [Blind calibration for radio interferometry using convex optimization ⤴](http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=7330285&abstractAccess=no&userType=inst)
• Complex Optimization: [Radio interferometric gain calibration as a complex optimization problem ⤴](http://arxiv.org/abs/1502.06974)
• Kalman filter: [Nonlinear Kalman filters for calibration in radio interferometry ⤴](http://arxiv.org/abs/1403.6308)
Of the methods listed above, StEFCal is arguably the most important recent development. It is discussed in more detail in $\S$ 8.4.2.1 ➞. Its importance stems, in part, from the fact that it lowers the computational complexity of calibration from O$(N^3)$ to O$(N^2)$.
StEFCal is an alternating direction implicit method. It works by first solving $\boldsymbol{\mathcal{G}}^H$ with $\boldsymbol{\mathcal{G}}$ held constant and then solving $\boldsymbol{\mathcal{G}}$ with $\boldsymbol{\mathcal{G}}^H$ held constant. This is tantamount to linearising the calibration problem.
As $\boldsymbol{\mathcal{D}}- \boldsymbol{\mathcal{G}}\boldsymbol{\mathcal{M}}\boldsymbol{\mathcal{G}}^H$ is Hermitian, the two steps are equivalent and we only require the following update step:
$$\boldsymbol{\mathcal{G}}^{[i]} = \textrm{argmin}_{\boldsymbol{\mathcal{G}}}\left\|\boldsymbol{\mathcal{D}}- \boldsymbol{\mathcal{G}}^{[i-1]}\boldsymbol{\mathcal{M}}\boldsymbol{\mathcal{G}}^H\right\|.$$We can now write:
$$\left\|\boldsymbol{\mathcal{D}} - \boldsymbol{\mathcal{Z}}^{[i]}\boldsymbol{\mathcal{G}}^H\right\| = \sqrt{\sum_{p}^N\boldsymbol{\mathcal{D}}_{:,p}-\boldsymbol{\mathcal{Z}}_{:,p}^{[i]}\left(g_p^{[i]}\right)^*},$$with $\boldsymbol{\mathcal{Z}}^{[i]} = \boldsymbol{\mathcal{G}}^{[i]}\boldsymbol{\mathcal{M}}$. We denote the $p$-th column of $\boldsymbol{\mathcal{A}}$ with $\boldsymbol{\mathcal{A}}_{:,p}$. If we now apply the normal equation method ⤴ we readily obtain:
StEFCal update step
\begin{equation}
g_p^{[i]} = \frac{\boldsymbol{\mathcal{D}}^H_{:,p}\boldsymbol{\mathcal{Z}}_{:,p}^{[i-1]}}{\left(\boldsymbol{\mathcal{Z}}_{:,p}^{[i-1]}\right)^H\boldsymbol{\mathcal{Z}}_{:,p}^{[i-1]}}.
\end{equation}
We can use the above update step to iteratively obtain the best estimate of $g_p$. We iterate until we exceed some maximum number of iterations or if we reach our convergence criteria.