Import standard modules:


In [ ]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from IPython.display import HTML 
HTML('../style/course.css') #apply general CSS

Import section specific modules:


In [ ]:
pass

In [ ]:
HTML('../style/code_toggle.html')

GIG:AC:This needs to be expanded significantly. For every external link I strongly suggest to add an explanation what the method entails and how it works. The DDG section should move down to the Heuristic approaches section.

8.4 3GC Calibration: direction-dependent self-calibration

As explained in Chapter 7 ➞, the increased field-of-view of modern telescopes causes direction-dependent effects, such as the primary beam and pointing error, to become apparent. Therefore, we cannot rely on using direction-independent self-calibration (see $\S$ 8.3 ➞). There are, in principle, many approaches one can use to perform direction-dependent calibration (see $\S$ 8.4.1 ⤵ for more details).

In this section we will concentrate on one specific approach: differential gains (Revisiting the radio interferometer measurement equation-II. Calibration and direction-dependent effects). This approach provides a nice framework with which one can build up some intuition regarding how direction-dependent calibration differs from direction-independent calibration.

In $\S$ 7.2 ➞ the following equation was presented:

The all-sky RIME

\begin{equation} \mathbf{V}_{pq} = \mathbf{G}_p\mathbf{X}_{pq}\mathbf{G}_q^H. \end{equation}

This equation is known as the all-sky RIME, where $\mathbf{V}_{pq}$ is the $2\times2$ correlation matrix measured by the interferometer and $\mathbf{X}_{pq}$ is the $2\times2$ coherency matrix. Moreover, $\mathbf{G}_p$ and $\mathbf{G}_q$ are G-Jones antenna matrices. During calibration we estimate $\mathbf{G}_p$ and $\mathbf{G}_q$ which we subsequently use to correct the correlation matrix $\mathbf{V}_{pq}$. The subscripts $p$ and $q$ denote the antennas that were used to make the measurement. Furthermore, $\mathbf{X}_{pq} = \sum_s \mathbf{X}_{spq}$ where $s$ is the source index, i.e. in the all-sky RIME we assume that the error that corrupts our visibilities is independent of the sources' positions. As explained in GIG:LF:the following chapter/section does not exist[$\S$ 7.3 ➞](../7_Observing_Systems/7_3_direction_independent_and_dependent_effects.ipynb), this assumption is violated when we work with a larger field-of-view. GSF:MC:This until "here" should be reordered and be moved down to section "Heuristic approaches". As an example, the primary beam of an instrument varies significantly over a large field-of-view (generally in time and frequency). In the case of the primary beam, we could try to model the direction dependent effects by adding an a-priori E-Jones matrix to our Jones' chain (see Incorporation of antenna primary beam patterns in radio-interferometric data reduction to produce wide-field, high-dynamic-range images). However, if we do not have any information about the physical source that is responsible for a direction dependent effect then we could use the idea of differential gains instead. In addition to to the direction-independent gain we add a differential gain which can be different for each source. Mathematically, we can express this as

Adding differential gains

\begin{equation} \mathbf{V}_{pq} = \mathbf{G}_p \left (\sum_s\Delta\mathbf{E}_{sp}\mathbf{X}_{spq} \Delta \mathbf{E}_{sq}^H \right) \mathbf{G}_q^H, \end{equation}

where $\Delta\mathbf{E}_{sp}$ and $\Delta\mathbf{E}_{sq}^H$ are the differential gains associated with source $s$ and antenna $p$ and $q$ respectively. We now use the above equation and least squares (see $\S$ 8.1 ➞) to estimate the unknown differential and direction independent gains. These gains can be used to correct $\mathbf{V}_{pq}$.

Warning: Note that we present the differential gains method by using the full-polarized RIME equation, while we used the unpolarized RIME in [$\S$ 8.1 ➞](../8_Calibration/8_1_calibration_least_squares_problem.ipynb).

Warning: Differential gains should be used sparingly in order to avoid over-fitting.
GSF:MC:here

Another question now arises; how do we know if a source requires a differential gain factor? Figure 8.4.1 ⤵ helps to answer this question.

Figure 8.4.1: Which sources require a differential gain factor?

GIG:IC:I think it cannot be done like that. The figure helps to see which sources are surrounded by purple circles, but not how to identify cleaning residuals from wrong calibration. I like the figure, but this has to be re-formulated: We can learn two main things from [Fig 8.4.1 ⤵](#cal:fig:dir_dep) :

  1. In practice it is easy to spot a source which requires a differential gain (in addition to the direction independent gain). The sources which require a differential gain are usually surrounded by imaging artifacts (shown by the purple regions around the black sources in the figure). The yellow sources do not require a differential gain factor (no imaging artifacts around them).
  2. The further a source is from the field center, the more likely it is to be affected by a direction dependent effect.

We now present a real radio image which was created using the differential gains method. In Fig. 8.4.2 ⤵ we have a JVLA image of the 3C147 field. The large image is the final end product, i.e the differential gains method has already been applied. The upper left image insert is just an enlargement of a subregion in the larger image. The bottom left image insert depicts how the same subregion looked prior to applying the differential gains method. The second image insert validates the comments we have already made; sources which require a differential gain factor, in addition to a direction independent gain, are usually surrounded imaging artifacts. In this particular example, the image artifacts were caused by the rotation of the JVLA primary beam.

Figure 8.4.2: Differential gains method applied to the 3C147 field (courtesy of O.M. Smirnov).

8.4.1 Physics-based and heuristic-only approaches

GIG:AC:Again, this needs to be expanded significantly. For every external link I strongly suggest to add an explanation what the method entails and how it works. The DDG section should move down to the Heuristic approaches section. Please make sub-sub sections of the approach headlines. The discussion of important techniques like AW projection and faceting, incorporation of the primary beam, etc. should be more elaborate and more detailed.

3GC can, in general, be divided into physics-based and heuristic-only approaches. If we know the underlying physical phenomenon which is responsible for a specific direction-dependent effect, we may employ a physics-based calibration approach. This is usually accomplished by constructing a parametrized model based on the underlying physical phenomenon. The aim of this approach is to estimate the parameters of this model and use the results to correct our observed visibilities. In some cases, the direction-dependent phenomenon is known a-priori and we simply need to correctly incorporate it whilst calibrating. The following list contains some examples of physics-based approaches:

3GC: Physics-based approaches

Pointing-selfcal: [EVLA Memo 84. Solving for the antenna based pointing errors ⤴](http://www.aoc.nrao.edu/evla/geninfo/memoseries/evlamemo84.ps)
Kalman filter: [Nonlinear Kalman filters for calibration in radio interferometry ⤴](http://arxiv.org/abs/1403.6308)
Primary beam: [Incorporation of antenna primary beam patterns in radio-interferometric data reduction to produce wide-field, high-dynamic-range images ⤴](http://ieeexplore.ieee.org/Xplore/defdeny.jsp?url=http%3A%2F%2Fieeexplore.ieee.org%2Fstamp%2Fstamp.jsp%3Ftp%3D%26arnumber%3D7297163%26userType%3Dinst&denyReason=-134&arnumber=7297163&productsMatched=null&userType=inst)

Correcting for a direction-dependent effect is non-trivial, regardless of whether it is known a-priori or from calibration. The following list contains some approaches that have been proposed to accomplish this:

3GC: Correcting for a known direction dependent effect

Faceting: [Radio-interferometric imaging of very large fields-The problem of non-coplanar arrays ⤴](http://adsabs.harvard.edu/abs/1992A%26A...261..353C)
AW-projection: [Correcting direction-dependent gains in the deconvolution of radio interferometric images ⤴](http://arxiv.org/abs/0805.0834)

On the other end of the spectrum we have the heuristic-only approaches. In an heuristic approach we do not know the physical source of a specific direction-dependent effect. Instead, we introduce a number of free-parameters which we try to optimize based on some user-defined heuristic. Some 3GC heuristic approaches are listed below:

3GC: Heuristic-only approaches

• *Peeling*: [LOFAR calibration challenges ⤴](http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=847375)
Differential gains: [Revisiting the radio interferometer measurement equation-II. Calibration and direction-dependent effects ⤴](http://arxiv.org/abs/1101.1765)
Clustered calibration: [Clustered calibration: an improvement to radio interferometric direction-dependent self-calibration ⤴](http://arxiv.org/abs/1301.0633)

Once we obtain a heuristic solution, we can try to makes sense of it by fitting a physical model to it. Prime examples of this include:

3GC: Fitting a model to heuristic-only solutions

SPAM (Source Peeling and Atmospheric Modelling): [Ionospheric calibration of low frequency radio interferometric observations using the peeling scheme-I. Method description and first results ⤴](http://arxiv.org/abs/0904.3975)
Primary beam shapes: [Estimation of radio interferometer beam shapes using Riemannian optimization ⤴](http://arxiv.org/abs/1209.4236)

8.4.2 Solver Development

As we explained in $\S$ 8.2 ➞, least-squares solvers (see $\S$ 8.1 ➞) only became popular with the advent of self-calibration. Many improvements and alternatives to the least-squares solver have since been developed. We list the most recent developments below:

Solver Development

Eigendecomposition: [Gain calibration methods for radio telescope arrays ⤴](http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1145704&abstractAccess=no&userType=inst)
SAGEcal: [Radio interferometric calibration using the SAGE algorithm ⤴](http://arxiv.org/abs/1012.1722)
Robust calibration: [Robust radio interferometric calibration using the t-distribution ⤴](http://arxiv.org/abs/1307.5040)
StEFCal: [Fast gain calibration in radio astronomy using alternating direction implicit methods: Analysis and applications ⤴](http://arxiv.org/abs/1410.2101)
Riemann-Manifold: [Radio interferometric calibration using a Riemannian manifold ⤴](http://arxiv.org/abs/1303.1029)
Blind Calibration: [Blind calibration for radio interferometry using convex optimization ⤴](http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=7330285&abstractAccess=no&userType=inst)
Complex Optimization: [Radio interferometric gain calibration as a complex optimization problem ⤴](http://arxiv.org/abs/1502.06974)
Kalman filter: [Nonlinear Kalman filters for calibration in radio interferometry ⤴](http://arxiv.org/abs/1403.6308)

Of the methods listed above, StEFCal is arguably the most important recent development. It is discussed in more detail in $\S$ 8.4.2.1 ➞. Its importance stems, in part, from the fact that it lowers the computational complexity of calibration from O$(N^3)$ to O$(N^2)$.

8.4.2.1 StEFCal

StEFCal is an alternating direction implicit method. It works by first solving $\boldsymbol{\mathcal{G}}^H$ with $\boldsymbol{\mathcal{G}}$ held constant and then solving $\boldsymbol{\mathcal{G}}$ with $\boldsymbol{\mathcal{G}}^H$ held constant. This is tantamount to linearising the calibration problem.

Warning: We have switched back to the unpolarized notation we used in [$\S$ 8.1 ➞](../8_Calibration/8_1_Calibration_Least_Squares_Problem.ipynb).

As $\boldsymbol{\mathcal{D}}- \boldsymbol{\mathcal{G}}\boldsymbol{\mathcal{M}}\boldsymbol{\mathcal{G}}^H$ is Hermitian, the two steps are equivalent and we only require the following update step:

$$\boldsymbol{\mathcal{G}}^{[i]} = \textrm{argmin}_{\boldsymbol{\mathcal{G}}}\left\|\boldsymbol{\mathcal{D}}- \boldsymbol{\mathcal{G}}^{[i-1]}\boldsymbol{\mathcal{M}}\boldsymbol{\mathcal{G}}^H\right\|.$$

We can now write:

$$\left\|\boldsymbol{\mathcal{D}} - \boldsymbol{\mathcal{Z}}^{[i]}\boldsymbol{\mathcal{G}}^H\right\| = \sqrt{\sum_{p}^N\boldsymbol{\mathcal{D}}_{:,p}-\boldsymbol{\mathcal{Z}}_{:,p}^{[i]}\left(g_p^{[i]}\right)^*},$$

with $\boldsymbol{\mathcal{Z}}^{[i]} = \boldsymbol{\mathcal{G}}^{[i]}\boldsymbol{\mathcal{M}}$. We denote the $p$-th column of $\boldsymbol{\mathcal{A}}$ with $\boldsymbol{\mathcal{A}}_{:,p}$. If we now apply the normal equation method ⤴ we readily obtain:

StEFCal update step

\begin{equation} g_p^{[i]} = \frac{\boldsymbol{\mathcal{D}}^H_{:,p}\boldsymbol{\mathcal{Z}}_{:,p}^{[i-1]}}{\left(\boldsymbol{\mathcal{Z}}_{:,p}^{[i-1]}\right)^H\boldsymbol{\mathcal{Z}}_{:,p}^{[i-1]}}. \end{equation}

We can use the above update step to iteratively obtain the best estimate of $g_p$. We iterate until we exceed some maximum number of iterations or if we reach our convergence criteria.

Warning: In practice we replace the gain solution of each even iteration by the average of the current gain solution and the gain solution of the previous odd iteration.