Import standard modules:
In [2]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inlinered
from IPython.display import HTML
HTML('../style/course.css') #apply general CSS
Jones notation, or Jones calculus, forms the foundation of the radio interferometer measurement equation (RIME). Jones calculus is a mathematical description of the propagation of electromagnetic plane waves. The signal measured by any telescope is not a perfect representation of the original astrophysical signal, but is rather contaminated by successive layers of propagation effects as the signal makes its way from the astrophysical source to an actual measurement on our instrument. Jones notation gives us a mathematical way of describing these corruptions. The business of estimating and removing the corruptions is called calibration and is dealt with in $\S$ 8 ➞. In this section we simply introduce Jones notation and show how it can be used to describe propagation effects.
*Jones notation*, or ***Jones calculus***, is at the heart of the radio interferometer measurement equation (RIME). It is a mathematical framework with which to describe the propagation of electromagnetic plane waves. A telescope does not measure a "pure" astrophysical signal, but rather one *contaminated* by various effects as the signal makes its way from the astrophysical source to the telescope - where the measurement itself can provide contamination! Many of these contaminating effects can be modelled with Jones notation. The business of estimating and removing them is known as **calibration**, and is covered in [$\S$ 8 ➞](../8_Calibration/8_0_Introduction.ipynb). In this section, we simply introduce Jones notation and show how it can be used to describe *propagation effects* - physical processes an astrophysical undergoes between being emitted and being measured.
For starters, we need to understand that the electromagnetic (EM) radiation we are measuring comes in the form of plane waves (since the sources of radiation are sufficiently far away from the observer). Mathematically, this means the following: pick a coordinate frame $xyz$, where $z$ points along the direction of propagation. In general, any electric field at point $(x,y,z)$ and time $t$ can be described by a complex 3-vector:
In [$\S$1.2 ➞](../1_Radio_Science/01_02_electromagnetic_radiation_and_astronomical_quantities.ipynb), we showed that radio waves were electromagnetic waves. Since astrophysical sources are distant enough that we can treat them, for all intents and purposes, as though they were infinitely distant, these waves can be treated as *plane waves*.
We thus pick a Cartesian frame of reference with basis $(x,y,z)$, where $z$ is the direction of our wave's propagation. In general, any electric field at point $(x,y,z)$ and time $t$ can be described by a complex 3-vector:
This is the general case. If we have a plane wave, then the EM field vector has two specific properties:
In the specific case of a planar wave, we also have the following properties:
In this case we can describe the entire plane wave (as a function of time) by a single complex 2-vector: $$\mathbf{e}(z,t) = \left[ \begin{array}{c}e_x\\e_y\end{array} \right].$$
The images below (courtesy of the Wikipedia page for plane waves ➞) show two very special kinds of coherent plane waves; respectively, fully linearly and fully circularly polarized waves. Both are monochromatic. Note that this only shows the complex amplitude of the $e_x$ and $e_y$ components.
The $\mathbf{e}$-vector of these two plane waves follows some very specific equations.
The $\mathbf{e}$-vector of these two plane waves can be written as:
For a linearly polarized [along the $x$ axis] plane wave: $$e_x = A_0 \cos( 2\pi(z - ct)/\lambda+\phi),~~~e_y=0.$$
For a cicularly polarized plane wave: $$e_x = A_0 \cos( 2\pi(z - ct)+\phi), ~~~e_y = A_0 \sin( 2\pi(z - ct)/\lambda+\phi),$$
Here, $A_0$ is the wave's amplitude, $\lambda$ its wavelength, $\phi$ is its phase shift, and $c$ its speed (i.e. the speed of light).
The above shows coherent radiation; the radiation from astrophysical sources is, by its nature, incoherent and broad-spectrum because is is the result of a natural processes. In other words, it is essentially noise-like -- you can think of the $\vec{e}$ vector as "waving around" more or less randomly, so the neat figures above do not really apply.
The above only holds for *coherent* radiation. Radiation from astrophysical sources is, by its nature, *incoherent* and *broad-spectrum*, because it results from natural processes. In other words, it is essentially noise-like -- you can think of the $\vec{e}$ vector as "waving around" more or less randomly, and so the exact, analytical treatment given above does not apply in practice.
We can avoid the broad spectrum issue for now by considering radiation within a narrow frequency bin $[\nu-\Delta\nu/2,\nu+\Delta\nu/2]$, that is a bin centred at $\nu$, with a bandwidth of $\Delta \nu$. As we'll see elsewhere in this chapter, most instruments measure signals within such narrow frequency bins; we can think of the full broad-spectrum signal as being a superposition of near-monochromatic signals.
We can sidestep the broad spectrum issue - for now - by considering radiation within a narrow frequency bin $[\nu-\Delta\nu/2,\nu+\Delta\nu/2]$, centred on $\nu$, with a bandwidth of $\Delta \nu$. As we'll see elsewhere in this chapter, most instruments measure signals within this kind of narrow frequency bin; we can think of the full broad-spectrum signal as a superposition of near-monochromatic signals.
As for polarization, it is still possible to describe the signal's polarisation state in a statistical sense. This is done in terms of the Stokes parameters, which are defined in terms of the coherency of the $\mathbf{e}$-vector components (the $\langle\cdot\rangle$ operator indicates averaging over time):
$I$ is the total power (flux) in the signal. $Q$ and $U$ correspond to linearly polarized flux, while $V$ corresponds to circularly polarized flux. For a signal with $I=1, Q=0.01, U=0, V=0$ the $\mathbf{e}$-vector tends to "wave around" in the $x$ direction a little bit (or exactly 1%) more than in the $y$ direction; a signal with $Q=-0.01$ "waves around" the $y$ direction more than the $x$ direction; a signal with $Q=0,U=0.01$ tends to "wave around" the $45^\circ$ axis. The two perfectly polarized plane waves in the figure above correspond to $I=A_0, Q=A_0, U=0, V=0$ and $I=A_0, Q=0, U=0, V=A_0$ respectively.
The usual approach to interferometry is to describe polarization in terms of the four Stokes parameters $IQUV$, since this is the representation that most readily lends itself to Jones calculus (see below). You may come across some other ways to represent polarization in the scientific literature. We will describe them here, for the sake completeness.
Fractional polarization describes the polarization state in terms of the polarization fraction $p$, position angle $\psi$, and the angle $\chi$: $$\begin{eqnarray} p &=& \frac{\sqrt{Q^2+U^2+V^2}}{I}\\ \psi &=& \frac{1}{2} \tan^{-1}\frac{U}{Q}\\ \chi &=& \frac{1}{2} \tan^{-1}\frac{V}{\sqrt{Q^2+U^2}}\\ \end{eqnarray} $$
These parameters can be represented in the form of a so-called polarisation ellipse (image courtesy of the Wikipedia page for the Stokes parameters ➞).
Figure 7.1.2: Polarisation ellipse
Circular polarization ($V$) is exceedingly rare in astrophysics, so the scientific literature often describes polarization using only $p$ and $\psi$. Physically speaking, these give the percentage of linearly polarized signal and its orientation on the sky respectively. In the examples above, for $I=1, Q=0.01, U=0, V=0$, the signal is said to be 1% linearly polarized ($p=0.01$) with polarixation angle $0^\circ$; a signal with $Q=-0.01$ has polarization angle $90^\circ$, a signal with $Q=0,U=0.01$ has polarization angle $45^\circ$ axis. This can be illustrated as (courtesy of the Wikipedia page for the Stokes parameters ➞):
Figure 7.1.3: Polarisations corresponding to different values of $I,Q,U,V$
Poincaré sphere. Another way to represent polarization state is via the Poincaré vector $(S_1,S_2,S_3)=(Q,U,V)$. This vector lies on the so-called Poincaré sphere, whose radius is given by the total polarized flux $P=\sqrt{Q^2+U^2+V^2}$ (image courtesy of the Wikipedia page for the Stokes parameters ➞):
Propagation effects that change the polarization state (such as Faraday rotation) can often be described by rotation of the Poincaré vector.
In practice, you will seldom come across these representations in radio interferometry.
By definition of the Stokes parameters, a physical signal has the property $Q^2+U^2+V^2 \le I^2$, i.e. the polarized fraction cannot exceed 100%. This might seem obvious; however, a curious fact of radio interferometry is that it is possible to observe signals that appear to formally have polarization in excess of 100%. Recall from $\S$ 5 ➞ that an interferometer samples a limited set of spatial frequencies, bound by its shortest and longest baselines. Therefore, if the $I$ flux varies smoothly over the sky (i.e. has very low spatial frequencies, or large spatial scales) while the polarized $QU$ varies on smaller scales, it is quite possible for the interferometer to "see" less (or even no) $I$ flux, and significant $QU$ flux. This is in fact the case when observing near the plane of the Milky Way: unpolarized synchrotron radiation from around the galactic plane is very smooth (on the order of degrees), and gets mostly suppressed, while the polarized foregrounds show smaller-scale structure (on the order of arcminutes) that is quite apparent to most interefometers.
Jones calculus lets us describe what happens to a signal as it interacts with some medium on it way to us. For example:
The fundamental (and indeed only!) assumption of Jones calculus is that all propagation effects are linear. In other words, if the original signal vector is $\mathbf{e}$ and the propagated vector is $\mathbf{e}'=\mathbf{f}(\mathbf{e})$, then
$$ \mathbf{f}(a\mathbf{e}_1+b\mathbf{e}_2) = a\mathbf{f}(\mathbf{e}_1)+b\mathbf{f}(\mathbf{e}_2). $$Fortunately, almost all reasonable physics of signal propagation - with very few, exotic exceptions - are linear. Using linear algebra properties, we know that any linear function on 2-vectors can be described by multiplication with a $2\times2$ matrix:
$$ \mathbf{e}' = \left[ \begin{array}{c}e'_x\\e'_y\end{array} \right] = \left[ \begin{array}{cc}j_{11} & j_{12} \\ j_{21} & j_{22} \end{array} \right] \left[ \begin{array}{c}e_x\\e_y\end{array} \right] = \mathbf{J} \mathbf{e}. $$$\mathbf{J}$ is called a Jones matrix. Any (linear) propagation effect can thus be described by its own particular kind of Jones matrix.
For example, we may consider the two complex voltages measured by our (dual-element) receiver as a voltage 2-vector, and therefore also treat it as a linear function of the EM vector input. The receiver itself can then be described by a Jones matrix:
$$ \mathbf{v} = \left[ \begin{array}{c}v_1\\v_2\end{array} \right] = \left[ \begin{array}{cc}j_{11} & j_{12} \\ j_{21} & j_{22} \end{array} \right] \left[ \begin{array}{c}e_x\\e_y\end{array} \right] = \mathbf{J} \mathbf{e}. $$Here, we transform $\mathbf{e}$ into $\mathbf{v}$ because we can treat the EM radiation as being converted into the voltage that our telescope measures.
In radio interferometry, Jones matrices are conventionally designated by a capital letter, with different letters corresponding to different propagation effects. Some examples are:
where $\kappa$ is a proportionality constant.
E-Jones describes the primary beam gain. This is further discussed in $\S$ 7.5. ➞.
D-Jones describes the polarization leakage, which is caused by to "cross-talk" between the two receiver elements. It typically looks like
Both the E- and D-Jones terms are examples of direction-dependent effects (see $\S$ 7.3➞). EB: broken link, no idea where it was originally pointing
Note that the above are only examples. Depending on our instrumental model, the G-, B-, E-, and D-Jones terms may take a different form, or may be rolled into a single Jones matrix.
A sequence of propagation effects is equivalent to a chain of Jones matrix multiplications. We can thus describe the voltage vector our instrument measures as follows: $$ \mathbf{v} = \mathbf{J}_n \mathbf{J}_{n-1} ... \mathbf{J}_1 \mathbf{e} = \mathbf{Je}, $$
where the individual Jones terms describe the different effects in sequence. The system then has an overall Jones matrix of $\mathbf{J}_{\textrm{sys}} = \mathbf{J}_n \mathbf{J}_{n-1} ... \mathbf{J}_1 $.
For example a system Jones chain might be $\mathbf{J}_{\textrm{sys}} = \mathbf{G} \, \mathbf{B} \, \mathbf{D} \, \mathbf{E} \, \mathbf{K} \, \mathbf{P} \, \mathbf{Z} \, \mathbf{F}$. Note that the order of the chain is important: the original signal is operated on by the Jones chain in the order of in which the propagation effects occur. If, for example, the signal is Faraday rotated on its way through the ionosphere, then modulated by the primary beam, before gain is applied by the amplifiers, then $\mathbf{F}$ must be applied before $\mathbf{E}$, which must be applied before $\mathbf{G}$. In general matrix multiplication does not commute.
In real-life observations, some of the effects in the chain are known perfectly in advance (e.g. K-Jones, P-Jones), others may have a reasonable prior model (E-Jones), and some can only be estimated through calibration (e.g. G-Jones, B-Jones, see $\S$ 8 ➞),
From linear algebra we know that in gneral, matrix multiplication does not commute. The order of our chaining Jones matrices therefore must correspond to the physical order of the effects experienced by the signal. Some kinds of matrices do commute, however. The rules for commutation are as follow:
a scalar matrix commutes with any matrix. The K- and Z-Jones matrices can thus be moved anywhere in the the chain without affecting the result.
diagonal matrices commute among themselves. The B- and G-Jones terms may thus be swapped without changing the result.
rotation matrices commute among themselves. The P- and F-Jones terms may thus be swapped.
The combination of Jones chains and linear algebra is the source of the tremendous power and utility of Jones calculus. In a nutshell:
To determine the true underlying astrophysical signal, we must model all the instrumentation and propagation effects.
Jones calculus allows us to construct a notionally perfect description of signal propagation, with individual effects described by elements of the Jones chain.
We can then use our prior knowledge to fix some of the Jones terms in the chain.
We can use calibration ($\S$ 8 ➞) to estimate others.
We can use the rules of linear algebra to reorder the Jones terms where allowed, in order to simplify our calculations.
We will see all this come together in the next section on the RIME, and in the next chapter on calibration.
In [ ]: