Import standard modules:


In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from IPython.display import HTML 
HTML('../style/course.css') #apply general CSS


Out[1]:

Import section specific modules:


In [3]:
HTML('../style/code_toggle.html')


Out[3]:
The raw code for this notebook is by default hidden for easier reading. To toggle on/off the raw code, click here.

4.3 The 2-element Interferometer

To understand the practicalities of interferometry, we must spend some time on the simple 2-element interferometer. Here, we will treat it in 1-D. Similar to $\S$ 1.9 ➞, we will derive the response of an ideal 1D interferometer and its link to the intensity distribution of a source. In $\S$ 4.3.1 ⤵, we establish the link between interferometry on a optical bench and radio interferometry. From the derivation of the cross-correlation of antenna signals, we will derive a spatial fringe pattern - the radio counterpart to the optical fringe pattern. In $\S$ 4.3.2 ⤵ we define two classical interferometers in a 1D regime: the $\sum$ and $\Pi$ interferometers. Finally, in $\S$ 4.3.3 ⤵, we build the complex interferometer and introduce the idea of a visibility, which will be further detailed in $\S$ 4.4 ➞.

4.3.1 Wave superposition theorem

Consider two point sources $S_1$ and $S_2$. Let us assume their emission is coherent and in phase: they will thus each emit a wave $s_1(r,t)$ and $s_2(r,t)$ which propagates through space at speed c. The point P in Fig. 4.3.1 ⤵ will receive the linear superposition of the two waves:

$$s(r,t)=s_1(r_1,t)+s_2(r_2,t)$$

where $r_1$ and $r_2$ are the radial distances from the two sources $S_1$ and $S_2$.

Moreover, \begin{eqnarray} s(r,t) &=& s_1(r_1,t)+s_2(r_2,t) \\ s(r,t) &=& \frac{s_{01}}{r_1} e^{\imath (\omega_1 t - k_1 r_1 +\varphi_{01})}+\frac{s_{02}}{r_2} e^{\imath (\omega_2 t - k_2 r_2 + \varphi_{02})} \end{eqnarray}

where $s_{01}$ and $s_{02}$ are the amplitudes of the two waves.

If the sources are emitting waves with the same amplitude ($s_{01}=s_{02}=s_{0}$), angular frequency ($\omega_1=\omega_2=\omega$), and phase ($\varphi_{01}=\varphi_{02}=\varphi_0$), then:

$s_1(r_1=0,t)=s_2(r_2=0,t)=s_0 e^{\imath \omega t}$

In the above equation, we have chosen the initial phase to be $\varphi_0 = 0$.

At $P$, we have that:

\begin{eqnarray} s_{1P}(r_1,t) &=& s_0 \exp{\imath \left[ \omega (t - \frac{r_1}{c}) \right]} \\ s_{2P}(r_2,t) &=& s_0 \exp{\imath \left[ \omega (t - \frac{r_2}{c}) \right]} \\ \end{eqnarray}

We can therefore rewrite the initial phase of these signals at $P$: $$\varphi_1=-\omega \frac{r_1}{c} \quad s_{1P}(r_1,t)=s_0 \exp{\imath (\omega t + \varphi_1)}$$

$$\varphi_2=-\omega \frac{r_2}{c} \quad s_{2P}(r_2,t)=s_0 \exp{\imath (\omega t + \varphi_2)}$$

The atentive reader will note that the two signals are out of phase by $\Delta \Phi=\varphi_2-\varphi_1$ at $P$. This phase difference can be associated with either a time delay or a difference in the distance travelled by the light from each source and $P$.

To derive this delay, we first need a definition of the optical path length (OPL), which is defined along a curve $\mathcal{C}$: $$ OPL = \int_\mathcal{C} n(s)ds$$ where $n$ is the optical index of the propagation medium ($n=1$ in vacuum) and $s$ the curvilinear abscissa along the path.

With this definition, we can define the optical path difference (OPD) $\Delta l$ as the physical difference length between the path from $S_1$ to $P$ and from $S_2$ to $P$:

TLG:GM: Check if the italic words are in the glossary. Add curvilinear and abscissa.

$$ \Delta L = S_2P - S_1P = r_2-r_1 \quad \Delta \Phi = \varphi_2-\varphi_1 = 2 \pi \frac{\Delta L}{\lambda}$$

The phase $\Delta \Phi$ depends on $\Delta L=r_2-r_1$, or equivalently, the position of $P$ w.r.t. the sources.

Figure 4.3.1: Interference region between two emitting sources $S_1$ and $S_2$. At position P, we receive the superposition of the two waves.

As seen previously, the incoming waves superpose at $P$. Assuming that $S_1$ and $S_2$ have the properties outlined above, the resulting signal has the same frequency $\omega$ as the constituent waves, but has amplitude $S_{0P}$ which depends on the relative amplitude and phase of the two waves at the location $P$:

$$s_P(t)=S_{0P} \cos( \omega t + \phi_{0P})$$$$\text{ with } S_{0P}=\sqrt{S_{01}^2+S_{02}^2+2 S_{01} S_{02} \cos \Delta \Phi} = \sqrt{2 S_0^2(1+ \cos\Delta \Phi)}$$

The amplitude $S_{0P}$ is modulated by a factor which depends on the location of $P$. The phase $\phi_{0P}$ depends on the phases $\varphi_{01}$ and $\varphi_{02}$.

4.3.1.1 Interfering conditions

The $\cos$ term in the previous equation will modulate the amplitude of the wave. We can define two regimes depending on the value of $S_{0P}$:

  • $S_{0P}$ is maximal when $\cos \Delta\Phi = +1$, i.e. $\Delta \Phi = 2 m \pi$ with $m \in \mathbb{Z}$, meaning that $\Delta L = m \lambda$. The two waves add in phase. This is known as constructive interference.
  • $S_{0P}$ is null when $\cos \Delta\Phi = -1$, i.e. $\Delta \Phi = (2m+1) \pi$ with $m \in \mathbb{Z}$, meaning that $\Delta L= (m+\frac{1}{2}) \lambda$. The two waves have opposite phase and therefore will cancel out. This is known destructive interference.

TLG:GM: Check if the italic words are in the glossary.

The amplitude of the resulting interference pattern is position-dependent. The interference pattern is also known as the fringe pattern. TLG:GM: Check if the italic words are in the glossary. This pattern consists of those fringes with the same (spatially constant) phase. A single fringe is defined as the location of points where $S_2 P - S_1 P = \text{const}$. In three-dimensional space, the fringes are defined as sets of hyperboloids with axial symmetry around the axis $S_1S_2$ (see Fig. 4.3.2 ⤵). Each hyperboloid corresponds to a particular constant.

  • In the plane perpendicular to the axis $S_1S_2$, and located mid-way between the sources, $\delta =0$, and therefore we have a bright fringe.
  • Conversely, in any plane containing the axis $S_1S_2$, the fringes are hyperbolas.
  • Far from the sources, in any plane perpendicular to the axis $S_1S_2$, circular fringes will be observed.
  • Far from the sources, in any plane parallel to the axis $S_1S_2$, linear fringes will be observed.

Figure 4.3.2: Interference region between two emitting sources $S_1$ and $S_2$ in three-dimensional space. The amplitude of the interference pattern takes the shape of circular hyperboloids. The axis $S_1S_2$ is a characteristic axis of symmetry in the system. In a plane perpendicular to this axis, we observe circular fringes in the far-field, and in a plane parallel to this axis, we observe linear fringes in the far-field.

Note: There is a continuous variation in fringe shape from linear to circular as the plane of observation moves from being parallel to perpendicular to the axis $S_1S_2$.
Warning: The interference fringes are clearly visible when the two waves approximately share the same physical characteristics (amplitude, frequency): see the note on temporal/spatial coherence below.

4.3.1.2 General considerations in the receiving case

In the "emitting" case, we have seen that waves originating from two different point sources can interfere. This interference can be projected onto a screen: it then gives different interference patterns depending on the location of the screen (Fig. 4.3.2 ⤵).

We will now consider the reverse scenario: in the "receiving" case, two point receivers are illuminated by a single plane wave coming from a point source located infinitely far away. Before studying the resulting interference, we must introduce the notions of temporal and spatial coherency. The waves reaching the receivers must have these two attributes. TLG:GM: Check if the italic words are in the glossary.

4.3.1.3 Temporal and Spatial coherence

This section aims to provide an intuitive approach to understanding wave coherence. The coherence of a wave is the degree to which it can maintain its "shape" (i.e. the degree of phase correlation) at different locations and different times as it propagates. TLG:GM: Check if the italic words are in the glossary.

A propagating wave is said to have temporal coherence if the phase difference between any two points, at an instant of time, along the direction of propagation (e.g. $P_1$ and $P'_1$ in Fig. 4.3.3a ⤵) is independent of time. In other words, a wave is temporally coherent if successive wavefronts are propagating between the two points with the same delay. Temporal coherence tells us how monochromatic a source is.

Similarly, a propagating wave has spatial coherence if the phase difference, between any two points in a plane perpendicular to the direction of propagation and at an instant of time, (here $P_2$ and $P'_2$ on Fig. 4.3.3a ⤵) is independent of time. In other words, a wave is spatially coherent if the spatial shape of the wavefront remains the same as the wave propagates. Spatial coherence gives us information on the uniformity of a wavefront's phase, and is usually associated with the spatial extent of the source.

In Fig. 4.3.3 ⤵, we illustrate the various regimes where a wave has temporal or spatial coherence. Green lines illustrate how the phase difference between random couples of points taken in (or perpendicular to) the direction of propagation is independent of time as the wave propagates. Conversely, the red lines show phases are no longer time-independent.

Those coherences, if not maintained, will blur the fringe pattern: the wave would then be interfering destructively. Spectral and polarization coherences should also be taken into consideration in principle, but this is beyond the scope of this section.

Figure 4.3.3: (a) Wave with both spatial and temporal coherence (b) Spatial coherence and no temporal coherence (c) temporal coherence and no spatial coherence (d) no spatial coherence and no temporal coherence.

Hopefully, we have shown that extended sources (i.e. sources with varying wavefronts) and multi-frequency sources (whose emission have a mix of various wavelengths) have incoherent emission. For the rest of this chapter, we will assume that we have a single point source with both spatial and temporal coherence.

4.3.1.4 Foreword on array pointing, phase center and projected baseline

Before moving to the full 3D case where we will have to consider the interferometer's response - how well it detects objects on the celestial sphere - we will focus on a simple case. Consider a two-element (and therefore 1-D interferometer observing a coherent plane wave emitted by an infinitely-distant point source. We assume this emission is spatially and temporally coherent.

Each element of our interferometer is an isotropic receiver (i.e. sees the full sky) lying on the ground (Fig. 4.3.4 ⤵). They are separated by a distance $|\mathbf{b}|$. As per $\S$ 1.2 ➞, we will consider that the array is illuminated by an electromagnetic plane wave coming from a direction $\mathbf{s_0}$, at an angle $\theta$ w.r.t. the baseline.

Figure 4.3.4: The projection of baseline $\mathbf{b}$ towards the direction $\theta$ along $\mathbf{s}_0$ in a simple 1-D sky/baseline example.

The inclination of the source causes the wave to travel an extra distance $\Delta L$ to reach $R_1$ after it has reached $R_2$. The wave's optical path length is longer for $R_1$ than $R_2$. This optical path difference is defined as $\Delta L = \lvert \mathbf{b} \rvert \cos \theta$.

In order to constructively combine the signals measured by $R_1$ and $R_2$, we need to compensate for this extra delay. In interferometry, this delay correction is known as fringe (or delay) tracking. As a result of this correction, we define the direction $\mathbf{s_0}$ on the celestial sphere to be the phase center of the array.

From the perspective of the source, the array appears projected on the normal plane of the direction of propagation (coinciding with the incoming plane wave at direction $\mathbf{s_0}$). The apparent projected distance between the antennas is $|\mathbf{b}|\sin \theta$. This length is called the projected baseline. In vector form, the extra OPL is found by projecting $\mathbf{b}$ onto $\mathbf{s_0}$: $$\Delta L= \mathbf{b} \cdot \mathbf{s_0}$$

TLG:GM: Check if the italic words are in the glossary.

As the source moves on the celestial sphere (due to Earth's rotation), the apparent baseline length will change (see $\S$ 4.5.1 ➞ for how astronomers capitalise on this). This means that keeping the source at phase centre (i.e. continuously following a source moving with the celestial sphere - effectively pointing the telescope) requires adapting our compensation for the OPL delay to the variation of the projected baseline as a function of time.

Warning: in practice, the antennas are usually steerable dishes with a directional response (or beam), which is NOT isotropic. To measure a signal coming from a direction $\mathbf{s_0}$ ([Fig. 4.3.4 ⤵](#vis:fig:434)), we therefore physically point these antennas towards the desired direction in order to maximize our antenna response. see [$\S$ 7.5 ➞](../7_Observing_Systems/7_5_primary_beam.ipynb)

That just about wraps it up for the simplest scenario: a two-element interferometers looking at a single point source. Let us now consider plane waves coming from two unrelated point sources in the directions $\mathbf{s_0}$ and $\mathbf{s}$ in the sky.

Figure 4.3.5: Two receivers receiving plane waves from directions $\mathbf{s_0}$ and $\mathbf{s}$ in the sky.

In this case, the two wavefronts are spatially and temporally incoherent. The signals from both sources will therefore not interfere amongst themselves. The emission from different parts of the sky will thus add linearly: the net signal detected by the interferometer will be the sum of the signal detected from each source.

Again, let us take $\mathbf{s_0}$ as the phase center of our interferometer. Arbitrary directions $\mathbf{s}$ can be given w.r.t. $\mathbf{s_0}$ through the difference vector $\boldsymbol{\sigma}=\mathbf{s}-\mathbf{s_0}$ (as illustrated in Fig. 4.3.5 ⤵). For small $|\boldsymbol{\sigma}|$, this vector $\boldsymbol{\sigma}$ lies approximately in the plane tangent to the celestial sphere in the direction of $\mathbf{s_0}$. This vector therefore defines the location of a source in the tangent sky plane, relative to the reference direction $\mathbf{s_0}$.

We will later use $\boldsymbol{\sigma}$ to derive the response of an interferometer at any position in that plane.

Given the directions ($\mathbf{s}, \mathbf{s_0}$), the optical path differences (OPDs) are obtained via a dot product with $\mathbf{b}$. We can compute them directly by computing the dot product between $\mathbf{b}$ and $\boldsymbol{\sigma}$.

$$\mathbf{b} \cdot \boldsymbol{\sigma} = \mathbf{b} \cdot (\mathbf{s} - \mathbf{s_0}) = \text{OPD}_{\mathbf{s}}-\text{OPD}_{\mathbf{s_0}}$$
Warning: Unlike the emitting case where we considered two spatially separated sources which emit perfectly coherent and in phase waves (see [$\S$ 4.3.1 ⤵](#vis:sec:431)), the two incoming signals received by $R_1$ (or $R_2$) DO NOT INTERFERE. The waves arriving at $R_1$ (or at $R_2$) come from different (and therefore spatially incoherent) sources of the sky. It is critical to understand that there is no equivalence between the emitting case and the receiving case.
Warning: For the moment, the vector $\mathbf{s}$ and $\mathbf{s_0}$ are unitary and defined with angles only. Therefore, the length of $\boldsymbol{\sigma}$ is a projected angular distance on the sky (defined in terms of direction cosines ($l$, $m$, $n$), defined in [$\S$ 3.4 ➞](../3_Positional_Astronomy/3_4_direction_cosine_coordinates.ipynb) ).

4.3.2 First approach to radio interferometry

Instead of considering the interference of two linearly superposed waves at a point $P$ as we did with Fig. 4.3.1 ⤵, let us now consider the net signal measured by each antenna. In the radio regime, the receivers are composed of conductors which collect an EM wave and convert it into a voltage. This voltage has a time-varying amplitude and phase, which are related to those of the initial EM-wave (specifically, the relationship between the two is determined by the antenna's impedance).

These voltage signals are the waves that we work with as radio astronomers doing interferometry.

From the configuration presented in Fig. 4.3.5 ⤵, we now try to understand what information can be extracted from different combinations of the measured voltages. We have seen that the extra optical path travelled by the plane wave will induce a time delay between the receivers. We need to take this information into account when combining the antenna signals.

There are two ways we can combine these signals:

  • by addition, to form a sum interferometer - a $\sum$-interferometer
  • by multiplication, to form a product interferometer - a $\prod$-interferometer.

TLG:GM: Check if the italic words are in the glossary.

Let us define the measured voltages, assuming they are recorded at the same frequency $\omega$:

$$V_1=V_{01} \cos (\omega t + \varphi_1) \quad V_2 = V_{02} \cos (\omega t + \varphi_2)$$

Let us use $\mathbf{s_0}$ as a reference direction for the incoming wave and $R_2$ as the reference antenna. To do so, we shift our starting time $t_0$ such that $V_2$ has a phase of zero at the origin. We can then recast the equations of our signal as a function of $\Delta L$:

\begin{align} V_1&=V_{01} \cos (\omega t + \varphi_1 - \varphi_2), \quad &V_2 = V_{02} \cos (\omega t)\\ \Leftrightarrow V_1&=V_{01} \cos (\omega (t + \frac{\Delta L}{c})), \quad &V_2 = V_{02} \cos (\omega t) \end{align}

We will assume that the maximum amplitudes of our voltages are all identical: $V_{01}=V_{02}=V_0$. Let us now look at each type of interferometer in further detail.

4.3.2.1 The $\sum$-interferometer

As per $\S$ 4.3.1.1 ⤵, we can derive the result of summing both signals.

$$V_\sum = V_1 + V_2 = V_{01} \cos{(\omega t + \varphi_1)} + V_{02} \cos{(\omega t + \varphi_2)}$$

We can keep following $\S$ 4.2.1.1 ➞ and compute the amplitude of the sum: $$A=\sqrt{(V_1 + V_2)^2}= \;... \;=\sqrt{2 V_0^2(1+ \cos\Delta \Phi)} \text{ with } \Delta\Phi = \varphi_1-\varphi_2$$

We can see that the sum's amplitude is modulated by the time delay between signals arriving at $R_1$ and $R_2$.

The problem with the $\sum$-interferometer is that the additive constant term $2 V_0^2$ is not known a priori, and cannot be easily removed. It represents the average value of the product of the voltages between $R_1$ and $R_2$.

In the $\prod$-interferometer, this problem takes care of itself: this unknown factor becomes a multiplying factor, which can be normalized with ease. This is why the $\Pi$-interferometer is more commonly used. For the remainder of this chapter, we will focus on the $\Pi$-interferometer.

The reader wanting more information on the $\sum$-interferometer is heartily encouraged to refer to Kraus $\S$6-20

4.3.2.2 The $\prod$-interferometer with a $\cos$ correlator

The product of two signals can be achieved through correlation. A correlator is a device which multiplies two voltages. Generally, the correlator performs some averaging in time: this is done to reduce noise in the final measurement. We will assume that the time over which the signal is averaged is long enough to smooth out the fast oscillation associated with $\omega t$. This is equivalent to filtering the signals with a low-pass filter, which is the kind of filter used to remove the fast-varying component of a signal. TLG:GM: Low pass filter.

In other words, the correlation is the product of the voltage of two antennas averaged over a time $t$:

$$C= \langle V_1 V_2 \rangle_t$$

where $\langle \cdot \rangle_t$ is the time-averaging operator and $\tau=\frac{\Delta L}{c}$.

\begin{align} C&=\langle V_{01} V_{02} \cos{\omega t} \cos{[\omega (t + \frac{\Delta L}{c}) ]}\rangle_t \\ &=\langle V_{01} V_{02} \cos{\omega t} \cos{[\omega (t + \tau) ]} \rangle_t\\ &= V_0^2 \frac{\langle \cos(2 \omega t + \tau)+\cos (\omega \tau)\rangle_t}{2} \end{align}$$\boxed{\displaystyle C =\frac{V_0^2}{2}\cos{(\omega \tau)}=\frac{V_0^2}{2}\cos{(2 \pi \frac{\Delta L}{\lambda}})}$$

TLG:AC: Add a reference to the appendix where the product of cosine identities are explained. [$\S$ A.1 ➞](../0_Introduction/2_Appendix.ipynb)

In $\S$ 4.3.1.1 ⤵, we have shown that the interference fringe pattern depends on the observer's position. We also found a phase-dependent equation to describe an individual fringe.

Similarly, we see here that the strength of the correlation between two measured signals depends on the OPD, i.e. the time delay between the signals' arrival at each antenna. Since the time delay depends on the projected baseline, the resulting fringe pattern is also spatially dependent on the direction of the source $\mathbf{s_0}$.

From the previous equation, we can define the fringe phase: the phase of the fringe pattern at the direction $\mathbf{s_0}$. It is defined as: $\phi = \omega\tau=\frac{\omega}{c} |\mathbf{b}| \cos \theta= \frac{2\pi}{\lambda} |\mathbf{b}| \cos \theta$

As the Earth rotates, the source being observed will slowly rotate on the celestial sphere. The delay tracking system will compensate for this motion by adjusting $\Omega$. As a consequence, the projected baseline, and therefore $\tau$, will slowly change.

We can characterize the speed of the phase variation with the fringe rate, defined as the differential of the fringe phase w.r.t. $\theta$: $ |\frac{d\phi}{d\theta}|=\frac{2\pi}{\lambda} |\mathbf{b}\sin\theta|=\frac{2\pi}{T_f}$ where $T_f$ is the fringe period. Observing this fringe rate can contribute to the precise localisation of a source (see $\S$ 4.4.2.3 ➞). TLG:GM: Are italic words in glossary.

First result:

  • The correlation of a signal measured by a baseline - i.e. between two measurements - can be associated to its source's position with respect to the physical baseline.

  • If $\lambda$ is small enough compared to the projected baseline $|\mathbf{b}|\sin\theta$, the phase of the correlation can track the position of a source to a high degree of accuracy.

  • As a consequence, the correlation is sensitive to variations of the spatial period $T_f$, which means that a 2-element interferometer acts as spatial filter for this spatial frequency.

In other words, each baseline probes a spatial frequency, or a single Fourier coefficient as described in $\S$ 4.1 ➞. This is the fundamental link to be made here: each baseline probes one spatial frequency, which is determined by its length and direction. By stacking baselines - i.e. probing more and more Fourier coefficients - we can start reconstituting the full sky!

To do so, we need to define what we mean by "the full sky". The sky consists of spatially incoherent sources (point-like and/or extended), which can be described using a continuous function $I_\nu(\boldsymbol{\mathbf{s}})$. This function is called the sky brightness distribution. The response (net measured signal) of an interferometer "looking at" a collection of incoherent sources is the sum of the responses of each individual source. This is expected: as outlined before, signals which are not coherent do not interfere, and thus add linearly. We thus define the total correlation between $R_1$ and $R_2$ on the whole sky by summing over all observable directions:

$$ C_{\cos}= \int_\Omega k(\mathbf{s}) \cos(2 \pi \frac{\Delta_L(\mathbf{s})}{\lambda})d\mathbf{s}$$

with $k(\mathbf{s})$ an implicit multiplying factor depending on $I_\nu(\mathbf{s})$.

When usign this definition, $C_{\cos}$ is called a cosine correlator.

However, as we know from Fourier analysis, any function can be expressed as the sum of an even function and an odd function. In practice, the cosine correlator will only be sensitive to the even component of the sky brigthness function - it only probes the $a_n$ components of its Fourier series as defined in $\S$4.1 ➞. To measure the odd component of the sky brigthness function, we must build a sin correlator.

Note: If the sky brightness is an odd function composed of two sources of opposite brightness, then:
$I_\nu(\mathbf{s}_+)=+1$ and $I_\nu(\mathbf{s}_-)=-1$. In these two directions (marked "+" and "-"), we will have $\Delta L_+ = \Delta L_-$ and $k_+=-k_-$.
If we use a *cosine correlator*, we will therefore measure: \begin{align} C_{\cos} &= k_+ \cos{2 \pi \frac{\Delta L_{+}}{\lambda}} + k_- \cos{2 \pi \frac{\Delta L_{-}}{\lambda}}\\ &=k_+ \cos{2 \pi \frac{\Delta L_{+}}{\lambda}} - k_+ \cos{2 \pi \frac{\Delta L_{+}}{\lambda}} \\ &= 0 \end{align} Similarly, the *odd* components of the sky brightness will not be measured if we rely only on a *cosine* correlator.

4.3.2.3 What is the coefficient $k(\mathbf{s})$ ?

If we assume that the voltage output of our antennas is a linear function of the electromagnetic (EM) wave they measure, the voltage measured by each antenna is the integral of the contribution from all directions over their respective field of view $\Omega_1$ and $\Omega_2$.

$$V_1= \int_{\Omega_1} V_{1_\Omega} d \Omega_1 \quad V_2= \int_{\Omega_2} V_{2_\Omega} d\Omega_2$$

In other words, since the individual signals of each source are incoherent, they do not interfere and add linearly: the net signal of all the sources in the field of view of each antenna is the sum of the signal from all individual sources.

$V_{i_\Omega}$ is proportional to the power received from direction $\mathbf{s}$:

$V_{i_\Omega} \propto \frac{1}{2} A_{\text{eff}}(\mathbf{s}) I_\nu(\mathbf{s})\Delta\nu d\Omega$

where $A_{\text{eff}}$ is the effective area of the antenna, $I_\nu$ the brightness distribution, $\Delta \nu$ the bandwidth of observation and $d\Omega$, an element of observing solid angle (see $\S$ 1.2 ➞ and $\S$ 4.4 ➞ for definitions). The $\frac{1}{2}$ comes from the fact that, usually, only one polarization is measured by an antenna.

therefore, $$C_{\cos}= \langle V_1 V_2 \rangle_t= \langle \int_{\Omega_1} V_{1_\Omega} d \Omega_1 \int_{\Omega_2} V_{2_\Omega} d \Omega_2 \rangle_t$$

We assume that all the emission from the sky is spatially incoherent, meaning that the only non-zero correlation is between two signals coming from the same direction. We can therefore swap the integrals with the time averaging brackets:

$$C_{\cos}= \langle V_1 V_2 \rangle_t= \int_{\Omega} \langle V_{1_\Omega} V_{2_\Omega} \rangle_t d \Omega \propto \Delta \nu \int_\Omega A(\mathbf{s}) I_\nu(\mathbf{s}) \cos(2 \pi \nu \frac{\Delta L}{c})d \Omega$$

For the sake of convenience, we can assume - for now - that

$k(\mathbf{s}) \propto \Delta \nu A(\mathbf{s}) I_\nu (\mathbf{s}) $
4.3.2.4 The $\prod$-interferometer with a $\sin$ correlator

A straightforward way to create a $\sin$ correlator is by introducing an artificial phase delay of $\frac{\pi}{2}$ in one of the two signal paths, since

$sin(x) = cos(x-\pi/2)$

If we introduce this $\frac{\pi}{2}$ phase delay to the path of signal $V_2$, we obtain the following:

\begin{align} V_1 &=V_{01} \cos (\omega (t + \tau))\\ V_2 &= V_{02} \cos (\omega t + \frac{\pi}{2} )\\ C&=\langle V_{01} V_{02} \cos{(\omega t + \frac{\pi}{2})} \cos{[\omega (t + \tau) ]} \rangle_t\\ &= V_0^2 \frac{\langle \cos(2 \omega t + \tau + \frac{\pi}{2})+\cos (\omega \tau - \frac{\pi}{2})\rangle_t}{2} \end{align}$$\boxed{C_{\sin} =\frac{V_0^2}{2}\sin{\omega \tau}}$$

As you might expect, this correlator will only be sensitive to the odd part of the sky brightness. This can be verified just as we did for the $\cos$ correlator's sensitivity.

Note: If the sky brightness is an even function composed of two sources of same brightness:
i.e. $I_\nu(\mathbf{s}_+)=+1$ and $I_\nu(\mathbf{s}_-)=+1$. In these two directions (marked "+" and "-"), we will have $\Delta L_+ = \Delta L_-$ and $k_+=-k_-$.
With a *sine correlator*, we will therefore have: \begin{align} C_{\sin} &= k_+ \sin{2 \pi \frac{\Delta L_{+}}{\lambda}} + k_- \sin{2 \pi \frac{\Delta L_{-}}{\lambda}}\\ &= k_+ \cos{2 \pi \frac{\Delta L_{+}}{\lambda}} - k_+ \sin{2 \pi \frac{\Delta L_{+}}{\lambda}}\\ &= 0 \end{align} The *even* part of the sky brightness will **not** be measured with a *sine* correlator.

By implementing two parallel correlators ($\cos$ & $\sin$), one can measure the correlation of both the even and odd parts of the sky brightness $I_\nu$.

$$\boxed{C_{\cos} =\frac{V_0^2}{2}\cos{\omega \tau}} \quad \boxed{C_{\sin} =\frac{V_0^2}{2}\sin{\omega \tau}}$$
Note: This $\frac{\pi}{2}$ delay can be implemented electronically, in either analog or digital manner.

4.3.3 The complex correlator and the complex visibility

4.3.3.1 First definition of Visibility

The previous section addressed the practical implementation of an interferometer observing an arbitrary sky. As we derived two interferometers with $\cos$ and $\sin$ correlators, we can combine these operations into one complex correlator:

\begin{eqnarray} \underline{C}&=& \Re{(\underline{C})}-\imath\Im{(\underline{C})}=C_{\cos} - \imath C_{\sin}\\ \underline{C}&=& \sum_{n=0}^{\infty} I_n \cos(2 \pi \frac{\Delta L_n}{\lambda})- \imath \sum_{n=0}^{\infty} I_n \sin(2 \pi \frac{\Delta L_n}{\lambda})\\ \underline{C}&=&\sum_{n=0}^{\infty} I_n \left[ \cos(2 \pi \frac{\Delta L_n}{\lambda}) - \imath \sin(2 \pi \frac{\Delta L_n}{\lambda}) \right] \\ \underline{C}&=&\sum_{n=0}^{\infty} I_n e^{-\imath 2\pi \frac{\Delta L_n}{\lambda}} \end{eqnarray}

And in the continuous case,

$$\boxed{\underline{C}=\Delta \nu \int_{\Omega} A(\mathbf{s}) I_\nu(\mathbf{s}) e^{-\imath 2\pi \frac{\mathbf{b}\cdot\mathbf{s}}{\lambda}} d\Omega}$$
.ote that this is, in fact, equivalent to $\mathscr{F}\{A(\mathbf{s}) I_\nu(\mathbf{s}\}$. In other words: each correlation corresponds to the **Fourier transform** of the *sky brightness distribution*, **convolved** with the *instrument response*. TLG:GM: Use correct symbol for Fourier Transform. $\mathscr{F}\{\cdot\}$ (also check rest of notebook).
Warning: The **minus sign** in the definition of the complex correlation is a matter of **convention** for the rest of the course.
$\Omega$ is a solid angle which describe the full-sky or the Field of View of the receiver.

$\underline{C}$ is a complex quantity associated with the measurement of $I_\nu$ with baseline $\mathbf{b}$. In the previous equations, the exponent term operates as a spatial filter (on $I_\nu(\mathbf{s})$), whose characteristics are dependent on the direction $\mathbf{s_0}$, the physical baseline $\mathbf{b}$ and the wavelength $\lambda$. This spatial filter is associated with a 1D fringe pattern which can be plotted from the real and imaginary part of the exponent.

We represent in Fig 4.3.6 ⤵, the distribution of the 1D fringe pattern (or array response) derived from the complex correlation. The real and imaginary parts are associated with the cosine correlator and sine correlator fringe patterns, respectively. For simplicity, we assume that $A(\mathbf{s})$ and $I_\nu(\mathbf{s})$ are constant.

Note that as baseline length increases, the spatial frequency it probes becomes larger: you get more peaks for the same angular distance on the sky. In other words, the longer your baseline, the more resolution you can achieve: this is consistent with the diffraction limit as defined for baselines - which is good, because the reverse would be troubling!

TLG:AC: Add the Cartesian view as well not just polar plot.


In [2]:
import numpy as np
theta=np.linspace(0.,180.,1000)
blambda=5

filter=np.exp(-1j*2*np.pi*blambda*np.cos(np.radians(theta)))

filtercosabs=np.abs(np.real(filter))
filtersinabs=np.abs(np.imag(filter))

f=plt.figure(figsize=(6,6))
plt.axes(polar=True)
plt.xlabel('theta')
plt.plot(np.radians(theta),np.abs(filtercosabs),'b',label="Cosine correlator")
plt.plot(np.radians(theta),np.abs(filtersinabs),'r',label="Sine correlator")
plt.legend(loc="lower center")


Out[2]:
<matplotlib.legend.Legend at 0x37b6b10>

Figure 4.3.6: Fringe patterns derived from the cosine (blue) and sine (red) correlators. 90$^\circ$ correspond to the zenith and the [0$^\circ$,180$^\circ$] corresponds to the ground.

We have defined the response of a complex correlator towards any direction $\mathbf{s}$. We will now introduce the direction of the phase centre (i.e. $\mathbf{s_0}$) by introducing the vector $\boldsymbol{\sigma}$ such that:

\begin{eqnarray} \mathbf{s}=\mathbf{s_0}+\boldsymbol{\sigma} \end{eqnarray}

Then

\begin{eqnarray} \underline{C}&=& \Delta \nu e^{-\imath 2\pi \frac{\mathbf{b}\cdot\mathbf{s_0}}{\lambda}} \int_{\Omega} A(\mathbf{s}) I_\nu(\mathbf{s}) e^{-\imath 2\pi \frac{\mathbf{b}\cdot \boldsymbol{\sigma}}{\lambda}} d\Omega \end{eqnarray}

The complex correlation $\underline{C}$ is the integral of the sky brightness as seen through a spatial filter.In other words, it gives us the intensity of a given Fourier component, defined by the baseline length and direction. From the expression of this complex correlation we define the complex visibility, $\underline{V}$, for which we can define an amplitude and a phase: $$\underline{V}=|V|e^{\imath\phi_V}=\int_{\Omega}A(\boldsymbol{\sigma}) I_\nu(\boldsymbol{\sigma}) e^{-\imath 2\pi \frac{\mathbf{b}\cdot \boldsymbol{\sigma}}{\lambda}} d\Omega$$

Warning: The position vector is now $\boldsymbol{\sigma}$, which depends on our choice $\mathbf{s_0}$. The visibility is a quantity which depends on the chosen phase center, and will be different if we decide to define our phase centre elsewhere.

As a consequence, the complex correlation and the visibility are linked by : $$\underline{C}= \Delta \nu e^{-\imath 2\pi \frac{\mathbf{b}\cdot\mathbf{s_0}}{\lambda}} |V|e^{\imath \phi_V} = \Delta \nu |V| e^{\imath (\phi_V - 2\pi \frac{\mathbf{b}\cdot\mathbf{s_0}}{\lambda})}$$

The trick is now to determine the amplitude $|V|$ and phase $\phi_V$ of the visibility (i.e. physical quantities) from the of the measured visibility. This is a calibration problem: we seek to go from measured qantities to physical quantities. In interferometry, we calibrate by comparing our measurement to a predefined model of the sky. The example given here is a simplified version of the more general framework, which will be introduced in $\S$ 8 ➞ and in the references therein.

4.3.3.2 Effect of the bandwidth, necessity of delay tracking

Measuring a correlation over a finite bandwidth $\Delta\nu$ will introduce a decorrelation of the two signals because of dependence in $\nu$ of the fringe term $e^{-\imath 2 \pi \frac{\mathbf{b}\cdot \boldsymbol{\sigma}}{\lambda}}$. Indeed, due to the superposition of various wavelengths which will destroy the interferences, we will lose the temporal coherency. This will have an effect of tempering the contrast of the fringe pattern and therefore reduce the amplitude of the correlation.

In other words, because we are sampling a finite segment of the bandwidth rather than the full frequency spectrum of our EM-wave, we will experience similar boundary problems as those described in the Fourier series section of $\S$ 4.1 ➞.

Let us investigate the impact of decorrelation on the fringe pattern, and how to account for it.

At a given frequency $\nu$, within a infinitesimal bandwidth $d \nu$, the correlator produces the output:

$$d\underline{C}= |V| e^{\imath (\phi_V - 2\pi \nu \tau)} d \nu$$

with $\tau= \frac{\mathbf{b}\cdot \mathbf{s_0}}{c}$

If we sum the response over a finite band $\Delta \nu$ centered at $\nu_0$:

\begin{align} \underline{C}&= |V| \int_{\nu_0-\Delta \nu /2}^{\nu_0+\Delta \nu /2}e^{\imath (\phi_V - 2\pi \nu \tau)} d\nu \\ &= |V| \int_{\nu_0-\Delta \nu /2}^{\nu_0+\Delta \nu /2} \cos (\phi_V - 2\pi \nu \tau)d\nu + \imath |V| \int_{\nu_0-\Delta \nu /2}^{\nu_0+\Delta \nu /2} \sin (\phi_V - 2\pi \nu \tau)d\nu\\ &= |V| \left[\frac{\sin (\phi_V - 2\pi \nu \tau)}{-2\pi \tau}\right]_{\nu_0-\Delta \nu /2}^{\nu_0+\Delta \nu /2} + \imath |V| \left[\frac{-\cos (\phi_V - 2\pi \nu \tau)}{-2\pi \tau}\right]_{\nu_0-\Delta \nu /2}^{\nu_0+\Delta \nu /2} \\ &= \frac{|V|}{-2\pi \tau} \left[ \sin (\phi_V - 2\pi (\nu_0 +\frac{\Delta \nu}{2}) \tau) - \sin (\phi_V - 2\pi (\nu_0 -\frac{\Delta \nu}{2}) \tau)\right] + \imath |V| \left[ \dots\right]_{\nu_0-\Delta \nu /2}^{\nu_0+\Delta \nu /2}\\ &= \frac{|V|}{2\pi \tau} \left[ 2 \cos{(\phi_V - 2\pi \nu_0 \tau) \sin{\pi\Delta\nu \tau}} \right] + \imath \frac{|V|}{2\pi\tau} \left[ 2 \sin{(\phi_V - 2\pi \nu_0 \tau) \sin{\pi\Delta\nu \tau}}\right]\\ &= |V|\Delta\nu \frac{\sin{\pi\Delta\nu \tau}}{\pi \Delta\nu \tau} e^{\imath (\phi_V - 2\pi \nu_0 \tau)}\\ &=|V|\Delta \nu \; \text{sinc}(\pi\Delta\nu \tau) e^{\imath (\phi_V - 2\pi \nu_0 \tau)} \end{align}

We see that the amplitude of the correlation experiences a modulation by a sinc function, which depends on the bandwidth $\Delta\nu$ and the delay $\tau$ defined in the direction of the phase center. We still want to observe radio signals over some bandwidth: this effect is thus unwelcome! One way to kill this damping factor is to cancel $\tau$, which is the delay between the two signals. This is done by injecting an arbitrary delay $\tau_c=\tau$ in the signal path of the receiver that first received the signal.

The direction $\mathbf{s_0}$ will slowly move in the sky as the Earth rotates. The phase center should thus be tracked by imposing a time dependent delay $\tau_c$ into the appropriate signal path to compensate for $\tau$.

As a result, the fringes and their envelopes will always follow the phase center.

In the following code, we will simulate the impact of observing with an interferometer in a finite bandwidth $\Delta \nu$. To do that, we sum over some wavelength range $\Delta \lambda$, the exponent factors only to mimic the computation of $\underline{C}$. The resulting figure is Fig. 4.3.7 ⤵. TLG:AC: Add the Cartesian view as well not just polar plot.


In [5]:
theta=np.linspace(0.,180.,1000)
blambda=np.linspace(4,6,100)


filter=0
for ilambda in np.arange(100):
    filter+=np.exp(-1j*2*np.pi*blambda[ilambda]*np.cos(np.radians(theta)))

filtercosabs=np.abs(np.real(filter))
filtersinabs=np.abs(np.imag(filter))

f=plt.figure(figsize=(10,10))
plt.axes(polar=True)

plt.xlabel('theta')
plt.plot(np.radians(theta),np.abs(filtercosabs),'b',label="Cosine correlator")
plt.plot(np.radians(theta),np.abs(filtersinabs),'r',label="Sine correlator")
plt.legend(loc="lower center")
#plt.savefig("sinc1.eps")


Out[5]:
<matplotlib.legend.Legend at 0x39e8bd0>

Figure 4.3.7: Directional fringe pattern caused by the finite bandwidth $\Delta \nu$ of our observation, which modulates the fringe amplitude with a sinc.

From this plot, we can see that the fringe pattern has a priviliged direction. The observation in a finite bandwidth has turned the previous fringe pattern (which had a low directivity) into a directive fringe pattern, pointing towards the zenith. If the source is to be observed at a different elevation, the array response will be lower. Using fringe tracking, we can track a source as it moves on the celestial sphere. This guarantees that the maximum array response is pointed toward the phase center.

Note: Delay tracking is the core principle of *phased arrays*, which electronically point the maximum array response towards the direction of interest by introducing delays between the receivers.

Warning: This effect is one effect among other direction-dependent effects which occurs in radio interferometry. The aim of this final section was to better explain the need for delay tracking.

TLG:AC: Time-smearing.

4.3.4 Conclusion

In this section, we have discussed the simple case of a 1D 2-element interferometer. From the correlation of the two voltage signals, we created a complex visibility. This is the measured quantity obtained by observing the sky through a spatial filter whose characteristics depend on the projected baseline.

In the next section, we will continue to work with the complex visibility in a more general scope, by combining the notations defined in $\S$ 4.2 ➞ with the notions defined in the present section.

We will investigate the physical quantity which can be recovered through the sampling of the complex visibility function.

Important things to remember
• Different parts of the sky are incoherent, they do not interfere at the receiver level.
• The interference pattern is "created" by a special combination of the antenna signals.
• The important quantity to consider in an interferometer is the *projected* baseline which will depend on the time and direction of observation.

Format status:

  •      : LF: 08/02/2017
  •      : NC: 08/02/2017
  •      : RF: 08/02/2017
  •      : HF: 08/02/2017
  •      : GM: 08/02/2017
  •      : CC: 08/02/2017
  •      : CL: 08/02/2017
  •      : ST: 08/02/2017
  •      : FN: 08/02/2017
  •      : TC: 08/02/2017
  •      : XX: 08/02/2017