In [ ]:
from IPython.display import HTML
HTML('../style/course.css')
HTML('../style/code_toggle.html')
The next decade will be an exciting one for HI science, since several major new surveys on SKA precursor facilities are underway or set to begin in 2017 that will cover cosmological volumes.These range from wide-field, relatively shallow surveys of the local Universe in emission and the distant one in absorption (Wallaby on ASKAP, FLASH on ASKAP, Apertif Shallow Northern Survey on WSRT, MALS on MeerKAT) to deeper pencil beam surveys that will detect gas in and around galaxies out to $z \approx 1$ (CHILES on the VLA, LADUMA on MeerKAT, DINGO on ASKAP).
A common feature of the majority of the surveys mentioned above is that they are blind: they will uniformly survey patches of sky in which the HI content is not a priori known at the requisite sensitivity and resolution (this is, of course, why one would want to survey there in the first place!). Predicting the number and nature of detections anticipated in these surveys is therefore critical for estimating the scientific returns from these large investments of limited telescope time, the impact of modifications to survey plans, and the resources required to analyse the resulting datasets (e.g. Duffy et al. (2012) ⤴ Duffy et al. 2012; Maddox et al. 2016 ⤴ Maddox et al. 2016; Giovanelli & Haynes 2016 ⤴ Giovanelli & Haynes 2016). Such predictions can also be useful a posteriori, to understand the properties of survey detections and to control for systematics.
This chapter provides general guidelines for predicting HI detections for blind HI surveys of cosmological volumes. All reliable predictions require a thorough understanding of the planned survey's basic parameters. With these parameters in-hand, we discuss predicting survey detections using the HI mass function, with a mention of other techniques. We then turn to predicting resolved galaxy properties.
The key ingredients for making robust predictions of HI detections in a given survey are the survey sensitivity and volume. In this section, we discuss the essential parameters that determine these quantities. Several excellent resources exist that describe the fundamental properties of radio telescopes, both in print (e.g. Thompson, Moran and Swenson 2001 ⤴ Thompson, Moran and Swenson 2001) and online (e.g. Essential Radio Astronomy, Radio Astronomy Tools and Techniques). Here, we tailor the more general (and complete) discussion in those resources to the specific case of HI survey predictions.
A survey's sensitivity is determined by a combination of the observational strategy and the instrument characteristics: the longer one integrates and the more sensitive the instrument, the more sensitive the resulting observation. If the observations a not limited by either astronomical confusion (e.g. Condon et al. 2012 ⤴ Condon et al. 2012) or instrument systematics (e.g. Grobler et al. 2014 ⤴ Grobler et al. 2014), the defining equation for the point-source sensitivity ($\sigma_{\rm{PS}}$, in Jy) per spectral channel of a radio interferometer at the pointing centre is the radiometer equation: $$\sigma_{\rm{PS}} = \frac{SEFD}{n_c \sqrt{n_{pol}\,N(N-1)\,t_{int}\,\delta_\nu}}$$ where $SEFD$ (in Jy) is the system equivalent flux density of an individual antenna (defined as the flux density of a radio source that doubles the system temperature), $n_c$ is the correlator efficiency ($n_c \sim 1$ for modern systems), $n_{pol}$ is the number of polarization products included in the image ($n_{pol}=2$ for the vast majority of HI surveys), $N$ is the number of antennas, $t_{int}$ (in seconds) is the net integration time of the observation, and $\delta_\nu$ (in Hz) is the spectral channel width. This equation illustrates that the signal-to-noise of an observation depends linearly on the antenna sensitivity (ie. the SEFD), and on the inverse square root of both the observing time and the spectral channel width. For a given instrument configuration, estimates of $\sigma_{PS}$ and the assumed spectral channel width $\delta_\nu$ are essential for predicting HI survey detections. Clearly, smoothing the data to a lower spectral resolution (larger $\delta_\nu$) will increase a survey's sensitivity.
It is important to recognize that the equation above provides the instrument sensitivity per synthesized beam per spectral channel; the flux from sources that are spatially or spectrally resolved relative to this beam or channel width will therefore be distributed across more than one pixel of the resulting dataset. In particular, the spatial scales to which an interferometer is sensitive are determined by the distribution of its antennas and the weighting applied during imaging (see here for a detailed derivation). The antenna configuration therefore determines the angular resolution (and therefore the synthesized beam) of an observation, which is required to determine the likelihood that survey detections will be spatially resolved.
The column density sensitivity of a survey dictates the detectability of spatially resolved sources (see Chapter 4). The column density sensitivity ($\sigma_{NHI}$, in $\rm{atoms} \, \rm{cm}^2$) per spectral channel of an observation is given by: $$\sigma_{NHI} = 2.23 \times 10^{24} \frac{\sigma_{PS} \, \delta_\nu}{\theta_a \, \theta_b \, \nu_c^2} ,$$ where $\theta_a$ and $\theta_b$ are full-width at half-maximum of the Gaussian beam along its major and minor axes, respectively (typically $\theta_a$=$\theta_b$ for planar arrays), $\nu_c$ is the observing frequency in GHz, and the other variables have the same definitions as above. The column density sensitivity of an observation therefore depends on both the point-source sensitivity and the synthesized beam shape; the latter property is therefore critical for estimating resolved galaxy detections. Note that different weights applied to survey data at the imaging stage will change $\theta_a$ and $\theta_b$, and therefore the column density sensitivity.
Given the point-source and surface-brightness sensitivities of each pointing, the second important factor governing the HI detections in a survey is the survey volume. The areal coverage of the survey is clearly an important factor in determining this volume. For surveys that will tile the sky to achieve near-uniform sensitivity in the survey footprint (e.g. WALLABY on ASKAP), the survey area is a trivial factor to include when computing HI detections. The issue is more complex for surveys with variable sensitivity due to the sparseness of individual pointings (e.g. MALS on MeerKAT)). A survey's volume is also determined by its spectral coverage. While the large bandwidths of modern correlators rarely limit survey volume, radio frequency interference can limit a survey's sensitivity at some frequencies at which redshifted HI lines may fall (due to the elimination of corrupted baselines, effectively reducing $N$ in the sensitivity equation; Fernandez et al. 2013 ⤴ Fernandez et al. 2013), or blind it altogether (e.g. Catinella et al. 2015 ⤴ Catinella et al. 2015). Spectral windows with strong RFI outside the protected 1.4 GHz bands should therefore be taken into account when estimating the survey volume for high-redshift HI detections.
We now turn to techniques for predicting HI detections in a survey of known sensitivity and volume. For surveys that probe a cosmological volume (ie. a volume much larger that the one spanned by typical large scale structures in the galaxy distribution; e.g. Martin et al. 2012 ⤴ Martin et al. 2012), the HI mass function (HIMF) is an important tool for making such predictions (e.g. Giovanelli et al. 2005; Duffy et al. (2012) ⤴ Duffy et al. 2012; Maddox et al. 2016 ⤴ Maddox et al. 2016). We focus on exploiting the HIMF to make predictions in this section, and also include a brief description of other approaches. We focus on the simplest case of spatially unresolved sources in this section, deferring a discussion of resolved disks to the next one.
Chapter 3 defines the HIMF as the number density of HI detections as a function of their HI mass, and it is now well-measured for HI masses in the range $M_{\rm{HI}} \gtrsim 10^{6.5}\,M_{\odot}$ in the local universe (e.g. Jones et al. 2016 ⤴ Jones et al. 2016; our knowledge of the low-mass end of the HIMF is limited by the sensitivity of extant surveys to these faint sources, of which there are many). The number HI detections as a function of HI mass in a given volume slice of an HI survey can therefore be obtained by integrating the HIMF: $$N_{gal}(M_{\rm{HI}}) = \int \phi(M_{\rm{HI}}) \, dV,$$ where $\phi(M_{\rm{HI}})$ is the HIMF and $dV$ is a volume element; the equation above may be evaluated analytically from Schechter function fits to the HIMF, although extrapolating galaxy number counts beyond the range spanned by the data (e.g. $M_{\rm{HI}} \leq 10^6 M_{\odot}$) should be treated with caution. As described in Chapter 1, the HI mass of a source scales with the flux integral of the detected line and distance squared. Given the distance of a volume slice in a survey, the detectability of an HI source predicted to lie in that volume from the HIMF therefore depends on the expected linewidth and the survey sensitivity. Specifically, the expected signal-to-noise $SN_{\rm{PS}}$ of a (point-source) detection in a given spectral channel can be approximated by: $$SN_{\rm{PS}} \sim 235.6 \, D^2 \, S_{\rm{peak}} \, W \, \sigma_{PS}^{-1},$$ where $D$ is the distance to the source in Mpc, $S_{\rm{peak}}$ is the peak flux of the source in mJy, $W$ is the characteristic width of the source in km/s, and $\sigma_{PS}$ in mJy is the survey sensitivity for that channel width. The simplest approach for estimating $SN_{PS}$ is to adopt a fixed $W$ for all detections (e.g. Maddox et al. 2016 ⤴ Maddox et al. 2016). For the inclined, rotating disks that are expected to make up the majority of HI sources (see Chapter 4), $W$ depends on the rotation speed in the outer disk $V_{rot}$, the disk inclination $i$ along the line-of-sight, and the velocity dispersion $\sigma_V$ of the disk: $$W \sim 2 \sqrt{ (V_{rot} \, \sin{i})^2 + \sigma_V^2 }. $$
The velocity dispersions of (warm) HI disks range from $8-15\,\rm{km}\,\rm{s}^{-1}$, where the upper end of this range is most appropriate for dwarfs; the contribution of $\sigma_V$ to $W$ is therefore negligible for most galaxies with $i \gtrsim 30^{\circ}$. Estimates of $V_{rot}$ for a given $M_{\rm{HI}}$ can be gleaned from known scaling relations (e.g. the baryonic Tully-Fisher relation and the gas fraction relation; see Chapter 3), by abundance matching the HI velocity function, or by matching dark matter to HI masses in semi-analytic galaxy formation models (e.g. Duffy et al. (2012) ⤴ Duffy et al. 2012). We note that self-consistent hydrodynamical simulations that predict the HI properties of the galaxy population from first principles are emerging, and may become an important future tool for predicting HI detections (e.g. Lagos et al. 2014 ⤴ Lagos et al. 2014).
The HIMF is a reliable tool for predicting HI detections in the local universe ($z \leq 0.2$), but the evolution in this relation should be considered when predicting detections at higher redshift. The $dN/dz$ relations of Obreschkow et al. (2009) ⤴ Obreschkow et al. (2009) provide one estimate of this evolution.
We end this section by re-iterating that the approach anchored by the HIMF described above is appropriate only for predicting number counts of gas-rich galaxies in cosmological volumes; it will break down when predicting detections from gas-poor systems that may not be well-represented in extant HI surveys (e.g. Huang et al. 2012 ⤴ Huang et al. 2012; (e.g. Brown et al. 2015 ⤴ Brown et al. 2015), for galaxies with $M_{\rm{HI}} \lesssim 10^{6.5} \, M_{\odot}$ that are not reliably probed by these surveys (e.g. Jones et al. 2016 ⤴ Jones et al. 2016), or for predicting detections in volumes where cosmic variance may play an important role. In these cases, it may be more appropriate to predict detections by using the colour-magnitude diagram and colour -- HI mass relations (e.g. Brown et al. 2015 ⤴ Brown et al. 2015), scaling relations derived specifically for low-mass galaxies (e.g. Bradford et al. 2015 ⤴ Bradford et al. 2015), or by using the distribution of galaxies mapped at another wavelength (e.g. in optical surveys) as a starting point.
As mentioned above, for resolved galaxy counts it is not the point source sensitivity of the survey that is important but the column density sensitivity. This means that there is a trade off between your sensitivity and the angular size of your beam as a larger beam would lead to better column density sensitivity as long as the galaxy is resolved, i.e. the synthesized beam is still completely filled with emission.
It also means that, under the assumption of constant angular size of the synthesized beam and constant channel width with frequency, the sensitivity of the survey is distance independent. This is because the physical area sampled in the synthesized beam, and hence the column density in said beam, increases with the distance squared whereas the flux from a constant column density would decrease with the same amount.
Therefore the distance limit in HI surveys for resolved galaxies is the point at which a galaxy no longer completely fills a synthesized beam, i.e. when they are no longer resolved. This means that it is the size of a galaxy that determines when a survey is no longer sensitive to that type of galaxy. This also means that merely increasing the synthesized beam size to become more sensitive would quickly lead to the survey becoming insensitive to the smallest galaxies. As explained in Chapter 4, the HI size of a galaxy is closely related to its HI mass, and hence to its HI flux.
The simplest way of obtaining estimated source counts of resolved galaxies is to construct a catalogue of fluxes and distances of galaxies in a cosmological volume, either by populating a n-body simulation with the HIMF or by taking an existing HI single dish all sky survey such as HIPASS (Meyer et al. 2004 ⤴ Meyer et al. 2004). The HI mass -- HI diameter relation (e.g. Wang et al. 2016 ⤴ Wang et al. 2016) then allows us to calculate the physical sizes of these galaxies which can be converted to their angular sizes. It is then straightforward to compare these sizes to the angular size of the synthesized beam and to match the used cosmological volume to that of the proposed survey and count our resolved galaxies. When using an existing low-resolution survey the key parameter is the distance of the galaxy. However, as HI observations are line observations the recessional velocity will always be known when the flux is measured and hence this poses no problem beyond the Local Volume.
The method described in the previous paragraph completely ignores the velocity structure and the orientation of the HI disk on the sky. These parameters are important as the sensitivity limit is calculated assuming that the HI column completely fills the synthesized beam and is entirely contained within the channel width. The first assumption is generally satisfied as we are interested in resolved galaxies. However, if a galaxy is edge-on ($i=90^{\circ}$), this might not be true for thinner galaxies at larger distances. Even more so, as HI disks rotate they have a velocity structure and the observed velocities are spread over a velocity range related to the inclination and the maximum rotational velocity, i.e $V_{\rm{obs}}=V_{\rm{rot}} \times \sin(i)$. This means that for edge-on galaxies in general the emission is spread out over many different channels and the exact distribution could even depend on the shape of the rotation curve. On the other hand we are considering disks. Therefore the column densities observed in an edge-on galaxy are much higher than those in a face-on galaxy as the path through the galaxy is much longer in the radial direction.
In order to incorporate all these effects into the estimates and to ensure that the estimates are accurate one would need to include phenomenologically correct HI disks in n-body simulations. For example, disks with realistic rotation curves and surface brightness distributions that are physically consistent, e.g. more massive galaxies have a higher rotation amplitude, would already be a tremendous improvement on the simple estimates discussed earlier. These disks should then be distributed with random orientations and subsequently the flux received in a given velocity bin should be compared to the column density sensitivity of the survey in order to determine whether this galaxy would be detected. A first attempt at such work can be found in Duffy et al. (2012) ⤴ Duffy et al. (2012).
Even though the main issues of whether a galaxy can be detected are discussed in the previous paragraphs, the inclusion of more parameters, such as warps, bars and velocity dispersion, in the phenomenological representation of the disk would increase the accuracy of the predictions. How much so is of course dependent on whether the survey is sensitive to column densities at which these structures manifest themselves.
In [ ]: