Including nuisance parameters in a model provides a way to explicitly account for and marginalize over systematic uncertainties.
This can be as simple as assigning a prior distribution to some quantity that would otherwise remain fixed, or it can involve a more explicit expansion of the model.
Often nuisance parameters represent latent variables - things that are logically or physically part of a model, but which cannot be directly measured.
Recall our simple measurement of a galaxy's flux based on a number of detected counts from the Bayes Theorem chunk:
Now let's expand the model to account for the fact that we actually measure counts from both the galaxy and the background. Being good astronomers, we also remembered to take a second measurement of a source-free (background-only) patch of sky.
Subscripts: $b$ for background, $g$ for galaxy, $s$ for the science observation that includes both.
The background quantities could be regarded as nuisance parameters here - we need to account for our uncertainty in $F_b$, but measuring it isn't the point of the analysis.
Often, especially in physics, the model for a data set naturally takes a hierarchical structure.
In statistics, this is related to the concept of exchangeability - as far as we know, individual sources of a given class are equivalent (until we measure them).
In practice, the hierarchy usually takes the form of common priors for the individual measured sources.
Let's modify the previous example as follows
$n(x) = \phi^\ast x^\alpha e^{-x}; \quad x=\frac{L}{L^\ast}$
Here $n$ is the number density of galaxies.
Now - what does the PGM for our experiment look like?
The exercises below will get you working with PGMs and model expressions (the $A\sim B$ type relationships) for a few scenarios that illustrate advanced (but common) features of data analysis.
When drawing a PGM, do note which parameters would need to have a prior defined, but don't worry about explicitly including prior parameters that are not part of the described model.
Produce the PGM and model expressions for the following scenario:
Type Ia supernova luminosities are "standardizable". This means that, using observable properties of the lightcurve (color $c$ and "stretch" $s$), SNIa luminosities can be corrected to a scale where they are almost equal. Almost, but not quite - there is still a residual intrinsic scatter, even after this correction.
The problem, described in a non-generative way, is as follows. We measure a number of counts for each of many SNIa, and can use the known telescope properties to convert that to a flux. The flux is related to the SNIa's intrinsic luminosity by a distance which is a function of its redshift, $z$. The color and stretch of the SNIa are also measured; these relate the actual luminosity to the "standard" SNIa luminosity. However, there is a scatter, which we assume to be Gaussian, between the actual luminosities of SNIa and this standard luminosity. The parameters that we ultimately care about are the standard luminosity, $\bar{L}$, and the intrinsic scatter, $\sigma$.
In practice, a common use of SNIa involves adding free cosmological parameters in the distance-redshift relation, but we can leave that out here.
It's pretty common for multiple quantities that can be derived from the same observation to have correlated measurement errors. X-ray spectroscopy of galaxy clusters provides a good example. From this type of data, we can measure gas temperatures and estimate the total gravitating mass (assuming the gas is in equilibrium with the gravitational potential). However, temperature is one of the observables needed in the hydrostatic equation to compute the mass (the other being gas density), so any uncertainty in the temperature measurement propagates to the mass.
Naturally, determining the gas temperature as a function of total mass for galaxy clusters is interesting. A standard model for this is a power-law, $T\propto M^\beta$, with a log-normal intrinsic scatter, $\tau$ (see last exercise). Slightly more simply, we could write $t=\alpha+\beta\,m$, with normal scatter $\tau$, for $t=\log(T)$ and $m=\log(M)$.
The new wrinkle is that we have correlated measurement errors on $t$ and $m$. We'll follow the common practice of assuming that the measurement errors are Gaussian and that we can use a fixed estimate of them - but instead of simply having some $\sigma_t$ and $\sigma_m$ in the problem, we actually have an estimated $2\times2$ covariance matrix, $\Sigma$.
Draw the PGM and write down the model for the above scenario.
Future time-domain photometric surveys will discover too many supernovae to follow up and type spectroscopically. This means that if we want to do our SNIa experiment from the "intrinsic scatter" example, our data set will be contaminated by other supernova types.
Here we simplify things a little by eliminating the standardization process, and just assume that each of $m$ SN types have luminosities that follow their own distributions. We also introduce group membership as a latent parameter.
Based on the model written out below, draw a PGM and come up with a more complete narration of what's going on than is provided above.
Some priors on $q_i$, $\bar{L}_i$, $\sigma_i$, for $i=1,\ldots,m$.
For each SN observed,
An extremely common task is fitting a linear model to data that have measurement uncertainties in both the $x$ and $y$ values, and intrinsic scatter. Non-Bayesian approaches were developed in the 90's, but have serious flaws (e.g., they cannot fit for the scatter), whereas this problem is neatly addressed by hierarchical modeling (see exercises).
For the also-common assumptions of Gaussian uncertainties and scatter, generic codes that fit these models using Gibbs sampling (see Efficient Monte Carlo) have been written.
Code | Language | multiple $x$'s | multiple $y$'s |
---|---|---|---|
linmix_err |
IDL | no | no |
mlinmix_err |
IDL | yes | no |
linmix |
Python | no | no |
lrgs |
R | yes | yes |