Source detection is different in character from parameter estimation: we are less interested in the flux of the source than we are in its existence.

Detection is a model comparison problem:
- $H_0$: there is no source present
- $H_1$: there is a source present, with flux $f$ and position $(x,y)$

One way to quantify the significance of the source's detection is to calculate and compare the evidences for each model, ${\rm Pr}(d\,|\,H_0)$ and ${\rm Pr}(d\,|\,H_1)$

Notice that these will involve marginalizing over the model parameter prior PDFs that you assigned in writing down $H_1$: the probability of getting the data given model $H_1$ depends on your prior uncertainty on its parameters (because that is part of $H_1$).

What does this mean? Increasing the prior ranges on the source position and flux makes any given point in parameter space less probable, making the detected model seem ever more contrived - and so this will decrease the evidence for $H_1$, making the detection less significant (but only linearly, remember).

Likewise, the maximum value that the evidence for $H_1$ can take occurs when the prior PDF is a delta function at the maximum likelihood point. At this point, the evidence ratio equals the likelihood ratio used in classical statistics: the procedure there would be to approximate this test statistic as being drawn from a $\chi^2$ distribution with 3 degrees of freedom (the difference between the two models) and ask for the probability of getting a likelihood rato larger than the observed value by chance, if in fact there was no source present.

The Bayesian evidence ratio is more conservative, because it takes into account the uncertainties on the source parameters present before the data were taken, and remaining after they have been included. However, it's expensive to compute, and depends on the prior assignment made. We will see next week that the Fermi group opted for the classical route because of this second factor.

In the limit of high signal to noise, the conclusions about detection significance made by analysts in both groups should agree, because in both approaches it is being dominated by the narrow sampling distribution, which will overwhelm almost any prior assigned.

Project Pitch: explore the relationship between Bayesian model selection and classical hypothesis testing in more detail, by setting up some simple toy problems and computing both the evidences, and the frequentist $p$-values.

A Short Note About Detection