Let's start by generating our dataset:
In [1]:
from threeML import *
# Let's generate some data with y = Powerlaw(x)
gen_function = Powerlaw()
# Generate a dataset using the power law, and a
# constant 30% error
x = np.logspace(0, 2, 50)
xyl_generator = XYLike.from_function("sim_data", function = gen_function,
x = x,
yerr = 0.3 * gen_function(x))
y = xyl_generator.y
y_err = xyl_generator.yerr
We can now fit it easily with 3ML:
In [2]:
fit_function = Powerlaw()
xyl = XYLike("data", x, y, y_err)
parameters, like_values = xyl.fit(fit_function)
Plot data and model:
In [3]:
xyl.plot(x_scale='log', y_scale='log')
Out[3]:
Compute the goodness of fit using Monte Carlo simulations (NOTE: if you repeat this exercise from the beginning many time, you should find that the quantity "gof" is a random number distributed uniformly between 0 and 1. That is the expected result if the model is a good representation of the data)
In [4]:
gof, all_results, all_like_values = xyl.goodness_of_fit()
print("The null-hypothesis probability from simulations is %.2f" % gof['data'])
The procedure outlined above works for any distribution for the data (Gaussian or Poisson). In this case we are using Gaussian data, thus the log(likelihood) is just half of a $\chi^2$. We can then also use the $\chi^2$ test, which give a close result without performing simulations:
In [5]:
import scipy.stats
# Compute the number of degrees of freedom
n_dof = len(xyl.x) - len(fit_function.free_parameters)
# Get the observed value for chi2
# (the factor of 2 comes from the fact that the Gaussian log-likelihood is half of a chi2)
obs_chi2 = 2 * like_values['-log(likelihood)']['data']
theoretical_gof = scipy.stats.chi2(n_dof).sf(obs_chi2)
print("The null-hypothesis probability from theory is %.2f" % theoretical_gof)
There are however many settings where a theoretical answer, such as the one provided by the $\chi^2$ test, does not exist. A simple example is a fit where data follow the Poisson statistic. In that case, the MC computation can provide the answer.
In [ ]: