In this notebook we will go through an example using the foxi code features to evaluate the expected utility of a mock scientific survey. This notebook will assume the reader is familiar with all of the concepts and is intended as a practical demonstration of the code. For more details about the utility functions themselves, as well as the inner workings of foxi, one can read the paper it was first developed in here: [https://arxiv.org/abs/1803.09491].
To begin with, we will import the module sys
to append our path towards foxi.
In [ ]:
import sys
path_to_foxi = '/PATH/foxi_train' # Give your path to foxi here.
sys.path.append(path_to_foxi + '/foxisource/')
from foxi import foxi
# These imports aren't stricly necessary to run foxi but they will be useful in our examples.
import numpy as np
from scipy.stats import multivariate_normal
Consider a scientific experiment over a 5D parameter space. Fitting a statistical model to the input, let us imagine that the posterior distribution is given by a multivariate Gaussian distribution with non-trivial covariances. Let us quickly generate $10^3$ samples from this 5D posterior. Feel free to change these settings to check the final results.
In [ ]:
# Choose a state of low information to start with.
mock_posterior_best_fit = np.asarray([np.random.uniform(-1.0,1.0),
np.random.uniform(-1.0,1.0),
np.random.uniform(-1.0,1.0),
np.random.uniform(-1.0,1.0),
np.random.uniform(-1.0,1.0)])
# Set the posterior best fit to be a unit vector.
mock_posterior_best_fit = mock_posterior_best_fit/np.sqrt(np.sum(mock_posterior_best_fit**2.0))
# Set the posterior Fisher matrix to be diagonal for simplicity.
mock_posterior_fisher_matrix = np.asarray([[ np.random.uniform(0.01,1.0), 0.0, 0.0, 0.0, 0.0],
[ 0.0, np.random.uniform(0.01,1.0), 0.0, 0.0, 0.0], # This obviously assumes that the
[ 0.0, 0.0, np.random.uniform(0.01,1.0), 0.0, 0.0], # covariance matrix is constant
[ 0.0, 0.0, 0.0, np.random.uniform(0.01,1.0), 0.0], # with respect to the parameters.
[ 0.0, 0.0, 0.0, 0.0, np.random.uniform(0.01,1.0)]])
# Give the posterior a unit-determinant Fisher matrix so that other realisations are comparable.
mock_posterior_fisher_matrix = mock_posterior_fisher_matrix/(np.linalg.det(mock_posterior_fisher_matrix)**(1.0/5.0))
# Quick inversion to generate the samples and mimic some weights too.
number_of_posterior_samples = 10**3
mock_posterior_covariance_matrix = np.linalg.inv(mock_posterior_fisher_matrix)
mock_posterior_samples = np.random.multivariate_normal(mock_posterior_best_fit,
mock_posterior_covariance_matrix,
number_of_posterior_samples)
mock_posterior_sample_weights = multivariate_normal.pdf(mock_posterior_samples,
mean=mock_posterior_best_fit,
cov=mock_posterior_covariance_matrix)
mock_posterior_samples_output = np.insert(mock_posterior_samples,0,mock_posterior_sample_weights,axis=1)
# Let's output this data to a file in the '/foxichains' directory which mimics a real MCMC output.
np.savetxt(path_to_foxi + '/foxichains/mock_posterior_samples.txt', mock_posterior_samples_output, delimiter='\t')
We should now generate a series of toy model priors which can span the parameter space. These should represent a true collection of theoretical models but, in this case for simplicity and to capture the essential effects of the whole model space, we will choose Nm
models that all have uniform hypersphere priors all with radius Rm
, and positions drawn from a Gaussian hyperprior centred on mock_posterior_best_fit
. Let us now quickly generate $500$ samples for each of these models and save them to files so that foxi can read them in.
In [ ]:
# Feel free to vary these (though consider the number of model pairs to compare grows like Nm*(Nm-1)/2).
Nm = 10
Rm = 0.0001
number_of_prior_samples = 5*10**2
# Making the model spread relatively small - not strictly necessary but improves convergence properties.
hyperprior_covariance_matrix = 0.1*mock_posterior_covariance_matrix
# Generate the positions.
mock_prior_positions = np.random.multivariate_normal(mock_posterior_best_fit,hyperprior_covariance_matrix,Nm)
# Generate the 5D hypersphere samples and output to text files in the '/foxipriors' directory.
for im in range(0,Nm):
R1 = np.random.uniform(0.0,Rm,size=number_of_prior_samples)
R2 = np.random.uniform(0.0,Rm,size=number_of_prior_samples)
R3 = np.random.uniform(0.0,Rm,size=number_of_prior_samples)
R4 = np.random.uniform(0.0,Rm,size=number_of_prior_samples)
R5 = np.random.uniform(0.0,Rm,size=number_of_prior_samples)
angle1 = np.random.uniform(0.0,np.pi,size=number_of_prior_samples)
angle2 = np.random.uniform(0.0,np.pi,size=number_of_prior_samples)
angle3 = np.random.uniform(0.0,np.pi,size=number_of_prior_samples)
angle4 = np.random.uniform(0.0,2.0*np.pi,size=number_of_prior_samples)
parameter1 = R1*np.cos(angle1) + mock_prior_positions[im][0]
parameter2 = R2*np.sin(angle1)*np.cos(angle2) + mock_prior_positions[im][1]
parameter3 = R3*np.sin(angle1)*np.sin(angle2)*np.cos(angle3) + mock_prior_positions[im][2]
parameter4 = R4*np.sin(angle1)*np.sin(angle2)*np.sin(angle3)*np.cos(angle4) + mock_prior_positions[im][3]
parameter5 = R5*np.sin(angle1)*np.sin(angle2)*np.sin(angle3)*np.sin(angle4) + mock_prior_positions[im][4]
mock_prior_samples = np.asarray([parameter1,parameter2,parameter3,parameter4,parameter5]).T
np.savetxt(path_to_foxi + '/foxipriors/mock_prior' + str(im+1) + '_samples.txt', mock_prior_samples, delimiter='\t')
Let us assume that the future likelihood will be a multivariate Gaussian over the parameters as well. The code can of course accomodate any specified fiduical point-dependent matrix, but it will be more instructive to simplify things in this way. In this instance, let us also model how the Fisher matrix varies with respect to the fiducial points with a polynomial.
In [ ]:
# Specify the polynomial baheviour of the forecast Fisher matrix with respect to the fiducial points.
def fisher_matrix(fiducial_point):
mock_forecast_fisher_matrix = np.zeros((5,5))
mock_forecast_fisher_matrix += mock_posterior_fisher_matrix
mock_forecast_fisher_matrix[0][0] += mock_posterior_fisher_matrix[0][0]*((fiducial_point[0]-mock_posterior_best_fit[0])**2.0)
mock_forecast_fisher_matrix[1][1] += mock_posterior_fisher_matrix[1][1]*((fiducial_point[1]-mock_posterior_best_fit[1])**2.0)
mock_forecast_fisher_matrix[2][2] += mock_posterior_fisher_matrix[2][2]*((fiducial_point[2]-mock_posterior_best_fit[2])**2.0)
mock_forecast_fisher_matrix[3][3] += mock_posterior_fisher_matrix[3][3]*((fiducial_point[3]-mock_posterior_best_fit[3])**2.0)
mock_forecast_fisher_matrix[4][4] += mock_posterior_fisher_matrix[4][4]*((fiducial_point[4]-mock_posterior_best_fit[4])**2.0)
return mock_forecast_fisher_matrix
In [ ]:
foxi_instance = foxi(path_to_foxi)
The next step is to specify where our posterior samples of the current data are, how many samples to read in, what weights they carry and identify if there are any transformations to be done on each column (the latter will be no in this case).
In [ ]:
chains_filename = 'mock_posterior_samples.txt'
# Note that the column numbers start from 0...
parameter_column_numbers = [1,2,3,4,5]
weights_column_number = 0
# Simply set this to the number of samples generated - this can be useful to get results out as a sanity check.
number_of_samples_to_read_in = number_of_posterior_samples
foxi_instance.set_chains(chains_filename,
parameter_column_numbers,
number_of_samples_to_read_in,
weights_column=weights_column_number, # All points are given weight 1 if this is ignored.
column_types=None) # No transformations needed here.
# One could have ['flat','log10','log'] specified for each column.
Now that the posterior chains have been set, we can do the same for our list of prior samples.
In [ ]:
# List the model file names to compute the expected utilities for.
model_name_list = ['mock_prior' + str(im+1) + '_samples.txt' for im in range(0,Nm)]
# List the column numbers to use for each file of prior samples.
prior_column_numbers = [[0,1,2,3,4] for im in range(0,Nm)]
# Once again, simply set this to the number of samples we made earlier for each prior.
number_of_prior_points_to_read_in = number_of_prior_samples
foxi_instance.set_model_name_list(model_name_list,
prior_column_numbers,
number_of_prior_points_to_read_in,
prior_column_types=None) # Once again, no transformations needed here.
Notice that the output here is from a quick KDE of each set of prior samples. How these are used depends on the specific numerical situation, which is once again covered in more detail in [https://arxiv.org/abs/1803.09491]. Our final step before running foxi is to give it a python function which returns the forecast Fisher matrix at each fiducial point (i.e. evaluated at each point of the current data chains). We do this like so
In [ ]:
foxi_instance.set_fisher_matrix(fisher_matrix)
All of the necessary settings have now been applied for a full run of the foxi code. Depending on how many points were introduced at the level of the priors and chains, as well as the length of time it takes to evaluate the forecast Fisher matrix at each fiducial point, we may wish to continue on a compute cluster. For this simple example we have chosen, it should be enough to simply run locally on a laptop in less than 10 minutes. Our main run command is the following
In [ ]:
mix_models = True # Set this to 'True' so that the expected utilities for all possible model pairs are calculated.
# The default is 'False' which calculates the utilities all with respect to the reference model
# here the 0-element in 'model_name_list'.
foxi_instance.run_foxifish(mix_models=mix_models)
Don't be too alarmed by a few error messages about dividing by zero: these occur infrequently due to excessively low densities appearing in the numerical procedure but are automatically discarded as points from the main result. The algorithm should terminate with a Number of maxed-out evidences for each model:
message. This message refers to the number of times when the model was found to be ruled out at the level of its maximum likelihood.
This last step was the longest running, from now onwards the post-processing of the results from foxi will always be quick enough to run locally on a laptop. The next step is to use rerun_foxi
to analyse the output that run_foxifish
dumped into the /foxioutput
directory.
In [ ]:
foxiplot_file_name = 'foxiplots_data_mix_models.txt' # This is the generic name that 'run_foxifish' will set, change
# this to whatever you like as long as the file is in '/foxioutput'.
# If 'mix_models = False' then remove the 'mix_models' tag.
# Set this to the number of samples generated - useful to vary to check convergence though we will make plots later.
number_of_foxiplot_samples_to_read_in = number_of_posterior_samples
# We must set this feature to 'flat' in each column to perform no post-processing transformation that reweights the chains.
# This can be a little redundant as it makes more numerical sense to simply generate new chains.
post_chains_column_types = ['flat','flat','flat','flat','flat']
# Set this to 'True' for the output to include fully-formatted LaTeX tables!
TeX_output = True
# For the truly lazy - you can set the TeX name for each model in the table output too.
model_TeX_names = [r'${\cal M}_' + str(i) + r'$' for i in range(0,Nm)]
foxi_instance.rerun_foxi(foxiplot_file_name,
number_of_foxiplot_samples_to_read_in,
post_chains_column_types,
model_name_TeX_input=model_TeX_names,
TeX_output=TeX_output)
Once again, don't be too alarmed by a few value error messages. If there are any error messages related to converting strings to floats, this is likely to do with a delimiter problem in the text files that were read in. Tab delimited data should work fine (or at least give a number of spaces). Note also that the number of model pairs quoted here is actually the number of unique pairs + the auto-pairs (i.e. Model_i - Model_j as well as Model_i - Model_i) so this number is an additional +N_m
larger than the number of unique pairs. A summary of the main results can be read from this file
In [ ]:
print(open(path_to_foxi + '/foxioutput/foxiplots_data_summary.txt', 'r').read())
One thing to avoid in these results is if the quoted expected Kullback-Leibler divergence <DKL>
is of roughly the same value as the natural log of the number of points in the posterior chains - which in this example is $\ln (10^3) \simeq 6.9$. More chain points must be used for larger values since otherwise the value of <DKL>
is only a lower bound.
A nice feature, if one sets TeX_output = True
in rerun_foxi
above, is the output of a fully-formatted $\LaTeX$ table incorporating of all of these results. This can be found in the following file
In [ ]:
print(open(path_to_foxi + '/foxioutput/foxiplots_data_summary_TeX.txt', 'r').read())
Another useful tool provided by foxi is to generate plots illustrating the numerical convergence (or lack thereof) of the calculated expected utilities.
In [ ]:
# Simply specify the number of bins in which to re-calculate the expected utilty with more samples.
number_of_bins = 50
# We have already specified the other inputs here so let's simply generate the plots!
foxi_instance.plot_convergence(foxiplot_file_name,
number_of_bins,
number_of_foxiplot_samples_to_read_in,
post_chains_column_types)
The output can be found in /foxiplots/foxiplot_convergence.pdf
.