Bayesian inference via posterior sampling

First we will need to wrap Chempy such that it internally produces priors and likelihoods, and returns a posterior.


In [1]:
%pylab inline


Populating the interactive namespace from numpy and matplotlib

In [2]:
# Loading the default parameters

from Chempy.parameter import ModelParameters
a = ModelParameters()

Chempy posterior with prior and with best-fit from paper

  • First we will initialise Chempy with its prior parameters and evaluate the Posterior for Sun+ observational constraint.
  • Then we initilise Chempy with the best parameters obtained from the MCMC (Figure 11 of the paper)

In [3]:
# Evaluating the Chempy posterior at the maximum prior parameters

from Chempy.cem_function import cem
a.testing_output = True
a.summary_pdf = True
a.observational_constraints_index = ['gas_reservoir','sn_ratio','sol_norm']
posterior, blobs = cem(a.p0,a)


[-2.29 -2.75 -0.8  -0.3   3.5   0.5   0.3 ]
('model_temp/', ' already exists. Content might be overwritten')
/home/jan/anaconda2/lib/python2.7/site-packages/Chempy-0.1-py2.7.egg/Chempy/making_abundances.py:49: RuntimeWarning: divide by zero encountered in log10
  cube_abundances[item] = np.where(cube_abundances[item] == 0. , -np.inf, np.log10(cube_abundances[item]) + 12.)
/home/jan/anaconda2/lib/python2.7/site-packages/Chempy-0.1-py2.7.egg/Chempy/data_to_test.py:684: RuntimeWarning: divide by zero encountered in log10
  plt.plot(gas_reservoir['time'],np.log10(gas_reservoir['Z']/solZ),label = "Z_gas_reservoir %.4f" %(np.log10(gas_reservoir['Z'][-1]/solZ)))
('l: ', -198.76735721199799, 'pr: ', 0.0, 'po: ', -198.76735721199799)
/home/jan/anaconda2/lib/python2.7/site-packages/Chempy-0.1-py2.7.egg/Chempy/data_to_test.py:693: RuntimeWarning: divide by zero encountered in log10
  plt.plot(cube['time'],np.log10(cube['Z']/solZ),label = "Z_gas %.4f" %(np.log10(cube['Z'][-1]/solZ)))
<matplotlib.figure.Figure at 0x7f9a338f7610>
<matplotlib.figure.Figure at 0x7f9a331253d0>
<matplotlib.figure.Figure at 0x7f9a33125990>

In [4]:
# Evaluating Chempy at the median posterior parameter values taken from the paper

a.p0 = np.array([-2.46, -3.07, -0.8, -0.31, 3.02, 0.47, -0.11])
posterior, blobs = cem(a.p0,a)


[-2.46 -3.07 -0.8  -0.31  3.02  0.47 -0.11]
('model_temp/', ' already exists. Content might be overwritten')
('l: ', -104.59464254162121, 'pr: ', -1.9270333333333323, 'po: ', -106.52167587495454)
<matplotlib.figure.Figure at 0x7f9a3a7fba90>
<matplotlib.figure.Figure at 0x7f9a30eb5b90>
<matplotlib.figure.Figure at 0x7f9a30f9d850>

MCMC workflow

A full MCMC sampling of the 7 dimensional parameter space of the paper takes too long for this tutorial. For now we do a small inference for only two parameters, high-mass slope of the IMF and number of SN Ia. This would have to run for about 30 minutes. You can uncomment mcmc(a) or you can just use the chain that is already saved to the github repository. All the results from this method will be stored in the folder source/mcmc/0/, where you can also inspect some convergence plots. You can run an MCMC for any Chempy parameter, you just need to define limits and a prior for each parameter in parameter.py.


In [5]:
# Convenience function for the MCMC

from Chempy.wrapper import mcmc
a.testing_output = False
a.summary_pdf = False
a.nwalkers = 8
a.ndim = 2
a.to_optimize = np.array(['high_mass_slope', 'log10_N_0'])
a.p0 = np.array([-2.29 ,-2.75])
#mcmc(a) 
# The code gives out the parameters and likelihoods for each single evaluation. 
# Also after each MCMC step the mean posterior value is shown.
# It will stop once the mean posterior stayed the same for 200 iterations.
# An example chain is already in the folder. Otherwise it takes about 30 minutes to run this chain on a laptop
# There is room for improvement: first minimizing to the 'global' maximum posterior and then sampling the PDF there with the MCMC.
# The algorithm loads the yield tables anew each time. They could be stored for the purpose of sampling instead.

In [6]:
# This function shows MCMC convergence plots and restructures the chain (flattens it and throws out burn-in)

from Chempy.plot_mcmc import restructure_chain
parameter_names = [r'$\alpha_\mathrm{IMF}$',r'$\log_{10}\left(\mathrm{N}_\mathrm{Ia}\right)$']
restructure_chain('mcmc/', parameter_names)


The chain has a length of 305 iterations, each iteration having 8 evaluations/walkers
Mean posteriors at the beginning and the end of the chain:
(-110.65742861947153, -229.82164507612799)
Mean posteriors after the burn-in tail is cut out:
(-110.65742861947153, -110.34496278934805)
We are left with a sample of 496 posterior evaluations from the converged MCMC chain
We have 496 iterations good enough posterior, their posteriors range from
(-109.67176962404118, -115.89082297031626)
('Highest posterior was obtained at parameters: ', array([[-2.38977345, -2.93490073]]))
('Number of unique posterior values: ', 371)
Inferred marginalized parameter distributions are:
('$\\alpha_\\mathrm{IMF}$', -2.3879762600534318, '+-', 0.0085970215284124803)
('$\\log_{10}\\left(\\mathrm{N}_\\mathrm{Ia}\\right)$', -2.9423306535922635, '+-', 0.034225756888401605)
<matplotlib.figure.Figure at 0x7f9a30febc10>
<matplotlib.figure.Figure at 0x7f9a30fa82d0>
<matplotlib.figure.Figure at 0x7f9a30f8a290>

Plotting the posterior PDF

I have a plotting routine which can plot the corner plot of the Posterior distribution and for comparison also plots the Prior distribution in the histograms as well as the 3sigma contour line in the 2d plot. Then we can inspect how much the data (Solar abundances + SN-ratio + Corona metallicity) narrows the parameter constraints.


In [7]:
# Corner plot of the posterior distribution in comparison to the Prior

from Chempy.plot_mcmc import plot_mcmc_chain
plot_mcmc_chain('mcmc/', use_scale = True)


<matplotlib.figure.Figure at 0x7f9a308affd0>

Interestingly the correlation here is negative, whereas in the paper the correlation is positive. The reason for negative correlation here is: the more yield from CC-SN are coming the less is needed from SN Ia and vice versa. In the paper, where we have more parameters they are positively correlated because the alpha elements (from CC-SNe) and the iron-peak elements (mainly from SN Ia) try to reach a similar plateau (at solar abundances). Their overproduction is then counterbalanced by a higher outflow fraction or later SFR peak. This is not possible when only using these two parameters. Hence we have an anti-correlation here.

Correlations with predictions and posterior

How do these parameters correlate with the predicted elements and with the posterior. We try to plot this next (the code is buggy and you will need to provide the name_list if you want to use your own example).


In [8]:
# Corner plot over parameters, predicted elemental abundances, posterior and predicted SN-ratio.

from Chempy.plot_mcmc import plot_element_correlation
plot_element_correlation('mcmc/')


['v-posterior', 'm-O', 'm-Mg', 'm-Si', 'm-Fe', 'm-SN-ratio', 'v-high_mass_slope', 'v-log10_N_0']

We see that Mg and O are perfectly aligned. Showing that one of the elements is completely redundant (at least when only inferring the high-mass slope of the IMF and SN Ia normalisation). Iron is not strongly correlated with the high-mass-slope, but with the Incidence of SN Ia. We also see that the SN-ratio is strongly correlated with the number of SN Ia. The posterior looks symmetric over parameters and predictions. This plot can help to identify redundant elements.

Your own stellar abundances, the wildcard!

If you want to use your own stellar abundances you can do that using the wildcard function. I will use as an example Sakari+ 2017 abundances of a red giant star of the intermediate age LMC cluster NGC1718.


In [9]:
# Abundance data from Sakari+ 2017 on NGC1718-9 (a little bit of averaging where two lines are given..)
# All elements except Fe are given in [X/Fe]

list_of_elements =         ['Fe','O', 'Na','Mg',   'Al', 'Si', 'Ca', 'Ti', 'V',   'Mn',  'Ni',  'Cu',  'Rb', 'Y',    'Zr',  'La', 'Eu']
list_of_abundances =     [-0.55,-0.13,-0.13, 0.11, 0.01, 0.11, 0.09, 0.00, -0.09, -0.19, -0.02, -0.63, -0.24, -0.04, -0.18, 0.27, 0.22]
list_of_abundance_errors = [0.01, 0.07, 0.07, 0.04, 0.07, 0.03, 0.1, 0.13, 0.08,  0.12,  0.05,  0.10,  0.09,  0.08,  0.06,  0.07, 0.05]


#cluster age

age = 2.0
age_error = 0.01 # So far not relevant

In [10]:
# This function produces the structured array which can be used with the wildcard likelihood function

from Chempy.data_to_test import produce_wildcard_stellar_abundances

produce_wildcard_stellar_abundances(stellar_identifier = 'NGC1718-9', 
                                     age_of_star = age,
                                     sigma_age = age_error,
                                     element_symbols = list_of_elements, 
                                     element_abundances = list_of_abundances,
                                     element_errors = list_of_abundance_errors)

In [11]:
# We reload the default parameters and set the observational constraints to the wildcard

a = ModelParameters()
a.testing_output = True
a.summary_pdf = True
a.stellar_identifier = 'NGC1718-9'
a.observational_constraints_index = ['wildcard']


# Evaluating Chempy shows that the stellar abundance is far away from Chempy prior parameters

posterior, blobs = cem(a.p0,a)


[-2.29 -2.75 -0.8  -0.3   3.5   0.5   0.3 ]
('model_temp/', ' already exists. Content might be overwritten')
('l: ', -3493.7782648281654, 'pr: ', 0.0, 'po: ', -3493.7782648281654)
<matplotlib.figure.Figure at 0x7f9a2cf97d90>

Finding the best parameter values for NGC1718-9

We see that the prior parameters fit the abundances of NGC1718-9 poorly. We can now run an MCMC looking for the PDF over parameter space.


In [12]:
# We set the variables in order to run the MCMC

a.testing_output = False
a.summary_pdf = False

# Uncomment the next line if you want to try (This takes about three hours on a modern 8core machine)

#mcmc(a) # We also provide an example chain in the tutorial/mcmc_wildcard/ folder

In [13]:
# This function shows MCMC convergence plots and restructures the chain (flattens it and throws out burn-in)

restructure_chain('mcmc_wildcard/',)


The chain has a length of 324 iterations, each iteration having 64 evaluations/walkers
Mean posteriors at the beginning and the end of the chain:
(-46.331378486928067, -3380.4856334108808)
Mean posteriors after the burn-in tail is cut out:
(-46.331378486928067, -46.098514808370261)
We are left with a sample of 448 posterior evaluations from the converged MCMC chain
We have 448 iterations good enough posterior, their posteriors range from
(-42.965865604445781, -53.090468957277366)
('Highest posterior was obtained at parameters: ', array([[-2.71082063, -3.33968071, -0.73864768, -0.48339269,  3.7317006 ,
         0.66381569,  0.5032127 ]]))
('Number of unique posterior values: ', 171)
Inferred marginalized parameter distributions are:
('$\\alpha_\\mathrm{IMF}$', -2.7110591120038712, '+-', 0.089681975285435908)
('$\\log_{10}\\left(\\mathrm{N}_\\mathrm{Ia}\\right)$', -3.3235198455767594, '+-', 0.16891706266659715)
('$\\log_{10}\\left(\\tau_\\mathrm{Ia}\\right)$', -0.73334683485337826, '+-', 0.31638234804932497)
('$\\log_{10}\\left(\\mathrm{SFE}\\right)$', -0.49435487780174769, '+-', 0.21141185183793493)
('$\\mathrm{SFR}_\\mathrm{peak}$', 4.7687348350052199, '+-', 1.1911277738954764)
('$\\mathrm{x}_\\mathrm{out}$', 0.59123829157221297, '+-', 0.16613353026894886)
('$\\log_{10}\\left(\\mathrm{f}_\\mathrm{corona}\\right)$', 0.34948278776153213, '+-', 0.28279414723849511)
<matplotlib.figure.Figure at 0x7f9a2cfdea50>
<matplotlib.figure.Figure at 0x7f9a2e992c10>
<matplotlib.figure.Figure at 0x7f9a5c082650>

In [14]:
# Corner plot of the posterior distribution in comparison to the Prior

plot_mcmc_chain('mcmc_wildcard/', use_scale = True)


<matplotlib.figure.Figure at 0x7f9a30996550>

Posterior PDF of NGC1718-9

The retrieved median posterior parameters are quite plausible for an intermediate age LMC cluster. We have relatively bottom-heavy IMF with low incidence of SN Ia. The star formation efficiency is a bit lower than our prior and the star formation peak is comparatively late. The outflow fraction is quite high and there are no constraints on the SN Ia delay time and the corona mass, which is expected, as we only have one stellar abundance at a specific time and no corona metallicity data constraint included.

When we now plot the predicted abundance pattern of the chempy model for the median posterior parameter values we see that the fit improved dratically. But similarly to the results from the paper the model has problems with a few elements, e.g. the under abundant Ti.


In [15]:
a = ModelParameters()
a.testing_output = True
a.summary_pdf = True
a.stellar_identifier = 'NGC1718-9'
a.observational_constraints_index = ['wildcard']


median_posterior = np.array([-2.73, -3.34, -0.76, -0.52, 4.6, 0.61, 0.36])

posterior, blobs = cem(median_posterior,a)


[-2.73 -3.34 -0.76 -0.52  4.6   0.61  0.36]
('model_temp/', ' already exists. Content might be overwritten')
('l: ', -40.340480213707174, 'pr: ', -5.0718055555555539, 'po: ', -45.412285769262731)
/home/jan/anaconda2/lib/python2.7/site-packages/Chempy-0.1-py2.7.egg/Chempy/making_abundances.py:49: RuntimeWarning: invalid value encountered in log10
  cube_abundances[item] = np.where(cube_abundances[item] == 0. , -np.inf, np.log10(cube_abundances[item]) + 12.)
<matplotlib.figure.Figure at 0x7f9a2ebc9190>

Continue investigating!

If you want you can run the mcmc using other yield tables and see which parameters are retrieved then...