Tools helpful for running MCMC chains of LCDM parameters

The WMAP posterior chain

The WMAP posterior chains are useful for a number of tasks. First, they are useful for generating a prior distribution on LCDM parameters when running a chain on a new data set. Also, the covariance matrix of the LCDM chain from WMAP can be used as a proposal distribution that has the correct degeneracies.

All the WMAP posterior chains are available online here. There are a few chains to choose from (depending on which parameters are allowed to vary and which data sets to include in the likelihood). If you want Julia to automaticall download the basic LCDM chain just uncomment the download and run commands below.



In [6]:

    
link      = "http://lambda.gsfc.nasa.gov/data/map/dr5/dcp/chains/wmap_lcdm_wmap9_chains_v5.tar.gz"
# download(link, "wmap_lcdm_wmap9_chains_v5.tar.gz")
# run(`gunzip wmap_lcdm_wmap9_chains_v5.tar.gz`)

wmap_path = "/Users/ethananderes/Dropbox/Courses/STA250CMB/data/wmap_chain"
# run(`tar -xf wmap_lcdm_wmap9_chains_v5.tar --directory $wmap_path`)
# run(`rm wmap_lcdm_wmap9_chains_v5.tar`)









    Out[6]:





"/Users/ethananderes/Dropbox/Courses/STA250CMB/data/wmap_chain"

Once the download is finished (and the files are in the right place as determined by wmap_path) we can take a peak at what the files look like and count how many samples we have.



In [7]:

    
run(`head $wmap_path/omegach2`)
nlines = countlines("$wmap_path/omegach2")









    



1       0.117062874352432095
2       0.108967379062375080
3       0.109528409355401521
4       0.101409184716838671
5       0.104083078229435091
6       0.107187006794221051
7       0.112446837434475005
8       0.107616613742339856
9       0.111562053244599030
10      0.111868995879391156






    Out[7]:





1296570

These chains are really long. Since we don't need all the iterations for this notebook I will only load in 100,000 of them from the standard 6 parameters.



In [8]:

    
nchain = 100_000
omega_b_chain     = readdlm("$wmap_path/omegabh2")[1:nchain,2]
omega_cdm_chain   = readdlm("$wmap_path/omegach2")[1:nchain,2]
tau_reio_chain    = readdlm("$wmap_path/tau")[1:nchain,2]
theta_s_chain     = readdlm("$wmap_path/thetastar")[1:nchain,2]
A_s_109_chain     = readdlm("$wmap_path/a002")[1:nchain,2]  # <-- 10⁹ * A_s
n_s_chain         = readdlm("$wmap_path/ns002")[1:nchain,2]
# note: kstar here is 0.002

full_chain    = hcat(omega_b_chain, omega_cdm_chain, tau_reio_chain, theta_s_chain, A_s_109_chain, n_s_chain)
names_chain   = [:omega_b, :omega_cdm, :tau_reio, :theta_s, :A_s_109, :n_s]









    Out[8]:





6-element Array{Symbol,1}:
 :omega_b  
 :omega_cdm
 :tau_reio 
 :theta_s  
 :A_s_109  
 :n_s

Now compute the approximated posterior covariance matrix of the LCDM parameters and the best fit.



In [11]:

    
Σwmap         = cov(full_chain)
wmap_best_fit = vec(mean(full_chain,1))
[names_chain wmap_best_fit]









    Out[11]:





6×2 Array{Any,2}:
 :omega_b    0.022635 
 :omega_cdm  0.113854 
 :tau_reio   0.0889852
 :theta_s    0.0103905
 :A_s_109    2.41154  
 :n_s        0.972309

Lets take a look at a trace plot of some of the iterations



In [12]:

    
using PyPlot

subplot(3,1,1)
plot(omega_b_chain[1:100:end], label = "omega_b_chain")
xlabel("iteration")
legend()
subplot(3,1,2)
plot(theta_s_chain[1:100:end], label = "theta_s_chain")
xlabel("iteration")
legend()
subplot(3,1,3)
plot(n_s_chain[1:100:end], label = "n_s_chain")
xlabel("iteration")
legend()









    












    Out[12]:





PyObject <matplotlib.legend.Legend object at 0x316404c10>

The loglikelihood of the 6 LCDM parameters given the bandpowers

We will use pypico for computing the tempurature spectral density.



In [14]:

    
using PyCall

@pyimport pypico
picodata_path = "/Users/ethananderes/Dropbox/Courses/STA250CMB/data/pypico/pico3_tailmonty_v34.dat"
const picoload = pypico.load_pico(picodata_path)

# --------  wrap pico
function pico(x)
    omega_b     = x[1]
    omega_cdm   = x[2]
    tau_reio    = x[3]
    theta_s     = x[4]
    A_s_109     = x[5]
    n_s         = x[6]
    plout::Dict{ASCIIString, Array{Float64,1}} = picoload[:get](;
        :re_optical_depth => tau_reio,
        symbol("scalar_amp(1)") =>  1e-9*A_s_109,
        :theta => theta_s,
        :ombh2 => omega_b,
        :omch2 => omega_cdm,
        symbol("scalar_spectral_index(1)") => n_s,
        :massive_neutrinos => 3.04,
        :helium_fraction => 0.25,
        :omnuh2 => 0.0,
        symbol("scalar_nrun(1)") => 0.0,
        :force     => true
    )
    clTT::Array{Float64,1} = plout["cl_TT"]
    ells   = 0:length(clTT)-1
    clTT .*= 2π ./ ells ./ (ells + 1)
    clTT[1] = 0.0
    return clTT
end









    Out[14]:





pico (generic function with 1 method)

Now we can define the loglikelihood as it depends on the bandpowers, the noise level σ² and the beam full width at half max b²



In [15]:

    
function loglike(x, bandpowers, σ², b²)
    ell = 0:length(bandpowers)-1
    cldd = pico(x) + σ² * exp(b² .* ell .* (ell + 1) ./ (8log(2)))
    rtn = 0.0
    @inbounds for l in ell[2:end]
      rtn -= log(cldd[l+1]) * (2l+1) / 2
      rtn -= (bandpowers[l+1] / cldd[l+1]) * (2l+1) / 2
    end
    return rtn
end









    Out[15]:





loglike (generic function with 1 method)

Now lets generate some fake data

In your homework 2 assignment I will ask you to run MCMC chains for generating posterior draws from this fake data. The LCDM parameter vector I will use for this fake data is a random draw from one of the iterations of the WMAP chain.

The noise level and beam I'm using to generate the fake data are defined as follows.



In [16]:

    
# beam and spectral density are set to somewhat resemble WMAP
const σ² = (10.0/3437.75)^2    # <---10μkarcmin noise level converted to radian pixels (1 radian = 3437.75 arcmin)
const b² = (0.0035)^2          # <-- pixel width 0.2ᵒ ≈ 12.0 armin ≈ 0.0035 radians









    Out[16]:





1.2250000000000001e-5

Now we randomly choose a LCDM parameter vector and save it to disk so we can check how your MCMC worked



In [18]:

    
using  HDF5
lcdm_sim_truth = full_chain[rand(1:nchain),:]
h5write("/Users/ethananderes/Dropbox/Courses/STA250CMB/lectures/julia_lecture5_LCDM_MCMC/lcdm_sim_truth.h5", "lcdm_sim_truth", lcdm_sim_truth)

Now we generate the spectral densities and simulate fake bandpowers (which are also saved to disk in the homework 2 directory)



In [19]:

    
clTT = pico(lcdm_sim_truth)
ell  = 0:length(clTT)-1
cldd = clTT + σ² * exp(b² .* ell .* (ell + 1) ./ (8log(2)))

bandpowers = Array(Float64, length(cldd))
for l in ell
  bandpowers[l+1] = abs2( √(cldd[l+1]) * randn() )
  for m in 1:l
    bandpowers[l+1] += 2abs2( √(cldd[l+1]/2) * randn() )
    bandpowers[l+1] += 2abs2( √(cldd[l+1]/2) * randn() )
  end
  bandpowers[l+1] ./= (2l + 1)
end
h5write("/Users/ethananderes/Dropbox/Courses/STA250CMB/homework/homework2/bandpowers.h5", "bandpowers", bandpowers)









    



pypico.datafiles.3e624ff23f85df8ba6355a4e18135245:326: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
pypico.datafiles.3e624ff23f85df8ba6355a4e18135245:116: RuntimeWarning: divide by zero encountered in divide

Here is a plot of the bandpowers compared to the signal spectral density and the noise



In [27]:

    
semilogy(ell[2:5:2000], bandpowers[2:5:2000], label="bandpowers")
semilogy(ell[2:5:2000], clTT[2:5:2000], label="temp spectrum")
semilogy(ell[2:5:2000], cldd[2:5:2000], label="data spectrum")
semilogy(ell[2:5:2000], (cldd-clTT)[2:5:2000], label="noise spectrum")
legend()









    












    Out[27]:





PyObject <matplotlib.legend.Legend object at 0x327c9a750>

NLopt

NLopt is a high quality optimization package ported to Julia. I highly recommend it. Usually, when optimizing a function you want to include gradient calculations. However, in the CMB case these are complicated. In NLopt you can choose from a collection of non-gradient optimization algorithms. We will use the one called LN_BOBYQA.



In [28]:

    
using NLopt

# here are some optimization algorithms that do not require gradient calculations
# see http://ab-initio.mit.edu/wiki/index.php/NLopt_Reference for a reference
algm = [:LN_BOBYQA, :LN_COBYLA, :LN_PRAXIS, :LN_NELDERMEAD, :LN_SBPLX]

llmin(x, grad)  = loglike(x, bandpowers, σ², b²)









    Out[28]:





llmin (generic function with 1 method)

NLopt expects that the function your trying to optimze takes two arguments: the variable x is the one your optimizing over and a variable grad which, when the funtion is called, over-writes grad with the gradient of the objective function at x. In this case we don't do anything to grad since we are only using non-gradient based algorithms.

The following code sets up the optimzation object.



In [29]:

    
#dx   = eig(Σwmap) |> x -> 0.01*x[2]*√(x[1]) # <--- start direction
opt = Opt(algm[1], 6)
#initial_step!(opt, dx)
upper_bounds!(opt, [0.034, 0.2,  0.55,  .0108, exp(4.0)/10,  1.25])
lower_bounds!(opt, [0.018, 0.06, 0.01,  .0102, exp(2.75)/10, 0.85])  # <-- pico training bounds
maxtime!(opt, 5*60.0)   # <--- max time in seconds
max_objective!(opt, llmin)

Note: I set the upper and lower bounds to the bounding box constraints used for training pypico.

Now we tell NLopt to optimize it.



In [30]:

    
optf, optx, ret = optimize(opt, wmap_best_fit)









    Out[30]:





(-3.8968e8,[0.0224791,0.108115,0.0782035,0.0104004,2.25854,0.9795],:ROUNDOFF_LIMITED)

The variable optf gives the objective value at the maximum, optx gives the optimal lcdm parameters and ret contains comments.

Lets take a look at the fitted values compared with the simulation truth and the WMAP best fit.



In [31]:

    
hcat(names_chain, optx, lcdm_sim_truth, wmap_best_fit)









    Out[31]:





6×4 Array{Any,2}:
 :omega_b    0.0224791  0.0223805  0.022635 
 :omega_cdm  0.108115   0.109353   0.113854 
 :tau_reio   0.0782035  0.0759736  0.0889852
 :theta_s    0.0104004  0.0104023  0.0103905
 :A_s_109    2.25854    2.28102    2.41154  
 :n_s        0.9795     0.976015   0.972309

Let's make sure optx is at a higher likelihood.



In [37]:

    
@show llmin(optx, 0);
@show llmin(lcdm_sim_truth, 0);
@show llmin(wmap_best_fit, 0);









    



llmin(optx,0) = -3.8967960904521793e8
llmin(lcdm_sim_truth,0) = -3.896796100340683e8
llmin(wmap_best_fit,0) = -3.896796706767549e8

Distributions

When programing MCMC chains the package Distributions is useful for generating proposals and for computing ratios of proposal densities. Here is how to define a Gaussian symmetric proposal. Note: I'm classifying it as constant as to give the type inference a guarentee that it will not change type withing the MCMC algorithm.



In [40]:

    
using Distributions
const lcdm_proposal_RV = MultivariateNormal(zeros(6), Σwmap)
generate_lcdm_prop(lcdm_curr)  = rand(lcdm_proposal_RV) + lcdm_curr









    



WARNING: redefining constant lcdm_proposal_RV
WARNING: Method definition generate_lcdm_prop(Any) in module Main at In[39]:3 overwritten at In[40]:3.






    Out[40]:





generate_lcdm_prop (generic function with 1 method)

Now I can generate proposals as follows



In [42]:

    
lcdm_curr = copy(optx)
lcdm_prop = generate_lcdm_prop(lcdm_curr)









    Out[42]:





6-element Array{Float64,1}:
 0.0226567
 0.112048 
 0.0762033
 0.0103927
 2.30137  
 0.979049

getDist for chain visulization

getDist is a great python package used for visualizing MCMC chains. Here is a link to the package and here is a link to some documentation.

I'm using Anaconda for python on my machine. I was able to easily install getDist with the following command at the terminal:

sudo pip install getdist

Just to give you a short demo of using it in Julia, the following code makes 1sigma and 2sigma contours of the WMAP chain on the 6 LCDM parameters we are interested in.



In [43]:

    
using PyCall, PyPlot

@pyimport getdist
@pyimport getdist.plots as plots
samples = getdist.MCSamples(samples=full_chain, names=names_chain)
g = plots.getSubplotPlotter()
g[:triangle_plot](samples, filled=true)
# g[:export]("output_file.pdf")  # <---use this if you want to save the image to disk

You can even compair two chains easily.



In [47]:

    
wider_chain = 1.5*(full_chain[1:10_000,:].-wmap_best_fit.') .+ optx.' 
samples1 = getdist.MCSamples(samples=wider_chain, names=names_chain)
samples2 = getdist.MCSamples(samples=full_chain[20_000:50_000,:], names=names_chain)
g = plots.getSubplotPlotter()
g[:triangle_plot]([samples1, samples2], filled=true)

MCMC Chain on LCDM



In [ ]:

    
mcmc_runs     = 2_000
mcmc_thin     = 5
mcmc_burn     = 300

# ------  choose a proposal distribution
const β_proposal_RV    = Normal(0.0, 0.2 )
const lcdm_proposal_RV = MultivariateNormal(zeros(lcdm_wmap), abs2(0.5) * lcdm_wmap_cov)

prop_β(β_curr)	       = rand(β_proposal_RV)    + β_curr
prop_lcdm(lcdm_curr)   = rand(lcdm_proposal_RV) + lcdm_curr

function mcmc_chain(numruns, thin, burn, lcdm_start, β_start)
	lcdmmat      = Array(Float64, 6, numruns)
	βmat         = Array(Float64, numruns)
	lcdmmat[:,1] = copy(lcdm_start)
	βmat[1] 	 = copy(β_start)
	curr_ll		 = loglike_bin(bin_pico(lcdm_start) + β_start*X_on_bin) 
	# curr_ll		+= logpdf(β_proposal_RV, β_start)
	curr_ll		+= logpdf(lcdm_wmap_RV, lcdm_start)
	for j = 2:numruns
		lcdm_prop = prop_lcdm(lcdmmat[:,j-1])
		β_prop    = prop_β(βmat[j-1])
		prop_ll   = loglike_bin(bin_pico(lcdm_prop) + β_prop*X_on_bin) 
		#prop_ll  += logpdf(β_proposal_RV, β_prop)
		prop_ll	 += logpdf(lcdm_wmap_RV, lcdm_prop)
		prob_accept = min(1.0, exp(prop_ll - curr_ll))
		if j % 250 == 1
			@show prob_accept
		end
		if rand() < prob_accept
			lcdmmat[:,j] = copy(lcdm_prop)
			βmat[j]	 = copy(β_prop)
			curr_ll		 = prop_ll
		else
			lcdmmat[:,j] = copy(lcdmmat[:, j-1])
			βmat[j]	 = copy(βmat[j-1])
		end
	end
	return lcdmmat[:,burn:thin:end], βmat[burn:thin:end]
end


#  ----- run the mcmc
@time lcdmmat, βmat = mcmc_chain(mcmc_runs, mcmc_thin, mcmc_burn, lcdm_binβ[1:6], lcdm_binβ[7])