Week 3 Homework

OGLE lightcurve fit

This week you will do your first MCMC fit of real data (for this class)!

REMINDER!!!

Before doing anything,

Make a copy of this notebook in your clone of the private course repository. Edit that copy, not the version in the public repository clone.
Remember that you will submit your solution by pull request to the private repository.

Once you've submitted your solution, don't forget to also fill out the very quick (and required!) weekly feedback form.

Additional reminder!

Next week's tutorial will be "Final project speed dating". Grad students should show up on Wednesday with some kind of concept for a final project, however half-baked. Undergrads are encouraged to do so as well, but not required.

Data

We'll be analyzing data from the Optical Gravitational Lensing Experiment (OGLE), which monitors stars in our galaxy in the hopes of detecting gravitational microlensing events that occur when a compact mass (e.g. a fainter star) passes in front of the monitored star.

Data are available through the OGLE Early Warning System. Scroll down a bit to the list of recent events and choose one to analyze. (Not the one shown below. Be original.) The event summary page will include a plot like this.

As long as a vaguely reasonable looking magenta line is shown, this should be a good data set to fit. Download the phot.dat for your chosen event (linked at the bottom of the webpage).

As described on the OGLE page, the columns of this text file are

Hel.JD, I magnitude, magnitude error, seeing estimation (in pixels - 0.26"/pixel) and sky level

Heliocentric Julian Date. This is time, measured in days, since a fixed reference. The "heliocentric" part means that it has been corrected to the reference frame of the Sun, i.e. the few minutes of light travel time more or less that would affect photon arrivals at different parts of the Earth's year have been subtracted off.
Measurements of magnitude in the $I$ band (a near infrared band). Recall that astronomical magnitude, relative to a given reference source, is given by the relationship $m = m_\mathrm{ref} - 2.5\,\log_{10}\left(\frac{F}{F_\mathrm{ref}}\right)$, where $F$ is flux.
Measurement uncertainty on the $I$ magnitude, defined in some unspecified way (digging through papers might elucidate this).
The "seeing" and "sky level" quantities refer to the observing conditions, which we will not work with directly. These will have been accounted for (somehow) in deriving the best-fitting magnitude and its uncertainty.

Model

An excellent resource for gravitational lensing background (among other things) is Peter Schneider's Extragalactic Astronomy and Cosmology. It looks like we can download a pdf of the entire book for free (at least from within the Stanford network), which is awesomeness incarnate. The relevant section for Galactic microlensing is 2.5 (logical page 77), and the equations defining the microlensing model lightcurve are 2.92 and 2.93. Namely,

$F(t) = F_0 \frac{y(t)^2 + 2}{y(t)\sqrt{y(t)^2+4}}$

where

$y(t) = \sqrt{p^2 + \left( \frac{t-t_\mathrm{max}}{t_\mathrm{E}} \right)^2}$.

You'll of course also need the transformation between flux and magnitude, above. For convenience, let's parameterize the normalization of the model lightcurve in magnitudes rather than flux, i.e. $I_0$ rather than $F_0$; that way, all of the "ref" quantities in the magnitude definition are absorbed into this new parameter and we won't have to worry about them explicitly. With that substitution, the model parameters are $I_0$, $p$, $t_\mathrm{max}$ and $t_\mathrm{E}$. Following the discussion in Schneider, you should be able to come up with order-of-magnitude guesses at the correct values of these parameters.

Lacking any better information, we'll assume that the sampling distributions for the magnitude measurements are Gaussian and independent, with means given by the "magnitude" column and standard deviations given by the "magnitude error" column, and that the time stamps are exact.

Assignment

Do an MCMC fit of this microlensing model to your lightcurve data. In future weeks, we'll encourage you to use some of the MCMC packages that are out there, but for this problem the rule is:

You may only use code that you've written yourselves or adapted from the "Basic Monte Carlo" lesson/tutorial to do the fit.

This fit should be doable, if potentially annoying, with very simple MCMC methods. You'll appreciate the more advanced methods that much more when we get there!

Your solution should include the following:

a PGM describing the model you're fitting
expressions (in readable code, at a minimum) of the prior distributions and likelihood encoded by the PGM
some justification of the choice of priors (can be brief)
plots showing traces of each parameter, and an identified burn-in period
some evaluation of how well the fit has converged (see below)
1D histograms of the parameter samples and 2D contour plots of parameter pairs (you can just use the corner package for this, which makes it trivial)
"best fit" values and 68.3% confidence intervals from the 1D marginalized posteriors of each parameter. (Originally, this was going to specify the maximum-probability convention for best fit and CI, but I don't want everyone to be sidetracked by the lack of an easy-to-use function to calculate them. If you do have the time and desire to write such a function in Python, I'll bet all of your classmates will be grateful.)
a plot of the best-fitting model lightcurve over the data, and some qualitative comments about how good a fit it appears (hint: depending on your data set, you may need to zoom in quite a lot to get a good look)

A gift

Here's a piece of code which will calculate the Gelman-Rubin convergence criterion ($R$) for a single parameter, if you pass it a numpy.ndarray where the first axis corresponds to different chains and the second axis corresponds to MCMC steps. It's not particularly flexible or optimal, so feel free to improve it or to write your own.



In [ ]:

    
def gelmanrubin(chains):
    M = chains.shape[0]
    N = chains.shape[1]
    thetaJ = np.mean(chains,axis =1)
    thetabar = np.mean(chains)
    sJ = np.zeros(M)
    for i in range(0,M):
        sJ[i] = 1./(N-1.0)*np.sum(np.power(chains[i,:]-thetaJ[i],2.))
    W = 1./float(M)*np.sum(sJ)
    B = float(N)/(M-1.)*np.sum(np.power(thetaJ-thetabar,2.0))
    vartheta = float(N-1)/float(N)*W +B/float(N)
    return np.sqrt(vartheta/W)