If you are running this notebook on , uncomment and execute the cell below. Otherwise you can jump down to the other import statements.
In [1]:
#!pip install emcee==3.0rc2
#!pip install corner
In [0]:
import numpy as np
import pandas as pd
from scipy.optimize import minimize, newton
import emcee
import corner
import matplotlib.pyplot as plt
np.random.seed(42)
In this example, we will download a time series of radial velocities for the star HD209458. This star hosts a Hot Jupiter exoplanet. In fact, this planet was the first to be seen in transit and was discovered 20 years ago yesterday!
Because the eccentricity is low for this planet, we can fit its orbit in the radial velocities with a relatively simple model: a sinusoid.
Below is a snippet of code that will download the time-series data from NASA Exoplanet Archive:
In [0]:
datafile = 'https://exoplanetarchive.ipac.caltech.edu/data/ExoData/0108/0108859/data/UID_0108859_RVC_001.tbl'
data = pd.read_fwf(datafile, header=0, names=['t', 'rv', 'rv_err'], skiprows=22)
data['t'] -= data['t'][0]
In [4]:
fig, ax = plt.subplots()
ax.errorbar(data['t'], data['rv'], data['rv_err'], fmt='o', ms=4)
ax.set_xlabel('Time (days)')
ax.set_ylabel(r'RV (m s$^{-1}$)');
First let's write a "get_model_predictions" function - this will resemble yesterday's same-named function, but instead of returning a line it should return a sinusoid. I suggest using the following free parameters, although there are a few alternative options that you may use instead:
theta = [period, # period of the sinusoid
amplitude, # semi-amplitude of the sinusoid
t0, # reference x at which sine phase = 0
rv0] # constant offset in y
In [0]:
def get_model_predictions(theta, t):
'''
Calculate RV predictions for parameters theta and timestamps t.
'''
period, amplitude, t0, rv0 = theta
model_preds = amplitude * np.sin(2. * np.pi / period * (t - t0)) + rv0
return model_preds
Write a lnprior function with flat priors on all parameters - again, this will be similar to yesterday's function, but with different values.
Hint: some of the bounds on these parameters will be physically motivated (i.e. orbital period cannot be negative). For others, you'll need to guess something reasonable but generous - i.e., a Hot Jupiter planet probably does not have an orbital period above a year or so.
In [0]:
def lnprior(theta):
period, amplitude, t0, rv0 = theta
if 0 < period <= 1e4 and 0 <= amplitude <= 1e3: # physical priors
lnp = np.log(1e-4) + np.log(1e-3)
else:
return -np.inf
if np.abs(t0) <= 1e3 and np.abs(rv0) <= 1e3: # generous flat priors
lnp += 2 * np.log(1/2e3)
else:
return -np.inf
return lnp
The following functions can be reused as-is from the previous day's Metropolis-Hastings exercise, so just copy-and-paste or import them:
lnlikelihood, lnposterior, hastings_ratio, propose_jump, mh_mcmc
In [0]:
def lnlikelihood(theta, y, x, y_unc):
model_preds = get_model_predictions(theta, x)
lnl = -np.sum((y-model_preds)**2/(2*y_unc**2))
return lnl
def lnposterior(theta, y, x, y_unc):
lnp = lnprior(theta)
if not np.isfinite(lnp):
return -np.inf
lnl = lnlikelihood(theta, y, x, y_unc)
lnpost = lnl + lnp
return lnpost
def hastings_ratio(theta_1, theta_0, y, x, y_unc):
lnpost1 = lnposterior(theta_1, y, x, y_unc)
lnpost0 = lnposterior(theta_0, y, x, y_unc)
h_ratio = np.exp(lnpost1 - lnpost0)
return h_ratio
def propose_jump(theta, cov):
if np.shape(theta) == np.shape(cov):
cov = np.diag(np.array(cov)**2)
proposed_position = np.random.multivariate_normal(theta, cov)
return proposed_position
def mh_mcmc(theta_0, cov, nsteps, y, x, y_unc):
positions = np.zeros((nsteps+1, len(theta_0)))
lnpost_at_pos = -np.inf*np.ones(nsteps+1)
acceptance_ratio = np.zeros_like(lnpost_at_pos)
accepted = 0
positions[0] = theta_0
lnpost_at_pos[0] = lnposterior(theta_0, y, x, y_unc)
for step_num in np.arange(1, nsteps+1):
proposal = propose_jump(positions[step_num-1], cov)
H = hastings_ratio(proposal, positions[step_num-1], y, x, y_unc)
R = np.random.uniform()
if H > R:
accepted += 1
positions[step_num] = proposal
lnpost_at_pos[step_num] = lnposterior(proposal, y, x, y_unc)
acceptance_ratio[step_num] = float(accepted)/step_num
else:
positions[step_num] = positions[step_num-1]
lnpost_at_pos[step_num] = lnpost_at_pos[step_num-1]
acceptance_ratio[step_num] = float(accepted)/step_num
return (positions, lnpost_at_pos, acceptance_ratio)
Let's start with initialization values.
To save some time, I will assert that if we made a Lomb-Scargle periodogram of the RVs, there would be a peak near period = 3.53 days, so start with that guess and let's figure out what the best values might be for the other parameters.
(If you finish early and are up for a bonus problem, you can double-check my assertion using astropy timeseries!)
In [0]:
theta_0 = [3.53, 80, 0, 0] # [period, amplitude, t0, rv0] starting guesses
Now run the MCMC for 5000 steps. I'll give you (the diagonal of a) covariance matrix to start with. As you saw yesterday afternoon, this cov
parameter sets the step sizes that the M-H algorithm will use when it proposes new values.
In [9]:
cov = [0.01, 1, 0.05, 0.01]
pos, lnpost, acc = mh_mcmc(theta_0, cov, 5000,
data['rv'], data['t'], data['rv_err'])
Do a pairs plot for the first two parameters. Does the behavior of this chain seem efficient?
In [10]:
fig, ax = plt.subplots()
ax.plot(pos[:,0], pos[:,1], 'o-', alpha=0.3)
ax.plot(theta_0[0], theta_0[1], '*', ms=30,
mfc='Crimson', mec='0.8', mew=2,
alpha=0.7)
ax.set_xlabel('Period', fontsize=14)
ax.set_ylabel(r'K (m s$^{-1}$)', fontsize=14)
fig.tight_layout()
This chain looks super inefficient for a couple of reasons: one, it's wandering from from the starting point, which implies a poor initialization and would require us to drop samples from the beginning (burn-in); and two, the acceptance fraction is low and it spends a long time at each point.
We'll use scipy.optimize.minimize to get best-fit parameters. Remember that the lnlikelihood function needs to be maximized not minimized, so we'll need a new function that works the same way, but negative.
In [11]:
def nll(*par):
'''
The negative ln(likelihood).
'''
return -1. * lnlikelihood(*par)
res = minimize(nll, theta_0,
args=(data['rv'], data['t'], data['rv_err']),
method='Powell')
print('Optimizer finished with message "{0}" and \n\
best-fit parameters {1}'.format(res['message'], res['x']))
Plot the data points and your best-fit model. Does the fit look reasonable? (You may need to zoom into a small time range to tell.)
In [12]:
plt.errorbar(data['t'], data['rv'], data['rv_err'],
fmt='o', ms=4)
xs = np.linspace(-0.1, 6, 1000)
plt.plot(xs, get_model_predictions(res['x'], xs), c='DarkOrange')
plt.xlim([-0.1,6])
plt.xlabel('Time (days)')
plt.ylabel(r'RV (m s$^{-1}$)');
Another way to see if we're on the right track is to plot the data phased to the orbital period that we found. Do that and optionally overplot the phased model as well.
In [13]:
period, amplitude, t0, rv0 = res['x']
fig, ax = plt.subplots()
phased_t = (data['t'] - t0) % period
ax.errorbar(phased_t / period, data['rv'], data['rv_err'],
fmt='o', ms=4)
phase_xs = np.linspace(0, period, 100)
ax.plot(phase_xs / period, get_model_predictions(res['x'], phase_xs + t0),
c='DarkOrange')
ax.set_xlabel('Phase')
ax.set_ylabel(r'RV (m s$^{-1}$)');
Now re-run the MCMC using these parameters as the initial values and make another pairs plot. Again, I'm going to give you some step size parameters to start with. Because we're now initializing the chain close to the likelihood maximum, we don't want it to move too far away, so I've lowered the values of cov
.
In [0]:
theta_bestfit = res['x']
cov = [0.001, 0.1, 0.01, 0.1]
pos, lnpost, acc = mh_mcmc(theta_bestfit, cov, 5000,
data['rv'], data['t'], data['rv_err'])
In [15]:
fig, ax = plt.subplots()
ax.plot(pos[:,0], pos[:,1], 'o-', alpha=0.3)
ax.plot(theta_bestfit[0], theta_bestfit[1], '*', ms=30,
mfc='Crimson', mec='0.8', mew=2,
alpha=0.7)
ax.set_xlabel('Period', fontsize=14)
ax.set_ylabel(r'K (m s$^{-1}$)', fontsize=14)
fig.tight_layout()
The chain is now staying relatively stationary, which is good! However, it's still spending a long time at each point.
First, let's just plot the sequence of orbital period values in the chain in a trace plot. From eyeballing this sequence, about how many steps do you think are needed to reach a sample that is independent from the previous one(s)?
In [16]:
plt.plot(pos[:,0]);
Writing an autocorrelation function for this purpose actually gets a bit tricky, so we'll use the built-in functionality of emcee.
For the documentation on these functions, check the emcee user guide.
For a more in-depth look at how this is calculated and why it's tricky, check out this tutorial.
In [17]:
acf = emcee.autocorr.function_1d(pos[:,0])
plt.plot(acf)
plt.xlabel('Lag')
plt.ylabel('Normalized ACF');
In [18]:
act = emcee.autocorr.integrated_time(pos[:,0], quiet=True)
print('The integrated autocorrelation time is estimated as: {0}'.format(act))
This is worrying - it means we have achieved very few actual independent draws from the posterior in our chain.
In [0]:
cov = [0.0001, 0.1, 0.01, 0.1]
pos, lnpost, acc = mh_mcmc(theta_bestfit, cov, 5000,
data['rv'], data['t'], data['rv_err'])
In [20]:
plt.plot(pos[:,0]);
In [21]:
acf = emcee.autocorr.function_1d(pos[:,0])
plt.plot(acf)
plt.xlabel('Lag')
plt.ylabel('Normalized ACF');
In [22]:
act = emcee.autocorr.integrated_time(pos[:,0])
print('The integrated autocorrelation time is estimated as: {0}'.format(act))
Much better!!
Using the step sizes and starting conditions that you deem best, run your MCMC for at least 500x the auto-correlation length to get a large number of independent samples. Plot the posterior distribution of radial velocity semi-amplitude K. This parameter is arguably the most important output of an RV fit, because it is a measurement of the mass of the planet.
In [0]:
pos, lnpost, acc = mh_mcmc(theta_bestfit, cov, 20000,
data['rv'], data['t'], data['rv_err'])
In [24]:
plt.hist(pos[:,1])
plt.xlabel(r'K (m s$^{-1}$)');
From these results, what can we say about the true value of K? What is the probability that K > 84 m/s? 85 m/s? 90 m/s? Are these numbers a reliable estimator of the true probability, in your opinion?
In [25]:
N_tot = len(pos[:,1])
print('The probability that K > 84 m/s is: {0:.2f}'.format(np.sum(pos[:,1] > 84.)/N_tot))
print('The probability that K > 85 m/s is: {0:.2f}'.format(np.sum(pos[:,1] > 85.)/N_tot))
print('The probability that K > 90 m/s is: {0:.2f}'.format(np.sum(pos[:,1] > 90.)/N_tot))
Note: we have not actually sampled parameter space around K > 90 m/s, so take this estimate with a grain of salt -- we can certainly conclude that the probability of K > 90 is low, but we'd need to actually calculate posterior values around K = 90 before we'd have a reliable estimate of the PDF there.
In the previous example, the orbit we were fitting had negligible eccentricity, so we were able to fit it with a sinusoid. In this example, we'll look at the high-eccentricity planet HD 80606b and fit a full Keplerian model to its RV data. This requires introducing some new free parameters to the model, which as we will see are not always straightforward to sample!
In [0]:
datafile = 'https://exoplanetarchive.ipac.caltech.edu/data/ExoData/0045/0045982/data/UID_0045982_RVC_006.tbl'
data = pd.read_fwf(datafile, header=0, names=['t', 'rv', 'rv_err'], skiprows=21)
data['t'] -= data['t'][0]
In [27]:
fig, ax = plt.subplots()
ax.errorbar(data['t'], data['rv'], data['rv_err'], fmt='o', ms=4)
ax.set_xlabel('Time (days)')
ax.set_ylabel(r'RV (m s$^{-1}$)');
In [28]:
plt.errorbar(data['t'] % 111.4, data['rv'], data['rv_err'], fmt='o', ms=4)
plt.xlabel('Phase (days)')
plt.ylabel(r'RV (m s$^{-1}$)');
This planet's orbit should look pretty different from a sine wave!
Remake the get_model_predictions
and lnprior
functions to fit a Keplerian.
Since this is a bit in the weeds of astronomy for the purposes of this workshop, I've gone ahead and written a solver for Kepler's equation and a get_model_predictions
function that will deliver RVs for you. Read over the docstring and use the information given there to write a lnprior
function for theta
.
In [0]:
def calc_ea(ma, ecc):
# Kepler solver - calculates eccentric anomaly
tolerance = 1e-3
ea = np.copy(ma)
while True:
diff = ea - ecc * np.sin(ea) - ma
ea -= diff / (1. - ecc * np.cos(ea))
if abs(diff).all() <= tolerance:
break
return ea
def get_model_predictions(theta, t):
'''
Calculate Keplerian orbital RVs
Input
-----
theta : list
A list of values for the following parameters:
Orbital period,
RV semi-amplitude,
eccentricity (between 0-1),
omega (argument of periastron; an angle in radians
denoting the orbital phase where the planet
passes closest to the host star)
Tp (time of periastron; reference timestamp for the above)
RV0 (constant RV offset)
t : list or array
Timestamps at which to calculate the RV
Returns
-------
rvs : list or array
Predicted RVs at the input times.
'''
P, K, ecc, omega, tp, rv0 = theta
ma = 2. * np.pi / P * (t - tp) # mean anomaly
ea = calc_ea(ma, ecc) # eccentric anomaly
f = 2.0 * np.arctan2(np.sqrt(1+ecc)*np.sin(ea/2.0),
np.sqrt(1-ecc)*np.cos(ea/2.0)) # true anomaly
rvs = - K * (np.cos(omega + f) + ecc*np.cos(omega))
return rvs + rv0
def lnprior(theta):
period, amplitude, ecc, omega, tp, rv0 = theta
if 0 < period <= 1e5 and 0 <= amplitude <= 1e4: # physical priors
lnp = np.log(1e-5) + np.log(1e-4)
else:
return -np.inf
if np.abs(tp) <= 1e4 and np.abs(rv0) <= 1e4: # generous flat priors
lnp += 2 * np.log(1/2e4)
else:
return -np.inf
if 0 <= ecc < 1 and 0 < omega < 2*np.pi: # more physical priors
lnp += np.log(1) + np.log(1/(2*np.pi))
else:
return -np.inf
return lnp
In [30]:
theta_0 = [111.4, 480, 0.95, 2.0, 89, -200] # P, K, ecc, omega, tp, rv0
plt.errorbar(data['t'], data['rv'], data['rv_err'],
fmt='o', ms=4)
xs = np.linspace(900, 1050, 1000)
plt.plot(xs, get_model_predictions(theta_0, xs), c='DarkOrange')
plt.xlim([900,1050])
plt.xlabel('Time (days)')
plt.ylabel(r'RV (m s$^{-1}$)');
In [31]:
cov = [0.1, 100, 0.01, 0.1, 0.1, 100]
pos, lnpost, acc = mh_mcmc(theta_0, cov, 10000,
data['rv'], data['t'], data['rv_err'])
In [32]:
plt.plot(pos[:,2])
plt.ylabel('Eccentricity')
plt.xlabel('Step');
In [33]:
corner.corner(pos, labels=['Period (days)', r"K (m s$^{-1}$)",
r"$e$", r"$\omega$",
r"T$_p$", r"RV$_0$ (m s$^{-1}$)"]);
It's hard to tell since we have very few independent samples, but $e$ and $\omega$ are definitely both highly correlated with many parameters and with each other!
Ford (2005) suggests mitigating this issue by reparameterizing the orbital parameters $e$ and $\omega$ as $e cos\omega$ and $e sin\omega$. Modify the get_model_predictions
and lnprior
functions accordingly and rerun the MCMC. Does performance improve?
Note: the efficiency of a basic MCMC in this situation is never going to be excellent. We'll talk more about challenging cases like this and how to deal with them in later lectures!
In [0]:
def get_model_predictions(theta, t):
P, K, esinw, ecosw, tp, rv0 = theta
omega = np.arctan2(esinw, ecosw)
ecc = esinw / np.sin(omega)
ma = 2. * np.pi / P * (t - tp) # mean anomaly
ea = calc_ea(ma, ecc) # eccentric anomaly
f = 2.0 * np.arctan2(np.sqrt(1+ecc)*np.sin(ea/2.0),
np.sqrt(1-ecc)*np.cos(ea/2.0)) # true anomaly
rvs = - K * (np.cos(omega + f) + ecc*np.cos(omega))
return rvs + rv0
def lnprior(theta):
period, amplitude, esinw, ecosw, tp, rv0 = theta
if 0 < period <= 1e5 and 0 <= amplitude <= 1e4: # physical priors
lnp = np.log(1e-5) + np.log(1e-4)
else:
return -np.inf
if np.abs(tp) <= 1e4 and np.abs(rv0) <= 1e4: # generous flat priors
lnp += 2 * np.log(1/2e4)
else:
return -np.inf
if -1 <= esinw < 1 and -1 < ecosw < 1: # more physical priors
lnp += 2 * np.log(1/2)
else:
return -np.inf
return lnp
In [35]:
theta_0 = [111.4, 480, 0.95 * np.cos(2), 0.95 * np.sin(2), 89, -200]
cov = [0.1, 100, 0.1, 0.1, 0.1, 100]
pos, lnpost, acc = mh_mcmc(theta_0, cov, 5000,
data['rv'], data['t'], data['rv_err'])
In [36]:
plt.plot(pos[:,2])
plt.ylabel('ecosw')
plt.xlabel('Step');
In [37]:
corner.corner(pos, labels=['Period (days)', r"K (m s$^{-1}$)",
r"$e\sin(\omega)$", r"$e\cos(\omega)$",
r"T$_p$", r"RV$_0$ (m s$^{-1}$)"]);
In [0]: