In [1]:
%run notebook_setup
In this science demo tutorial, we will reproduce the results in Thompson et al. 2019, who found and followed-up a candidate stellar-mass black hole companion to a giant star in the Milky Way. We will first use The Joker to constrain the orbit of the system using the TRES follow-up radial velocity data released in their paper and show that we get consistent period and companion mass constraints from modeling these data. We will then do a joint analysis of the TRES and APOGEE data for this source by simultaneously fitting for and marginalizing over an unknown constant velocity offset between the two surveys.
A bunch of imports we will need later:
In [2]:
from astropy.io import ascii
from astropy.time import Time
import astropy.units as u
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pymc3 as pm
import exoplanet.units as xu
import exoplanet as xo
import corner
import thejoker as tj
from twobody.transforms import get_m2_min
In [3]:
# set up a random number generator to ensure reproducibility
seed = 42
rnd = np.random.default_rng(seed=seed)
In [4]:
tres_tbl = ascii.read(
"""8006.97517 0.000 0.075
8023.98151 -43.313 0.075
8039.89955 -27.963 0.045
8051.98423 10.928 0.118
8070.99556 43.782 0.075
8099.80651 -30.033 0.054
8106.91698 -42.872 0.135
8112.81800 -44.863 0.088
8123.79627 -25.810 0.115
8136.59960 15.691 0.146
8143.78352 34.281 0.087""",
names=['HJD', 'rv', 'rv_err'])
tres_tbl['rv'].unit = u.km/u.s
tres_tbl['rv_err'].unit = u.km/u.s
In [5]:
apogee_tbl = ascii.read(
"""6204.95544 -37.417 0.011
6229.92499 34.846 0.010
6233.87715 42.567 0.010""",
names=['HJD', 'rv', 'rv_err'])
apogee_tbl['rv'].unit = u.km/u.s
apogee_tbl['rv_err'].unit = u.km/u.s
In [6]:
tres_data = tj.RVData(
t=Time(tres_tbl['HJD'] + 2450000, format='jd', scale='tcb'),
rv=u.Quantity(tres_tbl['rv']),
rv_err=u.Quantity(tres_tbl['rv_err']))
apogee_data = tj.RVData(
t=Time(apogee_tbl['HJD'] + 2450000, format='jd', scale='tcb'),
rv=u.Quantity(apogee_tbl['rv']),
rv_err=u.Quantity(apogee_tbl['rv_err']))
Let's now plot the data from these two instruments:
In [7]:
for d, name in zip([tres_data, apogee_data], ['TRES', 'APOGEE']):
d.plot(color=None, label=name)
plt.legend(fontsize=18)
Out[7]:
The two data sets are separated by a large gap in observations between the end of APOGEE and the start of the RV follow-up with TRES. Since there are more observations with TRES, we will start by running The Joker with just data from TRES before using all of the data. Let's plot the TRES data alone:
In [8]:
_ = tres_data.plot()
It is pretty clear that there is a periodic signal in the data, with a period between 10s to ~100 days (from eyeballing the plot above), so this limits the range of periods we need to sample over with The Joker below. The reported uncertainties on the individual RV measurements (plotted above, I swear) are all very small (typically smaller than the markers). So, we may want to allow for the fact that these could be under-estimated. With The Joker, we support this by accepting an additional nonlinear parameter, s
, that specifies a global, extra uncertainty that is added in quadrature to the data uncertainties while running the sampler. That is, the uncertainties used for computing the likelihood in The Joker are computed as:
$$
\sigma_n = \sqrt{\sigma_{n,0}^2 + s^2}
$$
where $\sigma_{n,0}$ are the values reported for each $n$ data point in the tables above. We'll use a log-normal prior on this extra error term, but will otherwise use the default prior form for The Joker:
In [9]:
with pm.Model() as model:
# Allow extra error to account for under-estimated error bars
s = xu.with_unit(pm.Lognormal('s', -2, 1),
u.km/u.s)
prior = tj.JokerPrior.default(
P_min=16*u.day, P_max=128*u.day, # Range of periods to consider
sigma_K0=30*u.km/u.s, P0=1*u.year, # scale of the prior on semiamplitude, K
sigma_v=25*u.km/u.s, # std dev of the prior on the systemic velocity, v0
s=s
)
With the prior set up, we can now generate prior samples, and run the rejection sampling step of The Joker:
In [10]:
# Generate a large number of prior samples:
prior_samples = prior.sample(size=1_000_000,
random_state=rnd)
In [11]:
# Run rejection sampling with The Joker:
joker = tj.TheJoker(prior, random_state=rnd)
samples = joker.rejection_sample(tres_data, prior_samples,
max_posterior_samples=256)
samples
Out[11]:
Only 1 sample is returned from the rejection sampling step - let's see how well it matches the data:
In [12]:
_ = tj.plot_rv_curves(samples, data=tres_data)
Let's look at the values of the sample that was returned, and compare that to the values reported in Thompson et al. 2019, included below for convenience: $$ P = 83.205 \pm 0.064\\ e = 0.00476 \pm 0.00255\\ K = 44.615 \pm 0.123 $$
In [13]:
samples.tbl['P', 'e', 'K']
Out[13]:
Already these look very consistent with the values inferred in the paper!
Let's now also plot the data phase-folded on the period returned in the one sample we got from The Joker:
In [14]:
_ = tres_data.plot(phase_fold=samples[0]['P'])
At this point, since the data are very constraining, we could use this one Joker sample to initialize standard MCMC to generate posterior samplings in the orbital parameters for this system. We will do that below, but first let's see how things look if we include both TRES and APOGEE data in our modeling.
One of the challenges with incorporating data from the two surveys is that they were taken with two different spectrographs, and there could be instrumental offsets that manifest as shifts in the absolute radial velocities measured between the two instruments. The Joker now supports simultaneously sampling over additional parameters that represent instrumental or calibratrion offsets, so let's take a look at how to run The Joker in this mode.
To start, we can pack the two datasets into a single list that contains data from both surveys:
In [15]:
data = [apogee_data, tres_data]
Before we run anything, let's try phase-folding both datasets on the period value we got from running on the TRES data alone:
In [16]:
tres_data.plot(color=None, phase_fold=np.mean(samples['P']))
apogee_data.plot(color=None, phase_fold=np.mean(samples['P']))
Out[16]:
That looks pretty good, but the period is clearly slightly off and there seems to be a constant velocity offset between the two surveys, given that the APOGEE RV points don't seem to lie in the RV curve. So, let's now try running The Joker on the joined dataset!
To allow for an unknown constant velocity offset between TRES and APOGEE, we have to define a new parameter for this offset and specify a prior. We'll put a Gaussian prior on this offset parameter (named dv0_1
below), with a mean of 0 and a standard deviation of 10 km/s, because it doesn't look like the surveys have a huge offset.
In [17]:
with pm.Model() as model:
# The parameter that represents the constant velocity offset between
# APOGEE and TRES:
dv0_1 = xu.with_unit(pm.Normal('dv0_1', 0, 5.),
u.km/u.s)
# The same extra uncertainty parameter as previously defined
s = xu.with_unit(pm.Lognormal('s', -2, 1),
u.km/u.s)
# We can restrict the prior on prior now, using the above
prior_joint = tj.JokerPrior.default(
# P_min=16*u.day, P_max=128*u.day,
P_min=75*u.day, P_max=90*u.day,
sigma_K0=30*u.km/u.s, P0=1*u.year,
sigma_v=25*u.km/u.s,
v0_offsets=[dv0_1],
s=s
)
prior_samples_joint = prior_joint.sample(size=10_000_000,
random_state=rnd)
In [18]:
# Run rejection sampling with The Joker:
joker_joint = tj.TheJoker(prior_joint, random_state=rnd)
samples_joint = joker_joint.rejection_sample(data,
prior_samples_joint,
max_posterior_samples=256)
samples_joint
Out[18]:
Here we again only get one sample back from The Joker, because these ata are so constraining:
In [19]:
_ = tj.plot_rv_curves(samples_joint, data=data)
Now, let's fire up standard MCMC, using the one Joker sample to initialize. We will use the NUTS sampler in pymc3
to run here. When running MCMC to model radial velocities with Keplerian orbits, it is typically important to think about the parametrization. There are several angle parameters in the two-body problem (e.g., argument of pericenter, phase, inclination, etc.) that can be especially hard to sample over naïvely. Here, for running MCMC, we will instead sample over $M_0 - \omega, \omega$ instead of $M_0, \omega$, and we will define these angles as exoplanet.distributions.Angle
distributions, which internally transform and sample in $\cos{x}, \sin{x}$ instead:
In [20]:
from exoplanet.distributions import Angle
with pm.Model():
# See note above: when running MCMC, we will sample in the parameters
# (M0 - omega, omega) instead of (M0, omega)
M0_m_omega = xu.with_unit(Angle('M0_m_omega'), u.radian)
omega = xu.with_unit(Angle('omega'), u.radian)
# M0 = xu.with_unit(Angle('M0'), u.radian)
M0 = xu.with_unit(pm.Deterministic('M0', M0_m_omega + omega),
u.radian)
# The same offset and extra uncertainty parameters as above:
dv0_1 = xu.with_unit(pm.Normal('dv0_1', 0, 5.), u.km/u.s)
s = xu.with_unit(pm.Lognormal('s', -2, 0.5),
u.km/u.s)
prior_mcmc = tj.JokerPrior.default(
P_min=16*u.day, P_max=128*u.day,
sigma_K0=30*u.km/u.s, P0=1*u.year,
sigma_v=25*u.km/u.s,
v0_offsets=[dv0_1],
s=s,
pars={'M0': M0, 'omega': omega}
)
joker_mcmc = tj.TheJoker(prior_mcmc, random_state=rnd)
mcmc_init = joker_mcmc.setup_mcmc(data, samples_joint)
trace = pm.sample(
tune=1000, draws=1000,
start=mcmc_init,
step=xo.get_dense_nuts_step(target_accept=0.95),
random_seed=seed)
We can now use pymc3
to look at some statistics of the MC chains to assess convergence:
In [21]:
pm.summary(trace, var_names=prior_mcmc.par_names)
Out[21]:
We can then transform the MCMC samples back into a JokerSamples
instance so we can manipulate and visualize the samples:
In [22]:
mcmc_samples = joker_mcmc.trace_to_samples(trace, data=data)
mcmc_samples.wrap_K()
Out[22]:
For example, we can make a corner plot of the orbital parameters (note the strong degenceracy between M0
and omega
! But also note that we don't sample in these parameters explicitly, so this shouldn't affect convergence):
In [23]:
df = mcmc_samples.tbl.to_pandas()
_ = corner.corner(df)
We can also use the median MCMC sample to fold the data and plot residuals relative to our inferred RV model:
In [24]:
fig, axes = plt.subplots(2, 1, figsize=(6, 8), sharex=True)
_ = tj.plot_phase_fold(mcmc_samples.median(), data, ax=axes[0], add_labels=False)
_ = tj.plot_phase_fold(mcmc_samples.median(), data, ax=axes[1], residual=True)
for ax in axes:
ax.set_ylabel(f'RV [{apogee_data.rv.unit:latex_inline}]')
axes[1].axhline(0, zorder=-10, color='tab:green', alpha=0.5)
axes[1].set_ylim(-1, 1)
Out[24]:
Finally, let's convert our orbit samples into binary mass function, $f(M)$, values to compare with one of the main conclusions of the Thompson et al. paper. We can do this by first converting the samples to KeplerOrbit
objects, and then using the .m_f
attribute to get the binary mass function values:
In [ ]:
mfs = u.Quantity([mcmc_samples.get_orbit(i).m_f
for i in np.random.choice(len(mcmc_samples), 1024)])
plt.hist(mfs.to_value(u.Msun), bins=32);
plt.xlabel(rf'$f(M)$ [{u.Msun:latex_inline}]');
# Values from Thompson et al., showing 1-sigma region
plt.axvline(0.766, zorder=100, color='tab:orange')
plt.axvspan(0.766 - 0.00637,
0.766 + 0.00637,
zorder=10, color='tab:orange',
alpha=0.4, lw=0)
In the end, using both the APOGEE and TRES data, we confirm the results from the paper, and find that the binary mass function value suggests a large mass companion. A success for reproducible science!