Single-Order Spectrum

This will show how to fit a single-order spectrum using our previous setup on some ~mysterious~ IRTF SpeX data. The spectrum is available for download here.

**Note:** This documentation is not meant to be an exhaustive tour of *Starfish*'s features, but rather a simple example showing a workflow typical of fitting data.

Preprocessing

Normally, you would pre-process your data. This includes loading the fits files, separating out the wavelengths, fluxes, uncertainties, and any masks. In addition, you would need to convert your data into the same units as your emulator. In our case, the PHOENIX emulator uses $A$ and $erg/cm^2/s/cm$. For this example, though, I've already created a spectrum that you can load directly.



In [1]:

    
%matplotlib inline
import matplotlib.pyplot as plt

plt.style.use("seaborn")
plt.rcParams["text.usetex"] = True



In [2]:

    
from Starfish.spectrum import Spectrum

data = Spectrum.load("example_spec.hdf5")

data.plot();

Setting up the model

Now we can set up our initial model. We need, at minimum, an emulator, our data, and a set of the library grid parameters. Every extra keyword argument we add is added to our list of parameters. For more information on what parameters are available and what effect they have, see the SpectrumModel documentation.

Some of these parameters are based on guesses or pre-existing knowledge. In particular, if you want to fit log_scale, you should spend some time tuning it by eye, first. We also want our global_cov:log_amp to be reasonable, so pay attention to the $\sigma$-contours in the residuals plots, too.

There aren't any previous in-depth works on this star, so we will start with some values based on the spectral type alone.



In [3]:

    
from Starfish.models import SpectrumModel

model = SpectrumModel(
    "F_SPEX_emu.hdf5",
    data,
    grid_params=[6800, 4.2, 0],
    Av=0,
    global_cov=dict(log_amp=38, log_ls=2),
)
model









    Out[3]:





SpectrumModel
-------------
Data: Example Spectrum
Emulator: F_SPEX_emu
Log Likelihood: None

Parameters
  Av: 0
  T: 6800
  Z: 0
  global_cov:
    log_amp: 38
    log_ls: 2
  logg: 4.2

In this plot, we can see the data and model in the left pane, the absolute errors (residuals) along with the diagonal of the covariance matrix as $\sigma$ contours in the top-right, and the relative errors (residuals / flux) in the bottom-right



In [5]:

    
model.plot();

Numerical Optimization

Now lets do a maximum a posteriori (MAP) point estimate for our data.

Here we freeze logg here because the PHOENIX models' response to logg compared to our data are relatively flat, so we fix the value using the freeze mechanics. This is equivalent to applying a $\delta$-function prior.



In [6]:

    
model.freeze("logg")
model.labels  # These are the fittable parameters









    Out[6]:





('Av', 'T', 'Z', 'global_cov:log_amp', 'global_cov:log_ls')

Here we specify some priors using scipy.stats classes. If you have a custom distribution you want to use, create a class and make sure it has a logpdf member function.



In [8]:

    
import scipy.stats as st

priors = {
    "T": st.norm(6800, 100),
    "Z": st.uniform(-0.5, 0.5),
    "Av": st.halfnorm(0, 0.2),
    "global_cov:log_amp": st.norm(38, 1),
    "global_cov:log_ls": st.uniform(0, 10),
}

Using the above priors, we can do our MAP optimization using scipy.optimize.minimze, which is usefully baked into the train method of our model. This should give us a good starting point for our MCMC sampling later.



In [9]:

    
%time model.train(priors)









    



CPU times: user 9min 32s, sys: 1min 37s, total: 11min 10s
Wall time: 1min 37s






    Out[9]:





 final_simplex: (array([[ 8.54783896e-12,  7.02369877e+03, -4.45788754e-04,
         3.88334287e+01,  4.14479840e+00],
       [ 1.48310582e-11,  7.02369876e+03, -4.45788836e-04,
         3.88334288e+01,  4.14479845e+00],
       [ 1.22696417e-11,  7.02369877e+03, -4.45788845e-04,
         3.88334288e+01,  4.14479844e+00],
       [ 1.00040208e-11,  7.02369885e+03, -4.45788664e-04,
         3.88334286e+01,  4.14479836e+00],
       [ 1.67318797e-11,  7.02369882e+03, -4.45788773e-04,
         3.88334287e+01,  4.14479842e+00],
       [ 1.32087844e-11,  7.02369883e+03, -4.45788591e-04,
         3.88334285e+01,  4.14479835e+00]]), array([-5885.18256651, -5885.18256651, -5885.18256651, -5885.18256651,
       -5885.18256651, -5885.18256651]))
           fun: -5885.18256651297
       message: 'Optimization terminated successfully.'
          nfev: 771
           nit: 453
        status: 0
       success: True
             x: array([ 8.54783896e-12,  7.02369877e+03, -4.45788754e-04,  3.88334287e+01,
        4.14479840e+00])



In [11]:

    
model









    Out[11]:





SpectrumModel
-------------
Data: Example Spectrum
Emulator: F_SPEX_emu
Log Likelihood: 5894.70076364305

Parameters
  Av: 8.547838957698344e-12
  T: 7023.698772270976
  Z: -0.00044578875442483404
  global_cov:
    log_amp: 38.8334286761706
    log_ls: 4.1447983961624875

Frozen Parameters
  logg: 4.2



In [13]:

    
model.plot();



In [14]:

    
model.save("example_MAP.toml")

MCMC Sampling

Now, we will sample from our model. Note the flexibility we provide with Starfish in order to allow sampler front-end that allows blackbox likelihood methods. In our case, we will continue with emcee, which provides an ensemble sampler. We are using pre-release of version 3.0. This document serves only as an example, and details about emcee's usage should be sought after in its documentation.

For this basic example, I will freeze both the global and local covariance parameters, so we are only sampling over T, Z, and Av.



In [15]:

    
import emcee

emcee.__version__









    Out[15]:





'3.0.0'



In [17]:

    
model.load("example_MAP.toml")
model.freeze("global_cov")
model.labels









    Out[17]:





('Av', 'T', 'Z')



In [19]:

    
import numpy as np

# Set our walkers and dimensionality
nwalkers = 50
ndim = len(model.labels)

# Initialize gaussian ball for starting point of walkers
scales = {"T": 1, "Av": 0.01, "Z": 0.01}

ball = np.random.randn(nwalkers, ndim)

for i, key in enumerate(model.labels):
    ball[:, i] *= scales[key]
    ball[:, i] += model[key]



In [20]:

    
# our objective to maximize
def log_prob(P, priors):
    model.set_param_vector(P)
    return model.log_likelihood(priors)


# Set up our backend and sampler
backend = emcee.backends.HDFBackend("example_chain.hdf5")
backend.reset(nwalkers, ndim)
sampler = emcee.EnsembleSampler(
    nwalkers, ndim, log_prob, args=(priors,), backend=backend
)

here we start our sampler, and following this example we check every 10 steps for convergence, with a max burn-in of 1000 samples.

**Warning:** This process *can* take a long time to finish. In cases with high resolution spectra or fully evaluating each nuisance covariance parameter, we recommend running on a remote machine. A setup I recommend is a remote jupyter server, so you don't have to create any scripts and can keeping working in notebooks.



In [21]:

    
max_n = 1000

# We'll track how the average autocorrelation time estimate changes
index = 0
autocorr = np.empty(max_n)

# This will be useful to testing convergence
old_tau = np.inf


# Now we'll sample for up to max_n steps
for sample in sampler.sample(ball, iterations=max_n, progress=True):
    # Only check convergence every 100 steps
    if sampler.iteration % 10:
        continue

    # Compute the autocorrelation time so far
    # Using tol=0 means that we'll always get an estimate even
    # if it isn't trustworthy
    tau = sampler.get_autocorr_time(tol=0)
    autocorr[index] = np.mean(tau)
    index += 1
    # skip math if it's just going to yell at us
    if not np.isfinite(tau).all():
        continue
    # Check convergence
    converged = np.all(tau * 10 < sampler.iteration)
    converged &= np.all(np.abs(old_tau - tau) / tau < 0.01)
    if converged:
        print(f"Converged at sample {sampler.iteration}")
        break
    old_tau = tau









    



/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/emcee/ensemble.py:258: RuntimeWarning: Initial state is not linearly independent and it will not allow a full exploration of parameter space
  category=RuntimeWarning,
  0%|          | 0/1000 [00:00<?, ?it/s]/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/emcee/moves/red_blue.py:97: RuntimeWarning: invalid value encountered in double_scalars
  lnpdiff = f + nlp - state.log_prob[j]
  1%|          | 10/1000 [00:22<40:51,  2.48s/it]/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/emcee/autocorr.py:38: RuntimeWarning: invalid value encountered in true_divide
  acf /= acf[0]
/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/emcee/autocorr.py:43: RuntimeWarning: invalid value encountered in less
  m = np.arange(len(taus)) < c * taus
/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/emcee/autocorr.py:101: RuntimeWarning: invalid value encountered in greater
  flag = tol * tau_est > n_t
 42%|████▏     | 420/1000 [35:41<49:16,  5.10s/it]  





    



Converged at sample 420

After our model has converged, let's take a few extra samples to make sure we have clean chains. Remember, we have 50 walkers, so 100 samples ends up becoming 5000 across each chain!



In [22]:

    
sampler.run_mcmc(backend.get_last_sample(), 100, progress=True);









    



100%|██████████| 100/100 [08:54<00:00,  5.34s/it]

MCMC Chain Analysis

Chain analysis is a very broad topic that is mostly out of the scope of this example. For our analysis, we like using ArviZ with a simple corner plot as well.



In [23]:

    
import arviz as az
import corner

print(az.__version__, corner.__version__)









    



0.6.0 1.0.3



In [24]:

    
reader = emcee.backends.HDFBackend("example_chain.hdf5")
full_data = az.from_emcee(reader, var_names=model.labels)



In [25]:

    
az.plot_trace(full_data);









    



/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/arviz/plots/backends/matplotlib/distplot.py:38: UserWarning: Argument backend_kwargs has not effect in matplotlib.plot_distSupplied value won't be used
  "Argument backend_kwargs has not effect in matplotlib.plot_dist"
/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/arviz/plots/backends/matplotlib/distplot.py:38: UserWarning: Argument backend_kwargs has not effect in matplotlib.plot_distSupplied value won't be used
  "Argument backend_kwargs has not effect in matplotlib.plot_dist"
/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/arviz/plots/backends/matplotlib/distplot.py:38: UserWarning: Argument backend_kwargs has not effect in matplotlib.plot_distSupplied value won't be used
  "Argument backend_kwargs has not effect in matplotlib.plot_dist"

After seeing our full traces, let's discard and thin some of the burn-in



In [26]:

    
tau = reader.get_autocorr_time(tol=0)
burnin = int(tau.max())
thin = int(0.3 * np.min(tau))
burn_samples = reader.get_chain(discard=burnin, thin=thin)
log_prob_samples = reader.get_log_prob(discard=burnin, thin=thin)
log_prior_samples = reader.get_blobs(discard=burnin, thin=thin)

dd = dict(zip(model.labels, burn_samples.T))
burn_data = az.from_dict(dd)



In [27]:

    
az.plot_trace(burn_data);









    



/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/arviz/plots/backends/matplotlib/distplot.py:38: UserWarning: Argument backend_kwargs has not effect in matplotlib.plot_distSupplied value won't be used
  "Argument backend_kwargs has not effect in matplotlib.plot_dist"
/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/arviz/plots/backends/matplotlib/distplot.py:38: UserWarning: Argument backend_kwargs has not effect in matplotlib.plot_distSupplied value won't be used
  "Argument backend_kwargs has not effect in matplotlib.plot_dist"
/Users/miles/.pyenv/versions/3.7.4/Python.framework/Versions/3.7/lib/python3.7/site-packages/arviz/plots/backends/matplotlib/distplot.py:38: UserWarning: Argument backend_kwargs has not effect in matplotlib.plot_distSupplied value won't be used
  "Argument backend_kwargs has not effect in matplotlib.plot_dist"



In [28]:

    
az.summary(burn_data)



In [30]:

    
az.plot_posterior(burn_data, ["T", "Z", "Av"]);



In [31]:

    
# See https://corner.readthedocs.io/en/latest/pages/sigmas.html#a-note-about-sigmas
sigmas = ((1 - np.exp(-0.5)), (1 - np.exp(-2)))
corner.corner(
    burn_samples.reshape((-1, 3)),
    labels=model.labels,
    quantiles=(0.05, 0.16, 0.84, 0.95),
    levels=sigmas,
    show_titles=True,
);

After looking at our posteriors, let's look at our fit



In [32]:

    
best_fit = dict(az.summary(burn_data)["mean"])
model.set_param_dict(best_fit)
model









    Out[32]:





SpectrumModel
-------------
Data: Example Spectrum
Emulator: F_SPEX_emu
Log Likelihood: None

Parameters
  Av: 0.067
  T: 7043.581
  Z: -0.297

Frozen Parameters
  logg: 4.2
  global_cov:log_amp: 38.8334286761706
  global_cov:log_ls: 4.1447983961624875



In [34]:

    
model.plot();

and finally, we can save our best fit.



In [35]:

    
model.save("example_sampled.toml")

Now, on to the next star!

	mean	sd	hpd_3%	hpd_97%	mcse_mean	mcse_sd	ess_mean	ess_sd	ess_bulk	ess_tail	r_hat
Av	0.067	0.058	0.000	0.172	0.003	0.002	415.0	415.0	494.0	932.0	1.08
T	7043.581	68.071	6919.282	7176.220	3.812	2.701	319.0	318.0	322.0	1564.0	1.12
Z	-0.297	0.097	-0.472	-0.113	0.005	0.003	434.0	434.0	440.0	1686.0	1.09