Analysis for the case that the proto-stellar (i.e., initial) helium mass fraction is fixed to a value that is linearly proportional to the heavy element mass fraction, $Z_i$. Specifically, we assume that
\begin{equation} Y_i = Y_{\textrm{prim}} + \frac{\Delta Y}{\Delta Z}\left(Z_i - Z_{\textrm{prim}}\right), \end{equation}where $Y_{\textrm{prim}}$ and $Z_{\textrm{prim}}$ are the primordial helium and heavy element abundances immediately following big bang nucleosynthesis, $Z_i$ is the proto-stellar heavy element abundance, and $\Delta Y / \Delta Z$ is a helium enrichment factor. Here, we assume $Z_{\textrm{prim}} = 0$, $Y_{\textrm{prim}} = 0.2488$ (Piembert et al. 2007), and $\Delta Y / \Delta Z = 1.76$. The latter was determined based on the enrichment factor required during our solar model calibration and is consistent with more rigorous studies that suggest $\Delta Y / \Delta Z = 2 \pm 1$ (e.g., Casagrande et al. 2007). We shall return address the validity of the helium enrichment factor later.
Three different comparisons with models were created: one with a weak metallicity constraint for stars that don't have quoted observational uncertainties (assumed $1\sigma = \pm 0.20$ dex), one with a strong metallicity contraint ($1\sigma = 0.05$ dex), and a run where we consider only contraints on the bolometric flux and angular diameter, since the observed effective temperature is a derived quantity.
First, we must load the data from each comparison (note, data is pre-processed). We'll start by loading data representing the most likely values from the MCMC runs,
In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
FeH_weak = np.genfromtxt('data/run09_mle_props.txt') # weak metallicity prior
FeH_strong = np.genfromtxt('data/run10_mle_props.txt') # strong metallicity prior
no_Teff = np.genfromtxt('data/run11_mle_props.txt') # no Teff observable in likelihood function
The pre-processed data files have observational quantities embedded within them. However, specific star names are (for the moment) kept hidden for proprietary reasons. It means we're storing a bit more data in memeory, since each data file contains observed quantities, but no matter. Anyway, we can look how well the MCMC does recovering particular values.
Starting with distance,
In [2]:
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
# set axis labels and plot one-to-one line
for ax in axes:
ax.set_xlabel('Observed Distance (pc)', fontsize=22.)
ax.plot( np.arange(0.0, 20.0, 1.0), np.arange(0.0, 20.0, 1.0), '--', lw=2, color='#444444')
axes[0].set_ylabel('Inferred Distance (pc)', fontsize=22.)
# plot distance data
axes[0].plot(1.0/FeH_weak[:, 20], FeH_weak[:, 4], 'o', color='#4682B4')
axes[1].plot(1.0/FeH_strong[:, 20], FeH_strong[:, 4], 'o', color='#4682B4')
axes[2].plot(1.0/no_Teff[:, 20], no_Teff[:, 4], 'o', color='#4682B4')
Out[2]:
The first two cases accurately reproduce the observed distances. However, the third case (no $T_{\textrm{eff}}$ constraint) does not. This may suggest that the latter MCMC runs did not fully converge. Scratch that -- there was a typo in the code that lead to reasonable looking, but incorrect results. Distances are well recovered. Moving now to metallicity,
In [3]:
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
# set axis labels and plot one-to-one line
for ax in axes:
ax.set_xlabel('Observed [Fe/H] (dex)', fontsize=22.)
ax.set_xlim((-0.5, 0.5))
ax.set_ylim((-0.5, 0.5))
ax.plot( np.arange(-0.5, 0.51, 0.5), np.arange(-0.5, 0.51, 0.5), '--', lw=2, color='#444444')
axes[0].set_ylabel('Inferred [M/H] (dex)', fontsize=22.)
# plot metallicity data
axes[0].plot(FeH_weak[:, 30], FeH_weak[:, 1], 'o', color='#4682B4')
axes[1].plot(FeH_strong[:, 30], FeH_strong[:, 1], 'o', color='#4682B4')
axes[2].plot(no_Teff[:, 30], no_Teff[:, 1], 'o', color='#4682B4')
Out[3]:
Metallicities are not so well reproduced without a strong metallicity prior (small uncertainty in the observed value). Even then, there is an apparent preferences for some stars to prefer metallicities around [Fe/H] $= -0.15$ dex. There is also a slight systematic shift of the inferred metallicities toward higher values. This is not necessarily indicative of a complete failure of the models to reproduce the observed values, as we must also be concerned with the present day metallicity, not the proto-stellar metallicity, which is inferred from the models. Due to gravitational settling and various diffusion processes, we should expect a marginally higher metallicity value among the proto-stellar values by upward of 0.05 dex. Unfortunately, the expected offset is temperature dependent, as it is a strong function of the depth of the convection zone and the temporal evolution of the surface convection zone depth. Fully convective stars, for example, are not expected to be affected by diffusive process as the timescale for mixing by the convection zone is orders of magnitude shorter. Nevertheless, as can be seen in the left-most panel, the models (if given the opportunity) will prefer significantly higher metallicity values, which should be seen as a potential issue among the models.
Masses inferred from the three data sets may also be compared.
In [4]:
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
# set axis labels and plot one-to-one line
for ax in axes:
ax.set_xlabel('Inferred Mass ($M_{\odot}$), weak prior', fontsize=18.)
ax.set_xlim((0.0, 1.0))
ax.set_ylim((0.0, 1.0))
ax.plot( np.arange(0.0, 1.01, 0.5), np.arange(0.0, 1.01, 0.5), '--', lw=2, color='#444444')
axes[0].set_ylabel('Inferred Mass ($M_{\odot}$), strong prior', fontsize=18.)
axes[1].set_ylabel('Inferred Mass ($M_{\odot}$), no $T_{\\rm eff}$', fontsize=18.)
# plot masses
axes[0].plot(FeH_weak[:, 0], FeH_strong[:, 0], 'o', color='#4682B4')
axes[1].plot(FeH_weak[:, 0], no_Teff[:, 0], 'o', color='#4682B4')
Out[4]:
In general, masses estimates are consistent among all three trials. Masses inferred in the weak metallicity prior trial are systematically higher than those in the other two trials, but not by a large margin. Issues arise in the vicinity of $0.80 M_{\odot}$, with the strong metallicity prior and no $T_{\textrm{eff}}$ trials showing a propensity for stars to have masses around $0.77 M_{\odot}$. This propensity does not appear to exist under the weak metallicity prior. While the effective temperature is a derived quantity, results of trials adopting an effective temperature prior appear to derive more reliable results that are devoid of a pile up in the $0.80 M_{\odot}$ vicinity. Masses above that threshold are most certainly expected, particularly in the case of Gl 559 B ($\alpha$ Cen B), which has an expected mass of approximately $0.92±0.02 M_{\odot}$. Unfortunately, none of the various trials were able to recover this mass within the $1\sigma$ limit.
How well do the various trials recover the observed properties? Let's start with the two primary observables: angular diameter and bolometric flux. These are inextricably coupled to the star's distance when comparing with stellar evolution models, which only provide information on the stellar radius and bolometric luminosity. Nevertheless, we saw above that distances were well recovered, permitting a reliable comparison of the more fundamental observables. First we'll define these new properties in new arrays,
In [5]:
# define in terms of (obs - model)/sigma_observed
#
# angular diameters
dTheta_sigma_weak = (FeH_weak[:, 18] - FeH_weak[:, 8])/FeH_weak[:, 19]
dTheta_sigma_strong = (FeH_strong[:, 18] - FeH_strong[:, 8])/FeH_strong[:, 19]
dTheta_sigma_noTeff = (no_Teff[:, 18] - no_Teff[:, 8])/no_Teff[:, 19]
# bolometric fluxes (note: strange units for model properties)
dFbol_sigma_weak = (FeH_weak[:, 22] - 10**(FeH_weak[:, 7] + 8.0))/FeH_weak[:, 23]
dFbol_sigma_strong = (FeH_strong[:, 22] - 10**(FeH_strong[:, 7] + 8.0))/FeH_strong[:, 23]
dFbol_sigma_noTeff = (no_Teff[:, 22] - 10**(no_Teff[:, 7] + 8.0))/no_Teff[:, 23]
# effective temperatures
dTeff_sigma_weak = (FeH_weak[:, 24] - 10**FeH_weak[:, 6])/FeH_weak[:, 25]
dTeff_sigma_strong = (FeH_strong[:, 24] - 10**FeH_strong[:, 6])/FeH_strong[:, 25]
Now we can compare results from all three trials at once, with relative discrepancies normalized to the observational uncertainties.
In [6]:
from matplotlib.patches import Ellipse
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
# set axis properties
ax.set_xlabel('$\\Delta F_{\\rm bol} / \\sigma$', fontsize=22.)
ax.set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot 68% and 99% confidence intervals
ells = [Ellipse(xy=(0.0, 0.0), width=2.*x, height=2.*x, angle=0.0, lw=3, fill=False,
linestyle='dashed', edgecolor='#333333') for x in [1.0, 3.0]]
for e in ells:
ax.add_artist(e)
# plot results of trials
ax.plot(dFbol_sigma_weak, dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
ax.plot(dFbol_sigma_strong, dTheta_sigma_strong, 's', color='#800000', markersize=9.0)
ax.plot(dFbol_sigma_noTeff, dTheta_sigma_noTeff, '^', color='#444444', markersize=9.0)
Out[6]:
Results are rather consistent, at least in the morphology of the scatter. Beyond the 68% confidence interval region (inner ellipse), there is very little scatter into quadrants III and IV with only a handful of points lying in quadrant I. Most of the scatter resids in quadrant II, which can be interpretted to mean that models tend to infer radii that are too small compared to observations with bolometric fluxes that are slightly too large. While the scatter along the bolometric flux axis is reveals a general consistency with the observations, there is still a systematic offset toward the left side of the zero-point. One may interpret this as the model attempting to produce larger radii while maintaining agreement with the bolometric flux and effective temperature measurements.
Can we discern any correlations of these errors among the parameter present in the data? Let's start with potential correations with the observables themselves.
In [7]:
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Bolometric Flux (erg s$^{-1}$ cm$^{-2}$)', fontsize=22.)
axes[0].set_ylabel('$\\Delta F_{\\rm bol} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.semilogx([1.e-1, 1.e3], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].semilogx(FeH_strong[:, 22], dFbol_sigma_strong, 's', color='#800000', markersize=7.0)
axes[1].semilogx(FeH_strong[:, 22], dTheta_sigma_strong, 's', color='#800000', markersize=7.0)
axes[0].semilogx(FeH_weak[:, 22], dFbol_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].semilogx(FeH_weak[:, 22], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[7]:
Note that this does not include the trials where the effective temperature was neglected in the likelihood estimator. There is good agreement between the two samples, despite changing the metallicity prior.
In [8]:
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Angular Diameter (mas)', fontsize=22.)
axes[0].set_ylabel('$\\Delta F_{\\rm bol} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.semilogx([1.e-1, 1.e1], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].semilogx(FeH_strong[:, 18], dFbol_sigma_strong, 's', color='#800000', markersize=7.0)
axes[1].semilogx(FeH_strong[:, 18], dTheta_sigma_strong, 's', color='#800000', markersize=7.0)
axes[0].semilogx(FeH_weak[:, 18], dFbol_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].semilogx(FeH_weak[:, 18], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[8]:
Computing the mean and median offsets of bolometric flux and angular diameter from the zero-points, we find
In [9]:
# compute mean of the offsets
print 'Weak Metallicity Prior'
print 'Mean Bolometric Flux Error: {:+5.2f} sigma'.format(np.mean(dFbol_sigma_weak))
print 'Mean Angular Diameter Error: {:+5.2f} sigma\n'.format(np.mean(dTheta_sigma_weak))
print 'Strong Metallicity Prior'
print 'Mean Bolometric Flux Error: {:+5.2f} sigma'.format(np.mean(dFbol_sigma_strong))
print 'Mean Angular Diameter Error: {:+5.2f} sigma\n'.format(np.mean(dTheta_sigma_strong))
# compute median of the offsets
print 'Weak Metallicity Prior'
print 'Median Bolometric Flux Error: {:+5.2f} sigma'.format(np.median(dFbol_sigma_weak))
print 'Median Angular Diameter Error: {:+5.2f} sigma\n'.format(np.median(dTheta_sigma_weak))
print 'Strong Metallicity Prior'
print 'Median Bolometric Flux Error: {:+5.2f} sigma'.format(np.median(dFbol_sigma_strong))
print 'Median Angular Diameter Error: {:+5.2f} sigma\n'.format(np.median(dTheta_sigma_strong))
Rephrasing these as fractional relative errors,
In [10]:
# define in terms of (obs - model)/obs
#
# angular diameters
dTheta_weak = (FeH_weak[:, 18] - FeH_weak[:, 8])/FeH_weak[:, 18]
dTheta_strong = (FeH_strong[:, 18] - FeH_strong[:, 8])/FeH_strong[:, 18]
# bolometric fluxes (note: strange units for model properties)
dFbol_weak = (FeH_weak[:, 22] - 10**(FeH_weak[:, 7] + 8.0))/FeH_weak[:, 22]
dFbol_strong = (FeH_strong[:, 22] - 10**(FeH_strong[:, 7] + 8.0))/FeH_strong[:, 22]
# compute mean of the offsets
print 'Weak Metallicity Prior'
print 'Mean Bolometric Flux Error: {:+5.2f}%'.format(np.mean(dFbol_weak)*100.)
print 'Mean Angular Diameter Error: {:+5.2f}%\n'.format(np.mean(dTheta_weak)*100.)
print 'Strong Metallicity Prior'
print 'Mean Bolometric Flux Error: {:+5.2f}%'.format(np.mean(dFbol_strong)*100.)
print 'Mean Angular Diameter Error: {:+5.2f}%\n'.format(np.mean(dTheta_strong)*100.)
# compute median of the offsets
print 'Weak Metallicity Prior'
print 'Median Bolometric Flux Error: {:+5.2f}%'.format(np.median(dFbol_weak)*100.)
print 'Median Angular Diameter Error: {:+5.2f}%\n'.format(np.median(dTheta_weak)*100.)
print 'Strong Metallicity Prior'
print 'Median Bolometric Flux Error: {:+5.2f}%'.format(np.median(dFbol_strong)*100.)
print 'Median Angular Diameter Error: {:+5.2f}%\n'.format(np.median(dTheta_strong)*100.)
Errors on stellar radii (given that we've recovered measured distances) are on the order of 2%, depending on the strength of the metallicity prior. This is consistent with mean offsets observed among low-mass stars in detached double-lined ecilpsing binaries (Feiden & Chaboyer 2012; [Spada et al. 2013). However, it's quite possible that the models are compensating for true modeling errors with older ages and higher metallicities. Let's look at the distribution of ages.
In [11]:
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
# set axis properties
ax.set_xlabel('$\\log_{10}(age / yr)$', fontsize=22.)
ax.set_ylabel('Number', fontsize=22.)
strong_hist = ax.hist(FeH_strong[:, 3], bins=20, facecolor='#800000', alpha=0.6)
weak_hist = ax.hist(FeH_weak[:, 3], bins=20, facecolor='#4682B4', alpha=0.6)
There is, in fact, a significant number of stars that are predicted to have ages comparable to the age of the Universe. While we cannot rule out the possibility that some stars have older ages consistent with the thick disk or halo populations, it is unlikely that such a large fraction of local field stars (nearly 50%) have ages older than 12 Gyr. This, in part, reflects the difficult of acquiring ages for low-mass stars, particuarly those in the M dwarf regime, where the stellar properties are not expected to evolve significantly over the stellar lifetime. A preference for older ages then only reflects that fact that some evolution does occur in the models, leading to almost negligibly larger radii that are only slightly preferred given the systematic offset observed above.
While considering final distributions, here is the final mass distribution from the weak metallicity prior trial.
In [12]:
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
# set axis properties
ax.set_xlabel('Mass ($M_{\\odot}$)', fontsize=22.)
ax.set_ylabel('Number', fontsize=22.)
weak_hist = ax.hist(FeH_weak[:, 0], bins=10, facecolor='#4682B4', alpha=0.6)
In [13]:
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Mass ($M_{\\odot}$)', fontsize=22.)
axes[0].set_ylabel('$\\Delta T_{\\rm eff} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.plot([0.0, 1.0e0], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].plot(FeH_strong[:, 0], dTeff_sigma_strong, 's', color='#800000', markersize=7.0)
axes[1].plot(FeH_strong[:, 0], dTheta_sigma_strong, 's', color='#800000', markersize=7.0)
axes[0].plot(FeH_weak[:, 0], dTeff_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].plot(FeH_weak[:, 0], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[13]:
There do not appear to be trends among model-observation discrepancies as a function of mass. Instead, models tend to systmatically under predict stellar radii and over predict effective temperatures. Below $0.5 M_{\odot}$, there is increased scatter owing to four significant outliers. One of these, Gl 388 (AD Leo), is a candidate young star, so we should perhaps not be surprised by the failure of models to reproduce its properties. It stands out in our above analysis of how well models recovered observed bolometric fluxes. Since the star is young, the models are not able to reproduce its bolometric flux; Gl 388 shows the largest disagreements when it comes to bolometric flux, adding further proof that the system may very well be young.
In [14]:
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Effective Temperature (K)', fontsize=22.)
axes[0].set_ylabel('$\\Delta T_{\\rm eff} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.plot([2500, 6000], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].plot(FeH_strong[:, 24], dTeff_sigma_strong, 's', color='#800000', markersize=7.0)
axes[1].plot(FeH_strong[:, 24], dTheta_sigma_strong, 's', color='#800000', markersize=7.0)
axes[0].plot(FeH_weak[:, 24], dTeff_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].plot(FeH_weak[:, 24], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[14]:
As with stellar mass, there is no immediate correlation between observed effective temperatures and model inferred fundamental properties. However, it is clear that the largest errors appear at the coolest temperatures, with the presence of the same four outliers as above. We also again note that, even for the hottest stars, there appears to be a systematic offset of model errors toward hotter temperatures and smaller radii. Individual errors are not necessarily significant, often located within $1\sigma$ of the zero-point. However, the sysetmatic nature of the errors betrays the existence of possible errors among K dwarfs. A problem thought to be exclusive of M dwarfs appears to extend throughout the K dwarf regime.
In [15]:
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Observed Metallicity (dex)', fontsize=22.)
axes[0].set_ylabel('$\\Delta T_{\\rm eff} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.set_xlim(-0.5, 0.5)
ax.plot([-0.5, 0.5], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].plot(FeH_strong[:, 30], dTeff_sigma_strong, 's', color='#800000', markersize=7.0)
axes[1].plot(FeH_strong[:, 30], dTheta_sigma_strong, 's', color='#800000', markersize=7.0)
axes[0].plot(FeH_weak[:, 30], dTeff_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].plot(FeH_weak[:, 30], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[15]:
In [16]:
# load Mann et al. (2015) data
mann = np.genfromtxt('data/dmestar_model_mcmcTLD_Final_FeH.txt')
# plot comparison
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Effective Temperature (K)', fontsize=22.)
axes[0].set_ylabel('$\\Delta T_{\\rm eff} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.plot([2500, 6000], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].plot(mann[:, 32], (mann[:, 32] - mann[:, 18])/mann[:, 33], 'o', color='#555555', markersize=6.0)
axes[1].plot(mann[:, 32], (mann[:, 36] - mann[:, 24])/mann[:, 37], 'o', color='#555555', markersize=6.0)
axes[0].plot(FeH_weak[:, 24], dTeff_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].plot(FeH_weak[:, 24], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[16]:
In [17]:
fig, axes = plt.subplots(2, 1, figsize=(10, 8))
# set axis properties
axes[1].set_xlabel('Mass ($M_{\\odot}$)', fontsize=22.)
axes[0].set_ylabel('$\\Delta T_{\\rm eff} / \\sigma$', fontsize=22.)
axes[1].set_ylabel('$\\Delta \\Theta / \\sigma$', fontsize=22.)
# plot zero points
for ax in axes:
ax.plot([0.0, 1.0], [0.0, 0.0], '--', lw=2, color='#333333')
# plot relative errors
axes[0].plot(mann[:, 3], (mann[:, 32] - mann[:, 18])/mann[:, 33], 'o', color='#555555', markersize=6.0)
axes[1].plot(mann[:, 3], (mann[:, 36] - mann[:, 24])/mann[:, 37], 'o', color='#555555', markersize=6.0)
axes[0].plot(FeH_weak[:, 0], dTeff_sigma_weak, 'o', color='#4682B4', markersize=9.0)
axes[1].plot(FeH_weak[:, 0], dTheta_sigma_weak, 'o', color='#4682B4', markersize=9.0)
Out[17]:
In [ ]: