In [1]:
import datetime
print(datetime.datetime.now().isoformat())


2016-09-06T17:34:40.378692

Is there a cutoff in Dst that allows slot filling?

Method

Using a statistical model of the relationship between the slot filling at 464keV vs Dst.

We model the number of slot filling events as a random sample from a binomial distribution meaning that there is a probability of the event occuring and there may be a parameter that changes the probability. This is not meant as a correlation but as a success failure creiteria.

From Wikipedia: In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.

$ change \sim Bin(n,p) \\ logit(p) = \alpha + \beta x \\ a \sim N(0,5) \\ \beta \sim N(0,10) $

where we set vague priors for $\alpha$ and $\beta$, the parameters for the logistic model.

This is the same technique used in the estimation of deaths due to a concentration of a chemical.


In [2]:
# http://onlinelibrary.wiley.com/doi/10.1002/2016JA022652/epdf


import pymc
from pprint import pprint
import numpy as np
import matplotlib.pyplot as plt
import spacepy.plot as spp


This unreleased version of SpacePy is not supported by the SpacePy team.
/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/matplotlib/__init__.py:872: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))

Data delivered by Geoff Reeves 9/6/2016


In [3]:
# min_Dst, min_L
data = np.asarray([
65.000, 3.8000,  
50.000, 3.7000,
67.000, 3.5000,
61.000, 3.4000, 
77.000, 3.2000,
99.000, 2.8900,  
87.000, 2.8000,  
98.000, 2.8000, 
96.000, 2.8000,
93.000, 2.3000,
92.000, 2.3000, 
225.00, 2.3000, 
206.00, 2.3000,
125.00, 2.3000]).reshape((-1,2))

dst = data[:,0]
minL = data[:,1]
print(dst, minL, data.dtype)


[  65.   50.   67.   61.   77.   99.   87.   98.   96.   93.   92.  225.
  206.  125.] [ 3.8   3.7   3.5   3.4   3.2   2.89  2.8   2.8   2.8   2.3   2.3   2.3
  2.3   2.3 ] float64

In [4]:
# make bins in Dst
dst_bins = np.arange(25, 300, 25)
print(dst_bins)
dst_bins_centers = np.asarray([dst_bins[:-1] + np.diff(dst_bins)/2]).T[:,0]
print(dst_bins_centers, dst_bins_centers.shape)
n_events_dig = np.digitize(dst, dst_bins)
print(n_events_dig)
n_events = np.zeros(len(dst_bins)-1)
success = np.zeros_like(n_events)
for i, v in enumerate(np.unique(n_events_dig)):
    n_events[v-1] = np.sum(n_events_dig==v)
    success[v-1] = np.sum(minL[n_events_dig==v] <= 2.4)
print(n_events)
print(success)


[ 25  50  75 100 125 150 175 200 225 250 275]
[  37.5   62.5   87.5  112.5  137.5  162.5  187.5  212.5  237.5  262.5] (10,)
[2 2 2 2 3 3 3 3 3 3 3 9 8 5]
[ 0.  4.  7.  0.  1.  0.  0.  1.  1.  0.]
[ 0.  0.  2.  0.  1.  0.  0.  1.  1.  0.]

In [5]:
plt.plot(dst, minL, 'o')
plt.xlim((25, 250))
plt.ylim((2.2, 4.0))
plt.xlabel('Min Dst')
plt.ylabel('Min L Shell')


Out[5]:
<matplotlib.text.Text at 0x115d82d68>

In [6]:
plt.plot(dst, minL, 'o')
plt.xlim((25, 250))
plt.ylim((2.2, 4.0))
plt.xlabel('Min Dst')
plt.ylabel('Min L Shell')
for v in dst_bins:
    plt.axvline(v, lw=0.5)
plt.axhline(2.4, c='r', lw=1)


Out[6]:
<matplotlib.lines.Line2D at 0x1192edeb8>

Oberved Data:

  • n_pts : the number of points in each of the 25nT wide Dst bins
  • successes : the number of events in each bin where the slot was filled (Min L = 2.3)
  • dst_bins_centers : the centers of the Dst bins

In [7]:
# define invlogit function
def invlogit(x):
  return pymc.exp(x) / (1 + pymc.exp(x))

Setup the Bayesian model accorind to the description above. Run the Markov chain monte carlo (MCMC) to sample the posterior distributions for $\alpha$ and $\beta$


In [8]:
# define priors
# these are wide uninformative priors
# alpha = pymc.Normal('alpha', mu=0, tau=1.0/5**2)
# beta = pymc.Normal('beta', mu=0, tau=1.0/10**2)
alpha = pymc.Uniform('alpha', -500, 100, value = 1e-2)
beta = pymc.Uniform('beta', -100, 100, value = 1e-2)


# cannot feed in zero events
ind = n_events > 0
print(n_events.shape, dst_bins_centers.shape, success.shape, )

# define likelihood
p = pymc.InvLogit('p', alpha + beta*dst_bins_centers[ind])
print('n_events', n_events[ind], 'p', p.value, 'success', success[ind], )
y = pymc.Binomial('y_obs', n=n_events[ind], p=p, value=success[ind], observed=True)


(10,) (10,) (10,)
n_events [ 4.  7.  1.  1.  1.] p [ 0.65362233  0.70785727  0.79979282  0.89425874  0.9156763 ] success [ 0.  2.  1.  1.  1.]

In [9]:
# inference
m = pymc.Model([alpha, beta, y])
mc = pymc.MCMC(m)
mc.sample(iter=500000, burn=1000, burn_till_tuned=True, thin=300)


 [-----------------102%------------------] 519038 of 504000 complete in 61.0 sec
/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/base.py:282: UserWarning: 
Error tallying Metropolis_beta_adaptive_scale_factor, will not try to tally it again this chain.
Did you make all the samevariables and step methods tallyable
as were tallyable last time you used the database file?

Error:

Traceback (most recent call last):
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/ram.py", line 97, in tally
    self._trace[chain][self._index[chain]] = value.copy()
AttributeError: 'float' object has no attribute 'copy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/base.py", line 272, in tally
    self._traces[name].tally(chain)
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/ram.py", line 99, in tally
    self._trace[chain][self._index[chain]] = value
IndexError: index 1663 is out of bounds for axis 0 with size 1663

  %s""" % (name, ''.join(traceback.format_exception(cls, inst, tb))))
/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/base.py:282: UserWarning: 
Error tallying beta, will not try to tally it again this chain.
Did you make all the samevariables and step methods tallyable
as were tallyable last time you used the database file?

Error:

Traceback (most recent call last):
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/base.py", line 272, in tally
    self._traces[name].tally(chain)
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/ram.py", line 97, in tally
    self._trace[chain][self._index[chain]] = value.copy()
IndexError: index 1663 is out of bounds for axis 0 with size 1663

  %s""" % (name, ''.join(traceback.format_exception(cls, inst, tb))))
/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/base.py:282: UserWarning: 
Error tallying Metropolis_alpha_adaptive_scale_factor, will not try to tally it again this chain.
Did you make all the samevariables and step methods tallyable
as were tallyable last time you used the database file?

Error:

Traceback (most recent call last):
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/ram.py", line 97, in tally
    self._trace[chain][self._index[chain]] = value.copy()
AttributeError: 'float' object has no attribute 'copy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/base.py", line 272, in tally
    self._traces[name].tally(chain)
  File "/Users/balarsen/miniconda3/envs/python3/lib/python3.5/site-packages/pymc/database/ram.py", line 99, in tally
    self._trace[chain][self._index[chain]] = value
IndexError: index 1663 is out of bounds for axis 0 with size 1663

  %s""" % (name, ''.join(traceback.format_exception(cls, inst, tb))))

Make diagnostic plots of the posteriour distributions as created using MCMC.


In [10]:
pymc.Matplot.plot(mc)


Plotting beta
Plotting alpha

In [11]:
pprint(mc.stats())


{'alpha': {'95% HPD interval': array([-301.25971684,  -19.21738056]),
           'mc error': 8.4062612623799797,
           'mean': -137.34146724447487,
           'n': 1663,
           'quantiles': {2.5: -302.04041700719802,
                         25: -194.51995197850934,
                         50: -117.19761697054732,
                         75: -71.946885728797639,
                         97.5: -19.463116643488519},
           'standard deviation': 84.107147757893117},
 'beta': {'95% HPD interval': array([ 0.19840535,  3.42479563]),
          'mc error': 0.096088418135145293,
          'mean': 1.5571875895218161,
          'n': 1663,
          'quantiles': {2.5: 0.21013046319166212,
                        25: 0.81163990684335996,
                        50: 1.3272919301315278,
                        75: 2.2197106799675064,
                        97.5: 3.4439594678644863},
          'standard deviation': 0.96147239105070614}}

In [12]:
xp = np.linspace(dst_bins_centers[ind].min(), dst_bins_centers[ind].max(), 100)
a = alpha.stats()['mean']
b = beta.stats()['mean']
y_val = invlogit(a + b*xp).value
plt.plot(xp, y_val)
plt.scatter(dst_bins_centers[ind], success[ind]/n_events[ind], s=50);
plt.xlabel('Minimum Dst')
plt.ylabel('Probability slot is filled')
plt.gca().ticklabel_format(useOffset=False)


Predictions based on this model


In [13]:
# get the minimum Dst where 99% should be successes
ind99 = y_val >= 0.99
minDst99 = xp[ind99][0]
print('At a minimum Dst of {0:0.0f}nT it is predicted to have a 99% percent of a slot filling'.format(minDst99))


At a minimum Dst of 93nT it is predicted to have a 99% percent of a slot filling

Plot up many lines for a feel at uncertantity


In [14]:
# one should be able to get estimates of the line uncertainity
ilu = np.empty((1000, len(xp)), dtype=float)
for ii, v in enumerate(np.random.random_integers(0, len(alpha.trace[:])-1, 1000)):
  ilu[ii] = invlogit(alpha.trace[:][v] + beta.trace[:][v]*xp).value

In [15]:
xp = np.linspace(dst_bins_centers[ind].min(), dst_bins_centers[ind].max(), 100)
for v in ilu:
  plt.plot(xp, v, alpha=.01, c='r')


a = alpha.stats()['mean']
b = beta.stats()['mean']
plt.plot(xp, invlogit(a + b*xp).value)

a = alpha.stats()['quantiles'][50]
b = beta.stats()['quantiles'][50]
plt.plot(xp, invlogit(a + b*xp).value, c='y')


plt.scatter(dst_bins_centers[ind], success[ind]/n_events[ind], s=50);
plt.xlabel('Minimum Dst')
plt.ylabel('Probability slot is filled')


Out[15]:
<matplotlib.text.Text at 0x11b442d30>

Same as previous figure with the red lines overlayed as 100 joint draws from the model posterior in order to show spread.


In [ ]:


In [ ]:


In [ ]:


In [ ]: