Replication of findings from Pitt & Hill (2016): Table 1


In [168]:
# some programmatic housekeeping
import os
import sys
import numpy as np
import pandas as pd
from scipy.stats import poisson
import matplotlib.pyplot as plt
import seaborn as sns

sns.set()
np.random.seed(215)
%matplotlib inline

notebook = "PittHill_table1.ipynb" # replace with FILENAME
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(notebook), '..', 'data')))

In [ ]:
def PoissonTriples_exact(moment, j, k, N):
    """
    This function computes the probability that triples of Poisson random variables
    contain their own rounded mean based on the formula given in Pitt & Hill, 2016.
    
    Parameters
    ----------
    moment : integer
              The mean-variance parameter of the Poisson distribution from which
              triples of Poisson random variables are to be generated.
              
    j : integer
         The...
         
    k : integer
         The...
         
    N : integer
         The...
    
    Returns
    -------
    prob : numeric
            The exact probability that triples of Poisson random variables contain
            their own rounded means.
    """
    for j in list(range(2, N + 1)):
        for k in list(range(j, N + 1)):
            inner = poisson.pmf(k - (j / 2), moment) + ((j % 2) * poisson.pmf(k - (j / 2) - 1, moment))
            outer = poisson.pmf(k, moment) * poisson.pmf(k - j, moment)
            prob = outer * inner
    
    prob = 6 * prob
    return(prob)

In [169]:
def PoissonTriples_empirical(moment, n_times):
    """
    This function computes the probability that triples of Poisson random variables
    contain their own rounded mean.
    
    Parameters
    ----------
    moment : integer
              The mean-variance parameter of the Poisson distribution from which
              triples of random variables are to be generated.
    n_times : integer
               The number of Poisson triples (from a distribution parameterized by
               the moment argument) to use in computing the probability.
               
    Returns
    -------
    prob : numeric
            The empirically computed probability that triples of Poisson random
            variables contain their rounded own means.
    """
    inCounter = 0
    
    for i in range(n_times):
        poisTriple = poisson.rvs(moment, size = 3)
        tripleMean = int(np.round(np.mean(poisTriple), decimals = 0))
        meanCheck = np.sum(np.in1d(poisTriple, tripleMean))
    
        if meanCheck != 0:
            inCounter += 1
    
    prob = inCounter / n_times
    return(prob)

In [170]:
nTriples = 1000 # number of RV triples to generate to compute the empirical probability
poisMoments = range(1, 1000) # each mean-variance parameter of the Poisson distribution
probs = np.zeros(len(poisMoments))

# Computing the inclusion probability of the mean using (nTriples) of Poisson RV triples
# across as many values of the mean-variance parameter as specified above. This loop takes
# a while to run -- on the order of 2-5 minutes...
for i in range(len(poisMoments)):
    probs[i] = PoissonTriples_empirical(poisMoments[i], nTriples)

In [171]:
plt.figure(num = None, figsize=(12, 9), dpi=80)
plt.plot(poisMoments, probs)
plt.xlabel('$\lambda$ (moment) parameter of Poisson distribution')
plt.ylabel('Probability')
plt.title(r'Histogram of Inclusion Probabilities from Poisson Triples')
plt.axis([min(poisMoments), max(poisMoments), 0, min(1.5 * max(probs), 1)])
plt.show()


The plot above illustrates the decrease in the empirical inclusion probabilities for 1000 triples of Poisson random variables containing their own rounded mean, across increasing magnitudes of the Poisson mean-variance (moment) parameter, which is set to range from 1 to 999.


In [172]:
probs_df = pd.DataFrame(data = [poisMoments, probs]).transpose()
probs_df.columns = ["$\lambda$", "Inclusion Probability"]
table1 = probs_df.head(25)
table1


Out[172]:
$\lambda$ Inclusion Probability
0 1.0 0.795
1 2.0 0.697
2 3.0 0.586
3 4.0 0.552
4 5.0 0.531
5 6.0 0.467
6 7.0 0.445
7 8.0 0.423
8 9.0 0.401
9 10.0 0.420
10 11.0 0.369
11 12.0 0.383
12 13.0 0.373
13 14.0 0.314
14 15.0 0.352
15 16.0 0.343
16 17.0 0.336
17 18.0 0.321
18 19.0 0.316
19 20.0 0.286
20 21.0 0.287
21 22.0 0.299
22 23.0 0.277
23 24.0 0.253
24 25.0 0.269

The table above displays the empirical inclusion probability of the mean of 1000 Poisson random variables across differing values of the $\lambda$ parameter. The empirical inclusion probability of the mean is computed by generating 1000 triples of Poisson random variables for a given realization of the $\lambda$ parameter, counting the number of triples that contain their own rounded mean, and dividing this by the number of triples generated (1000). This is performed for realizations of the Poisson $\lambda$ parameter from 1 to 999.