Replication Archive for "Measuring causes of death in populations: a new metric that corrects cause-specific mortality fractions for chance"



In [1]:

    
import numpy as np, pandas as pd, matplotlib.pyplot as plt, seaborn as sns
%matplotlib inline

sns.set_style('whitegrid')
sns.set_context('poster')

Table 1

Confusion matrices for physician-certified verbal autopsy and random-allocation verbal autopsy. Panel A shows the confusion matrix for physician certified verbal autopsy (with a length-three cause list for clarity). The entry each cell counts the number of deaths truly due to the row cause that were predicted to be due to the column cause. For example, the value 83 in the “other” row, “stroke” column indicates that 83 deaths truly due to causes other than stroke or diabetes were (incorrectly) attributed to stroke by physicians. This table demonstrates that (for this dataset) physicians are right more often than they are wrong when they predict stroke as the cause of death, but wrong more than they are right when they predict diabetes. Panel B shows the confusion matrix for Random Allocation with the same dataset, where random chance predicts stroke and diabetes incorrectly for a vast majority of the cases. True and PCVA data from Lozano et al., 2011, where physicians were presented with VAI data where the underlying cause was known to meet stringent clinical diagnostic criteria, and their results compared to the truth.18



In [2]:

    
df = pd.read_csv('../3-data/pcva_results.csv')



In [3]:

    
df









    Out[3]:






  
    
      
      id
      gs_text34
      gs_assigned_MRR_1
    
  
  
    
      0
      0
      Stroke
      ZZ23
    
    
      1
      1
      IHD - Acute Myocardial Infarction
      I21
    
    
      2
      2
      Stroke
      I64
    
    
      3
      3
      Pneumonia
      E10
    
    
      4
      4
      Stroke
      I64
    
    
      5
      5
      Other Cardiovascular Diseases
      ZZ25
    
    
      6
      6
      Diabetes
      E10
    
    
      7
      7
      Renal Failure
      D91
    
    
      8
      8
      Stroke
      ZZ23
    
    
      9
      9
      IHD - Acute Myocardial Infarction
      ZZ23
    
    
      10
      10
      Cirrhosis
      A09
    
    
      11
      11
      Stroke
      ZZ23
    
    
      12
      12
      Stroke
      ZZ23
    
    
      13
      13
      Stroke
      W00
    
    
      14
      14
      Other Cardiovascular Diseases
      I21
    
    
      15
      15
      Diabetes
      E10
    
    
      16
      16
      Poisonings
      ZZ25
    
    
      17
      17
      Diarrhea/Dysentery
      I64
    
    
      18
      18
      Stroke
      ZZ23
    
    
      19
      19
      IHD - Acute Myocardial Infarction
      ZZ23
    
    
      20
      20
      Epilepsy
      J45
    
    
      21
      21
      Falls
      W00
    
    
      22
      22
      Stroke
      ZZ25
    
    
      23
      23
      IHD - Acute Myocardial Infarction
      E10
    
    
      24
      24
      IHD - Acute Myocardial Infarction
      E10
    
    
      25
      25
      Pneumonia
      J33
    
    
      26
      26
      Stroke
      ZZ23
    
    
      27
      27
      IHD - Acute Myocardial Infarction
      I21
    
    
      28
      28
      Stroke
      ZZ23
    
    
      29
      29
      Falls
      W00
    
    
      ...
      ...
      ...
      ...
    
    
      2672
      2672
      Other Infectious Diseases
      J33
    
    
      2673
      2673
      Cirrhosis
      ZZ21
    
    
      2674
      2674
      Poisonings
      ZZ27
    
    
      2675
      2675
      Other Infectious Diseases
      ZZ21
    
    
      2676
      2676
      Other Infectious Diseases
      J45
    
    
      2677
      2677
      Pneumonia
      E10
    
    
      2678
      2678
      Diabetes
      K71
    
    
      2679
      2679
      Maternal
      O67
    
    
      2680
      2680
      Other Infectious Diseases
      ZZ27
    
    
      2681
      2681
      Maternal
      O67
    
    
      2682
      2682
      Drowning
      ZZ27
    
    
      2683
      2683
      Falls
      ZZ27
    
    
      2684
      2684
      Renal Failure
      O67
    
    
      2685
      2685
      Suicide
      ZZ27
    
    
      2686
      2686
      IHD - Acute Myocardial Infarction
      I21
    
    
      2687
      2687
      Fires
      ZZ27
    
    
      2688
      2688
      Homicide
      ZZ27
    
    
      2689
      2689
      Homicide
      Y00
    
    
      2690
      2690
      Maternal
      O67
    
    
      2691
      2691
      Fires
      ZZ27
    
    
      2692
      2692
      Drowning
      ZZ27
    
    
      2693
      2693
      Fires
      ZZ27
    
    
      2694
      2694
      Drowning
      W65
    
    
      2695
      2695
      Drowning
      ZZ25
    
    
      2696
      2696
      Fires
      ZZ27
    
    
      2697
      2697
      Drowning
      W65
    
    
      2698
      2698
      Homicide
      Y00
    
    
      2699
      2699
      Pneumonia
      ZZ23
    
    
      2700
      2700
      IHD - Acute Myocardial Infarction
      I21
    
    
      2701
      2701
      Cirrhosis
      K71
    
  

2702 rows × 3 columns



In [4]:

    
df['gs3'] = 'Other'
df.loc[df.gs_text34=='Stroke', 'gs3'] = 'Stroke'
df.loc[df.gs_text34=='Diabetes', 'gs3'] = 'Diabetes'



In [5]:

    
df.gs3.value_counts()









    Out[5]:





Other       2290
Stroke       266
Diabetes     146
Name: gs3, dtype: int64



In [6]:

    
df['pc3'] = 'Other'
df.loc[df.gs_assigned_MRR_1=='I64', 'pc3'] = 'Stroke'
df.loc[df.gs_assigned_MRR_1=='E10', 'pc3'] = 'Diabetes'



In [7]:

    
df.pc3.value_counts()









    Out[7]:





Other       2292
Stroke       211
Diabetes     199
Name: pc3, dtype: int64

Panel A



In [8]:

    
cnts = df.groupby(['gs3', 'pc3']).id.count()
cnts = cnts.unstack()
cnts = pd.DataFrame(cnts, columns=['Stroke', 'Diabetes', 'Other'], index = ['Stroke', 'Diabetes', 'Other'])
cnts



In [9]:

    
# spot-check a few things

assert np.all(cnts.sum(axis=0) - df.pc3.value_counts() == 0)
assert np.all(cnts.sum(axis=1) - df.gs3.value_counts() == 0)

assert cnts.loc['Stroke', 'Stroke'] == ((df.pc3=='Stroke') & (df.gs3=='Stroke')).sum()
assert cnts.loc['Stroke', 'Other'] == ((df.pc3=='Other') & (df.gs3=='Stroke')).sum()

Panel B



In [10]:

    
# set seed for reproducibility
np.random.seed(12345)

df['ra3'] = np.random.choice(['Stroke', 'Diabetes', 'Other'], size=len(df.index))



In [11]:

    
cnts = df.groupby(['gs3', 'ra3']).id.count()
cnts = cnts.unstack()
cnts = pd.DataFrame(cnts, columns=['Stroke', 'Diabetes', 'Other'], index = ['Stroke', 'Diabetes', 'Other'])
cnts

Table 2

Confusion matrix for Random-From-Train verbal autopsy. The confusion matrix for Random-From-Train (with a length-three cause-list for clarity). As in Table 1, the entry each cell counts the number of deaths truly due to the row cause that were predicted to be due to the column cause. This table demonstrates that while Random-From-Train is inaccurate at the individual level, at the population level, the prediction distribution can closely match the truth.



In [12]:

    
df['rft'] = np.random.choice(df.gs3, size=len(df.index))



In [13]:

    
cnts = df.groupby(['gs3', 'rft']).id.count()
cnts = cnts.unstack()
cnts = pd.DataFrame(cnts, columns=['Stroke', 'Diabetes', 'Other'], index = ['Stroke', 'Diabetes', 'Other'])
cnts

This table demonstrates that while Random-From-Train is inaccurate at the individual level, at the population level, the prediction distribution can closely match the truth.



In [14]:

    
pd.DataFrame({'row sums':cnts.sum(0), 'col sums':cnts.sum(1)})

To do this, we performed a Monte Carlo calculation of the CSMF accuracy of Random Allocation, by simulating a dataset with known CSMF distribution, assigning “predicted” causes of death uniformly at random, and measuring the CSMF accuracy of the predictions.

The distribution of the simulated dataset is an important and subtle detail of this calculation. We sampled the true CSMF distribution from an uninformative Dirichlet distribution (which gives equal probability to all possible CSMF distributions). We generated XXX replicated of the Monte Carlo simulation, and calculated the mean and standard deviation of the CSMF accuracy.



In [15]:

    
def csmf_acc_for_random(J=34, r=500, n=10000, seed=12345):
    """ Use Monte Carlo to approximate the CSMF accuracy
    of randomly assigning deaths to causes

    J : int, number of causes
    r : int, number of replicates in Monte Carlo approx
    n : int, size of test db
    """
    # set random seed for reproducibility
    np.random.seed(seed)

    #######################################################
    #######################################################
    # generate n CSMFs from uninformative dirichlet prior

    csmf_true = np.random.dirichlet(np.ones(J), size=r)

    
    ########################################################
    ########################################################
    # generate n CSMFs from random allocation of causes
    # to n deaths drawn from this distribution

    csmf_rand = np.random.multinomial(n, np.ones(J) / float(J), size=r) / float(n)
    assert np.allclose(csmf_rand.sum(axis=1), 1)  #  rows sum to one (modulo machine precision)


    ########################################################
    ########################################################
    # calculate CSMF accuracy for all replicates

    csmf_acc = 1 - np.sum(np.absolute(csmf_true - csmf_rand), axis=1) \
                        / (2 * (1 - np.min(csmf_true, axis=1)))


    #plt.title('Mean CSMF Accuracy = %0.2f\nfor I=%d, n=%d, r=%d replicates' % (np.mean(csmf_acc), I, n, r))
    #sns.distplot(csmf_acc, rug=True, rug_kws={'alpha':.25})
    
    return csmf_acc

acc = csmf_acc_for_random(J=34)
acc.mean(), acc.std()









    Out[15]:





(0.63650814200295769, 0.039592336123364276)

We also used this simulation framework to perform a Monte Carlo calculation of the concordance for random allocation, which provides a cross-check for the analytical derivation of CCC derived in [ref]. We repeated the simulations for cause lists ranging from 3 to 50 causes.



In [16]:

    
def concordance_for_random(J=34, r=500, n=10000, seed=12345):
    """ Use Monte Carlo to approximate the concordance
    of randomly assigning deaths to causes

    J : int, number of causes
    r : int, number of replicates in Monte Carlo approx
    n : int, size of test db
    """
    # set random seed for reproducibility
    np.random.seed(seed)

    #######################################################
    #######################################################
    # generate r replicates of n underlying causes of death
    # (true and predicted)
    
    cause_true = np.floor(np.random.uniform(0, J, size=(r,n)))
    cause_pred = np.floor(np.random.uniform(0, J, size=(r,n)))

    ########################################################
    ########################################################
    # calculate concordance for r replicates
    c = np.empty((J,r))
    for j in range(J):
        n_j = (cause_true == j).sum(axis=1)
        n_j = np.array(n_j, dtype=float)  # ensure that we get floating point division
        
        c[j] = ((cause_true == j)&(cause_pred == j)).sum(axis=1) / n_j
        
    concordance = np.mean(c, axis=0)
    assert concordance.shape == (r,)

    return concordance

c = concordance_for_random(J=3)
c.mean(), c.std()









    Out[16]:





(0.33348456656282904, 0.004793418009544693)

This simulation setting also provided us an opportunity to demonstrate the importance of randomly resampling the cause-fraction of the test set from an uninformative Dirichlet distribution (a technical point that perhaps has not been sufficiently appreciated since its introduction in [ref]). To do so, we compared the CCCSMF accuracy of Random Allocation with that of Random-From-Train, where training data was either uniformly distributed among causes or distributed according same distribution as the test data.



In [17]:

    
def csmf_acc_for_rft(J=34, r=500, n=10000, seed=12345, uniform_train=False):
    """ Use Monte Carlo to approximate the CSMF accuracy
    of randomly assigning deaths according to train set distribution

    J : int, number of causes
    r : int, number of replicates in Monte Carlo approx
    n : int, size of test db
    uniform_train : bool, should train set have different distribution than test set
    """
    # set random seed for reproducibility
    np.random.seed(seed)

    #######################################################
    #######################################################
    # generate n CSMFs from uninformative dirichlet prior

    csmf_true = np.random.dirichlet(np.ones(J), size=r)
    assert np.allclose(csmf_true.sum(axis=1), 1)  # after completing, rows sum to one (modulo machine precision)

    #######################################################
    #######################################################
    # generate train set of size n
    
    X_train = np.empty((r,n))
    for i in range(r):
        X_train[i,:] = np.random.choice(range(len(csmf_true[i])), p=csmf_true[i], size=n)
    
    ########################################################
    ########################################################
    # re-calculate csmf for train set
    # (since it is a little different than desired)
    csmf_true = np.empty((r,J))
    for i in range(r):
        for j in range(J):
            csmf_true[i,j] = (X_train[i] == j).sum() / float(n)
    assert np.allclose(csmf_true.sum(axis=1), 1)  #  rows sum to one (modulo machine precision)
    
    #######################################################
    #######################################################
    # resample train set to have equal class sizes,
    # _if requested_
    if uniform_train:
        X_train = np.empty((r,n))
        for i in range(r):
            X_train[i,:] = np.random.choice(range(J), p=np.ones(J)/float(J), size=n)
    
    
    
    ########################################################
    ########################################################
    # generate test set using random-from-train
    X_test = np.empty((r,n))
    for i in range(r):
        X_test[i] = np.random.choice(X_train[i], size=n, replace=True)
        
    ########################################################
    ########################################################
    # calculate csmf for test set
    csmf_rft = np.empty((r,J))
    for i in range(r):
        for j in range(J):
            csmf_rft[i,j] = (X_test[i] == j).sum() / float(n)
    assert np.allclose(csmf_rft.sum(axis=1), 1)  #  rows sum to one (modulo machine precision)


    ########################################################
    ########################################################
    # calculate CSMF accuracy for all replicates

    csmf_acc = 1 - np.sum(np.absolute(csmf_true - csmf_rft), axis=1) \
                        / (2 * (1 - np.min(csmf_true, axis=1)))


    #plt.title('Mean CSMF Accuracy = %0.2f\nfor I=%d, n=%d, r=%d replicates' % (np.mean(csmf_acc), I, n, r))
    #sns.distplot(csmf_acc, rug=True, rug_kws={'alpha':.25})
    
    return csmf_acc

acc = csmf_acc_for_rft(J=34)
print(acc.mean(), acc.std())

acc = csmf_acc_for_rft(J=34, uniform_train=True)
print(acc.mean(), acc.std())









    



(0.9801022955774783, 0.0028779120303083984)
(0.63496926759115124, 0.040351311224907462)



In [18]:

    
import sys



In [19]:

    
%%time

df = pd.DataFrame(columns=['J'])
for J in range(3,51):
    sys.stdout.write(str(J)+' ')
    results_J = {'J':J}
    
    acc = csmf_acc_for_random(J, r=10000)
    results_J['acc_rand'] = acc.mean()
    results_J['acc_rand_lb'] = np.percentile(acc, 2.5)
    results_J['acc_rand_ub'] = np.percentile(acc, 97.5)
    
    
    c = concordance_for_random(J, r=500)
    results_J['conc_rand'] = c.mean()
    results_J['conc_rand_lb'] = np.percentile(c, 2.5)
    results_J['conc_rand_ub'] = np.percentile(c, 97.5)

    acc = csmf_acc_for_rft(J, r=500)
    results_J['acc_rft'] = acc.mean()
    results_J['acc_rft_lb'] = np.percentile(acc, 2.5)
    results_J['acc_rft_ub'] = np.percentile(acc, 97.5)

    acc = csmf_acc_for_rft(J, r=500, uniform_train=True)
    results_J['acc_rft_unif'] = acc.mean()
    results_J['acc_rft_unif_lb'] = np.percentile(acc, 2.5)
    results_J['acc_rft_unif_ub'] = np.percentile(acc, 97.5)
    
    df = df.append(results_J, ignore_index=True)
print()
df_sim = df









    



3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 ()
CPU times: user 4min 20s, sys: 9.92 s, total: 4min 30s
Wall time: 4min 30s



In [20]:

    
df = df_sim
df.head()









    Out[20]:






  
    
      
      J
      acc_rand
      acc_rand_lb
      acc_rand_ub
      acc_rft
      acc_rft_lb
      acc_rft_ub
      acc_rft_unif
      acc_rft_unif_lb
      acc_rft_unif_ub
      conc_rand
      conc_rand_lb
      conc_rand_ub
    
  
  
    
      0
      3.0
      0.672649
      0.409661
      0.906207
      0.994532
      0.986763
      0.999225
      0.676596
      0.412911
      0.915653
      0.333485
      0.324596
      0.342875
    
    
      1
      4.0
      0.664950
      0.419911
      0.870437
      0.993817
      0.986271
      0.998765
      0.670330
      0.433460
      0.870414
      0.249967
      0.241173
      0.258959
    
    
      2
      5.0
      0.659377
      0.445512
      0.845829
      0.992700
      0.985842
      0.997518
      0.666023
      0.466219
      0.844205
      0.199946
      0.192522
      0.208066
    
    
      3
      6.0
      0.655971
      0.456732
      0.827259
      0.991902
      0.985600
      0.996989
      0.659627
      0.467981
      0.806137
      0.167013
      0.160000
      0.174345
    
    
      4
      7.0
      0.652765
      0.465657
      0.815029
      0.991545
      0.985601
      0.996526
      0.659969
      0.494500
      0.822932
      0.143023
      0.136691
      0.149720

Table 3

CCCSMF accuracy of Random Allocation and Random-From-Train with and without resampling the test CSMF distribution. This table demonstrates the importance of resampling the CSMF distribution in the test set; if the test and train sets have the same CSMF distribution, then simple approaches like Random-From-Train, as well as state-of-the-art approaches like King-Lu,23 can appear to have better performance than is justified, due to “overfitting”.



In [21]:

    
df.index = df.J
np.round(1*((df.filter(['acc_rft', 'acc_rand', 'acc_rft_unif']) - .632) / (1-.632)).loc[[5,15,25,35,50,]], 3)









    Out[21]:






  
    
      
      acc_rft
      acc_rand
      acc_rft_unif
    
    
      J
      
      
      
    
  
  
    
      5
      0.980
      0.074
      0.092
    
    
      15
      0.964
      0.028
      0.027
    
    
      25
      0.953
      0.016
      0.016
    
    
      35
      0.945
      0.010
      0.007
    
    
      50
      0.933
      0.006
      -0.005

Figure 1

Figure 1. CSMF Accuracy of random allocation as a function of CoD list length. The mean CSMF accuracy of random allocation was calculated with 10,000 Monte Carlo replicates for cause-list length ranging from 3 to 50. The CSMF accuracy decreases monotonically as a function of J and appears to stay above 1-1/e≈0.632, which we selected for our chance-correction parameter.



In [22]:

    
df.acc_rand.plot(color='k',marker='o')

plt.ylabel('CSMF Accuracy\nof Random Allocation', rotation=0, ha='right')
plt.xlabel('Number of Causes ($J$)')

plt.axis(xmin=0, xmax=53)
plt.subplots_adjust(left=.3)

Figure 2

Figure 2. Comparison of concordance from Monte Carlo calculation and analytic calculation. The analogous chance-correction value for concordance was calculated analytically in Murray et al.13, and we confirmed its accuracy in our simulation environment. The absolute relative difference was always less than 1%.



In [23]:

    
plt.plot(1/df.J, df.conc_rand, 'ko')
plt.ylabel('Monte Carlo Estimate\nof Concordance\nof Random Allocation', rotation=0, ha='right')
plt.xlabel('Analytically Calculated Concordance\nof Random Allocation')

df.index = df.J
for j in [3,5,10,25,50]:
    plt.text(1/float(j), df.conc_rand[j], '   $J=%d$'%j, ha='left', va='center', fontsize=24)
    
plt.subplots_adjust(left=.3)

Figure 3

Figure 3. Comparison of individual-level and population-level prediction quality for three commonly used methods: InterVA, Tariff, physician-certified verbal autopsy (PCVA). Questions that rely on the deceased having health care experience (HCE) are necessary for population-level PCVA quality to surpass random guessing. Data from Murray et al.12



In [24]:

    
df = pd.read_excel('../3-data/va_methods_comparison.xlsx')

# chance-correct accuracy
df.accuracy = (df.accuracy - 0.632) / (1 - 0.632)
df.head()









    Out[24]:






  
    
      
      model
      module
      hce
      accuracy
      CCC
    
  
  
    
      0
      InterVA
      Adult
      With HCE
      -0.017486
      0.238500
    
    
      1
      InterVA
      Adult
      Without HCE
      -0.054394
      0.234244
    
    
      2
      KL
      Adult
      With HCE
      0.153660
      0.000000
    
    
      3
      KL
      Adult
      Without HCE
      0.108869
      0.000000
    
    
      4
      MCCD
      Adult
      With HCE
      0.562500
      0.759000



In [25]:

    
fig, ax_list = plt.subplots(2, 3, sharex=True, sharey=True, figsize=(15,10))

for i, hce in enumerate(['With HCE', 'Without HCE']):
    for j, module in enumerate(['Neonate', 'Child', 'Adult']):
        t = df[(df.hce==hce)&(df.module==module)&df.model.isin(['InterVA', 'PCVA', 'Tariff'])]
        t = t.dropna(subset=['accuracy', 'CCC'])
        ax = ax_list[i,j]
        ax.axvline(0.64, color='grey', linestyle='--', linewidth=2)
        ax.plot(t.accuracy, t.CCC, 'o', ms=15, color='grey', mec='k', mew=2, alpha=1)
        
        for k in t.index:
            ha = 'left'
            dx = .06
            dy = 0
            if t.model[k] == 'Tariff' and j==0:
                ha = 'right'
                dx = -.06
            elif t.model[k] == 'InterVA':
                dy = -.0
            ax.text(t.accuracy[k]+dx, t.CCC[k]+dy, t.model[k], ha=ha, va='center', fontsize=18)
        
        if i == 0:
            ax.set_title(module)
    
    ax = ax_list[i,2]
    ax.set_ylabel(hce, labelpad=-320)

ax = ax_list[0,0]
ax.set_ylabel('Individual-Level Quality (CCC)', ha='center', position=(0,0))
ax = ax_list[1,1]
ax.set_xlabel('Population-Level Quality (CCCSMF Accuracy)')
fig.subplots_adjust(wspace=0.1, hspace=0.1)

ax.set_xticks([-.3, 0, .3])
ax.set_yticks([.2, .4, .6, .8])
ax.axis(xmin=-.4, xmax=.6, ymin=.15, ymax=.55)









    Out[25]:





(-0.4, 0.6, 0.15, 0.55)

	id	gs_text34	gs_assigned_MRR_1
0	0	Stroke	ZZ23
1	1	IHD - Acute Myocardial Infarction	I21
2	2	Stroke	I64
3	3	Pneumonia	E10
4	4	Stroke	I64
5	5	Other Cardiovascular Diseases	ZZ25
6	6	Diabetes	E10
7	7	Renal Failure	D91
8	8	Stroke	ZZ23
9	9	IHD - Acute Myocardial Infarction	ZZ23
10	10	Cirrhosis	A09
11	11	Stroke	ZZ23
12	12	Stroke	ZZ23
13	13	Stroke	W00
14	14	Other Cardiovascular Diseases	I21
15	15	Diabetes	E10
16	16	Poisonings	ZZ25
17	17	Diarrhea/Dysentery	I64
18	18	Stroke	ZZ23
19	19	IHD - Acute Myocardial Infarction	ZZ23
20	20	Epilepsy	J45
21	21	Falls	W00
22	22	Stroke	ZZ25
23	23	IHD - Acute Myocardial Infarction	E10
24	24	IHD - Acute Myocardial Infarction	E10
25	25	Pneumonia	J33
26	26	Stroke	ZZ23
27	27	IHD - Acute Myocardial Infarction	I21
28	28	Stroke	ZZ23
29	29	Falls	W00
...	...	...	...
2672	2672	Other Infectious Diseases	J33
2673	2673	Cirrhosis	ZZ21
2674	2674	Poisonings	ZZ27
2675	2675	Other Infectious Diseases	ZZ21
2676	2676	Other Infectious Diseases	J45
2677	2677	Pneumonia	E10
2678	2678	Diabetes	K71
2679	2679	Maternal	O67
2680	2680	Other Infectious Diseases	ZZ27
2681	2681	Maternal	O67
2682	2682	Drowning	ZZ27
2683	2683	Falls	ZZ27
2684	2684	Renal Failure	O67
2685	2685	Suicide	ZZ27
2686	2686	IHD - Acute Myocardial Infarction	I21
2687	2687	Fires	ZZ27
2688	2688	Homicide	ZZ27
2689	2689	Homicide	Y00
2690	2690	Maternal	O67
2691	2691	Fires	ZZ27
2692	2692	Drowning	ZZ27
2693	2693	Fires	ZZ27
2694	2694	Drowning	W65
2695	2695	Drowning	ZZ25
2696	2696	Fires	ZZ27
2697	2697	Drowning	W65
2698	2698	Homicide	Y00
2699	2699	Pneumonia	ZZ23
2700	2700	IHD - Acute Myocardial Infarction	I21
2701	2701	Cirrhosis	K71

	J	acc_rand	acc_rand_lb	acc_rand_ub	acc_rft	acc_rft_lb	acc_rft_ub	acc_rft_unif	acc_rft_unif_lb	acc_rft_unif_ub	conc_rand	conc_rand_lb	conc_rand_ub
0	3.0	0.672649	0.409661	0.906207	0.994532	0.986763	0.999225	0.676596	0.412911	0.915653	0.333485	0.324596	0.342875
1	4.0	0.664950	0.419911	0.870437	0.993817	0.986271	0.998765	0.670330	0.433460	0.870414	0.249967	0.241173	0.258959
2	5.0	0.659377	0.445512	0.845829	0.992700	0.985842	0.997518	0.666023	0.466219	0.844205	0.199946	0.192522	0.208066
3	6.0	0.655971	0.456732	0.827259	0.991902	0.985600	0.996989	0.659627	0.467981	0.806137	0.167013	0.160000	0.174345
4	7.0	0.652765	0.465657	0.815029	0.991545	0.985601	0.996526	0.659969	0.494500	0.822932	0.143023	0.136691	0.149720

	acc_rft	acc_rand	acc_rft_unif
J
5	0.980	0.074	0.092
15	0.964	0.028	0.027
25	0.953	0.016	0.016
35	0.945	0.010	0.007
50	0.933	0.006	-0.005

	model	module	hce	accuracy	CCC
0	InterVA	Adult	With HCE	-0.017486	0.238500
1	InterVA	Adult	Without HCE	-0.054394	0.234244
2	KL	Adult	With HCE	0.153660	0.000000
3	KL	Adult	Without HCE	0.108869	0.000000
4	MCCD	Adult	With HCE	0.562500	0.759000