Evaluate classification accuracy

This notebook demonstrates how to evaluate classification accuracy of "novel taxa". Due to the unique nature of this analysis, the metrics that we use to evaluate classification accuracy of "novel taxa" are different from those used for mock and simulated communities.

The key measure here is rate of match vs. overclassification, hence P/R/F are not useful metrics. Instead, we define and measure the following as percentages:

Match vs. overclassification rate
- Match: assignment == L - 1 (e.g., a novel species is assigned the correct genus)
- overclassification: assignment == L (e.g., correct genus but assigns to a near neighbor)
- misclassification: incorrect assignment at L - 1 (e.g., wrong genus-level assignment)

Where L = taxonomic level being tested

Functions



In [1]:

    
from tax_credit.framework_functions import novel_taxa_classification_evaluation
from tax_credit.eval_framework import parameter_comparisons
from tax_credit.plotting_functions import (pointplot_from_data_frame,
                                           heatmap_from_data_frame,
                                           per_level_kruskal_wallis,
                                           rank_optimized_method_performance_by_dataset)

import seaborn.xkcd_rgb as colors
from os.path import expandvars, join, exists
from glob import glob
from IPython.display import display, Markdown
import pandas as pd

Evaluate classification results

First, enter in filepaths and directory paths where your data are stored, and the destination



In [7]:

    
project_dir = expandvars("../../")
analysis_name = "novel-taxa-simulations"
precomputed_results_dir = join(project_dir, "data", "precomputed-results", analysis_name)
expected_results_dir = join(project_dir, "data", analysis_name)
summary_fp = join(precomputed_results_dir, 'evaluate_classification_summary.csv')

results_dirs = glob(join(precomputed_results_dir, '*', '*', '*', '*'))

# we can save plots in this directory
outdir = expandvars("../../plots/")

This cell performs the classification evaluation and should not be modified.



In [3]:

    
force = True
if force or not exists(summary_fp):
    accuracy_results = novel_taxa_classification_evaluation(results_dirs, expected_results_dir, summary_fp)
else:
    accuracy_results = pd.DataFrame.from_csv(summary_fp)

Plot classification accuracy

Finally, we plot our results. Line plots show the mean +/- 95% confidence interval for each classification result at each taxonomic level (1 = phylum, 6 = species) in each dataset tested. Do not modify the cell below, except to adjust the color_palette used for plotting. This palette can be a dictionary of colors for each group, as shown below, or a seaborn color palette.

Precision = Proportion of classifications that were correct. For novel taxa, this means a match at the last common ancestor (LCA) (level-1). True Positives / (True Positives + False Positives)

Recall = Proportion of reads that were correctly classified. Equals the number of exact matches to the LCA. True Positives / (True Positives + False Negatives)

F-measure = Harmonic mean of Precision and Recall

overclassification_ratio = proportion of taxa that were assigned to correct lineage but to a deeper taxonomic level than expected, rather than to LCA. E.g., assignment to another species in the clade

underclassification_ratio = proportion of assignments to correct lineage but to a lower level than expected.

misclassification_ratio = proportion of assignments to an incorrect lineage.



In [4]:

    
color_palette={
    'expected': 'black', 'rdp': colors['baby shit green'], 'sortmerna': colors['macaroni and cheese'],
    'uclust': 'coral', 'blast': 'indigo', 'blast+': colors['electric purple'], 'naive-bayes': 'dodgerblue',
    'naive-bayes-bespoke': 'blue', 'vsearch': 'firebrick'
}

y_vars = ["Precision", "Recall", "F-measure",
          "overclassification_ratio", 
          "underclassification_ratio", "misclassification_ratio"]

Plot per-level classification accuracy

For novel-taxa analysis, a separate classification is performed at each taxonomic level using different test (unique taxa at level L) and training sets (ref - test taxonomies). Hence, results at each level L represent independent tests, unlike for mock and simulated communities where each level represents the accuracy of each species-level classification trimmed to level L. For novel taxa, results at level L indicate the accuracy with with method M assigns the correct lineage to a "novel" taxon, which is unrepresented in the reference at level L, e.g., level 6 indicates the performance with which each classifier assigns the correct genus to each species.



In [5]:

    
point = pointplot_from_data_frame(accuracy_results, "level", y_vars, 
                                  group_by="Dataset", color_by="Method",
                                  color_palette=color_palette)



In [8]:

    
for k, v in point.items():
    v.savefig(join(outdir, 'novel-{0}-lineplots.pdf'.format(k)))

Per-level classification accuracy statistic

Kruskal-Wallis FDR-corrected p-values comparing classification methods at each level of taxonomic assignment



In [9]:

    
result = per_level_kruskal_wallis(accuracy_results, y_vars, group_by='Method', 
                                  dataset_col='Dataset', alpha=0.05, 
                                  pval_correction='fdr_bh')
result









    



/home/ben/miniconda3/envs/qiime2-2017.6/lib/python3.5/site-packages/statsmodels/stats/multitest.py:320: RuntimeWarning: invalid value encountered in less_equal
  reject = pvals_sorted <= ecdffactor*alpha






    Out[9]:







  
    
      
      Dataset
      Variable
      1
      2
      3
      4
      5
      6
    
  
  
    
      0
      B1-REF
      Precision
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      1
      B1-REF
      Recall
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      2
      B1-REF
      F-measure
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      3
      B1-REF
      overclassification_ratio
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      4
      B1-REF
      underclassification_ratio
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      5
      B1-REF
      misclassification_ratio
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      6
      F1-REF
      Precision
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      7
      F1-REF
      Recall
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      8
      F1-REF
      F-measure
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      9
      F1-REF
      overclassification_ratio
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      10
      F1-REF
      underclassification_ratio
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      11
      F1-REF
      misclassification_ratio
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN

Heatmaps of method accuracy by parameter

Heatmaps show the performance of individual method/parameter combinations at each taxonomic level, in each reference database (i.e., for bacterial and fungal novel-taxa datasets individually).



In [10]:

    
heatmap_from_data_frame(accuracy_results, metric="Precision", rows=["Method", "Parameters"], cols=["Dataset", "level"])









    












    Out[10]:





<matplotlib.axes._subplots.AxesSubplot at 0x7eff7bd18be0>



In [11]:

    
heatmap_from_data_frame(accuracy_results, metric="Recall", rows=["Method", "Parameters"], cols=["Dataset", "level"])









    












    Out[11]:





<matplotlib.axes._subplots.AxesSubplot at 0x7eff80c68908>



In [12]:

    
heatmap_from_data_frame(accuracy_results, metric="F-measure", rows=["Method", "Parameters"], cols=["Dataset", "level"])









    












    Out[12]:





<matplotlib.axes._subplots.AxesSubplot at 0x7eff80a94f98>



In [13]:

    
heatmap_from_data_frame(accuracy_results, metric="overclassification_ratio", rows=["Method", "Parameters"], cols=["Dataset", "level"])









    












    Out[13]:





<matplotlib.axes._subplots.AxesSubplot at 0x7eff7a006668>



In [14]:

    
heatmap_from_data_frame(accuracy_results, metric="underclassification_ratio", rows=["Method", "Parameters"], cols=["Dataset", "level"])









    












    Out[14]:





<matplotlib.axes._subplots.AxesSubplot at 0x7eff7853eb38>



In [15]:

    
heatmap_from_data_frame(accuracy_results, metric="misclassification_ratio", rows=["Method", "Parameters"], cols=["Dataset", "level"])









    












    Out[15]:





<matplotlib.axes._subplots.AxesSubplot at 0x7eff783326d8>

Rank-based statistics comparing the performance of the optimal parameter setting run for each method on each data set.

Rank parameters for each method to determine the best parameter configuration within each method. Count best values in each column indicate how many samples a given method achieved within one mean absolute deviation of the best result (which is why they may sum to more than the total number of samples).



In [16]:

    
for method in accuracy_results['Method'].unique():
    top_params = parameter_comparisons(accuracy_results[accuracy_results["level"] == 6], 
                                       method, metrics=["Precision", "Recall", 
                                                        "overclassification_ratio", 
                                                        "underclassification_ratio", 
                                                        "misclassification_ratio",
                                                        "F-measure"],
                                       ascending={"Precision": False, "Recall": False, 
                                                        "overclassification_ratio": True, 
                                                        "underclassification_ratio": True, 
                                                        "misclassification_ratio": True,
                                                        "F-measure": False},
                                       sample_col='Dataset', method_col='Method',
                                       dataset_col='Dataset')
    display(Markdown('## {0}'.format(method)))
    display(top_params[:5])









    




blast+






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      0.001:10:0.51:0.8
      20.0
      10.0
      20.0
      0.0
      17.0
      10.0
    
    
      0.001:10:0.75:0.8
      18.0
      14.0
      13.0
      9.0
      18.0
      7.0
    
    
      0.001:10:0.99:0.8
      9.0
      20.0
      0.0
      16.0
      20.0
      0.0
    
    
      0.001:1:0.51:0.97
      0.0
      0.0
      0.0
      14.0
      10.0
      0.0
    
    
      0.001:1:0.99:0.97
      0.0
      0.0
      0.0
      14.0
      10.0
      0.0
    
  








    




naive-bayes






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      0.001:[6,6]:0.94
      20.0
      0.0
      20.0
      15.0
      0.0
      0.0
    
    
      0.001:[8,8]:0.98
      20.0
      0.0
      19.0
      17.0
      0.0
      0.0
    
    
      0.001:[7,7]:0.98
      20.0
      0.0
      20.0
      17.0
      0.0
      0.0
    
    
      0.001:[6,6]:0.96
      20.0
      0.0
      20.0
      17.0
      0.0
      0.0
    
    
      0.001:[6,6]:0.98
      20.0
      0.0
      19.0
      20.0
      2.0
      0.0
    
  








    




vsearch






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      10:0.51:0.8
      20.0
      10.0
      20.0
      0.0
      17.0
      10.0
    
    
      10:0.51:0.9
      10.0
      10.0
      10.0
      1.0
      16.0
      0.0
    
    
      10:0.99:0.8
      10.0
      17.0
      1.0
      12.0
      20.0
      1.0
    
    
      10:0.51:0.97
      0.0
      0.0
      0.0
      20.0
      20.0
      0.0
    
    
      10:0.51:0.99
      0.0
      0.0
      0.0
      20.0
      20.0
      0.0
    
  








    




rdp






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      0.8
      20.0
      0.0
      20.0
      18.0
      7.0
      0.0
    
    
      0.9
      20.0
      9.0
      20.0
      20.0
      16.0
      0.0
    
    
      0.7
      17.0
      0.0
      20.0
      4.0
      0.0
      0.0
    
    
      1.0
      16.0
      17.0
      10.0
      20.0
      20.0
      0.0
    
    
      0.6
      4.0
      0.0
      16.0
      0.0
      0.0
      0.0
    
  








    




sortmerna






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      1.0:0.9:5:0.9:1.0
      20.0
      19.0
      7.0
      19.0
      18.0
      0.0
    
    
      1.0:0.8:3:0.8:1.0
      20.0
      13.0
      20.0
      10.0
      13.0
      0.0
    
    
      1.0:0.8:5:0.8:1.0
      20.0
      20.0
      18.0
      10.0
      15.0
      0.0
    
    
      0.76:0.9:5:0.9:1.0
      20.0
      18.0
      10.0
      19.0
      12.0
      0.0
    
    
      0.76:0.9:5:0.8:1.0
      20.0
      18.0
      10.0
      19.0
      12.0
      0.0
    
  








    




blast






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      0.001
      20
      20
      20
      3
      5
      6
    
    
      1
      20
      20
      20
      3
      5
      6
    
    
      1000
      20
      20
      20
      3
      5
      6
    
    
      1e-10
      20
      20
      20
      4
      7
      2
    
  








    




uclust






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      1.0:0.8:3
      20.0
      12.0
      20.0
      16.0
      13.0
      0.0
    
    
      0.51:0.8:5
      20.0
      10.0
      20.0
      0.0
      11.0
      8.0
    
    
      0.76:0.8:3
      20.0
      12.0
      20.0
      16.0
      13.0
      0.0
    
    
      0.76:0.8:5
      20.0
      12.0
      20.0
      10.0
      13.0
      0.0
    
    
      0.76:0.9:5
      20.0
      14.0
      10.0
      20.0
      12.0
      0.0
    
  








    




naive-bayes-bespoke






    







  
    
      
      F-measure
      Precision
      Recall
      misclassification_ratio
      overclassification_ratio
      underclassification_ratio
    
  
  
    
      0.001::[6,6]:0.98
      20.0
      0.0
      18.0
      19.0
      0.0
      0.0
    
    
      0.001::[6,6]:0.94
      20.0
      0.0
      19.0
      14.0
      0.0
      0.0
    
    
      0.001::[6,6]:0.96
      20.0
      0.0
      19.0
      16.0
      0.0
      0.0
    
    
      0.001::[7,7]:0.98
      20.0
      0.0
      19.0
      17.0
      0.0
      0.0
    
    
      0.001::[6,6]:0.92
      19.0
      0.0
      20.0
      8.0
      0.0
      3.0

Rank performance of optimized methods

Now we rank the top-performing method/parameter combination for each method at species level. Methods are ranked by top F-measure, and the average value for each metric is shown (rather than count best as above). F-measure distributions are plotted for each method, and compared using paired t-tests with FDR-corrected P-values. This cell does not need to be altered, unless if you wish to change the metric used for sorting best methods and for plotting.



In [17]:

    
boxes = rank_optimized_method_performance_by_dataset(accuracy_results, dataset="Dataset",
                                                     metric="F-measure",
                                                     level="level",
                                                     level_range=range(6,7),
                                                     display_fields=["Method",
                                                                     "Parameters",
                                                                     "Precision",
                                                                     "Recall",
                                                                     "F-measure",
                                                                     "overclassification_ratio",
                                                                     "underclassification_ratio",
                                                                     "misclassification_ratio"],
                                                     paired=True,
                                                     parametric=True,
                                                     color=None,
                                                     color_palette=color_palette)









    




B1-REF level 6






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      overclassification_ratio
      underclassification_ratio
      misclassification_ratio
    
  
  
    
      6
      uclust
      0.76:0.8:5
      0.346457
      0.168196
      0.226206
      0.160282
      0.512495
      0.159026
    
    
      7
      vsearch
      10:0.51:0.8
      0.332746
      0.171471
      0.225973
      0.101851
      0.482356
      0.244321
    
    
      1
      blast+
      0.001:10:0.51:0.8
      0.331765
      0.170688
      0.225072
      0.099453
      0.483138
      0.246721
    
    
      5
      sortmerna
      0.76:0.8:5:0.8:1.0
      0.306492
      0.150548
      0.201736
      0.176592
      0.506561
      0.166299
    
    
      2
      naive-bayes
      0.001:[6,6]:0.94
      0.276945
      0.117913
      0.165328
      0.158884
      0.573784
      0.149419
    
    
      3
      naive-bayes-bespoke
      0.001::[6,6]:0.9
      0.242715
      0.125204
      0.165107
      0.203411
      0.483745
      0.187640
    
    
      4
      rdp
      0.9
      0.289102
      0.103589
      0.152392
      0.121990
      0.640886
      0.133535
    
    
      0
      blast
      0.001
      0.000000
      0.000000
      0.000000
      0.401618
      0.016360
      0.582022
    
  








    







  
    
      
      
      stat
      P
      FDR P
    
    
      Method A
      Method B
      
      
      
    
  
  
    
      blast+
      naive-bayes
      9.157050
      7.409731e-06
      1.220426e-05
    
    
      vsearch
      -0.373432
      7.174717e-01
      8.035683e-01
    
    
      rdp
      10.541452
      2.303171e-06
      5.862616e-06
    
    
      sortmerna
      5.177653
      5.811739e-04
      7.396758e-04
    
    
      blast
      50.671187
      2.279314e-12
      1.595520e-11
    
    
      uclust
      -0.210259
      8.381482e-01
      9.026212e-01
    
    
      naive-bayes-bespoke
      10.286754
      2.826549e-06
      6.595281e-06
    
    
      naive-bayes
      vsearch
      -9.774469
      4.325013e-06
      8.073358e-06
    
    
      rdp
      3.591854
      5.821895e-03
      7.087524e-03
    
    
      sortmerna
      -6.841872
      7.541793e-05
      1.055851e-04
    
    
      blast
      56.453517
      8.642542e-13
      8.545258e-12
    
    
      uclust
      -9.987936
      3.614368e-06
      7.228736e-06
    
    
      naive-bayes-bespoke
      0.080303
      9.377534e-01
      9.656222e-01
    
    
      vsearch
      rdp
      10.682818
      2.059514e-06
      5.766638e-06
    
    
      sortmerna
      5.660557
      3.094161e-04
      4.125548e-04
    
    
      blast
      60.776861
      4.455565e-13
      8.545258e-12
    
    
      uclust
      -0.044313
      9.656222e-01
      9.656222e-01
    
    
      naive-bayes-bespoke
      10.181918
      3.079015e-06
      6.631724e-06
    
    
      rdp
      sortmerna
      -8.023815
      2.161518e-05
      3.362361e-05
    
    
      blast
      35.711508
      5.237240e-11
      2.094896e-10
    
    
      uclust
      -11.822569
      8.740744e-07
      3.059261e-06
    
    
      naive-bayes-bespoke
      -3.409434
      7.756568e-03
      9.049329e-03
    
    
      sortmerna
      blast
      56.091988
      9.155633e-13
      8.545258e-12
    
    
      uclust
      -9.162465
      7.373867e-06
      1.220426e-05
    
    
      naive-bayes-bespoke
      7.002351
      6.308346e-05
      9.296510e-05
    
    
      blast
      uclust
      -48.031956
      3.682823e-12
      2.062381e-11
    
    
      naive-bayes-bespoke
      -46.496631
      4.928292e-12
      2.299870e-11
    
    
      uclust
      naive-bayes-bespoke
      11.553916
      1.062524e-06
      3.305629e-06
    
  








    












    




F1-REF level 6






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      overclassification_ratio
      underclassification_ratio
      misclassification_ratio
    
  
  
    
      6
      uclust
      0.76:0.8:5
      0.761485
      0.492482
      0.598056
      0.094481
      0.353130
      0.059907
    
    
      7
      vsearch
      10:0.51:0.8
      0.697564
      0.469367
      0.561082
      0.126237
      0.326975
      0.077421
    
    
      1
      blast+
      0.001:10:0.51:0.8
      0.690091
      0.461693
      0.553177
      0.095593
      0.331007
      0.111707
    
    
      5
      sortmerna
      0.76:0.8:5:0.9:1.0
      0.637763
      0.458432
      0.533388
      0.158935
      0.281008
      0.101625
    
    
      4
      rdp
      0.9
      0.617385
      0.379322
      0.469883
      0.193591
      0.385601
      0.041486
    
    
      2
      naive-bayes
      0.001:[6,6]:0.98
      0.519976
      0.361091
      0.426155
      0.275799
      0.305106
      0.058005
    
    
      3
      naive-bayes-bespoke
      0.001::[6,6]:0.98
      0.511613
      0.357635
      0.420927
      0.282702
      0.300502
      0.059161
    
    
      0
      blast
      0.001
      0.000000
      0.000000
      0.000000
      0.616293
      0.227025
      0.156682
    
  








    







  
    
      
      
      stat
      P
      FDR P
    
    
      Method A
      Method B
      
      
      
    
  
  
    
      blast+
      naive-bayes
      18.772981
      1.585850e-08
      2.913244e-08
    
    
      vsearch
      -1.661258
      1.310287e-01
      1.310287e-01
    
    
      rdp
      14.325652
      1.680985e-07
      2.241313e-07
    
    
      sortmerna
      4.824997
      9.404200e-04
      1.012760e-03
    
    
      blast
      86.602654
      1.849095e-14
      7.396382e-14
    
    
      uclust
      -14.951978
      1.159958e-07
      1.623941e-07
    
    
      naive-bayes-bespoke
      18.669679
      1.664711e-08
      2.913244e-08
    
    
      naive-bayes
      vsearch
      -27.091117
      6.162722e-10
      1.660726e-09
    
    
      rdp
      -7.095229
      5.696537e-05
      6.380122e-05
    
    
      sortmerna
      -23.185040
      2.457519e-09
      4.915037e-09
    
    
      blast
      119.401962
      1.029548e-15
      9.732375e-15
    
    
      uclust
      -26.918133
      6.524282e-10
      1.660726e-09
    
    
      naive-bayes-bespoke
      3.830653
      4.023696e-03
      4.172722e-03
    
    
      vsearch
      rdp
      18.539728
      1.770153e-08
      2.915546e-08
    
    
      sortmerna
      7.249331
      4.819741e-05
      5.623032e-05
    
    
      blast
      112.649990
      1.737924e-15
      9.732375e-15
    
    
      uclust
      -7.354822
      4.305133e-05
      5.241031e-05
    
    
      naive-bayes-bespoke
      24.446490
      1.535902e-09
      3.583770e-09
    
    
      rdp
      sortmerna
      -14.970077
      1.147836e-07
      1.623941e-07
    
    
      blast
      121.403333
      8.865654e-16
      9.732375e-15
    
    
      uclust
      -29.598787
      2.800567e-10
      9.801983e-10
    
    
      naive-bayes-bespoke
      7.686589
      3.041681e-05
      3.871231e-05
    
    
      sortmerna
      blast
      117.676919
      1.173532e-15
      9.732375e-15
    
    
      uclust
      -17.433748
      3.037369e-08
      4.724796e-08
    
    
      naive-bayes-bespoke
      24.119252
      1.731081e-09
      3.728482e-09
    
    
      blast
      uclust
      -113.819445
      1.583758e-15
      9.732375e-15
    
    
      naive-bayes-bespoke
      -102.536284
      4.049976e-15
      1.889989e-14
    
    
      uclust
      naive-bayes-bespoke
      27.652978
      5.133382e-10
      1.597052e-09



In [18]:

    
for k, v in boxes.items():
    v.get_figure().savefig(join(outdir, 'novel-{0}-boxplots.pdf'.format(k)))



In [ ]:

	Dataset	Variable	1	2	3	4	5	6
0	B1-REF	Precision	NaN	NaN	NaN	NaN	NaN	NaN
1	B1-REF	Recall	NaN	NaN	NaN	NaN	NaN	NaN
2	B1-REF	F-measure	NaN	NaN	NaN	NaN	NaN	NaN
3	B1-REF	overclassification_ratio	NaN	NaN	NaN	NaN	NaN	NaN
4	B1-REF	underclassification_ratio	NaN	NaN	NaN	NaN	NaN	NaN
5	B1-REF	misclassification_ratio	NaN	NaN	NaN	NaN	NaN	NaN
6	F1-REF	Precision	NaN	NaN	NaN	NaN	NaN	NaN
7	F1-REF	Recall	NaN	NaN	NaN	NaN	NaN	NaN
8	F1-REF	F-measure	NaN	NaN	NaN	NaN	NaN	NaN
9	F1-REF	overclassification_ratio	NaN	NaN	NaN	NaN	NaN	NaN
10	F1-REF	underclassification_ratio	NaN	NaN	NaN	NaN	NaN	NaN
11	F1-REF	misclassification_ratio	NaN	NaN	NaN	NaN	NaN	NaN

	F-measure	Precision	Recall	misclassification_ratio	overclassification_ratio	underclassification_ratio
0.001:10:0.51:0.8	20.0	10.0	20.0	0.0	17.0	10.0
0.001:10:0.75:0.8	18.0	14.0	13.0	9.0	18.0	7.0
0.001:10:0.99:0.8	9.0	20.0	0.0	16.0	20.0	0.0
0.001:1:0.51:0.97	0.0	0.0	0.0	14.0	10.0	0.0
0.001:1:0.99:0.97	0.0	0.0	0.0	14.0	10.0	0.0

	F-measure	Recall	misclassification_ratio	overclassification_ratio
0.001:[6,6]:0.94	20.0	20.0	15.0	0.0
0.001:[8,8]:0.98	20.0	19.0	17.0	0.0
0.001:[7,7]:0.98	20.0	20.0	17.0	0.0
0.001:[6,6]:0.96	20.0	20.0	17.0	0.0
0.001:[6,6]:0.98	20.0	19.0	20.0	2.0

	F-measure	Precision	Recall	misclassification_ratio	overclassification_ratio	underclassification_ratio
10:0.51:0.8	20.0	10.0	20.0	0.0	17.0	10.0
10:0.51:0.9	10.0	10.0	10.0	1.0	16.0	0.0
10:0.99:0.8	10.0	17.0	1.0	12.0	20.0	1.0
10:0.51:0.97	0.0	0.0	0.0	20.0	20.0	0.0
10:0.51:0.99	0.0	0.0	0.0	20.0	20.0	0.0

	F-measure	Precision	Recall	misclassification_ratio	overclassification_ratio
1.0:0.9:5:0.9:1.0	20.0	19.0	7.0	19.0	18.0
1.0:0.8:3:0.8:1.0	20.0	13.0	20.0	10.0	13.0
1.0:0.8:5:0.8:1.0	20.0	20.0	18.0	10.0	15.0
0.76:0.9:5:0.9:1.0	20.0	18.0	10.0	19.0	12.0
0.76:0.9:5:0.8:1.0	20.0	18.0	10.0	19.0	12.0

	F-measure	Precision	Recall	misclassification_ratio	overclassification_ratio	underclassification_ratio
1.0:0.8:3	20.0	12.0	20.0	16.0	13.0	0.0
0.51:0.8:5	20.0	10.0	20.0	0.0	11.0	8.0
0.76:0.8:3	20.0	12.0	20.0	16.0	13.0	0.0
0.76:0.8:5	20.0	12.0	20.0	10.0	13.0	0.0
0.76:0.9:5	20.0	14.0	10.0	20.0	12.0	0.0

	F-measure	Recall	misclassification_ratio	underclassification_ratio
0.001::[6,6]:0.98	20.0	18.0	19.0	0.0
0.001::[6,6]:0.94	20.0	19.0	14.0	0.0
0.001::[6,6]:0.96	20.0	19.0	16.0	0.0
0.001::[7,7]:0.98	20.0	19.0	17.0	0.0
0.001::[6,6]:0.92	19.0	20.0	8.0	3.0

	Method	Parameters	Precision	Recall	F-measure	overclassification_ratio	underclassification_ratio	misclassification_ratio
6	uclust	0.76:0.8:5	0.346457	0.168196	0.226206	0.160282	0.512495	0.159026
7	vsearch	10:0.51:0.8	0.332746	0.171471	0.225973	0.101851	0.482356	0.244321
1	blast+	0.001:10:0.51:0.8	0.331765	0.170688	0.225072	0.099453	0.483138	0.246721
5	sortmerna	0.76:0.8:5:0.8:1.0	0.306492	0.150548	0.201736	0.176592	0.506561	0.166299
2	naive-bayes	0.001:[6,6]:0.94	0.276945	0.117913	0.165328	0.158884	0.573784	0.149419
3	naive-bayes-bespoke	0.001::[6,6]:0.9	0.242715	0.125204	0.165107	0.203411	0.483745	0.187640
4	rdp	0.9	0.289102	0.103589	0.152392	0.121990	0.640886	0.133535
0	blast	0.001	0.000000	0.000000	0.000000	0.401618	0.016360	0.582022

		stat	P	FDR P
Method A	Method B
blast+	naive-bayes	9.157050	7.409731e-06	1.220426e-05
	vsearch	-0.373432	7.174717e-01	8.035683e-01
	rdp	10.541452	2.303171e-06	5.862616e-06
	sortmerna	5.177653	5.811739e-04	7.396758e-04
	blast	50.671187	2.279314e-12	1.595520e-11
	uclust	-0.210259	8.381482e-01	9.026212e-01
	naive-bayes-bespoke	10.286754	2.826549e-06	6.595281e-06
naive-bayes	vsearch	-9.774469	4.325013e-06	8.073358e-06
	rdp	3.591854	5.821895e-03	7.087524e-03
	sortmerna	-6.841872	7.541793e-05	1.055851e-04
	blast	56.453517	8.642542e-13	8.545258e-12
	uclust	-9.987936	3.614368e-06	7.228736e-06
	naive-bayes-bespoke	0.080303	9.377534e-01	9.656222e-01
vsearch	rdp	10.682818	2.059514e-06	5.766638e-06
	sortmerna	5.660557	3.094161e-04	4.125548e-04
	blast	60.776861	4.455565e-13	8.545258e-12
	uclust	-0.044313	9.656222e-01	9.656222e-01
	naive-bayes-bespoke	10.181918	3.079015e-06	6.631724e-06
rdp	sortmerna	-8.023815	2.161518e-05	3.362361e-05
	blast	35.711508	5.237240e-11	2.094896e-10
	uclust	-11.822569	8.740744e-07	3.059261e-06
	naive-bayes-bespoke	-3.409434	7.756568e-03	9.049329e-03
sortmerna	blast	56.091988	9.155633e-13	8.545258e-12
	uclust	-9.162465	7.373867e-06	1.220426e-05
	naive-bayes-bespoke	7.002351	6.308346e-05	9.296510e-05
blast	uclust	-48.031956	3.682823e-12	2.062381e-11
blast	naive-bayes-bespoke	-46.496631	4.928292e-12	2.299870e-11
uclust	naive-bayes-bespoke	11.553916	1.062524e-06	3.305629e-06

	Method	Parameters	Precision	Recall	F-measure	overclassification_ratio	underclassification_ratio	misclassification_ratio
6	uclust	0.76:0.8:5	0.761485	0.492482	0.598056	0.094481	0.353130	0.059907
7	vsearch	10:0.51:0.8	0.697564	0.469367	0.561082	0.126237	0.326975	0.077421
1	blast+	0.001:10:0.51:0.8	0.690091	0.461693	0.553177	0.095593	0.331007	0.111707
5	sortmerna	0.76:0.8:5:0.9:1.0	0.637763	0.458432	0.533388	0.158935	0.281008	0.101625
4	rdp	0.9	0.617385	0.379322	0.469883	0.193591	0.385601	0.041486
2	naive-bayes	0.001:[6,6]:0.98	0.519976	0.361091	0.426155	0.275799	0.305106	0.058005
3	naive-bayes-bespoke	0.001::[6,6]:0.98	0.511613	0.357635	0.420927	0.282702	0.300502	0.059161
0	blast	0.001	0.000000	0.000000	0.000000	0.616293	0.227025	0.156682