Evaluate mock community classification accuracy for `nb-extra`

The purpose of this notebook is to evaluate taxonomic classification accuracy of mock communities using different classification methods.

Contains an additional section that uses CART analysis.

Prepare the environment

First we'll import various functions that we'll need for generating the report.



In [1]:

    
%matplotlib inline
from os.path import join, exists, expandvars
import pandas as pd
from IPython.display import display, Markdown
import seaborn.xkcd_rgb as colors
from tax_credit.plotting_functions import (pointplot_from_data_frame,
                                           boxplot_from_data_frame,
                                           heatmap_from_data_frame,
                                           per_level_kruskal_wallis,
                                           beta_diversity_pcoa,
                                           average_distance_boxplots,
                                           rank_optimized_method_performance_by_dataset)
from tax_credit.eval_framework import (evaluate_results,
                                       method_by_dataset_a1,
                                       parameter_comparisons,
                                       merge_expected_and_observed_tables,
                                       filter_df)

Configure local environment-specific values

This is the only cell that you will need to edit to generate basic reports locally. After editing this cell, you can run all cells in this notebook to generate your analysis report. This will take a few minutes to run, as results are computed at multiple taxonomic levels.

Values in this cell will not need to be changed, with the exception of project_dir, to generate the default results contained within tax-credit. To analyze results separately from the tax-credit precomputed results, other variables in this cell will need to be set.



In [2]:

    
## project_dir should be the directory where you've downloaded (or cloned) the 
## tax-credit repository. 
project_dir = join('..', '..')

## expected_results_dir contains expected composition data in the structure
## expected_results_dir/<dataset name>/<reference name>/expected/
expected_results_dir = join(project_dir, "data/precomputed-results/", "mock-community")

## mock_results_fp designates the files to which summary results are written.
## If this file exists, it can be read in to generate results plots, instead
## of computing new scores.
mock_results_fp = join(expected_results_dir, 'broad_sweep_results.tsv')

## results_dirs should contain the directory or directories where
## results can be found. By default, this is the same location as expected 
## results included with the project. If other results should be included, 
## absolute paths to those directories should be added to this list.
results_dirs = [expected_results_dir]

## directory containing mock community data, e.g., feature table without taxonomy
mock_dir = join(project_dir, "data", "mock-community")

## Minimum number of times an OTU must be observed for it to be included in analyses. Edit this
## to analyze the effect of the minimum count on taxonomic results.
min_count = 1

## Define the range of taxonomic levels over which to compute accuracy scores.
## The default given below will compute order (level 2) through species (level 6)
taxonomy_level_range = range(2,7)



In [3]:

    
dataset_ids = ['mock-' + str(m) for m in (3, 12, 18, 22, 24, '26-ITS1', '26-ITS9')]

Find mock community pre-computed tables, expected tables, and "query" tables

Next we'll use the paths defined above to find all of the tables that will be compared. These include the pre-computed result tables (i.e., the ones that the new methods will be compared to), the expected result tables (i.e., the tables containing the known composition of the mock microbial communities), and the query result tables (i.e., the tables generated with the new method(s) that we want to compare to the pre-computed result tables).

Note: if you have added additional methods to add, set append=True. If you are attempting to recompute pre-computed results, set force=True.

This cell will take a few minutes to run if new results are being added, so hold onto your hat. If you are attempting to re-compute everything, it may take an hour or so, so go take a nap.



In [4]:

    
mock_results = evaluate_results(results_dirs, 
                                expected_results_dir, 
                                mock_results_fp, 
                                mock_dir,
                                taxonomy_level_range=range(2,7), 
                                min_count=min_count,
                                taxa_to_keep=None, 
                                md_key='taxonomy', 
                                subsample=False,
                                per_seq_precision=True,
                                exclude=['other'],
                                method_ids=['nb-extra'],
                                append=False,
                                force=False)









    



../../data/precomputed-results/mock-community/mock_results.tsv already exists.
Reading in pre-computed evaluation results.
To overwrite, set force=True
Results have been filtered to only include datasets or reference databases or methods or parameters that are explicitly set by results params. To disable this function and load all results, set dataset_ids and reference_ids and method_ids and parameter_ids to None.



In [5]:

    
mock_results['Reference'].unique()









    Out[5]:





array(['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read',
       'unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full',
       'unite_20.11.2016_clean_read'], dtype=object)

Restrict analyses to a set of datasets or references: e.g., exclude taxonomy assignments made for purpose of reference database comparisons. This can be performed as shown below — alternatively, specific reference databases, datasets, methods, or parameters can be chosen by setting dataset_ids, reference_ids, method_ids, and parameter_ids in the evaluate_results command above.



In [6]:

    
#mock_results = filter_df(mock_results, column_name='Reference',
#                         values=['gg_13_8_otus_amplicon', 'gg_13_8_otus_read', 'gg_13_8_otus_full'],
#                         exclude=False)
mock_results = mock_results.reset_index(drop=True)

Compute and summarize precision, recall, and F-measure for mock communities

In this evaluation, we compute and summarize precision, recall, and F-measure of each result (pre-computed and query) based on the known composition of the mock communities. We then summarize the results in two ways: first with boxplots, and second with a table of the top methods based on their F-measures. Higher scores = better accuracy

As a first step, we will evaluate average method performance at each taxonomic level for each method within each reference dataset type.

Note that, as parameter configurations can cause results to vary widely, average results are not a good representation of the "best" results. See here for results using optimized parameters for each method.

First we will define our color palette and the variables we want to plot. Via seaborn, we can apply the xkcd crowdsourced color names. If that still doesn't match your hue, use hex codes.



In [7]:

    
color_pallette={
    'nb-extra': 'black'
}

y_vars = ["Precision", "Recall", "F-measure", "Taxon Accuracy Rate", "Taxon Detection Rate"]



In [12]:

    
pointplot_from_data_frame?



In [15]:

    
pointplot_from_data_frame(mock_results, "Level", y_vars,
                          "Reference", "Method", color_pallette)









    












    












    












    












    












    Out[15]:





{'F-measure': <seaborn.axisgrid.FacetGrid at 0x11cb2a0f0>,
 'Precision': <seaborn.axisgrid.FacetGrid at 0x10eabd470>,
 'Recall': <seaborn.axisgrid.FacetGrid at 0x1207b8a90>,
 'Taxon Accuracy Rate': <seaborn.axisgrid.FacetGrid at 0x11ac44550>,
 'Taxon Detection Rate': <seaborn.axisgrid.FacetGrid at 0x11b6e1a90>}

CART Analysis

In this section we will use Classification and Regression Trees to try to pick good parameters for the naïve Bayes classifier. In each case, we pick the path to the classification leaf that yields the highest expected F-measure. Also, we unconventionally turn of pruning, so that all parameters are sepecified. This has the effect of picking arbitrary parameters towards the leaves of the decision tree where it doesn't matter as much which parameters we choose.

This section requires the additional dependencies of rpy2 in Python and rpart in R. If you do not wish to install those dependencies, skip the CART Analysis section.



In [16]:

    
mock_results['Reference'].unique()









    Out[16]:





array(['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read',
       'unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full',
       'unite_20.11.2016_clean_read'], dtype=object)



In [17]:

    
from itertools import product

from pandas import DataFrame, concat, to_numeric
from numpy import mean
import rpy2



In [18]:

    
%load_ext rpy2.ipython



In [19]:

    
%R require(rpart)









    



/Users/benkaehler/miniconda3/envs/qiime2-2017.5/lib/python3.5/site-packages/rpy2/rinterface/__init__.py:185: RRuntimeWarning: Loading required package: rpart

  warnings.warn(x, RRuntimeWarning)






    Out[19]:





array([1], dtype=int32)

Split the Parameter String and Aggregate by Community



In [20]:

    
columns = ['Alpha', 'Class-Prior', 'N-Features', 'Ngram-Range', 'Norm', 'Use-IDF', 'Confidence']
params = DataFrame((s.split(':') for s in mock_results['Parameters']), columns=columns)
keepers = ['Dataset', 'Level', 'Reference']
raw_param_results = concat([mock_results[keepers + ['F-measure']], params], axis=1)
raw_param_results = raw_param_results.apply(to_numeric, errors='ignore')
param_results = raw_param_results.groupby(keepers + columns, as_index=False).mean()
len(param_results)









    Out[20]:





223090



In [21]:

    
%%R
recommend_params <- function(data, prior, levels, references)
{
    data = data[data[,"Reference"] %in% references,]
    data = data[data[,"Class.Prior"] == prior,]
    data = data[data[,"Level"] %in% levels,]
    fit <- rpart(F.measure ~ Confidence + Use.IDF + Ngram.Range + N.Features + Alpha + Reference + Norm, 
                 data=data,
                 method="anova",
                 control=rpart.control(cp=0))
    rightmost_leaf <- fit$frame[fit$frame[,"yval"] == max(fit$frame[,"yval"]),]
    path.rpart(fit, as.numeric(rownames(rightmost_leaf)))
}



In [22]:

    
priors = ['uniform', 'prior']
reference_sets = [
    ['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read'],
    ['unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full',
    'unite_20.11.2016_clean_read']
]
level_sets = [[2,3,4,5], [6]]
for prior, levels, references in product(priors, level_sets, reference_sets):
    display(Markdown("Prior: `" + prior + '`'))
    display(Markdown("References: `"  + str(references) + '`'))
    display(Markdown("Levels: `" + str(levels) + '`'))
    %R -i param_results,prior,levels,references recommend_params(param_results, prior, levels, references)









    




Prior: uniform







    




References: ['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read']







    




Levels: [2, 3, 4, 5]







    





 node number: 31743 
   root
   Norm=l2,None
   Ngram.Range=[16,16],[8,8]
   Use.IDF=False
   N.Features< 4608
   Reference=gg_13_8_otus_amplicon,gg_13_8_otus_read
   Ngram.Range=[8,8]
   Norm=l2
   Alpha>=0.0055
   Confidence< 0.7
   Alpha< 0.055
   Confidence< 0.5
   Reference=gg_13_8_otus_read
   Confidence>=0.1
   Confidence>=0.3







    




Prior: uniform







    




References: ['unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full', 'unite_20.11.2016_clean_read']







    




Levels: [2, 3, 4, 5]







    





 node number: 30671 
   root
   Norm=l2,None
   Reference=unite_20.11.2016_clean_amplicon,unite_20.11.2016_clean_full
   Norm=l2
   Ngram.Range=[16,16],[4,16],[8,8]
   Alpha< 0.055
   N.Features>=4608
   Reference=unite_20.11.2016_clean_full
   Ngram.Range=[16,16],[8,8]
   Confidence>=0.5
   Alpha>=0.0055
   N.Features>=3.686e+04
   Ngram.Range=[16,16]
   Confidence>=0.7
   Use.IDF=True







    




Prior: uniform







    




References: ['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read']







    




Levels: [6]







    





 node number: 1919 
   root
   Norm=l2,None
   Ngram.Range=[16,16],[8,8]
   Reference=gg_13_8_otus_full
   N.Features>=4608
   Use.IDF=False
   Alpha< 0.055
   N.Features< 3.686e+04
   Norm=None
   Ngram.Range=[8,8]
   Confidence>=0.5







    




Prior: uniform







    




References: ['unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full', 'unite_20.11.2016_clean_read']







    




Levels: [6]







    





 node number: 2047 
   root
   Norm=l2,None
   Reference=unite_20.11.2016_clean_full
   Ngram.Range=[16,16],[8,8]
   Alpha< 0.055
   N.Features>=4608
   Alpha< 0.0055
   N.Features< 3.686e+04
   Norm=l2
   Ngram.Range=[8,8]
   Confidence>=0.3







    




Prior: prior







    




References: ['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read']







    




Levels: [2, 3, 4, 5]







    





 node number: 7675 
   root
   Norm=l2,None
   Ngram.Range=[16,16],[4,4],[8,8]
   Reference=gg_13_8_otus_full
   Ngram.Range=[16,16],[8,8]
   Norm=l2
   N.Features>=4608
   Alpha< 0.055
   N.Features< 3.686e+04
   Ngram.Range=[8,8]
   Confidence>=0.7
   Alpha>=0.0055
   Use.IDF=False







    




Prior: prior







    




References: ['unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full', 'unite_20.11.2016_clean_read']







    




Levels: [2, 3, 4, 5]







    





 node number: 3999 
   root
   Norm=l2,None
   Reference=unite_20.11.2016_clean_full
   Ngram.Range=[16,16],[8,8]
   N.Features>=4608
   Norm=l2
   Confidence>=0.7
   Alpha< 0.055
   Ngram.Range=[8,8]
   Alpha>=0.0055
   N.Features< 3.686e+04
   Use.IDF=False







    




Prior: prior







    




References: ['gg_13_8_otus_amplicon', 'gg_13_8_otus_full', 'gg_13_8_otus_read']







    




Levels: [6]







    





 node number: 7935 
   root
   Norm=l2,None
   Norm=l2
   Ngram.Range=[16,16],[4,16],[8,8]
   Reference=gg_13_8_otus_full
   N.Features>=4608
   Ngram.Range=[16,16],[8,8]
   Confidence< 0.5
   N.Features< 3.686e+04
   Alpha< 0.055
   Alpha< 0.0055
   Ngram.Range=[8,8]
   Use.IDF=False







    




Prior: prior







    




References: ['unite_20.11.2016_clean_amplicon', 'unite_20.11.2016_clean_full', 'unite_20.11.2016_clean_read']







    




Levels: [6]







    





 node number: 1007 
   root
   Reference=unite_20.11.2016_clean_full
   Norm=l2,None
   Ngram.Range=[16,16],[8,8]
   N.Features>=4608
   Alpha>=0.055
   Use.IDF=True
   Confidence>=0.5
   Norm=l2
   Ngram.Range=[16,16]

Kruskal-Wallis between-method accuracy comparisons

Kruskal-Wallis FDR-corrected p-values comparing classification methods at each level of taxonomic assignment



In [23]:

    
result = per_level_kruskal_wallis(mock_results, y_vars, group_by='Method', 
                                  dataset_col='Reference', level_name='Level',
                                  levelrange=range(2,7), alpha=0.05, 
                                  pval_correction='fdr_bh')
result









    Out[23]:







  
    
      
      Reference
      Variable
      2
      3
      4
      5
      6
    
  
  
    
      0
      gg_13_8_otus_amplicon
      Precision
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      1
      gg_13_8_otus_amplicon
      Recall
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      2
      gg_13_8_otus_amplicon
      F-measure
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      3
      gg_13_8_otus_amplicon
      Taxon Accuracy Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      4
      gg_13_8_otus_amplicon
      Taxon Detection Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      5
      gg_13_8_otus_full
      Precision
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      6
      gg_13_8_otus_full
      Recall
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      7
      gg_13_8_otus_full
      F-measure
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      8
      gg_13_8_otus_full
      Taxon Accuracy Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      9
      gg_13_8_otus_full
      Taxon Detection Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      10
      gg_13_8_otus_read
      Precision
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      11
      gg_13_8_otus_read
      Recall
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      12
      gg_13_8_otus_read
      F-measure
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      13
      gg_13_8_otus_read
      Taxon Accuracy Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      14
      gg_13_8_otus_read
      Taxon Detection Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      15
      unite_20.11.2016_clean_amplicon
      Precision
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      16
      unite_20.11.2016_clean_amplicon
      Recall
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      17
      unite_20.11.2016_clean_amplicon
      F-measure
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      18
      unite_20.11.2016_clean_amplicon
      Taxon Accuracy Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      19
      unite_20.11.2016_clean_amplicon
      Taxon Detection Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      20
      unite_20.11.2016_clean_full
      Precision
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      21
      unite_20.11.2016_clean_full
      Recall
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      22
      unite_20.11.2016_clean_full
      F-measure
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      23
      unite_20.11.2016_clean_full
      Taxon Accuracy Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      24
      unite_20.11.2016_clean_full
      Taxon Detection Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      25
      unite_20.11.2016_clean_read
      Precision
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      26
      unite_20.11.2016_clean_read
      Recall
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      27
      unite_20.11.2016_clean_read
      F-measure
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      28
      unite_20.11.2016_clean_read
      Taxon Accuracy Rate
      1.0
      1.0
      1.0
      1.0
      1.0
    
    
      29
      unite_20.11.2016_clean_read
      Taxon Detection Rate
      1.0
      1.0
      1.0
      1.0
      1.0

Violin plots of per-level accuracy

Heatmaps show the performance of individual method/parameter combinations at each taxonomic level, in each reference database (i.e., for bacterial and fungal mock communities individually).

Now we will focus on results at species level (for genus level, change to level 5)



In [24]:

    
mock_results_6 = mock_results[mock_results['Level'] == 6]



In [30]:

    
boxplot_from_data_frame?



In [31]:

    
boxplot_from_data_frame(mock_results_6, group_by="Method", metric="Precision", color_palette=color_pallette)









    












    Out[31]:





<matplotlib.axes._subplots.AxesSubplot at 0x12908c2e8>



In [32]:

    
boxplot_from_data_frame(mock_results_6, group_by="Method", metric="Recall", color_palette=color_pallette)









    












    Out[32]:





<matplotlib.axes._subplots.AxesSubplot at 0x11f65edd8>



In [33]:

    
boxplot_from_data_frame(mock_results_6, group_by="Method", metric="F-measure", color_palette=color_pallette)









    












    Out[33]:





<matplotlib.axes._subplots.AxesSubplot at 0x11b812f28>



In [34]:

    
boxplot_from_data_frame(mock_results_6, group_by="Method", metric="Taxon Accuracy Rate", color_palette=color_pallette)









    












    Out[34]:





<matplotlib.axes._subplots.AxesSubplot at 0x133b1f828>



In [35]:

    
boxplot_from_data_frame(mock_results_6, group_by="Method", metric="Taxon Detection Rate", color_palette=color_pallette)









    












    Out[35]:





<matplotlib.axes._subplots.AxesSubplot at 0x1291db2e8>

Method Optimization

Which method/parameter configuration performed "best" for a given score? We can rank the top-performing configuration by dataset, method, and taxonomic level.

First, the top-performing method/configuration combination by dataset.



In [36]:

    
for i in [n for n in range(1,27)]:
    display(Markdown('## mock-{0}'.format(i)))
    best = method_by_dataset_a1(mock_results_6, 'mock-{0}'.format(i))
    display(best)









    




mock-1






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-2






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-3






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      1815209
      nb-extra
      0.001:prior:65536:[8,8]:l2:True:0.4
      1.0
      1.0
      1.0
      1.0
      0.7
    
  








    




mock-4






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-5






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-6






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-7






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-8






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-9






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-10






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-11






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-12






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      5489
      nb-extra
      0.01:uniform:1024:[16,16]:l2:True:0.4
      0.995096
      0.954609
      0.974432
      0.295455
      0.619048
    
  








    




mock-13






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-14






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-15






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-16






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-17






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-18






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      39219
      nb-extra
      0.1:prior:1024:[8,8]:l2:True:0.6
      0.99998
      1.0
      0.99999
      0.8125
      0.866667
    
  








    




mock-19






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-20






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-21






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-22






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      74449
      nb-extra
      0.001:prior:65536:[16,16]:l2:True:0.8
      0.999744
      0.885041
      0.938902
      0.62963
      0.894737
    
  








    




mock-23






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-24






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      103099
      nb-extra
      0.1:prior:8192:[4,16]:l2:True:0.8
      1.0
      0.889646
      0.941601
      0.5
      0.125
    
  








    




mock-25






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
  








    




mock-26






    







  
    
      
      Method
      Parameters
      Precision
      Recall
      F-measure
      Taxon Accuracy Rate
      Taxon Detection Rate

Now we can determine which parameter configuration performed best for each method. Count best values in each column indicate how many samples a given method achieved within one mean absolute deviation of the best result (which is why they may sum to more than the total number of samples).



In [37]:

    
for method in mock_results_6['Method'].unique():
    top_params = parameter_comparisons(
        mock_results_6, method, 
        metrics=['Precision', 'Recall', 'F-measure',
                 'Taxon Accuracy Rate', 'Taxon Detection Rate'])
    display(Markdown('## {0}'.format(method)))
    display(top_params[:10])









    




nb-extra






    







  
    
      
      F-measure
      Precision
      Recall
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      0.01:prior:1024:[8,8]:l2:False:0.0
      139.0
      125.0
      145.0
      145.0
      74.0
    
    
      0.01:prior:65536:[8,8]:l2:True:0.2
      140.0
      126.0
      146.0
      51.0
      74.0
    
    
      0.01:prior:65536:[8,8]:l2:True:0.0
      140.0
      126.0
      146.0
      51.0
      74.0
    
    
      0.1:prior:1024:[8,8]:l2:True:0.0
      139.0
      125.0
      145.0
      159.0
      73.0
    
    
      0.1:prior:8192:[8,8]:l2:True:0.0
      140.0
      126.0
      146.0
      130.0
      73.0
    
    
      0.1:prior:8192:[8,8]:l2:True:0.2
      140.0
      126.0
      146.0
      127.0
      73.0
    
    
      0.01:prior:1024:[16,16]:l2:False:0.0
      135.0
      121.0
      141.0
      143.0
      72.0
    
    
      0.01:prior:8192:[8,8]:l2:False:0.0
      140.0
      126.0
      146.0
      86.0
      72.0
    
    
      0.1:prior:1024:[16,16]:l2:True:0.2
      135.0
      121.0
      141.0
      147.0
      72.0
    
    
      0.1:prior:1024:[16,16]:l2:True:0.0
      135.0
      121.0
      141.0
      149.0
      72.0



In [38]:

    
uniform_6 = mock_results_6[['uniform' in p for p in mock_results_6['Parameters']]]
for method in uniform_6['Method'].unique():
    top_params = parameter_comparisons(
        uniform_6, method, 
        metrics=['Precision', 'Recall', 'F-measure',
                 'Taxon Accuracy Rate', 'Taxon Detection Rate'])
    display(Markdown('## {0}'.format(method)))
    display(top_params[:10])









    




nb-extra






    







  
    
      
      F-measure
      Precision
      Recall
      Taxon Accuracy Rate
      Taxon Detection Rate
    
  
  
    
      0.001:uniform:65536:[8,8]:l2:True:0.4
      65.0
      60.0
      72.0
      73.0
      74.0
    
    
      0.001:uniform:8192:[8,8]:l2:False:0.2
      63.0
      58.0
      71.0
      72.0
      74.0
    
    
      0.1:uniform:65536:[8,8]:None:True:0.8
      63.0
      56.0
      73.0
      70.0
      74.0
    
    
      0.1:uniform:65536:[8,8]:None:True:0.6
      63.0
      56.0
      73.0
      70.0
      74.0
    
    
      0.1:uniform:65536:[8,8]:None:True:0.4
      63.0
      56.0
      73.0
      70.0
      74.0
    
    
      0.1:uniform:65536:[8,8]:None:True:0.2
      63.0
      56.0
      73.0
      70.0
      74.0
    
    
      0.1:uniform:65536:[8,8]:None:True:0.0
      63.0
      56.0
      73.0
      70.0
      74.0
    
    
      0.01:uniform:8192:[8,8]:l2:True:0.0
      63.0
      57.0
      69.0
      70.0
      74.0
    
    
      0.001:uniform:8192:[8,8]:l2:False:0.0
      63.0
      58.0
      71.0
      71.0
      74.0
    
    
      0.001:uniform:65536:[8,8]:l2:True:0.0
      63.0
      58.0
      72.0
      70.0
      74.0

Optimized method performance

And, finally, which method performed best at each individual taxonomic level for each reference dataset (i.e., for across all fungal and bacterial mock communities combined)?

For this analysis, we rank the top-performing method/parameter combination for each method at family through species levels. Methods are ranked by top F-measure, and the average value for each metric is shown (rather than count best as above). F-measure distributions are plotted for each method, and compared using paired t-tests with FDR-corrected P-values. This cell does not need to be altered, unless if you wish to change the metric used for sorting best methods and for plotting.



In [39]:

    
rank_optimized_method_performance_by_dataset(mock_results,
                                             dataset="Reference",
                                             metric="F-measure",
                                             level_range=range(4,7),
                                             display_fields=["Method",
                                                             "Parameters",
                                                             "Taxon Accuracy Rate",
                                                             "Taxon Detection Rate",
                                                             "Precision",
                                                             "Recall",
                                                             "F-measure"],
                                             paired=True,
                                             parametric=True,
                                             color=None,
                                             color_palette=color_pallette)









    




gg_13_8_otus_amplicon level 4






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[16,16]:l2:True:0.8
      0.79959
      0.829692
      0.999721
      0.999976
      0.999848
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_amplicon level 5






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.01:prior:1024:[8,8]:l2:True:0.6
      0.816088
      0.800213
      0.993622
      0.993945
      0.993784
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_amplicon level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:1024:[16,16]:l2:False:0.4
      0.879643
      0.787719
      0.968601
      0.969304
      0.968952
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_full level 4






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:uniform:65536:[8,8]:l2:False:0.8
      0.783437
      0.829692
      0.999728
      0.999987
      0.999857
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_full level 5






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.01:prior:8192:[8,8]:l2:False:0.8
      0.818158
      0.790009
      0.993642
      0.993903
      0.993772
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_full level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:8192:[8,8]:l2:False:0.2
      0.861703
      0.783638
      0.976765
      0.977484
      0.977124
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_read level 4






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.01:uniform:8192:[16,16]:l2:True:0.6
      0.782342
      0.829692
      0.999724
      0.999972
      0.999848
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_read level 5






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.01:prior:1024:[8,8]:l2:True:0.6
      0.810208
      0.800213
      0.993659
      0.993945
      0.993802
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_read level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:1024:[8,8]:l2:False:0.4
      0.871071
      0.787719
      0.968596
      0.969297
      0.968946
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_amplicon level 4






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[4,16]:l2:False:0.8
      0.723485
      0.54717
      0.907487
      0.921429
      0.912529
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_amplicon level 5






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[4,16]:l2:False:0.8
      0.700848
      0.526587
      0.873847
      0.891686
      0.881093
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_amplicon level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:1024:[8,8]:l2:False:0.4
      0.540731
      0.256218
      0.975314
      0.674285
      0.725063
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_full level 4






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:8192:[16,16]:l2:True:0.8
      0.817835
      0.539819
      0.985147
      0.997707
      0.991269
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_full level 5






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:8192:[16,16]:l2:True:0.8
      0.817835
      0.538808
      0.985147
      0.997707
      0.991269
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_full level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:8192:[16,16]:l2:True:0.8
      0.817835
      0.538808
      0.985147
      0.997707
      0.991269
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_read level 4






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[8,8]:l2:True:0.8
      0.752057
      0.472678
      0.84
      0.786714
      0.777958
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_read level 5






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[8,8]:l2:True:0.6
      0.730044
      0.472341
      0.840278
      0.786714
      0.778107
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_read level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.01:prior:8192:[4,16]:l2:True:0.6
      0.399013
      0.257933
      0.809594
      0.674285
      0.720384
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    Out[39]:





{'gg_13_8_otus_amplicon': <matplotlib.axes._subplots.AxesSubplot at 0x12f48f3c8>,
 'gg_13_8_otus_full': <matplotlib.axes._subplots.AxesSubplot at 0x14814b9b0>,
 'gg_13_8_otus_read': <matplotlib.axes._subplots.AxesSubplot at 0x14773fc50>,
 'unite_20.11.2016_clean_amplicon': <matplotlib.axes._subplots.AxesSubplot at 0x146d9df98>,
 'unite_20.11.2016_clean_full': <matplotlib.axes._subplots.AxesSubplot at 0x13505b5c0>,
 'unite_20.11.2016_clean_read': <matplotlib.axes._subplots.AxesSubplot at 0x1347a5f98>}



In [40]:

    
rank_optimized_method_performance_by_dataset(mock_results,
                                             dataset="Reference",
                                             metric="Taxon Accuracy Rate",
                                             level_range=range(6,7),
                                             display_fields=["Method",
                                                             "Parameters",
                                                             "Taxon Accuracy Rate",
                                                             "Taxon Detection Rate",
                                                             "Precision",
                                                             "Recall",
                                                             "F-measure"],
                                             paired=True,
                                             parametric=True,
                                             color=None,
                                             color_palette=color_pallette)









    




gg_13_8_otus_amplicon level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:65536:[4,4]:l1:True:0.0
      1.0
      0.107555
      0.50601
      0.506222
      0.506116
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_full level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:1024:[4,4]:l1:True:0.0
      1.0
      0.071464
      0.449513
      0.449712
      0.449613
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




gg_13_8_otus_read level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.1:prior:65536:[4,4]:l1:True:0.0
      1.0
      0.12184
      0.506509
      0.506721
      0.506615
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_amplicon level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[16,16]:l1:True:0.0
      0.886792
      0.093911
      0.438905
      0.45676
      0.447373
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_full level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.01:prior:65536:[16,16]:l1:False:0.0
      1.0
      0.274014
      0.638015
      0.673886
      0.65502
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    




unite_20.11.2016_clean_read level 6






    







  
    
      
      Method
      Parameters
      Taxon Accuracy Rate
      Taxon Detection Rate
      Precision
      Recall
      F-measure
    
  
  
    
      0
      nb-extra
      0.001:prior:65536:[8,8]:l1:False:0.2
      1.0
      0.093911
      0.438905
      0.45676
      0.447373
    
  








    







  
    
      
      
      stat
      P
    
    
      Method A
      Method B
      
      
    
  
  
  








    












    Out[40]:





{'gg_13_8_otus_amplicon': <matplotlib.axes._subplots.AxesSubplot at 0x1343aa2e8>,
 'gg_13_8_otus_full': <matplotlib.axes._subplots.AxesSubplot at 0x133e8beb8>,
 'gg_13_8_otus_read': <matplotlib.axes._subplots.AxesSubplot at 0x133d7d710>,
 'unite_20.11.2016_clean_amplicon': <matplotlib.axes._subplots.AxesSubplot at 0x134487438>,
 'unite_20.11.2016_clean_full': <matplotlib.axes._subplots.AxesSubplot at 0x12e92d9e8>,
 'unite_20.11.2016_clean_read': <matplotlib.axes._subplots.AxesSubplot at 0x12e548e10>}



In [ ]:

	Reference	Variable	2	3	4	5	6
0	gg_13_8_otus_amplicon	Precision	1.0	1.0	1.0	1.0	1.0
1	gg_13_8_otus_amplicon	Recall	1.0	1.0	1.0	1.0	1.0
2	gg_13_8_otus_amplicon	F-measure	1.0	1.0	1.0	1.0	1.0
3	gg_13_8_otus_amplicon	Taxon Accuracy Rate	1.0	1.0	1.0	1.0	1.0
4	gg_13_8_otus_amplicon	Taxon Detection Rate	1.0	1.0	1.0	1.0	1.0
5	gg_13_8_otus_full	Precision	1.0	1.0	1.0	1.0	1.0
6	gg_13_8_otus_full	Recall	1.0	1.0	1.0	1.0	1.0
7	gg_13_8_otus_full	F-measure	1.0	1.0	1.0	1.0	1.0
8	gg_13_8_otus_full	Taxon Accuracy Rate	1.0	1.0	1.0	1.0	1.0
9	gg_13_8_otus_full	Taxon Detection Rate	1.0	1.0	1.0	1.0	1.0
10	gg_13_8_otus_read	Precision	1.0	1.0	1.0	1.0	1.0
11	gg_13_8_otus_read	Recall	1.0	1.0	1.0	1.0	1.0
12	gg_13_8_otus_read	F-measure	1.0	1.0	1.0	1.0	1.0
13	gg_13_8_otus_read	Taxon Accuracy Rate	1.0	1.0	1.0	1.0	1.0
14	gg_13_8_otus_read	Taxon Detection Rate	1.0	1.0	1.0	1.0	1.0
15	unite_20.11.2016_clean_amplicon	Precision	1.0	1.0	1.0	1.0	1.0
16	unite_20.11.2016_clean_amplicon	Recall	1.0	1.0	1.0	1.0	1.0
17	unite_20.11.2016_clean_amplicon	F-measure	1.0	1.0	1.0	1.0	1.0
18	unite_20.11.2016_clean_amplicon	Taxon Accuracy Rate	1.0	1.0	1.0	1.0	1.0
19	unite_20.11.2016_clean_amplicon	Taxon Detection Rate	1.0	1.0	1.0	1.0	1.0
20	unite_20.11.2016_clean_full	Precision	1.0	1.0	1.0	1.0	1.0
21	unite_20.11.2016_clean_full	Recall	1.0	1.0	1.0	1.0	1.0
22	unite_20.11.2016_clean_full	F-measure	1.0	1.0	1.0	1.0	1.0
23	unite_20.11.2016_clean_full	Taxon Accuracy Rate	1.0	1.0	1.0	1.0	1.0
24	unite_20.11.2016_clean_full	Taxon Detection Rate	1.0	1.0	1.0	1.0	1.0
25	unite_20.11.2016_clean_read	Precision	1.0	1.0	1.0	1.0	1.0
26	unite_20.11.2016_clean_read	Recall	1.0	1.0	1.0	1.0	1.0
27	unite_20.11.2016_clean_read	F-measure	1.0	1.0	1.0	1.0	1.0
28	unite_20.11.2016_clean_read	Taxon Accuracy Rate	1.0	1.0	1.0	1.0	1.0
29	unite_20.11.2016_clean_read	Taxon Detection Rate	1.0	1.0	1.0	1.0	1.0

	F-measure	Precision	Recall	Taxon Accuracy Rate	Taxon Detection Rate
0.01:prior:1024:[8,8]:l2:False:0.0	139.0	125.0	145.0	145.0	74.0
0.01:prior:65536:[8,8]:l2:True:0.2	140.0	126.0	146.0	51.0	74.0
0.01:prior:65536:[8,8]:l2:True:0.0	140.0	126.0	146.0	51.0	74.0
0.1:prior:1024:[8,8]:l2:True:0.0	139.0	125.0	145.0	159.0	73.0
0.1:prior:8192:[8,8]:l2:True:0.0	140.0	126.0	146.0	130.0	73.0
0.1:prior:8192:[8,8]:l2:True:0.2	140.0	126.0	146.0	127.0	73.0
0.01:prior:1024:[16,16]:l2:False:0.0	135.0	121.0	141.0	143.0	72.0
0.01:prior:8192:[8,8]:l2:False:0.0	140.0	126.0	146.0	86.0	72.0
0.1:prior:1024:[16,16]:l2:True:0.2	135.0	121.0	141.0	147.0	72.0
0.1:prior:1024:[16,16]:l2:True:0.0	135.0	121.0	141.0	149.0	72.0

	F-measure	Precision	Recall	Taxon Accuracy Rate	Taxon Detection Rate
0.001:uniform:65536:[8,8]:l2:True:0.4	65.0	60.0	72.0	73.0	74.0
0.001:uniform:8192:[8,8]:l2:False:0.2	63.0	58.0	71.0	72.0	74.0
0.1:uniform:65536:[8,8]:None:True:0.8	63.0	56.0	73.0	70.0	74.0
0.1:uniform:65536:[8,8]:None:True:0.6	63.0	56.0	73.0	70.0	74.0
0.1:uniform:65536:[8,8]:None:True:0.4	63.0	56.0	73.0	70.0	74.0
0.1:uniform:65536:[8,8]:None:True:0.2	63.0	56.0	73.0	70.0	74.0
0.1:uniform:65536:[8,8]:None:True:0.0	63.0	56.0	73.0	70.0	74.0
0.01:uniform:8192:[8,8]:l2:True:0.0	63.0	57.0	69.0	70.0	74.0
0.001:uniform:8192:[8,8]:l2:False:0.0	63.0	58.0	71.0	71.0	74.0
0.001:uniform:65536:[8,8]:l2:True:0.0	63.0	58.0	72.0	70.0	74.0

Evaluate mock community classification accuracy for nb-extra

Prepare the environment

Configure local environment-specific values

Find mock community pre-computed tables, expected tables, and "query" tables

Compute and summarize precision, recall, and F-measure for mock communities

CART Analysis

Split the Parameter String and Aggregate by Community

Kruskal-Wallis between-method accuracy comparisons

Violin plots of per-level accuracy

Method Optimization

mock-1

mock-2

mock-3

mock-4

mock-5

mock-6

mock-7

mock-8

mock-9

mock-10

mock-11

mock-12

mock-13

mock-14

mock-15

mock-16

mock-17

mock-18

mock-19

mock-20

mock-21

mock-22

mock-23

mock-24

mock-25

mock-26

nb-extra

nb-extra

Optimized method performance

gg_13_8_otus_amplicon level 4

gg_13_8_otus_amplicon level 5

gg_13_8_otus_amplicon level 6

gg_13_8_otus_full level 4

gg_13_8_otus_full level 5

gg_13_8_otus_full level 6

gg_13_8_otus_read level 4

gg_13_8_otus_read level 5

gg_13_8_otus_read level 6

unite_20.11.2016_clean_amplicon level 4

unite_20.11.2016_clean_amplicon level 5

unite_20.11.2016_clean_amplicon level 6

unite_20.11.2016_clean_full level 4

unite_20.11.2016_clean_full level 5

unite_20.11.2016_clean_full level 6

unite_20.11.2016_clean_read level 4

unite_20.11.2016_clean_read level 5