Experimental Results from a Decision Tree based NER model

Decisions Trees, as opposed to other machine learning techniques such as SVM's and Neural Networks, provide a human-interpretable classification model. We will exploit this to both generate pretty pictures and glean information for feature selection in our high dimensionality datasets.

This report will provide precision, recall, and f-measure values for Decision Trees built on the orthographic; orthograhic + morphological; orthographic + morphological + lexical feature sets for the Adverse Reaction, Indication, Active Ingredient, and Inactive Ingredient entities. A viewable Decision Tree structure will also be generated for each fold.


The file 'decisiontree.py' builds a Decision Tree classifier on the sparse format ARFF file passed in as a parameter. This file is saved in the models directory with the format 'decisiontree_[featuresets]_[entity name].pkl'
The file 'evaluate_decisiontree.py' evaluates a given Decision Tree model stored inside a '.pkl' file outputing appropriate statistics and saving a pdf image of the underlying decision structure associated with the given model.

All ARFF files were cleaned with 'arff_translator.py'. This cleaning consisted of removing a comma from each instance that was mistakenly inserted during file creation.

python3 arff_translator.py [filename]

Adverse Reaction Feature Set

Orthographic Features


In [5]:
import subprocess

""" Creates models for each fold and runs evaluation with results """
featureset = "o"
entity_name = "adversereaction"

for fold in range(1,1): #training has already been done
    training_data = "../ARFF_Files/%s_ARFF/_%s/_train/%s_train-%i.arff" % (entity_name, featureset, entity_name, fold)
    os.system("python3 decisiontree.py -tr %s" % (training_data))


for fold in range(1,11):
    testing_data = "../ARFF_Files/%s_ARFF/_%s/_test/%s_test-%i.arff" % (entity_name, featureset, entity_name, fold)
    output = subprocess.check_output("python3 evaluate_decisiontree.py -te %s" % (testing_data), shell=True)
    print(output.decode('utf-8'))


adversereaction_test-1.arff
Precision: 0.961538
Recall: 0.013789
[[   25  1788]
 [    1 16927]]


adversereaction_test-2.arff
Precision: 0.750000
Recall: 0.008167
[[    9  1093]
 [    3 19878]]


adversereaction_test-3.arff
Precision: 0.333333
Recall: 0.001961
[[    1   509]
 [    2 10642]]


adversereaction_test-4.arff
Precision: 1.000000
Recall: 0.009394
[[   11  1160]
 [    0 10655]]


adversereaction_test-5.arff
Precision: 0.571429
Recall: 0.010852
[[   20  1823]
 [   15 18196]]


adversereaction_test-6.arff
Precision: 0.166667
Recall: 0.002210
[[    2   903]
 [   10 13178]]


adversereaction_test-7.arff
Precision: 0.800000
Recall: 0.006098
[[    4   652]
 [    1 18655]]


adversereaction_test-8.arff
Precision: 0.708333
Recall: 0.020118
[[   17   828]
 [    7 15856]]


adversereaction_test-9.arff
Precision: 0.500000
Recall: 0.001765
[[   2 1131]
 [   2 8715]]


adversereaction_test-10.arff
Precision: 0.538462
Recall: 0.006261
[[    7  1111]
 [    6 15010]]


Rather lackluster performance.

Orthographic + Morphological Features


In [9]:
import subprocess

""" Creates models for each fold and runs evaluation with results """
featureset = "om"
entity_name = "adversereaction"

for fold in range(1,1): #training has already been done
    training_data = "../ARFF_Files/%s_ARFF/_%s/_train/%s_train-%i.arff" % (entity_name, featureset, entity_name, fold)
    os.system("python3 decisiontree.py -tr %s" % (training_data))


for fold in range(1,11):
    testing_data = "../ARFF_Files/%s_ARFF/_%s/_test/%s_test-%i.arff" % (entity_name, featureset, entity_name, fold)
    output = subprocess.check_output("python3 evaluate_decisiontree.py -te %s" % (testing_data), shell=True)
    print(output.decode('utf-8'))


adversereaction_test-1.arff
Precision: 0.810458
Recall: 0.478764
[[  868   945]
 [  203 16725]]


adversereaction_test-2.arff
Precision: 0.475576
Recall: 0.468240
[[  516   586]
 [  569 19312]]


adversereaction_test-3.arff
Precision: 0.487965
Recall: 0.437255
[[  223   287]
 [  234 10410]]


adversereaction_test-4.arff
Precision: 0.795165
Recall: 0.533732
[[  625   546]
 [  161 10494]]


adversereaction_test-5.arff
Precision: 0.767084
Recall: 0.432447
[[  797  1046]
 [  242 17969]]


adversereaction_test-6.arff
Precision: 0.607207
Recall: 0.372376
[[  337   568]
 [  218 12970]]


adversereaction_test-7.arff
Precision: 0.423135
Recall: 0.423780
[[  278   378]
 [  379 18277]]


adversereaction_test-8.arff
Precision: 0.526387
Recall: 0.460355
[[  389   456]
 [  350 15513]]


adversereaction_test-9.arff
Precision: 0.797601
Recall: 0.469550
[[ 532  601]
 [ 135 8582]]


adversereaction_test-10.arff
Precision: 0.732477
Recall: 0.560823
[[  627   491]
 [  229 14787]]


It appears adding in the morphological features greatly increased classifier performance.
Below, find the underlying decision tree structure representing the classifier.


In [41]:
import graphviz
from sklearn.externals import joblib
from Tools import arff_converter
from sklearn import tree

featureset = "o"
entity_name = "adversereaction"

fold = 3
training_data = "../ARFF_Files/%s_ARFF/_%s/_train/%s_train-%i.arff" % (entity_name, featureset, entity_name, fold)
dataset = arff_converter.arff2df(training_data)
dtree = joblib.load('../Models/decisiontree/adversereaction_o/decisiontree_o_adversereaction_train-%i.arff.pkl' % fold)
tree.export_graphviz(dtree, out_file="visual/temptree.dot",
                                  feature_names=dataset.columns.values[:-1],
                                  class_names=["Entity", "Non-Entity"], label='all',
                               filled=True, rounded=True, proportion=False, leaves_parallel=True,
                                 special_characters=True)
with open("visual/temptree.dot") as f:
    dot_graph = f.read()
graphviz.Source(dot_graph)


Out[41]:
'Source.gv.pdf'

In [ ]: