In this notebook we demonstrate a basic document level classification of reports with respect to a single finding ( fever). We leverage the convenience of Pandas to read our data from a MySQL database and then use Pandas to add our classification as a new column in the dataframe.
Many of the common pyConTextNLP tasks have been wrapped into functions contained in the radnlp
pacakge. We important multiple modules that will allow us to write concise code.
In [1]:
from utils import *
import pandas as pd
data = get_data()
data.head(5)
We now need to apply our schema to the reports. Since our data is in a Pandas data frame, the easiest way to process our reports is with the DataFrame apply
method.
lambda
to create an anonymous function which basically just applies analyze_report
to the "impression"
column with the modifiers, targets, etc. that we have read in separately.analyze_report
returns a dictionary with keys
as any identified targets defined in the "targets"
file and values as a tuple with values:
In [3]:
options = getOptions()
kb = get_kb_rules_schema(options)
#data = data.dropna()
data["pe rslt"] = \
data.apply(lambda x: analyze_report(x["impression"],
kb["modifiers"],
kb["targets"],
kb["rules"],
kb["schema"]), axis=1)
In [4]:
view_markup(data, colors)
In [ ]: