2018.02.09 - prelim - disagreement analysis

Setup

Setup - Imports


In [1]:
import datetime
import six

print( "packages imported at " + str( datetime.datetime.now() ) )


packages imported at 2019-04-11 18:59:35.867114

Setup - virtualenv jupyter kernel

If you are using a virtualenv, make sure that you:

  • have installed your virtualenv as a kernel.
  • choose the kernel for your virtualenv as the kernel for your notebook (Kernel --> Change kernel).

Since I use a virtualenv, need to get that activated somehow inside this notebook. One option is to run ../dev/wsgi.py in this notebook, to configure the python environment manually as if you had activated the sourcenet virtualenv. To do this, you'd make a code cell that contains:

%run ../dev/wsgi.py

This is sketchy, however, because of the changes it makes to your Python environment within the context of whatever your current kernel is. I'd worry about collisions with the actual Python 3 kernel. Better, one can install their virtualenv as a separate kernel. Steps:

  • activate your virtualenv:

      workon sourcenet
  • in your virtualenv, install the package ipykernel.

      pip install ipykernel
  • use the ipykernel python program to install the current environment as a kernel:

      python -m ipykernel install --user --name <env_name> --display-name "<display_name>"
    
    

    sourcenet example:

      python -m ipykernel install --user --name sourcenet --display-name "sourcenet (Python 3)"

More details: http://ipython.readthedocs.io/en/stable/install/kernel_install.html


In [2]:
%pwd


Out[2]:
'/home/jonathanmorgan/work/django/research/work/phd_work/methods/evaluate_disagreements'

Setup - Initialize Django

First, initialize my dev django project, so I can run code in this notebook that references my django models and can talk to the database using my project's settings.


In [4]:
%run ../django_init.py


django initialized at 2019-04-11 19:00:51.443162

In [5]:
# django imports
from context_analysis.models import Reliability_Names_Evaluation

Setup - field metadata


In [6]:
# CONSTANTS-ish - names of boolean model fields.
FIELD_NAME_IS_AMBIGUOUS = "is_ambiguous"
FIELD_NAME_IS_ATTRIBUTION_COMPOUND = "is_attribution_compound"
FIELD_NAME_IS_ATTRIBUTION_FOLLOW_ON = "is_attribution_follow_on"
FIELD_NAME_IS_ATTRIBUTION_PRONOUN = "is_attribution_pronoun"
FIELD_NAME_IS_ATTRIBUTION_SECOND_HAND = "is_attribution_second_hand"
FIELD_NAME_IS_COMPLEX = "is_complex"
FIELD_NAME_IS_COMPOUND_NAMES = "is_compound_names"
FIELD_NAME_IS_CONTRIBUTED_TO = "is_contributed_to"
FIELD_NAME_IS_DICTIONARY_ERROR = "is_dictionary_error"
FIELD_NAME_IS_DISAMBIGUATION = "is_disambiguation"
FIELD_NAME_IS_EDITING_ERROR = "is_editing_error"
FIELD_NAME_IS_ERROR = "is_error"
FIELD_NAME_IS_FOREIGN_NAMES = "is_foreign_names"
FIELD_NAME_IS_GENDER_CONFUSION = "is_gender_confusion"
FIELD_NAME_IS_INITIALS_ERROR = "is_initials_error"
FIELD_NAME_IS_INTERESTING = "is_interesting"
FIELD_NAME_IS_LAYOUT_OR_DESIGN = "is_layout_or_design"
FIELD_NAME_IS_LIST = "is_list"
FIELD_NAME_IS_LOOKUP_ERROR = "is_lookup_error"
FIELD_NAME_IS_NO_HTML = "is_no_html"
FIELD_NAME_IS_NOT_HARD_NEWS = "is_not_hard_news"
FIELD_NAME_IS_POSSESSIVE = "is_possessive"
FIELD_NAME_IS_PRONOUNS = "is_pronouns"
FIELD_NAME_IS_PROPER_NOUN = "is_proper_noun"
FIELD_NAME_IS_QUOTE_DISTANCE = "is_quote_distance"
FIELD_NAME_IS_SAID_VERB = "is_said_verb"
FIELD_NAME_IS_SHORT_N_GRAM = "is_short_n_gram"
FIELD_NAME_IS_SOFTWARE_ERROR = "is_software_error"
FIELD_NAME_IS_SPANISH = "is_spanish"
FIELD_NAME_IS_SPORTS = "is_sports"
FIELD_NAME_IS_STRAIGHTFORWARD = "is_straightforward"
FIELD_NAME_IS_TITLE = "is_title"
FIELD_NAME_IS_TITLE_COMPLEX = "is_title_complex"
FIELD_NAME_IS_TITLE_PREFIX = "is_title_prefix"

In [7]:
# CONSTANTS-ish - other related field names.
FIELD_NAME_IS_SUBJECT_SHB_AUTHOR = "is_subject_shb_author"
FIELD_NAME_IS_NOT_A_PERSON = "is_not_a_person"

In [8]:
# CONSTANTS-ish - names of properties per field.
PROP_NAME = "name"
PROP_TAG_LIST = "tag_list"
PROP_IS_ERROR = "is_error"
PROP_ASSOCIATED_FIELDS = "associated_fields"

# CONSTANTS-ish - map of field names to field traits.
FIELD_NAME_TO_TRAITS_MAP = {}

# CONSTANTS-ish - map tag values to field names.
TAG_TO_FIELD_NAME_MAP = {}

In [9]:
# set up mapping of field names to traits.
temp_traits_map = {}

# FIELD_NAME_IS_AMBIGUOUS = "is_ambiguous"
temp_traits_map = {}
field_name = FIELD_NAME_IS_AMBIGUOUS
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'ambiguous', 'ambiguity' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_ATTRIBUTION_COMPOUND = "is_attribution_compound"
temp_traits_map = {}
field_name = FIELD_NAME_IS_ATTRIBUTION_COMPOUND
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'compound_attribution' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_ATTRIBUTION_FOLLOW_ON = "is_attribution_follow_on"
temp_traits_map = {}
field_name = FIELD_NAME_IS_ATTRIBUTION_FOLLOW_ON
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'follow_on_attribution' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_ATTRIBUTION_PRONOUN = "is_attribution_pronoun"
temp_traits_map = {}
field_name = FIELD_NAME_IS_ATTRIBUTION_PRONOUN
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'pronoun_attribution' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_ATTRIBUTION_SECOND_HAND = "is_attribution_second_hand"
temp_traits_map = {}
field_name = FIELD_NAME_IS_ATTRIBUTION_SECOND_HAND
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'second_hand' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_COMPLEX = "is_complex"
temp_traits_map = {}
field_name = FIELD_NAME_IS_COMPLEX
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'complex' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_COMPOUND_NAMES = "is_compound_names"
temp_traits_map = {}
field_name = FIELD_NAME_IS_COMPOUND_NAMES
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'compound_names' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_CONTRIBUTED_TO = "is_contributed_to"
temp_traits_map = {}
field_name = FIELD_NAME_IS_CONTRIBUTED_TO
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'contributed_to' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR, FIELD_NAME_IS_SUBJECT_SHB_AUTHOR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_DICTIONARY_ERROR = "is_dictionary_error"
temp_traits_map = {}
field_name = FIELD_NAME_IS_DICTIONARY_ERROR
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'dictionary_error' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_DISAMBIGUATION = "is_disambiguation"
temp_traits_map = {}
field_name = FIELD_NAME_IS_DISAMBIGUATION
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'disambiguation' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_EDITING_ERROR = "is_editing_error"
temp_traits_map = {}
field_name = FIELD_NAME_IS_EDITING_ERROR
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'editing_error' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_ERROR = "is_error"
temp_traits_map = {}
field_name = FIELD_NAME_IS_ERROR
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'error' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_FOREIGN_NAMES = "is_foreign_names"
temp_traits_map = {}
field_name = FIELD_NAME_IS_FOREIGN_NAMES
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'foreign_names' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_GENDER_CONFUSION = "is_gender_confusion"
temp_traits_map = {}
field_name = FIELD_NAME_IS_GENDER_CONFUSION
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'gender_confusion' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_INITIALS_ERROR = "is_initials_error"
temp_traits_map = {}
field_name = FIELD_NAME_IS_INITIALS_ERROR
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'initials' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_INTERESTING = "is_interesting"
temp_traits_map = {}
field_name = FIELD_NAME_IS_INTERESTING
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'interesting' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_LAYOUT_OR_DESIGN = "is_layout_or_design"
temp_traits_map = {}
field_name = FIELD_NAME_IS_LAYOUT_OR_DESIGN
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'layout_or_design' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_LIST = "is_list"
temp_traits_map = {}
field_name = FIELD_NAME_IS_LIST
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'list', 'lists' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_LOOKUP_ERROR = "is_lookup_error"
temp_traits_map = {}
field_name = FIELD_NAME_IS_LOOKUP_ERROR
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'lookup' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_NO_HTML = "is_no_html"
temp_traits_map = {}
field_name = FIELD_NAME_IS_NO_HTML
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'no_html' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_NOT_HARD_NEWS = "is_not_hard_news"
temp_traits_map = {}
field_name = FIELD_NAME_IS_NOT_HARD_NEWS
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'non_news' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_POSSESSIVE = "is_possessive"
temp_traits_map = {}
field_name = FIELD_NAME_IS_POSSESSIVE
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'possessive' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_PRONOUNS = "is_pronouns"
temp_traits_map = {}
field_name = FIELD_NAME_IS_PRONOUNS
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'pronouns' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_PROPER_NOUN = "is_proper_noun"
temp_traits_map = {}
field_name = FIELD_NAME_IS_PROPER_NOUN
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'proper_noun', 'proper_nouns' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR, FIELD_NAME_IS_NOT_A_PERSON ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_QUOTE_DISTANCE = "is_quote_distance"
temp_traits_map = {}
field_name = FIELD_NAME_IS_QUOTE_DISTANCE
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'quote_distance' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_SAID_VERB = "is_said_verb"
temp_traits_map = {}
field_name = FIELD_NAME_IS_SAID_VERB
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'said_verb' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_SHORT_N_GRAM = "is_short_n_gram"
temp_traits_map = {}
field_name = FIELD_NAME_IS_SHORT_N_GRAM
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'short_n-gram' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_SOFTWARE_ERROR = "is_software_error"
temp_traits_map = {}
field_name = FIELD_NAME_IS_SOFTWARE_ERROR
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'software_error' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_SPANISH = "is_spanish"
temp_traits_map = {}
field_name = FIELD_NAME_IS_SPANISH
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'spanish' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_SPORTS = "is_sports"
temp_traits_map = {}
field_name = FIELD_NAME_IS_SPORTS
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'sports' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_STRAIGHTFORWARD = "is_straightforward"
temp_traits_map = {}
field_name = FIELD_NAME_IS_STRAIGHTFORWARD
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'straightforward' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = []
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_TITLE = "is_title"
temp_traits_map = {}
field_name = FIELD_NAME_IS_TITLE
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'complex_title', 'complex_titles', 'title_prefix' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_TITLE_COMPLEX = "is_title_complex"
temp_traits_map = {}
field_name = FIELD_NAME_IS_TITLE_COMPLEX
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'complex_title', 'complex_titles' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

# FIELD_NAME_IS_TITLE_PREFIX = "is_title_prefix"
temp_traits_map = {}
field_name = FIELD_NAME_IS_TITLE_PREFIX
temp_traits_map[ PROP_NAME ] = field_name
temp_traits_map[ PROP_TAG_LIST ] = [ 'title_prefix' ]
temp_traits_map[ PROP_ASSOCIATED_FIELDS ] = [ FIELD_NAME_IS_ERROR ]
FIELD_NAME_TO_TRAITS_MAP[ field_name ] = temp_traits_map

In [10]:
# set up mapping of tag values to field names in TAG_TO_FIELD_NAME_MAP.

# declare variables
current_field_name = None
current_traits = None
tag_list = None
current_tag = None

# loop over things in FIELD_NAME_TO_TRAITS_MAP.
for current_field_name in six.iterkeys( FIELD_NAME_TO_TRAITS_MAP ):
    
    # get traits dictionary for field name.
    current_traits = FIELD_NAME_TO_TRAITS_MAP.get( current_field_name, None )
    
    # retrieve tag list for field.
    tag_list = current_traits.get( PROP_TAG_LIST )
    for current_tag in tag_list:
        
        # get list of fields for tag
        tag_field_list = TAG_TO_FIELD_NAME_MAP.get( current_tag, [] )
        
        # append current field.
        tag_field_list.append( current_field_name )
        
        # put list back.
        TAG_TO_FIELD_NAME_MAP[ current_tag ] = tag_field_list
        
    #-- END loop over tags for a given field --#
    
#-- END loop over field names. --#

print( "Map of tags to field names in TAG_TO_FIELD_NAME_MAP: {}".format( str( TAG_TO_FIELD_NAME_MAP ) ) )


Map of tags to field names in TAG_TO_FIELD_NAME_MAP: {'ambiguous': ['is_ambiguous'], 'ambiguity': ['is_ambiguous'], 'compound_attribution': ['is_attribution_compound'], 'follow_on_attribution': ['is_attribution_follow_on'], 'pronoun_attribution': ['is_attribution_pronoun'], 'second_hand': ['is_attribution_second_hand'], 'complex': ['is_complex'], 'compound_names': ['is_compound_names'], 'contributed_to': ['is_contributed_to'], 'dictionary_error': ['is_dictionary_error'], 'disambiguation': ['is_disambiguation'], 'editing_error': ['is_editing_error'], 'error': ['is_error'], 'foreign_names': ['is_foreign_names'], 'gender_confusion': ['is_gender_confusion'], 'initials': ['is_initials_error'], 'interesting': ['is_interesting'], 'layout_or_design': ['is_layout_or_design'], 'list': ['is_list'], 'lists': ['is_list'], 'lookup': ['is_lookup_error'], 'no_html': ['is_no_html'], 'non_news': ['is_not_hard_news'], 'possessive': ['is_possessive'], 'pronouns': ['is_pronouns'], 'proper_noun': ['is_proper_noun'], 'proper_nouns': ['is_proper_noun'], 'quote_distance': ['is_quote_distance'], 'said_verb': ['is_said_verb'], 'short_n-gram': ['is_short_n_gram'], 'software_error': ['is_software_error'], 'spanish': ['is_spanish'], 'sports': ['is_sports'], 'straightforward': ['is_straightforward'], 'complex_title': ['is_title', 'is_title_complex'], 'complex_titles': ['is_title', 'is_title_complex'], 'title_prefix': ['is_title', 'is_title_prefix']}

Evaluating disagreements

Look at stats for disagreements and evaluation, including human and computer errors.

Process: Look at each instance where there is a disagreement and make sure the human coding is correct.

Most are probably instances where the computer screwed up, but since we are calling this human coding "ground truth", want to winnow out as much human error as possible.

For each disagreement, to check for coder error (like just capturing a name part for a person whose full name was in the story), click the "Article ID" in the column that has a link to article ID. It will take you to a view of the article where all the people who coded the article are included, with each detection of a mention or quotation displayed next to the paragraph where the person was originally first detected.

If not human error, remove TODO tag, filling in details on the diagreement in the record in Reliability_Names_Evaluation for the removal of the tag (details: Disagreement tracking process).

If human error:

  • 1) look at all the disagreements for the article.
  • 2) remove all TODO tags from all disagreements, and fill in details for each.
  • 3) follow steps below to create ground_truth copy and fix it.
  • 4) rebuild Reliability_Names for article and cleanup.
  • 5) then, do any deletes or merges you need to do, so you only do them once.

Pull together some numbers and analysis from disagreement work:

  • counts of disagreements.
  • which were human and computer error.
  • ratio of human error to machine error.
  • proportion of human and machine errors.
  • overall number of disagreements compared to all decisions.
  • characterisation of potential for systematic issues (not as bad as I feared).

Disagreement tracking process

From 2017.07.01-work_log-prelim_month-evaluate_disagreements.ipynb - Disagreement resolution:

For each disagreement, click on the article ID link in the row to go to the article and check to see if the human coding for the disagreement in question is correct ( http://research.local/research/context/text/article/article_data/view_with_text/ ).

Once you've evaluated and verified the human coding, remove the "TODO" tag from the current record (either from the single-article view above if you've removed all disagreements, or from the disagreement view if not):

  • Click the checkbox in the "select" column next to the record whose evaluation is complete.
  • In the "Reliability names action:" field, select "Remove tag(s) from selected".
  • In the "Tag(s) - (comma-delimited):" field, enter "TODO" (without the quotes).
  • Click the "Do Action" button.
  • This will also place information on the Reliability_Names record into a Reliability_Names_Evaluation record in the database. The message that results from this action completing will include a link to the record (the first number in the output). Click the link to open the record and update it with additional details. Specifically:

    • status - status of human coder's coding:

      • If the human coder got it right, status is "CORRECT", even if OpenCalais had an egregious error.
      • If this is because the coding screen couldn't capture compound names initially, set status to "INCOMPLETE", set the status message to "SKIPPED because screen couldn't deal with compound names", put the compound name string in notes, and then add tag "compound_names".
      • if the OC coder had an issue because we had to smoosh all paragraphs together because it didn't deal well with HTML markup in the body of text it processed, set status to "CORRECT", set status message to "OC ERROR because of formatting", and then explain the problem in the notes. If the article is a list or column with odd formatting, consider flagging the article for removal from the study.
      • if you have to update ground truth, set "status" to "ERROR".
      • else, use your best judgment.
    • if problems caused by automated coder error, click the "is_automated_error" checkbox.

    • update the "status_message" so it contains a brief description of what exactly happened (should have been mentioned, should have been quoted, missed the person entirely, etc.).
    • update "Notes" with more details.

      • If should have been quoted or mentioned, note the graf # and paragraph text of the paragraph that indicates this.
    • add "Tags" if appropriate (for sports articles, for example, add "sports" tag).

NOTE: Always remove TODO tag first, so you have a record of status for each Reliability Names record. Then, once you've done that, you can merge, delete, etc.

Data in Reliability_Names_Evaluation

Overall disagreement log

  • Track each Reliability_Names record with disagreement that we evaluate (All "remove tags" events with label "prelim_month"):
  • Moved to Reliability_Names_Evaluation table in django: http://research.local/research/admin/context_analysis/reliability_names_evaluation/?label=prelim_month&o=-1.7.8.3.5
  • All "remove tags" events with label "prelim_month"
  • 428 total records.
  • of those, 13 are same person and article, but different Reliability_Names record, so disagreements that had to be corrected twice because of rebuilding Reliability_Names for the article (either human error, or something weird). SQL:

      SELECT sarne1.person_name,
          sarne1.id,
          sarne1.status,
          sarne1.original_reliability_names_id,
          sarne1.is_duplicate,
          sarne2.is_duplicate,
          sarne2.id,
          sarne2.status,
          sarne2.original_reliability_names_id
      FROM context_analysis_reliability_names_evaluation AS sarne1,
          context_analysis_reliability_names_evaluation AS sarne2
      WHERE sarne1.id != sarne2.id
          AND sarne1.label = 'prelim_month'
          AND sarne2.label = 'prelim_month'
          AND sarne1.event_type = 'remove_tags'
          AND sarne2.event_type = 'remove_tags'
          AND sarne1.article_id = sarne2.article_id
          AND sarne1.person_name = sarne2.person_name
          AND sarne1.original_reliability_names_id != sarne2.original_reliability_names_id
          AND sarne2.original_reliability_names_id > sarne1.original_reliability_names_id
      ORDER BY sarne1.id ASC;
  • So, 428 - 13 = 415 unique disagreements.

  • Could regenerate Reliability_Names without ground_truth to look at original counts? Should be able to... Just need to make sure I remember all steps...

    • would need to clear single names, then I'd be left with disagreements. Not worth it.

flag duplicates

TODO:

  • duplicates

    • Make sure that for each pair of duplicates, one has "is_duplicate" checked.
    • to start, mark all that have duplicates as "is_to_do" = True.

Step 1: get IDs of records with duplicates

SELECT DISTINCT ( sarne1.id )
FROM context_analysis_reliability_names_evaluation AS sarne1,
    context_analysis_reliability_names_evaluation AS sarne2
WHERE sarne1.id != sarne2.id
    AND sarne1.label = 'prelim_month'
    AND sarne2.label = 'prelim_month'
    AND sarne1.event_type = 'remove_tags'
    AND sarne2.event_type = 'remove_tags'
    AND sarne1.article_id = sarne2.article_id
    AND sarne1.person_name = sarne2.person_name
    AND sarne1.original_reliability_names_id != sarne2.original_reliability_names_id
    AND sarne2.original_reliability_names_id > sarne1.original_reliability_names_id
ORDER BY sarne1.id ASC;

and

SELECT DISTINCT ( sarne2.id )
FROM context_analysis_reliability_names_evaluation AS sarne1,
    context_analysis_reliability_names_evaluation AS sarne2
WHERE sarne1.id != sarne2.id
    AND sarne1.label = 'prelim_month'
    AND sarne2.label = 'prelim_month'
    AND sarne1.event_type = 'remove_tags'
    AND sarne2.event_type = 'remove_tags'
    AND sarne1.article_id = sarne2.article_id
    AND sarne1.person_name = sarne2.person_name
    AND sarne1.original_reliability_names_id != sarne2.original_reliability_names_id
    AND sarne2.original_reliability_names_id > sarne1.original_reliability_names_id
ORDER BY sarne2.id ASC;

In [ ]:
# Got IDs that contain duplicates, now tag them as todo
ids_to_process = []
ids_to_process.append( 15 )
ids_to_process.append( 33 )
ids_to_process.append( 75 )
ids_to_process.append( 405 )
ids_to_process.append( 435 )
ids_to_process.append( 512 )
ids_to_process.append( 556 )
ids_to_process.append( 586 )
ids_to_process.append( 610 )
ids_to_process.append( 620 )
ids_to_process.append( 635 )
ids_to_process.append( 646 )
ids_to_process.append( 16 )
ids_to_process.append( 34 )
ids_to_process.append( 76 )
ids_to_process.append( 407 )
ids_to_process.append( 432 )
ids_to_process.append( 513 )
ids_to_process.append( 517 )
ids_to_process.append( 558 )
ids_to_process.append( 596 )
ids_to_process.append( 611 )
ids_to_process.append( 619 )
ids_to_process.append( 637 )
ids_to_process.append( 651 )
do_save = False

# retrieve model instances.

# get all evaluation records with label = "prelim_month" and IDs in our list.
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( pk__in = ids_to_process )

# count?
eval_count = evaluation_qs.count()
print( "record count: {}".format( str( eval_count ) ) )

# loop, setting "is_to_do" to True on each and saving.
for current_eval in evaluation_qs:
    
    # set is_to_do to True and set work_status to "metadata_review".
    current_eval.is_to_do = True
    current_eval.work_status = "duplicate_processing"
    
    # save?
    if ( do_save == True ):
        
        # save! save! save!
        current_eval.save()
        
    #-- END check to see if we save(). --#
    
#-- END loop over QuerySet. --#

Turns out, many of these were a missed person by human coder, that, once fixed, revealed a problem with the automated coding. So, many were actually not duplicates, they were two separate issues with the same person.

review tags

TODO:

  • get all tags.
  • make boolean columns for each tag.
  • for each record, set flags based on tags.
  • Not going to flag all as todo and "review_disagreements".
  • Just use the tags I set, imperfect as they are, make basic counts and pick good ones for writing about disagreements.

In [ ]:
'''
django-taggit documentation: https://github.com/alex/django-taggit

Adding tags to a model:

    from django.db import models
    
    from taggit.managers import TaggableManager
    
    class Food(models.Model):
        # ... fields here
    
        tags = TaggableManager()

Interacting with a model that has tags:

    >>> apple = Food.objects.create(name="apple")
    >>> apple.tags.add("red", "green", "delicious")
    >>> apple.tags.all()
    [<Tag: red>, <Tag: green>, <Tag: delicious>]
    >>> apple.tags.remove("green")
    >>> apple.tags.all()
    [<Tag: red>, <Tag: delicious>]
    >>> Food.objects.filter(tags__name__in=["red"])
    [<Food: apple>, <Food: cherry>]
    
    # include only those with certain tags.
    #tags_in_list = [ "prelim_unit_test_001", "prelim_unit_test_002", "prelim_unit_test_003", "prelim_unit_test_004", "prelim_unit_test_005", "prelim_unit_test_006", "prelim_unit_test_007" ]
    tags_in_list = [ "grp_month", ]
    if ( len( tags_in_list ) > 0 ):
    
        # filter
        print( "filtering to just articles with tags: " + str( tags_in_list ) )
        grp_article_qs = grp_article_qs.filter( tags__name__in = tags_in_list )
        
    #-- END check to see if we have a specific list of tags we want to include --#

'''

# imports
from context_analysis.models import Reliability_Names_Evaluation

# declare variables
evaluation_qs = None
record_count = -1
record_counter = -1
current_record = None
tag_to_count_map = {}
tag_qs = None
tag_list = None
current_tag = ""
cleaned_tag = ""
current_count = -1
tag_count = -1
no_tags_list = []

# get all evaluation records with label = "prelim_month" and event_type = "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )

# first, just make sure that worked.
record_count = evaluation_qs.count()

# Check count of articles retrieved.
print( "Got " + str( record_count ) + " evaluations records." )

# loop over evaluations.
no_tags_count = 0
for current_record in evaluation_qs:

    # get tags.
    # current_article.tags.add( tag_value )
    tag_qs = current_record.tags.all()
    
    # output the tags.
    #print( "- Tags for record " + str( current_record.id ) + " : " + str( tag_qs ) )
    
    # count tags
    tag_count = tag_qs.count()
    
    # got tags?
    if ( tag_count > 0 ):
    
        # loop over tags.
        for current_tag in tag_qs:

            # standardize
            cleaned_tag = str( current_tag )

            # to lower case
            cleaned_tag = cleaned_tag.lower()

            # strip()
            cleaned_tag = cleaned_tag.strip()

            # in map?  Get current count.
            current_count = 0
            if ( cleaned_tag in tag_to_count_map ): 

                # It is in map - get counter for it.
                current_count = tag_to_count_map.get( cleaned_tag, None )

            #-- END check to see if tag in map --#

            # increment count and store.
            current_count += 1
            tag_to_count_map[ cleaned_tag ] = current_count

        #-- END loop over tags --#
        
    else:
        
        # increment no_tag_counter bby 1.
        no_tags_list.append( current_record )
        
    #-- END check to see if tags or not --#
    
#-- END loop over records --#

# output number of tagless evaluations
no_tags_count = len( no_tags_list )
print( "--> Count of articles with no tags: {}".format( str( no_tags_count ) ) )

# output tags
key_view = six.viewkeys( tag_to_count_map )
tag_list = list( key_view )
tag_list.sort()
for tag_string in tag_list:

    # print each tag and its count.
    current_count = tag_to_count_map.get( tag_string, -1 )
    print( "- {} ( {} )".format( str( tag_string ), str( current_count ) ) )
    
#-- END print tags. --#

Output:

--> Count of articles with no tags: 188

Got 428 evaluations records.

  • ambiguity ( 2 )
  • ambiguous ( 10 )
  • complex ( 23 )
  • complex_title ( 1 )
  • complex_titles ( 25 )
  • compound_attribution ( 1 )
  • compound_names ( 9 )
  • contributed_to ( 2 )
  • dictionary_error ( 20 )
  • disambiguation ( 4 )
  • editing_error ( 3 )
  • error ( 204 )
  • follow_on_attribution ( 24 )
  • foreign_names ( 1 )
  • gender_confusion ( 2 )
  • initials ( 5 )
  • interesting ( 198 )
  • layout_or_design ( 3 )
  • list ( 14 )
  • lists ( 1 )
  • lookup ( 5 )
  • no_html ( 5 )
  • non_news ( 12 )
  • possessive ( 1 )
  • pronoun_attribution ( 2 )
  • pronouns ( 4 )
  • proper_noun ( 2 )
  • proper_nouns ( 38 )
  • quote_distance ( 12 )
  • said_verb ( 29 )
  • second_hand ( 10 )
  • short_n-gram ( 5 )
  • software_error ( 1 )
  • spanish ( 1 )
  • sports ( 6 )
  • straightforward ( 74 )
  • title_prefix ( 8 )

tags to create:

  • is_ambiguous = ambiguous, ambiguity
  • --> is_attribution_compound = compound_attribution
  • --> is_attribution_follow_on = follow_on_attribution
  • --> is_attribution_pronoun = pronoun_attribution
  • --> is_attribution_second_hand = second_hand
  • is_complex = complex
  • --> is_compound_names = compound_names
  • --> is_contributed_to (and is_subject_shb_author) = contributed_to
  • --> is_dictionary_error = dictionary_error
  • --> is_disambiguation = disambiguation
  • --> is_editing_error = editing_error
  • is_error = error
  • --> is_foreign_names = foreign_names
  • --> is_gender_confusion = gender_confusion
  • --> is_initials_error = initials
  • is_interesting = interesting
  • --> is_layout_or_design = layout_or_design
  • is_list = list, lists
  • --> is_lookup_error = lookup
  • is_no_html = no_html
  • is_not_hard_news = non_news
  • --> is_possessive = possessive
  • --> is_pronouns = pronouns
  • --> is_proper_noun (and is_not_a_person) = proper_noun, proper_nouns
  • --> is_quote_distance = quote_distance
  • --> is_said_verb = said_verb
  • --> is_short_n_gram = short_n-gram
  • --> is_software_error = software_error
  • --> is_spanish = spanish
  • is_sports = sports
  • is_straightforward = straightforward
  • is_title = complex_title, complex_titles, title_prefix
  • is_title_complex = complex_title, complex_titles
  • is_title_prefix = title_prefix

Remember:

  • If error ("-->"), make sure to set "is_error", as well.
  • And, "is_automated_error" if automated error.

Notes:

  • verified that all fields are in model.
  • admin:

    • make sure that all fields are in admin.
    • reorganizing to make it a little less crazy.
    • make sure all are in limit list.
  • Need to automatically set the flags based on the tag values.

Set up meta-data on fields, tags, and how they relate.

Now, we set booleans based on tags - first, see if tag is mapped to a field, then, if so, look up field traits to figure out what booleans to set.


In [28]:
'''
django-taggit documentation: https://github.com/alex/django-taggit

Adding tags to a model:

    from django.db import models
    
    from taggit.managers import TaggableManager
    
    class Food(models.Model):
        # ... fields here
    
        tags = TaggableManager()

Interacting with a model that has tags:

    >>> apple = Food.objects.create(name="apple")
    >>> apple.tags.add("red", "green", "delicious")
    >>> apple.tags.all()
    [<Tag: red>, <Tag: green>, <Tag: delicious>]
    >>> apple.tags.remove("green")
    >>> apple.tags.all()
    [<Tag: red>, <Tag: delicious>]
    >>> Food.objects.filter(tags__name__in=["red"])
    [<Food: apple>, <Food: cherry>]
    
    # include only those with certain tags.
    #tags_in_list = [ "prelim_unit_test_001", "prelim_unit_test_002", "prelim_unit_test_003", "prelim_unit_test_004", "prelim_unit_test_005", "prelim_unit_test_006", "prelim_unit_test_007" ]
    tags_in_list = [ "grp_month", ]
    if ( len( tags_in_list ) > 0 ):
    
        # filter
        print( "filtering to just articles with tags: " + str( tags_in_list ) )
        grp_article_qs = grp_article_qs.filter( tags__name__in = tags_in_list )
        
    #-- END check to see if we have a specific list of tags we want to include --#

'''

# imports
from context_analysis.models import Reliability_Names_Evaluation

# declare variables
evaluation_qs = None
record_count = -1
record_counter = -1
current_record = None
tag_set = set()
tag_qs = None
tag_list = None
current_tag = ""
cleaned_tag = ""
field_name_list = None
current_field_name = None
current_traits = None
related_field_name_list = None
related_field_name = None
do_save = True

# get all evaluation records with label = "prelim_month" and event_type = "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )

# first, just make sure that worked.
record_count = evaluation_qs.count()

# Check count of articles retrieved.
print( "Got " + str( record_count ) + " evaluations records." )

# loop over evaluations.
for current_record in evaluation_qs:

    # get tags.
    # current_article.tags.add( tag_value )
    tag_qs = current_record.tags.all()
    
    # output the tags.
    #print( "- Tags for record " + str( current_record.id ) + " : " + str( tag_qs ) )
    
    # loop over tags.
    for current_tag in tag_qs:
        
        # standardize
        cleaned_tag = str( current_tag )

        # to lower case
        cleaned_tag = cleaned_tag.lower()
        
        # strip()
        cleaned_tag = cleaned_tag.strip()
        
        # First, try to retrieve field name for current tag.
        field_name_list = TAG_TO_FIELD_NAME_MAP.get( cleaned_tag, None )
        
        # got a field name?
        if ( field_name_list is not None ):
            
            # loop over items in list.
            for current_field_name in field_name_list:

                # set field to True.
                setattr( current_record, current_field_name, True )

                # retrieve field's traits.
                current_traits = FIELD_NAME_TO_TRAITS_MAP.get( current_field_name, None )

                # get list of related fields
                related_field_name_list = current_traits.get( PROP_ASSOCIATED_FIELDS, None )

                # got anything?
                if ( ( related_field_name_list is not None ) and ( len( related_field_name_list ) > 0 ) ):

                    # yes - set related fields to True, also.
                    for related_field_name in related_field_name_list:

                        # set field to True.
                        setattr( current_record, related_field_name, True )

                    #-- END loop over related field names. --#

                #-- END check to see if any related fields. --#
                
            #-- END loop over related fields. --#

        else:
            
            # Unknown tag!
            print( "!!!! Unknown tag: {}".format( cleaned_tag ) )
            
        #-- END check to see what tag we are processing. --#
        
    #-- END loop over tags --#
    
    # are we saving the results of this grand endeavour?
    if ( do_save == True ):
        
        # do save.
        current_record.save()
        
    #-- END check to see if saving. --#
    
#-- END loop over records --#

# output
print( "Completed at {}".format( str( datetime.datetime.now() ) ) )


Got 428 evaluations records.
Completed at 2018-02-20 20:33:18.034532

In [29]:
# Generate counts for each field.

# declare variables
field_names_view = None
field_names_list = None
current_field_name = None
my_kwargs = None
kwarg_name = None
kwarg_value = None
evaluation_qs = None
filtered_qs = None
filtered_count = -1

# first, get all evaluation instances with label = "prelim_month" and event type "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )

# get view of keys.
field_names_view = six.viewkeys( FIELD_NAME_TO_TRAITS_MAP )

# convert to sorted list
field_names_list = list( field_names_view )
field_names_list.sort()

# loop over things in FIELD_NAME_TO_TRAITS_MAP.
for current_field_name in field_names_list:
    
    #print( "current field name: {}".format( current_field_name ) )
    
    # filter and count records where the current field is True.
    my_kwargs = {}
    kwarg_name = current_field_name
    kwarg_value = True
    my_kwargs[ kwarg_name ] = kwarg_value
    #print( my_kwargs )
    
    # filter.
    filtered_qs = evaluation_qs.filter( **my_kwargs )
    
    # count
    filtered_count = filtered_qs.count()

    # output
    print( "- field {} count: {}".format( current_field_name, str( filtered_count ) ) )
    
#-- END loop over field names. --#


- field is_ambiguous count: 67
- field is_attribution_compound count: 1
- field is_attribution_follow_on count: 24
- field is_attribution_pronoun count: 2
- field is_attribution_second_hand count: 10
- field is_complex count: 23
- field is_compound_names count: 9
- field is_contributed_to count: 2
- field is_dictionary_error count: 20
- field is_disambiguation count: 4
- field is_editing_error count: 3
- field is_error count: 221
- field is_foreign_names count: 1
- field is_gender_confusion count: 2
- field is_initials_error count: 5
- field is_interesting count: 198
- field is_layout_or_design count: 3
- field is_list count: 15
- field is_lookup_error count: 5
- field is_no_html count: 5
- field is_not_hard_news count: 16
- field is_possessive count: 1
- field is_pronouns count: 4
- field is_proper_noun count: 40
- field is_quote_distance count: 12
- field is_said_verb count: 29
- field is_short_n_gram count: 5
- field is_software_error count: 1
- field is_spanish count: 1
- field is_sports count: 6
- field is_straightforward count: 74
- field is_title count: 33
- field is_title_complex count: 26
- field is_title_prefix count: 8

disagreement reason summary

field name field count tag count
field is_ambiguous count 67 12
field is_attribution_compound count 1 1
field is_attribution_follow_on count 24 24
field is_attribution_pronoun count 2 2
field is_attribution_second_hand count 10 10
field is_complex count 23 23
field is_compound_names count 9 9
field is_contributed_to count 2 2
field is_dictionary_error count 20 20
field is_disambiguation count 4 4
field is_editing_error count 3 3
field is_error count 221 204
field is_foreign_names count 1 1
field is_gender_confusion count 2 2
field is_initials_error count 5 5
field is_interesting count 198 198
field is_layout_or_design count 3 3
field is_list count 15 15
field is_lookup_error count 5 5
field is_no_html count 5 5
field is_not_hard_news count 16 12
field is_possessive count 1 1
field is_pronouns count 4 4
field is_proper_noun count 40 40
field is_quote_distance count 12 12
field is_said_verb count 29 29
field is_short_n_gram count 5 5
field is_software_error count 1 1
field is_spanish count 1 1
field is_sports count 6 6
field is_straightforward count 74 74
field is_title count 33 34
field is_title_complex count 26 26
field is_title_prefix count 8 8

Ground truth coding fixed

  • For some, the error will be on the part of the human coder. For human error, we create a new "ground_truth" record that we will correct, so we preserve original coding (and evidence of errors) in case we want or need that information later. Below, we have a table of the articles where we had to fix ground truth. To find the original coding, click the Article link.
  • Denoted by records with "is_ground_truth_fixed" set to True in the Reliability_Names_Evaluation table in django: http://research.local/research/admin/context_analysis/reliability_names_evaluation/?is_ground_truth_fixed__exact=1&label=prelim_month&o=-1.7.8.3.5
  • 130 total (130/415 = 31.3% - this is a lot - is this right?)

    • 4 duplicates, so call it 126.
    • based on "prelim_month_human" Reliability_Names tag, 135 disagreements between original and corrected coding. Probably some merging needed here?
    • of those, 4 are same person and article, but different Reliability_Names record, so disagreements that had to be corrected twice because of rebuilding Reliability_Names for the article (either human error, or something else weird). SQL:

        SELECT sarne1.person_name,
            sarne1.id,
            sarne1.status,
            sarne1.original_reliability_names_id,
            sarne1.article_id,
            sarne1.is_duplicate,
            sarne2.is_duplicate,
            sarne2.id,
            sarne2.status,
            sarne2.original_reliability_names_id,
            sarne2.article_id
        FROM context_analysis_reliability_names_evaluation AS sarne1,
            context_analysis_reliability_names_evaluation AS sarne2
        WHERE sarne1.id != sarne2.id
            AND sarne1.label = 'prelim_month'
            AND sarne2.label = 'prelim_month'
            AND sarne1.event_type = 'remove_tags'
            AND sarne2.event_type = 'remove_tags'
            AND sarne1.is_ground_truth_fixed = TRUE
            AND sarne2.is_ground_truth_fixed = TRUE
            AND sarne1.article_id = sarne2.article_id
            AND sarne1.person_name = sarne2.person_name
            AND sarne1.original_reliability_names_id != sarne2.original_reliability_names_id
            AND sarne2.original_reliability_names_id > sarne1.original_reliability_names_id
        ORDER BY sarne1.id ASC;
      
      

      Results (looks like these are ones that had to be merged, so ... minimze - when ambiguity, assume error in creating data, treat as duplicates so reduce count by number of duplicates):

person_name id status original_reliability_names_id article_id id status original_reliability_names_id article_id
Jeff Hawkins 33 ERROR 9408 21007 34 ERROR 9414 21007
Fritz Wahlfield 405 ERROR 10330 22415 407 CORRECT 10997 22415
John Agar 610 ERROR 8917 23904 611 ERROR 8918 23904
Rachael Recker 620 ERROR 8968 23920 619 ERROR 8976 23920
  • work:

    • set all with is_ground_truth_fixed = True so that is_to_do = True and work_status = "metadata_review".
    • update all with is_ground_truth_updated = True so is_human_error = True.

Mark all ground truth updates as TODO


In [ ]:
# get all evaluation records with label = "prelim_month", is_ground_truth_fixed = True, and event_type = "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )

# count?
eval_count = evaluation_qs.count()
print( "record count: {}".format( str( eval_count ) ) )

# loop, setting "is_to_do" to True on each and saving.
for current_eval in evaluation_qs:
    
    # set is_to_do to True and set work_status to "metadata_review".
    current_eval.is_to_do = True
    current_eval.work_status = "metadata_review"
    current_eval.save()
    
#-- END loop over QuerySet. --#

Mark all ground truth updates as human error


In [ ]:
# get all evaluation records with label = "prelim_month", is_ground_truth_fixed = True, and event_type = "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )

# count?
eval_count = evaluation_qs.count()
print( "record count: {}".format( str( eval_count ) ) )

# loop, setting "is_human_error" to True on each and saving.
for current_eval in evaluation_qs:
    
    # set is_to_do to True and set work_status to "metadata_review".
    current_eval.is_human_error = True
    current_eval.save()
    
#-- END loop over QuerySet. --#

Count ground truth updates

Task 1: go through all "is_to_do" and update the metadata booleans.


In [ ]:
# get all evaluation records with:
# - label = "prelim_month"
# - is_ground_truth_fixed = True
# - event_type = "remove_tags"
# - is_duplicate = False
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )
# evaluation_qs = evaluation_qs.filter( is_to_do = True )  # 130 originally
evaluation_qs = evaluation_qs.filter( is_duplicate = False )  # now 123!

# count?
eval_count = evaluation_qs.count()
print( "record count: {}".format( str( eval_count ) ) )

# 123

Task 2: update counts of characterizations above. When filtering for counts:

  • filter out duplicates when generating counts (include only where "is_duplicate" = False).
  • filter out skipped when generating counts (include only where "is_skipped" = False).

In [ ]:
# get all evaluation records with:
# - label = "prelim_month"
# - is_ground_truth_fixed = True
# - is_human_error = True
# - event_type = "remove_tags"
# - is_duplicate = False
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.filter( is_human_error = True )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )
# evaluation_qs = evaluation_qs.filter( is_to_do = True )  # 130 originally
evaluation_qs = evaluation_qs.filter( is_duplicate = False )  # now 123!

# count?
eval_count = evaluation_qs.count()
print( "record count: {}".format( str( eval_count ) ) )

# 102

Analysis:

  • 130 total, 123 without duplicates, 102 if one removes those where human skipped a person because of limitation of the coding application (compound names).
  • 102/415 = 24.57831%
  • number of affected articles?
  • characterization of the problems:

    • is_missed_author:
    • is_missed_subject:
    • is_skipped (limitation of coding application):
    • is_author_shb_subject: *
    • is_subject_shb_author: *
    • is_quoted_shb_mentioned: *
    • is_mentioned_shb_quoted: *
    • is_wrong_text_captured: *
    • is_duplicate:
  • note:

    • evaluation 469 is both author and subject.

TODO:

  • work:

    • figure out number of affected articles (should just be count of Article_Data by ground_truth user).
    • update metadata for all disagreements (all "remove_tags" events).

Reliability_Names records merged

Deleted Reliability_Names records

In summary:

Evaluating Disagreements - Human Error

Human error:

  • Per article (how many have ground truth?)
  • Per decision? How many errors, compared to total number of decisions, and what kind of errors?

human error grouped by article


In [20]:
# declare variables
eval_record = None
current_article = None
article_id = None
article_id_to_count_map = None
article_id_to_instance_map = None
article_count = None
average_count = None

# get all evaluation records with:
# - label = "prelim_month"
# - is_ground_truth_fixed = True
# - is_human_error = True
# - event_type = "remove_tags"
# - is_duplicate = False
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.filter( is_human_error = True )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )
# evaluation_qs = evaluation_qs.filter( is_to_do = True )  # 130 originally
evaluation_qs = evaluation_qs.filter( is_duplicate = False )  # now 123!

# count?
eval_count = evaluation_qs.count()
print( "record count: {}".format( str( eval_count ) ) )

# loop
article_id_to_count_map = {}
article_id_to_instance_map = {}
article_count = 0
for eval_record in evaluation_qs:
    
    # get article ID
    current_article = eval_record.article
    article_id = current_article.id
    
    # save instance
    if article_id not in article_id_to_instance_map:
        
        # add it
        article_id_to_instance_map[ article_id ] = current_article
        
    #-- END check if already saved --#
    
    # update count
    article_count = article_id_to_count_map.get( article_id, 0 )
    article_count += 1
    article_id_to_count_map[ article_id ] = article_count
    
#-- END loop over records --#

# output
for article_id, article_count in six.iteritems( article_id_to_count_map ):
    
    # print results
    print( "- article {}: {}".format( article_id, article_count ) )
    
#-- END loop over articles --#

# now, get and output count of articles and average per article.
article_count = len( article_id_to_count_map )
average_count = eval_count / article_count
print( "" )
print( "Article-level info:" )
print( "- article count: {}".format( article_count ) )
print( "- average per article: {}".format( average_count ) )


record count: 102
- article 23476: 1
- article 23356: 2
- article 23264: 1
- article 23179: 2
- article 23158: 1
- article 23146: 2
- article 23125: 1
- article 23118: 1
- article 23055: 1
- article 22870: 1
- article 22869: 2
- article 22105: 3
- article 22705: 8
- article 22663: 1
- article 22630: 1
- article 22608: 1
- article 22566: 1
- article 22499: 1
- article 22443: 1
- article 22410: 1
- article 22379: 1
- article 22355: 1
- article 22325: 1
- article 22302: 1
- article 22131: 1
- article 22034: 1
- article 21984: 1
- article 21955: 1
- article 21927: 1
- article 21899: 1
- article 21888: 1
- article 21880: 1
- article 21863: 1
- article 21829: 1
- article 21828: 2
- article 21794: 1
- article 21702: 1
- article 21675: 1
- article 21644: 1
- article 21627: 1
- article 21588: 1
- article 21557: 2
- article 21551: 1
- article 21515: 1
- article 21435: 1
- article 21355: 1
- article 21338: 1
- article 21305: 1
- article 21268: 1
- article 21174: 1
- article 21161: 2
- article 21043: 1
- article 21007: 2
- article 21001: 1
- article 20930: 1
- article 20919: 1
- article 20981: 1
- article 20854: 2
- article 20815: 1
- article 20813: 1
- article 23482: 1
- article 24131: 1
- article 24024: 1
- article 23983: 1
- article 23921: 1
- article 23914: 1
- article 23904: 2
- article 23861: 1
- article 23865: 2
- article 23804: 2
- article 23745: 1
- article 23736: 2
- article 23536: 2
- article 24132: 1
- article 24022: 1
- article 23920: 1
- article 23502: 1
- article 22415: 1
- article 21350: 1

Article-level info:
- article count: 79
- average per article: 1.2911392405063291

overall counts within human error


In [18]:
# Generate counts for each field.

# declare variables
field_names_view = None
field_names_list = None
current_field_name = None
my_kwargs = None
kwarg_name = None
kwarg_value = None
evaluation_qs = None
total_count = None
filtered_qs = None
filtered_count = -1

# first, get all evaluation instances with label = "prelim_month" and event type "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.filter( is_human_error = True )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )
evaluation_qs = evaluation_qs.filter( is_duplicate = False )  # now 123!
total_count = evaluation_qs.count()

# get view of keys.
field_names_view = six.viewkeys( FIELD_NAME_TO_TRAITS_MAP )

# convert to sorted list
field_names_list = list( field_names_view )
field_names_list.sort()

# loop over things in FIELD_NAME_TO_TRAITS_MAP.
field_string = ""
zero_list = []
non_zero_list = []
for current_field_name in field_names_list:
    
    #print( "current field name: {}".format( current_field_name ) )
    
    # filter and count records where the current field is True.
    my_kwargs = {}
    kwarg_name = current_field_name
    kwarg_value = True
    my_kwargs[ kwarg_name ] = kwarg_value
    #print( my_kwargs )
    
    # filter.
    filtered_qs = evaluation_qs.filter( **my_kwargs )
    
    # count
    filtered_count = filtered_qs.count()
    
    # how many?
    field_string = "- field {} count: {}".format( current_field_name, str( filtered_count ) )
    if ( filtered_count > 0 ):
        
        # add to non-zero list
        non_zero_list.append( field_string )
        
    else:
        
        # add to zero list
        zero_list.append( field_string )
        
    #-- END count check --#

#-- END loop over field names. --#

print( "Total records: {}\n".format( total_count ) )
print( "tags found:" )
print( "\n".join( non_zero_list ) )
print( "\ntags not found:" )
print( "\n".join( zero_list ) )


Total records: 102

tags found:
- field is_ambiguous count: 5
- field is_attribution_second_hand count: 1
- field is_complex count: 2
- field is_error count: 5
- field is_initials_error count: 2
- field is_interesting count: 1
- field is_list count: 5
- field is_not_hard_news count: 6
- field is_proper_noun count: 1
- field is_sports count: 3

tags not found:
- field is_attribution_compound count: 0
- field is_attribution_follow_on count: 0
- field is_attribution_pronoun count: 0
- field is_compound_names count: 0
- field is_contributed_to count: 0
- field is_dictionary_error count: 0
- field is_disambiguation count: 0
- field is_editing_error count: 0
- field is_foreign_names count: 0
- field is_gender_confusion count: 0
- field is_layout_or_design count: 0
- field is_lookup_error count: 0
- field is_no_html count: 0
- field is_possessive count: 0
- field is_pronouns count: 0
- field is_quote_distance count: 0
- field is_said_verb count: 0
- field is_short_n_gram count: 0
- field is_software_error count: 0
- field is_spanish count: 0
- field is_straightforward count: 0
- field is_title count: 0
- field is_title_complex count: 0
- field is_title_prefix count: 0

Disagreements - Computer Error

Computer error - look over classes of error for trends (systemic error) and interesting.

overall counts within computer error


In [15]:
# Generate counts for each field.

# declare variables
field_names_view = None
field_names_list = None
current_field_name = None
my_kwargs = None
kwarg_name = None
kwarg_value = None
evaluation_qs = None
total_count = None
filtered_qs = None
filtered_count = -1

# first, get all evaluation instances with label = "prelim_month" and event type "remove_tags".
evaluation_qs = Reliability_Names_Evaluation.objects.filter( label = "prelim_month" )
evaluation_qs = evaluation_qs.filter( event_type = "remove_tags" )
evaluation_qs = evaluation_qs.exclude( is_ground_truth_fixed = True )
evaluation_qs = evaluation_qs.exclude( is_human_error = True )
total_count = evaluation_qs.count()

# get view of keys.
field_names_view = six.viewkeys( FIELD_NAME_TO_TRAITS_MAP )

# convert to sorted list
field_names_list = list( field_names_view )
field_names_list.sort()

# loop over things in FIELD_NAME_TO_TRAITS_MAP.
field_string = ""
zero_list = []
non_zero_list = []
for current_field_name in field_names_list:
    
    #print( "current field name: {}".format( current_field_name ) )
    
    # filter and count records where the current field is True.
    my_kwargs = {}
    kwarg_name = current_field_name
    kwarg_value = True
    my_kwargs[ kwarg_name ] = kwarg_value
    #print( my_kwargs )
    
    # filter.
    filtered_qs = evaluation_qs.filter( **my_kwargs )
    
    # count
    filtered_count = filtered_qs.count()
    
    # how many?
    field_string = "- field {} count: {}".format( current_field_name, str( filtered_count ) )
    if ( filtered_count > 0 ):
        
        # add to non-zero list
        non_zero_list.append( field_string )
        
    else:
        
        # add to zero list
        zero_list.append( field_string )
        
    #-- END count check --#

#-- END loop over field names. --#

print( "Total records: {}\n".format( total_count ) )
print( "tags found:" )
print( "\n".join( non_zero_list ) )
print( "\ntags not found:" )
print( "\n".join( zero_list ) )


Total records: 301

tags found:
- field is_ambiguous count: 62
- field is_attribution_compound count: 1
- field is_attribution_follow_on count: 24
- field is_attribution_pronoun count: 2
- field is_attribution_second_hand count: 9
- field is_complex count: 21
- field is_compound_names count: 1
- field is_contributed_to count: 2
- field is_dictionary_error count: 20
- field is_disambiguation count: 4
- field is_editing_error count: 3
- field is_error count: 208
- field is_foreign_names count: 1
- field is_gender_confusion count: 2
- field is_initials_error count: 3
- field is_interesting count: 197
- field is_layout_or_design count: 3
- field is_list count: 10
- field is_lookup_error count: 5
- field is_no_html count: 5
- field is_not_hard_news count: 10
- field is_possessive count: 1
- field is_pronouns count: 4
- field is_proper_noun count: 39
- field is_quote_distance count: 12
- field is_said_verb count: 29
- field is_short_n_gram count: 5
- field is_software_error count: 1
- field is_spanish count: 1
- field is_sports count: 3
- field is_straightforward count: 74
- field is_title count: 33
- field is_title_complex count: 26
- field is_title_prefix count: 8

tags not found: