methods paper planning

1 Setup
2 Analysis steps
3 Data Creation
- 3.1 Data Creation - local GRP articles
  - 3.1.1 Data Creation - GRP - prelim month
- 3.2 Data Creation - local TDN articles
4 Reliability
5 Preparation for automated assessment
- 5.1 Single name removal
  - 5.1.1 Notes
  - 5.1.2 Error detail
- 5.2 Evaluating disagreements
6 Automated coder evaluation
7 Network Analysis
8 Combine results into spreadsheet
9 NEXT
10 Paper Edits
11 TODO

Setup

Back to Table of Contents

Setup - Imports

Back to Table of Contents



In [1]:

    
import datetime
import six

print( "packages imported at " + str( datetime.datetime.now() ) )









    



packages imported at 2019-04-24 01:42:07.550142

Setup - virtualenv jupyter kernel

Back to Table of Contents

If you are using a virtualenv, make sure that you:

have installed your virtualenv as a kernel.
choose the kernel for your virtualenv as the kernel for your notebook (Kernel --> Change kernel).

Since I use a virtualenv, need to get that activated somehow inside this notebook. One option is to run ../dev/wsgi.py in this notebook, to configure the python environment manually as if you had activated the sourcenet virtualenv. To do this, you'd make a code cell that contains:

%run ../dev/wsgi.py

This is sketchy, however, because of the changes it makes to your Python environment within the context of whatever your current kernel is. I'd worry about collisions with the actual Python 3 kernel. Better, one can install their virtualenv as a separate kernel. Steps:

activate your virtualenv:
```
  workon sourcenet
```
in your virtualenv, install the package ipykernel.
```
  pip install ipykernel
```

use the ipykernel python program to install the current environment as a kernel:

  python -m ipykernel install --user --name <env_name> --display-name "<display_name>"

sourcenet example:

  python -m ipykernel install --user --name sourcenet --display-name "sourcenet (Python 3)"

More details: http://ipython.readthedocs.io/en/stable/install/kernel_install.html



In [2]:

    
%pwd









    Out[2]:





'/home/jonathanmorgan/work/django/research/work/phd_work/methods'

Setup - Initialize Django

Back to Table of Contents

First, initialize my dev django project, so I can run code in this notebook that references my django models and can talk to the database using my project's settings.



In [3]:

    
%run django_init.py









    



django initialized at 2019-04-24 01:42:15.032662



In [4]:

    
# django imports
from context_analysis.models import Reliability_Names_Evaluation

Analysis steps

In the methods folder, here is the order the code was run for analysis:

1) data_creation (results in data folder)
2) reliability - for human reliability.
3) evaluate_disagreements - correct human coding when they are wrong in a disagreement, to create "ground truth".
4) precision_recall
5) reliability - for comparing human and computer coding to baseline.
6) network_analysis
7) results

Data Creation

back to Table of Contents

Below are the criteria used for each paper to filter down to just locally-implemented hard news articles.

For actual code, see ./data_creation/data_creation-filter_locally_implemented_hard_news.ipynb

Data Creation - local GRP articles

back to Table of Contents

Definition of local hard news and in-house implementor:

Grand Rapids Press
- context_text/examples/articles/articles-GRP-local_news.py
- local hard news sections (stored in Article.GRP_NEWS_SECTION_NAME_LIST):
  - "Business"
  - "City and Region"
  - "Front Page"
  - "Lakeshore"
  - "Religion"
  - "Special"
  - "State"
- excluding any publications with index term of "Column".
- in-house implementor (based on byline patterns, stored in sourcenet.models.Article.Q_GRP_IN_HOUSE_AUTHOR):
  - Byline ends in "/ THE GRAND RAPIDS PRESS", ignore case.
    - Q( author_varchar__iregex = r'.* */ *THE GRAND RAPIDS PRESS$'
  - Byline ends in "/ PRESS * EDITOR", ignore case.
    - Q( author_varchar__iregex = r'.* */ *PRESS .* EDITOR$' )
  - Byline ends in "/ GRAND RAPIDS PRESS * BUREAU", ignore case.
    - Q( author_varchar__iregex = r'.* */ *GRAND RAPIDS PRESS .* BUREAU$' )
  - Byline ends in "/ SPECIAL TO THE PRESS", ignore case.
    - Q( author_varchar__iregex = r'.* */ *SPECIAL TO THE PRESS$' )

Data Creation - GRP - prelim month

Description of month of GRP articles from December, 2009, for paper.

grp_month article count = 441



In [5]:

    
from context_text.models import Article



In [ ]:

    
# how many articles in "grp_month"?
article_qs = Article.objects.filter( tags__name__in = [ "grp_month" ] )
grp_month_count = article_qs.count()

print( "grp_month count = {}".format( grp_month_count ) )

Data Creation - local TDN articles

back to Table of Contents

Definition of local hard news and in-house implementor:

Detroit News
- context_text/examples/articles/articles-TDN-local_news.py
- local hard news sections (stored in from context_text.collectors.newsbank.newspapers.DTNB import DTNB - DTNB.NEWS_SECTION_NAME_LIST):
  - "Business"
  - "Metro"
  - "Nation" - because of auto industry stories
- in-house implementor (based on byline patterns, stored in DTNB.Q_IN_HOUSE_AUTHOR):
  - Byline ends in "/ The Detroit News", ignore case.
    - Q( author_varchar__iregex = r'.*\s*/\s*the\s*detroit\s*news$' )
  - Byline ends in "Special to The Detroit News", ignore case.
    - Q( author_varchar__iregex = r'.*\s*/\s*special\s*to\s*the\s*detroit\s*news$' )
  - Byline ends in "Detroit News * Bureau", ignore case.
    - Q( author_varchar__iregex = r'.*\s*/\s*detroit\s*news\s*.*\s*bureau$' )

Reliability

back to Table of Contents

TODO:

Characterize data used in reliability test (# articles, from both Detroit News and Grand Rapids Press, etc.).
describe reliability results.

Coding protocol testing

Back to Table of Contents

Outline in voodoopad - "Dropbox/academia/MSU/program_stuff/voodoopad/phd.vpdoc", note "Prelim - Notes".

Trained on 7 samples of 10 articles each. For each training set, users coded, I reviewed coding and updated protocol, then we reviewed problems and changes to protocol. After 7 sets, did formal reliability test.

Reliability Data

Back to Table of Contents

Article traits:

87 total local news articles implemented by a staff writer (either full-time or contractor), published in a local news section (outline criteria for GRP and TDN), and across two different papers, to just confirm that sourcing is standard across publications.
Sample size:
- minimum required sample size was 47 articles (46.79)
- Equation: $$n = \frac {(N-1)(SE)^2 + PQN}{(N-1)(SE)^2 + PQ}$$
  - WHERE:
    - N = total number of items to be coded
    - SE = standard error at desired confidence level ( $$SE = \frac{confidence interval}{Z-score}$$
    - P = population level of agreement
    - Q = 1 - P
- from:
  - Lacy, S., and D. Riffe. “Sampling Error and Selecting Intercoder Reliability Samples for Nominal Content Categories.” JOURNALISM AND MASS COMMUNICATION QUARTERLY, no. 4 (Winter 1996): 963. https://doi.org/10.1177/107769909607300414.
  - Riffe, Daniel, Stephen Lacy, and Frederick G. Fico. Analyzing Media Messages: Using Quantitative Content Analysis in Research, Second Edition. 2nd ed. LEA Communications Series. Mahwah, New Jersey: Lawrence Erlbaum Associates, Inc., 2005.
Articles in the reliability test sample:
- need a minimum of 47 articles given math above (calculated in detail below). Ended up including 87 articles in reliability sample to make sure we would be OK if we needed to code many more by hand if automated coder was no good.
- 27 articles from the Detroit News (tags "minnesota1-20160409", "minnesota2-20160409", and "minnesota3-20160409" each reference these same 27 articles)
- 60 Grand Rapids Press articles are tagged "prelim_reliability_test".
number of people detected?
anything else?

$$n = \frac {(N-1)(SE)^2 + PQN}{(N-1)(SE)^2 + PQ}$$

prelim_month reliability sample size

Back to Table of Contents



In [3]:

    
# ==> prelim_month

# init variables
n = 441
p = 0.95
ci = 0.05
z = 1.64
se = ci / z
q = 1 - p
sample_size = None

# calculate sample_size
n_minus_1 = n - 1
se_squared = se ** 2
p_times_q = p * q
n_minus_1_times_se_squared = n_minus_1 * se_squared
numerator = n_minus_1_times_se_squared + ( p_times_q * n )
denominator = n_minus_1_times_se_squared + p_times_q
sample_size = numerator / denominator

print( "prelim_month reliability minimum sample size: {}".format( sample_size ) )









    



prelim_month reliability minimum sample size: 46.78486279032644

for "grp_month":
- P = 95% agreement in the population
- seeking a 95% confidence level (confidence interval p = 0.05)
- Z-Score for p = 0.05: 1.64
- N = content universe = 1 month of local news articles by staff writers in Grand Rapids Press = 441 articles.
- SE = 0.05 / 1.64 = 0.0304878
- so:

$$n = \frac {(441-1)(0.0304878)^2 + (0.95 * 0.05 * 441)}{(441-1)(0.0304878)^2 + (0.95 * 0.05)}$$$$n = \frac {(440)(0.0304878)^2 + (0.95 * 0.05 * 441)}{(440)(0.0304878)^2 + (0.95 * 0.05)}$$$$n = \frac {(440)(0.0009295059) + (0.95 * 0.05 * 441)}{(440)(0.0009295059) + (0.95 * 0.05)}$$$$n = \frac {0.4089826 + (0.95 * 0.05 * 441)}{0.4089826 + (0.95 * 0.05)}$$$$n = \frac {0.4089826 + (0.0475 * 441)}{0.4089826 + 0.0475}$$$$n = \frac {0.4089826 + 20.9475}{0.4089826 + 0.0475}$$$$n = \frac {21.35648}{0.4564826}$$$$n = 46.78487$$



In [8]:

    
# ==> prelim_month

# init variables
n = 1000000000
p = 0.95
ci = 0.05
z = 1.64
se = ci / z
q = 1 - p
sample_size = None

# calculate sample_size
n_minus_1 = n - 1
se_squared = se ** 2
p_times_q = p * q
n_minus_1_times_se_squared = n_minus_1 * se_squared
numerator = n_minus_1_times_se_squared + ( p_times_q * n )
denominator = n_minus_1_times_se_squared + p_times_q
sample_size = numerator / denominator

print( "prelim_month reliability minimum sample size: {}".format( sample_size ) )









    



prelim_month reliability minimum sample size: 52.10239738854489

original design reliability sample size

Back to Table of Contents



In [2]:

    
# ==> original sample

# init variables
n = 461
p = 0.95
ci = 0.05
z = 1.64
se = ci / z
q = 1 - p
sample_size = None

# calculate sample_size
n_minus_1 = n - 1
se_squared = se ** 2
p_times_q = p * q
n_minus_1_times_se_squared = n_minus_1 * se_squared
numerator = n_minus_1_times_se_squared + ( p_times_q * n )
denominator = n_minus_1_times_se_squared + p_times_q
sample_size = numerator / denominator

print( "original sample reliability minimum sample size: {}".format( sample_size ) )









    



original sample reliability minimum sample size: 46.9929438797392

for original sample:
- P = 95% agreement in the population
- seeking a 95% confidence level (confidence interval p = 0.05)
- Z-Score for p = 0.05: 1.64
- N = content universe = 2 weeks of local news articles by staff writers in each of the Grand Rapids Press and Detroit News = 461 articles.
- SE = 0.05 / 1.64 = 0.0304878
- so:

$$n = \frac {(461-1)(0.0304878)^2 + (0.95 * 0.05 * 461)}{(461-1)(0.0304878)^2 + (0.95 * 0.05)}$$$$n = \frac {(460)(0.0304878)^2 + (0.95 * 0.05 * 461)}{(460)(0.0304878)^2 + (0.95 * 0.05)}$$$$n = \frac {(460)(0.0009295059) + (0.95 * 0.05 * 461)}{(460)(0.0009295059) + (0.95 * 0.05)}$$$$n = \frac {0.4275727 + (0.95 * 0.05 * 461)}{0.4275727 + (0.95 * 0.05)}$$$$n = \frac {0.4275727 + (0.0475 * 461)}{0.4275727 + 0.0475}$$$$n = \frac {0.4275727 + 21.8975}{0.4275727 + 0.0475}$$$$n = \frac {22.32507}{0.4750727}$$$$n = 46.99295$$

Creating Reliability_Names data

Back to Table of Contents

A little exploration to see what the tags below contain.



In [6]:

    
from context_text.models import Article



In [7]:

    
# how many articles in "prelim_reliability_test"?
article_qs = Article.objects.filter( tags__name__in = [ "prelim_reliability_test" ] )
reliability_sample_count = article_qs.count()

print( "prelim_reliability_test count = {}".format( reliability_sample_count ) )









    



prelim_reliability_test count = 60



In [8]:

    
# how many articles in "prelim_reliability_combined"?
article_qs = Article.objects.filter( tags__name__in = [ "prelim_reliability_combined" ] )
reliability_sample_count = article_qs.count()

print( "prelim_reliability_combined count = {}".format( reliability_sample_count ) )









    



prelim_reliability_combined count = 87

So:

prelim_reliability_test is just Grand Rapids Press, not what I'm reporting for the paper.
prelim_reliability_combined is GRP plus The Detroit News, is what I'm reporting in the paper.

The original code to generate data is in context_analysis/examples/reliability/reliability-build_name_data.py. It was used to create all the Reliability_Names data for the formal reliability test, including the initial run that only contained GRP articles, and some runs that included the automated coding alongside (no need).

The main label for reliability is prelim_reliability_combined_human, which would not have included the index 4 with automated coder:



In [ ]:

    
from __future__ import unicode_literals

# django imports
from django.contrib.auth.models import User

# sourcenet imports
from context_text.shared.context_text_base import ContextTextBase

# context_analysis imports
from context_analysis.reliability.reliability_names_builder import ReliabilityNamesBuilder

# declare variables
my_reliability_instance = None
tag_list = None
label = ""

# declare variables - user setup
current_coder = None
current_coder_id = -1
current_index = -1

# declare variables - Article_Data filtering.
coder_type = ""

# make reliability instance
my_reliability_instance = ReliabilityNamesBuilder()

#===============================================================================
# configure
#===============================================================================

# list of tags of articles we want to process.
tag_list = [ "prelim_reliability_combined", ]

# label to associate with results, for subsequent lookup.
label = "prelim_reliability_combined_human"

# ! ====> map coder user IDs to indices within the reliability names table.

# set it up so that...

# ...coder ID 8 is index 1...
current_coder_id = 8
current_index = 1
my_reliability_instance.add_coder_at_index( current_coder_id, current_index )

# ...coder ID 9 is index 2...
current_coder_id = 9
current_index = 2
my_reliability_instance.add_coder_at_index( current_coder_id, current_index )

# ...coder ID 10 is index 3...
current_coder_id = 10
current_index = 3
my_reliability_instance.add_coder_at_index( current_coder_id, current_index )

# output debug JSON to file
#my_reliability_instance.debug_output_json_file_path = "/home/jonathanmorgan/" + label + ".json"

#===============================================================================
# process
#===============================================================================

# process articles
my_reliability_instance.process_articles( tag_list )

# output to database.
my_reliability_instance.output_reliability_data( label )

Reliability Analysis

Back to Table of Contents

Path to Dropbox folder that holds PDF and Excel file output of reliability numbers:

Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/reliability/2016-data

To view results: https://research.local/research/context/analysis/reliability/names/results/view

The human-only results (the ones I will write about) are results with labels:

"prelim_reliability_combined_human_final"
- this is latest code, regenerated recently. Is identical to the results from the old code (numbers from 2016.08.27):
  - Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/reliability/2016-data/2016.08.27-reliability-prelim_reliability_combined_human.pdf
  - Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/reliability/2016-data/2016.08.27-reliability-prelim_reliability_combined_human.xlsx
and "prelim_reliability_combined_human"
- this is old code results stored in the database.
- compare to "final" calculated with rewritten code - numbers should be identical.
- results in database for "prelim_reliability_combined_human", shown below, are not identical - but, the results stored in Dropbox (see above) are identical to those for "prelim_reliability_combined_human_final". Very strange.

Since they match original numbers, and since they are lower, I'll just use "prelim_reliability_combined_human_final".

Code to calculate reliability numbers

Back to Table of Contents

This is the code invoked by the page https://research.local/research/context/analysis/reliability/names/results/view



In [ ]:

    
# start to support python 3:
from __future__ import unicode_literals
from __future__ import division

#==============================================================================#
# ! imports
#==============================================================================#

# grouped by functional area, then alphabetical order by package, then
#     alphabetical order by name of thing being imported.

# context_analysis imports
from context_analysis.reliability.reliability_names_analyzer import ReliabilityNamesAnalyzer

#==============================================================================#
# ! logic
#==============================================================================#

# declare variables
my_analysis_instance = None
label = ""
indices_to_process = -1
result_status = ""

# make reliability instance
my_analysis_instance = ReliabilityNamesAnalyzer()

# database connection information - 2 options...  Enter it here:
#my_analysis_instance.db_username = ""
#my_analysis_instance.db_password = ""
#my_analysis_instance.db_host = "localhost"
#my_analysis_instance.db_name = "sourcenet"

# Or set up the following properties in Django_Config, inside the django admins.
#     All have application of: "sourcenet-db-admin":
#     - db_username
#     - db_password
#     - db_host
#     - db_port
#     - db_name

# run the analyze method, see what happens.
#label = "prelim_reliability_test"
#indices_to_process = 3
#label = "prelim_reliability_combined_human"
#indices_to_process = 3
#label = "name_data_test_combined_human"
#indices_to_process = 3
#label = "prelim_reliability_combined_human_final"
#indices_to_process = 3
#label = "prelim_reliability_combined_all"
#indices_to_process = 4
#label = "prelim_reliability_combined_all_final"
#indices_to_process = 4
#label = "prelim_reliability_test_human"
#indices_to_process = 3
#label = "prelim_reliability_test_all"
#indices_to_process = 4
label = "prelim_month"
indices_to_process = 2
result_status = my_analysis_instance.analyze_reliability_names( label, indices_to_process )

Original notebooks to calculate reliability

Back to Table of Contents

Notebooks with the original, pre-web-page code to calculate reliability:

phd_work/methods/reliability/prelim_month-reliability.ipynb

prelim_reliability_combined_human_final results

back to Table of Contents

Go to: https://research.local/research/context/analysis/reliability/names/results/view

Label: prelim_reliability_combined_human_final

Results:

Author reliability

label	results ID	coder1 index	coder2 index	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N
prelim_reliability_combined_human_final	10	1	2		98	0.9795918367	-0.0051546392	0.9727891156	0.9795918367	0.9791722296	1.0000000000	1.0000000000	96	0.9795918367	-0.0051546392	0.9782312925	1.0000000000	1.0000000000	1.0000000000	96
prelim_reliability_combined_human_final	11	1	3	10	98	0.9897959184	0.0000000000	0.9863945578	0.9897959184	0.9895861148	1.0000000000	1.0000000000	97	0.9897959184	0.0000000000	0.9891156463	1.0000000000	1.0000000000	1.0000000000	97
prelim_reliability_combined_human_final	12	2	3	10	98	0.9897959184	0.0000000000	0.9863945578	0.9897959184	0.9895816637	1.0000000000	1.0000000000	97	0.9897959184	0.0000000000	0.9891156463	1.0000000000	1.0000000000	1.0000000000	97
Averages:					98	0.9863945578333333333333333333	-0.001718213066666666666666666667	0.9818594104	0.9863945578333333333333333333	0.9861133360333333333333333333	1.0000000000	1.0000000000	96.66666666666666666666666667	0.9863945578333333333333333333	-0.001718213066666666666666666667	0.9854875283666666666666666667	1.0000000000	1.0000000000	1.0000000000	96.66666666666666666666666667

Subject reliability

label	results ID	coder1 index	coder2 index	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N	1st graf %	1st graf A	1st index %	1st index A	org hash %	org hash A
prelim_reliability_combined_human_final	10	1	2		399	0.9122807018	0.1407669798	0.8830409357	0.9122807018	0.9118944818	1.0000000000	1.0000000000	360	0.8922305764	0.7955934892	0.8850459482	0.9777777778	0.9523699116	0.9750000000	360	0.5363408521	0.9573273382	0.5087719298	0.9101064582	0.5839598997	0.5626481371
prelim_reliability_combined_human_final	11	1	3	10	399	0.8972431078	0.2505447123	0.8629908104	0.8972431078	0.8965318523	1.0000000000	1.0000000000	349	0.8746867168	0.7694145966	0.8663324979	0.9742120344	0.9446517907	0.9709885387	349	0.4962406015	0.9117737368	0.4736842105	0.8747093023	0.5664160401	0.5380809123
prelim_reliability_combined_human_final	12	2	3	10	399	0.9147869674	0.1062664908	0.8863826232	0.9147869674	0.9144471807	1.0000000000	1.0000000000	362	0.8972431078	0.8055310893	0.8903926483	0.9806629834	0.9591258208	0.9782458564	362	0.5037593985	0.9086158161	0.4812030075	0.8724327241	0.5664160401	0.5514299936
Averages:					399	0.9081035923333333333333333333	0.1658593943	0.8774714564333333333333333333	0.9081035923333333333333333333	0.9076245049333333333333333333	1.0000000000	1.0000000000	357	0.8880534670	0.7901797250333333333333333333	0.8805903648	0.9775509318666666666666666667	0.9520491743666666666666666667	0.9747447983666666666666666667	357	0.5121136173666666666666666667	0.9259056303666666666666666667	0.4878863826	0.8857494948666666666666666667	0.5722639933	0.5507196810

prelim_reliability_combined_human results

Back to Table of Contents

This is not the latest code, and so not reporting it, but including it here for reference.

Go to: https://research.local/research/context/analysis/reliability/names/results/view

Label: prelim_reliability_combined_human

Results:

Author reliability

label	results ID	coder1 index	coder2 index	coder1 ID	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N
prelim_reliability_combined_human	37	1	2		9	98	0.9795918367	-0.0051546392	0.9727891156	0.9795918367	0.9791722296	1.0000000000	1.0000000000	96	0.9795918367	-0.0051546392	0.9782312925	1.0000000000	1.0000000000	1.0000000000	96
prelim_reliability_combined_human	38	1	3			98	0.9897959184	0.0000000000	0.9863945578	0.9897959184	0.9895861148	1.0000000000	1.0000000000	97	0.9897959184	0.0000000000	0.9891156463	1.0000000000	1.0000000000	1.0000000000	97
prelim_reliability_combined_human	39	2	3	9		98	0.9897959184	0.0000000000	0.9863945578	0.9897959184	0.9895816637	1.0000000000	1.0000000000	97	0.9897959184	0.0000000000	0.9891156463	1.0000000000	1.0000000000	1.0000000000	97
Averages:						98	0.9863945578333333333333333333	-0.001718213066666666666666666667	0.9818594104	0.9863945578333333333333333333	0.9861133360333333333333333333	1.0000000000	1.0000000000	96.66666666666666666666666667	0.9863945578333333333333333333	-0.001718213066666666666666666667	0.9854875283666666666666666667	1.0000000000	1.0000000000	1.0000000000	96.66666666666666666666666667

Subject reliability

label	results ID	coder1 index	coder2 index	coder1 ID	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N	1st graf %	1st graf A	1st index %	1st index A	org hash %	org hash A
prelim_reliability_combined_human	37	1	2		9	398	0.9170854271	0.1524794056	0.8894472362	0.9145728643	0.9142174364	0.9972299169	0.9972247563	361	0.8969849246	0.8038230285	0.8901172529	0.9778393352	0.9524469067	0.9750692521	361	0.5402010050	0.9575247587	0.5125628141	0.9105087189	0.5854271357	0.5645628699
prelim_reliability_combined_human	38	1	3			398	0.9020100503	0.2639413147	0.8693467337	0.8994974874	0.8988353338	0.9971428571	0.9971374163	350	0.8793969849	0.7772888300	0.8713567839	0.9742857143	0.9447435683	0.9710714286	350	0.5000000000	0.9121951220	0.4773869347	0.8752880184	0.5703517588	0.5427100012
prelim_reliability_combined_human	39	2	3	9		398	0.9145728643	0.0615886682	0.8860971524	0.9145728643	0.9142514529	1.0000000000	1.0000000000	362	0.8969849246	0.8042530448	0.8901172529	0.9806629834	0.9591258208	0.9782458564	362	0.5050251256	0.9086158161	0.4824120603	0.8724327241	0.5653266332	0.5506229232
Averages:						398	0.9112227805666666666666666667	0.1593364628333333333333333333	0.8816303741	0.9095477386666666666666666667	0.9091014077	0.9981242580	0.9981207242	357.6666666666666666666666667	0.8911222780333333333333333333	0.7951216344333333333333333333	0.8838637632333333333333333333	0.9775960109666666666666666667	0.9521054319333333333333333333	0.9747955123666666666666666667	357.6666666666666666666666667	0.5150753768666666666666666667	0.9261118989333333333333333333	0.4907872697	0.8860764871333333333333333333	0.5737018425666666666666666667	0.5526319314333333333333333333

Preparation for automated assessment

Bacl to Table of Contents

In preparation for using the human coding as a standard against which data created with automated tool is assessed, I performed a couple of cleaning steps:

Single name removal - In CA protocol, we ignored people who were referred to only with a single name part, to avoid potential for ambiguity when assigning a last name. When preparing for assessment, I first went through and removed all people who were captured with only a single name by the automated tool.
Evaluating disagreements - In order to make the human-created data as good a reflection of correct data as possible, I then reviewed each disagreement between the human and computer manually. I assessed the disagreement based on the content analysis protocol and on having been a news writer and editor. If the human coding was in error, I fixed the human coding (and the content analysis protocol, if it turned out to have been a problem with the protocol).

Single name removal

Back to Table of Contents

In CA protocol, we ignored people who were referred to only with a single name part, to avoid potential for ambiguity when assigning a last name. Removed all instances where person was only ever referenced using a single-word (really single-part - only first name, mostly) name, to remove potential source of ambiguity.

Example: "Joe Smith's wife Sandy" - could assume her name is Sandy Smith, but it could be something else. For this study, removing that potential ambiguity by discarding instances where a given person's full name is never used.

Notebook with aggregated information on what was removed, and notes: 2017.06.01-work_log-prelim_month-remove_single_names.ipynb
Exceptions:
- If the single-name was an error on the part of either computer or human, where one party detected the full name but the other incorrectly detected just a single name, the Reliability_Names records for the two were merged so the same person being detected was captured, and then the error is subsumed in the agreement and precision-recall analysis.
- Since, post-reliability, the human coding is being used as the standard against which the quality of the automated coding is compared, if the human made an error, the coding was corrected to prepare for assessing the automated coding (see non-destructive method for correcting erroneous human coding below).

Notes

removed single names because of potential for ambiguity
Usually means you lose spouses and children of subjects who are not subjects themselves.
143 instances of single name references removed from coding data.
3 instances were automated errors that needed to be merged with a more accurate human-coded record.
1 instance was human error, and was corrected.
Types of single-named entities:
- most were family members of subjects not relevant to the story (spouses, children, parents, grandparents, etc.).
- biblical characters (Mary, Jesus)
- famous people (Obama)
- one homeless man who wouldn't give last name (article 23982).
- a few were misspellings (so a wrong spelling of a last name only used once, without first name).
- pets
- out and out automated errors - place name (Example: Saigon - Article 23921), parts of song titles ("Twinkle" - Article 23491), planet names ("Saturn" - Article 23559), or a part of a business name were detected as a person name.
Of 143 single names removed from analysis data, 15 instances were out-and-out errors (89.5% correct):
- 3 partial-name detects that had to be merged.
- 3 references to named pet chickens (Article 23065 - Betty, Mabel, and Violet).
- 9 errors where a place name (Example: Saigon - Article 23921), parts of song titles ("Twinkle" - Article 23491), planet names ("Saturn" - Article 23559), or a part of a business name were detected as a person name.
Only noted one instance where the single-name person was quoted ("Linda") - Article 23223 | Article_Data 3212 | 12096 (AS) - Linda ( id = 2911; capture_method = OpenCalais_REST_API_v2 ) (quoted; individual) ==> name: Linda |
- Actually was quoted, but just a one-word name, no explicit mention of last name. Need to keep track of relationship to others in story ("wife of X").
Assessment - OpenCalais is actually quite good at identifying single name-part references to people, for the most part. It even sometimes tacked on a last name based on the context in the article. But, most of the time it did not. Appears to be built to know of this potential, but tuned to only take action when it is certain. Not built to assume name relationships implied by things like "survived by" or "Smith's children X, Y, and Z". This is something that could be leveraged in a post-processing step if single names were left in.

Error detail

Errors:

Article 21116
- RANDOM - "More..."
- Paragraph 12: More than 600 works of art were added to the museum's collection under her leadership, most notably Ellsworth Kelly's "Blue White," a 25-foot- tall wall sculpture that was commissioned in 2006 for the museum's entry pavilion.
- User: 2 - automated (OpenCalais_REST_API_v2)
- Article 21116 | 11288 (AS) - More ( id = 2817; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: More
Article 22765
- PLACE NAME
- Paragraph 8: Gavin Orchards has started selling farm-direct apples to Grand Rapids and Fruitport schools. The biggest challenge is the time it takes to deliver low-volume orders, said Mike Gavin, who runs the 240-acre farm near Coopersville with his brother, Dave.
- User: 2 - automated (OpenCalais_REST_API_v2)
- Article 22765 | 11806 (AS) - Coopersville ( id = 2869; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Coopersville
Article 23055
- PLACE NAME
- Paragraph 2: While they are not disputing the state DHS' recent decision to reassign longtime Kent County DHS Director Andy Zylstra from Grand Rapids to Lansing, legislators are asking state officials to improve their communications with local workers, state Rep. Robert Dean said.
- User: 2 - automated (OpenCalais_REST_API_v2)
- Article 23055 | 12014 (AS) - Lansing ( id = 2902; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Lansing
Article 23491
- SONG LYRIC
- Paragraph 39: "As the program was wrapping up and the kids were leaving the stage, one of the 2-year-olds ran up to the microphone and started singing 'Twinkle, twinkle Christmas star ...' to the tune of 'Twinkle, Twinkle Little Star.' It was so funny and cute."
- User: 2 - automated (OpenCalais_REST_API_v2)
- Portion of Song title: | 10448 | Article 23491 | Article_Data 3249 | 12299 (AS) - Twinkle ( id = 2938; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Twinkle |
Article 23559
- PLANET NAME
- Paragraph 10: "Three appear: Saturn joins Mars and Venus in March so, through spring and most of summer, there will be three naked eye planets in the evening sky. They will be joined briefly by elusive Mercury in April."
- User: 2 - automated (OpenCalais_REST_API_v2)
- "Saturn joins..." - | 7961 | Article 23559 | Article_Data 3254 | 12315 (AS) - Saturn ( id = 2940; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Saturn |
Article 23631
- SCHOOL NAME - "Madonna"
- Paragraph 6: "The school is planning a tribute during halftime of the first night's Hope game Tuesday against Madonna. There will also be other activities open to former players and family members connected to DeVette."
- User: 2 - automated (OpenCalais_REST_API_v2)
- | 8120 | Article 23631 | Article_Data 3274 | 12404 (AS) - Madonna ( id = 2946; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Madonna |
Article 23631
- SCHOOL NAME
- Paragraph 7: "We have a dinner scheduled in his honor and memory during the first game of the tournament (between Davenport and Grace Bible)," Van Wieren said. "We had people that had a hard time getting to the funeral, so this will be a way that people attending can share memories of Russ."
- User: 2 - automated (OpenCalais_REST_API_v2)
- | 8119 | Article 23631 | Article_Data 3274 | 12405 (AS) - Davenport ( id = 2947; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Davenport |
Article 23921
- PLACE
- Paragraph 6: It was 1975 when he fled his native Saigon as it fell to the North Vietnamese Army.
- User: 2 - automated (OpenCalais_REST_API_v2)
- | 8981 | Article 23921 | Article_Data 3283 | 12444 (AS) - Saigon ( id = 2952; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Saigon |
Article 23974
- BUSINESS NAME - Detected part of business name as person.
- Paragraph 14: Trevor Ditmar, a two-year employee at Smitty's Specialty Beverage, 1489 Lake Drive SE, said customers are vowing to quit in increasing numbers due to the product change.
- User: 2 - automated (OpenCalais_REST_API_v2)
- | 8185 | Article 23974 | Article_Data 3292 | 12492 (AS) - Smitty ( id = 2789; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Smitty |
Article 21080
- MISSPELLING
- Paragraph 21: "Ben was in middle school when his father was in Desert Storm, and we'd watch the developments on TV," Patti Vab Syzkle said. "He'd say, 'It's OK, Mom. It's just a skirmish.'"
- User: 2 - automated (OpenCalais_REST_API_v2)
- Made new person: 11246 (AS) - Syzkle, Patti ( id = 2813; capture_method = OpenCalais_REST_API_v2 ) (quoted; individual) ==> name: Patti Vab Syzkle
- Should have mapped to: 11248 (AS) - Van Syzkle, Patti ( id = 1750; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Patti Van Syzkle

Evaluating disagreements

Back to Table of Contents

Moved to 2018.02.09-prelim-disagreement_analysis.ipynb.

Details:

For numbers used in results, see: 2018.02.09-prelim-disagreement_analysis.ipynb --> Deleted Reliability_Names records
For discussion, see disagreement reason summary: 2018.02.09-prelim-disagreement_analysis.ipynb --> disagreement reason summary
There is also a tag summary, but I switched to updating fields halfway through, so the reason summary linked above is more complete. Tag summary: 2018.02.09-prelim-disagreement_analysis.ipynb --> review tags

Automated coder evaluation

Back to Table of Contents

`prelim_month` vs. `prelim_month_human`

prelim_month - Reliability_Names data with label prelim_month where coder 1 is "ground truth" (corrected human coding) and coder 2 is data created by OpenCalais.
prelim_month_human - Reliability_Names data with label prelim_month_human where coder 1 is "ground truth" (corrected human coding) and coder 2 is uncorrected human coding (for comparison).

Precision and Recall - `prelim_month`

Back to Table of Contents

Calculate precision and recall for automated versus baseline - set it up so that coder 1 is human coding with ground_truth user having precedence, set up coder 2 so it is the automated coding output.

Jupyter notebooks:
- Create Reliability_Names data where coder 1 is ground truth, and coder 2 is automated coder: https://research.local:8000/user/jonathanmorgan/notebooks/work/django/research/work/phd_work/methods/data_creation/prelim_month-create_Reliability_Names_data.ipynb
  - configuration: https://research.local:8000/user/jonathanmorgan/edit/work/django/research/work/phd_work/methods/config-coder_index-prelim_month.py
- Calculate confusion matrices and precision/recall/F1: https://research.local:8000/user/jonathanmorgan/notebooks/work/django/research/work/phd_work/methods/precision_recall/prelim_month-confusion_matrix.ipynb
results are in Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/precision_and_recall/.

Automated Confusion Matrix and Scores

score	detect	type "author"	type "subject"	type "source"
TP	2315	454	631	1080
TN	0	1990	1580	1172
FP	68	0	152	66
FN	63	2	83	128
precision	0.97146	1	0.80587	0.94241
recall	0.97351	0.99561	0.88375	0.89404
F1	0.97248	0.9978	0.84302	0.91759

Human Precision and Recall - `prelim_month_human`

Back to Table of Contents

Calculate precision and recall for humans versus ground truth - set it up so that coder 1 is as it was for computer (ground_truth having precedence) and then set up coder 2 up the same way, but without ground_truth...

Jupyter notebooks:
- Create Reliability_Names data where coder 1 is ground truth, coder 2 is human coding without corrections for ground truth: https://research.local:8000/user/jonathanmorgan/notebooks/work/django/research/work/phd_work/methods/precision_recall/prelim_month-create_Reliability_Names-ground_truth_vs_human.ipynb
- Calculate confusion matrices and precision/recall/F1: https://research.local:8000/user/jonathanmorgan/notebooks/work/django/research/work/phd_work/methods/precision_recall/prelim_month_human-confusion_matrix.ipynb
results are in Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/precision_and_recall/.

Human Confusion Matrix and Scores

score	detect	type "author"	type "subject"	type "source"
TP	2309	453	646	1188
TN	0	1962	1669	1189
FP	19	1	24	16
FN	93	5	82	28
precision	0.99184	0.9978	0.96418	0.98671
recall	0.96128	0.98908	0.88736	0.97697
F1	0.97632	0.99342	0.92418	0.98182

Calculate reliability numbers for `prelim_month`...

Back to Table of Contents

Run the reliability calculations for prelim_month just to get lookup assessment (since it is not classification, precision and recall make no sense).

Jupyter notebook: https://research.local:8000/user/jonathanmorgan/notebooks/work/django/research/work/phd_work/methods/reliability/prelim_month-reliability.ipynb
results:
- Go to: https://research.local/research/context/analysis/reliability/names/results/view
- Label: prelim_month
- also stored in PDF form in Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/reliability/2016-data/prelim_month-reliability_results.pdf.

prelim_month Automated Author reliability

label	results ID	coder1 index	coder2 index	coder1 ID	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N
prelim_month	41	1	2	9	2	456	1.0000000000	1.0000000000	1.0000000000	1.0000000000	1.0000000000	1.0000000000	1.0000000000	456	0.9956140351	-0.0010989011	0.9941520468	0.9956140351	-0.0010989011	0.9934210526	456
Averages:						456	1.0000000000	1.0000000000	1.0000000000	1.0000000000	1.0000000000	1.0000000000	1.0000000000	456	0.9956140351	-0.0010989011	0.9941520468	0.9956140351	-0.0010989011	0.9934210526	456

prelim_month Automated Subject reliability

label	results ID	coder1 index	coder2 index	coder1 ID	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N	1st graf %	1st graf A	1st index %	1st index A	org hash %	org hash A
prelim_month	41	1	2	9	2	1990	0.9341708543	-0.0337750065	0.8683417085	0.9271356784	0.9270088091	0.9924690694	0.9924634556	1859	0.8597989950	0.7240822488	0.8130653266	0.9203873050	0.8309561562	0.8805809575	1859	0.3437185930	0.6123922212	0.3366834171	0.6206739538	0.2412060302	-0.2349657677
Averages:						1990	0.9341708543	-0.0337750065	0.8683417085	0.9271356784	0.9270088091	0.9924690694	0.9924634556	1859	0.8597989950	0.7240822488	0.8130653266	0.9203873050	0.8309561562	0.8805809575	1859	0.3437185930	0.6123922212	0.3366834171	0.6206739538	0.2412060302	-0.2349657677

... and calculate reliability numbers for `prelim_month_human`

Back to Table of Contents

Run the reliability calculations for prelim_month_human just to get lookup assessment (since it is not classification, precision and recall make no sense).

Jupyter notebook of agreement between corrected and uncorrected human coding: https://research.local:8000/user/jonathanmorgan/notebooks/work/django/research/work/phd_work/methods/reliability/prelim_month_human-reliability.ipynb
results are in Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/reliability/2016-data/prelim_month_human-reliability_results.pdf.

prelim_month_human - Author reliability

label	results ID	coder1 index	coder2 index	coder1 ID	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N
prelim_month_human	42	1	2	9	9	459	0.9869281046	-0.0054824561	0.9738562092	0.9869281046	0.9864845280	1.0000000000	1.0000000000	453	0.9869281046	-0.0054824561	0.9825708061	1.0000000000	1.0000000000	1.0000000000	453
Averages:						459	0.9869281046	-0.0054824561	0.9738562092	0.9869281046	0.9864845280	1.0000000000	1.0000000000	453	0.9869281046	-0.0054824561	0.9825708061	1.0000000000	1.0000000000	1.0000000000	453

prelim_month_human - Subject reliability

label	results ID	coder1 index	coder2 index	coder1 ID	coder2 ID	count	detect %	detect A	detect pi	lookup %	lookup A	lookup-NZ %	lookup-NZ A	lookup-NZ N	type %	type A	type pi	type-NZ %	type-NZ A	type-NZ pi	type-NZ N	1st graf %	1st graf A	1st index %	1st index A	org hash %	org hash A
prelim_month_human	42	1	2	9	9	1962	0.9459734964	-0.0275013096	0.8919469929	0.9454638124	0.9453873170	0.9994612069	0.9994608121	1856	0.9347604485	0.8674336065	0.9130139314	0.9881465517	0.9740898999	0.9822198276	1856	0.6121304791	1.0000000000	0.6116207951	0.9991667040	0.9576962283	0.9550945519
Averages:						1962	0.9459734964	-0.0275013096	0.8919469929	0.9454638124	0.9453873170	0.9994612069	0.9994608121	1856	0.9347604485	0.8674336065	0.9130139314	0.9881465517	0.9740898999	0.9822198276	1856	0.6121304791	1.0000000000	0.6116207951	0.9991667040	0.9576962283	0.9550945519

Network Analysis

Back to Table of Contents

Generate some basic network statistics from the ground truth and automated attribution data, characterize and compare using QAP (including explaining substantial limitations of this given sparseness of networks).

This notebook, methods_paper_planning.ipynb, is the master network analysis planning notebook now, not network_analysis/methods-network_analysis-create_network_data.ipynb.
ARCHIVE - Original master network analysis notebook: network_analysis/methods-network_analysis-create_network_data.ipynb (previously named 2017.11.14-work_log-prelim-network_analysis.ipynb).
- contains all original notebooks and additional explanation of what each contains (now consolidated into this notebook, below, and expanded to include latest code).

Basic Plan for getting back up to speed:

original plan was to derive descriptives on human and machine coded networks, compare them, and then do a QAP graph correlaton comparison just between the two graphs to assess quality of automated coding. This is superceded in current plan by precision, recall, F1. But, will use the original graph analysis code and plan to compare different time slices, so need to know how it works.
find code used to derive network information last time. - Evernote Network Analysis Notes: https://www.evernote.com/shard/s101/nl/11379781/ef9db83f-5fd3-4bd2-bdd7-1407e2c01f9c/
run it again on full month of data, rather than just a week.
examine traits of ground_truth and automated networks
- compare with QAP.
- look at average degree of reporters
- look at average degree, density, etc.

New analysis steps

1) create network data for each time period - network_analysis/methods-network_analysis-create_network_data.ipynb
- resulting network matrix plus some node-level traits stored in tab-delimited files that are then loaded by R programs that do the actual analysis in igraph and statnet.
2) network descriptives, for comparison across network slices.
- igraph - see Notebooks section on notebooks - network analysis - igraph below.
- statnet - see Notebooks section on notebooks - network creation, descriptives and QAP - statnet below.
3) QAP comparison of networks.
- see Notebooks section on notebooks - network creation, descriptives and QAP - statnet below.
4) Question: do we care about author-info?
- see Notebooks section on notebooks - author info

SNA Notebooks

Details on notebooks that:

1) are used to implement steps above and...
2) contain the results, for reference when writing.

notebooks - create data

network_analysis/methods-network_analysis-create_network_data.ipynb
- Overview:
  - data creation for original Python network analysis and for current analysis.
  - original Python network analysis, run for original coders, then week and month of new data - per-author source, shared, and article counts; and means of each across all authors.
  - for "human", this is the baseline - "ground_truth" user takes precedence over all others.
- Section 2 (2.1-2.3): Deriving network data - for all network analysis, contains the exact settings used to create the network data for each time period.
  - 2.1 - for original week (12/06/2009-12/12/2009), original coders, networks output include:
    - combined human
    - automated.
  - 2.2 - original week (12/06/2009-12/12/2009), new coders, networks output include:
    - ground_truth (combined human plus corrected, corrected first, then human)
    - automated.
  - 2.3 - entire month (12/01/2009-12/31/2009).
    - includes all people for the entire month, networks for:
      - 2.3.1 nodes - full month; ties - full month
      - 2.3.2 nodes - full month; ties - week 1 (2009-12-06 to 2009-12-12)
      - 2.3.3 nodes - full month; ties - week 2 (2009-12-13 to 2009-12-19)
      - 2.3.4 nodes - full month; ties - week 3 (2009-12-20 to 2009-12-26)
- Section 3 - output from here is basis for all the R author info stuff below:
  - 3.1 - reproduce original analysis, but with the new data. Success. Modest.
  - 3.2 - OBSOLETE - Overview of the rest of the notebooks, consolidated below.

notebooks - network analysis - igraph

network_analysis/igraph - R igraph analysis
- new notebooks: Broke out into one file per time period, and separate R data files:
  - network_analysis/igraph/R-igraph-grp_month-full_month.ipynb - full month of data, and data file igraph-grp_month-full_month.RData.
  - network_analysis/igraph/R-igraph-grp_month-week_1.ipynb - full week 1 of three (2009-12-06 to 2009-12-12), and data file igraph-grp_month-week_1.RData.
  - network_analysis/igraph/R-igraph-grp_month-week_2.ipynb - full week 2 of three (2009-12-13 to 2009-12-19), and data file igraph-grp_month-week_2.RData.
  - network_analysis/igraph/R-igraph-grp_month-week_3.ipynb - full week 3 of three (2009-12-20 to 2009-12-26), and data file igraph-grp_month-week_3.RData.
- original notebook: network_analysis/igraph/2017.12.02-work_log-prelim-R-igraph-grp_month.ipynb - basic network analysis of new month and week (nodes for all people from entire month, ties for whole month, then just first week) using igraph.

notebooks - network creation, descriptives and QAP - statnet

network_analysis/statnet - R statnet analysis
- All data still stored in a single RData file (statnet-grp_month.RData)
- statnet function library: context_analysis git repo, context_analysis/r/sna/statnet/functions-statnet.r
- network creation, descriptives, and QAP notebooks:
  - the below notebooks each create network data for a single time period, then analyze each period separately. Each includes:
    - creating the data for the notebook's time period using automated data and generating basic network metrics to describe the network.
    - creating the data for the notebook's time period using the baseline data and generating basic network metrics to describe the network.
    - comparing the baseline matrix to the automated matrix using matrix correlation and QAP.
  - notebooks:
    - network_analysis/statnet/R-statnet-grp_month-full_month.ipynb - full month of data.
      - correlation between human and automated: 0.914011398376571
    - network_analysis/statnet/R-statnet-grp_month-week_1.ipynb - full week 1 of three (2009-12-06 to 2009-12-12).
      - correlation between human and automated: 0.90223000784894
    - network_analysis/statnet/R-statnet-grp_month-week_2.ipynb - full week 2 of three (2009-12-13 to 2009-12-19).
      - correlation between human and automated: 0.908007732272713
    - network_analysis/statnet/R-statnet-grp_month-week_3.ipynb - full week 3 of three (2009-12-20 to 2009-12-26).
      - correlation between human and automated: 0.898065381427003
- network comparison - then, the networks created above are compared week-to-week and each week to the month as a whole to start to look at what constitutes a network snapshot.
  - network_analysis/statnet/R-statnet-grp_month-compare_graphs.ipynb - QAP comparisons, both automated-to-automated and human-to-human, of:
    - week 1 to week 2
    - week 1 to week 3
    - week 2 to week 3
    - full month to week 1
    - full month to week 2
    - full month to week 3
  - network_analysis/statnet/R-statnet-grp_month-compare_graphs_cross_source.ipynb - To look at difference mixing and matching human and automated makes for analysis, includes QAP comparisons, baseline-to-automated (b2a) and automated-to-baseline (a2b), of:
    - week 1 to week 2
    - week 1 to week 3
    - week 2 to week 3
    - full month to week 1
    - full month to week 2
    - full month to week 3
- ARCHIVE: original notebook: network_analysis/statnet/2017.12.02-work_log-prelim-R-statnet-grp_month.ipynb - basic network analysis of new month and week (nodes for all people from entire month, ties for whole month, then just first week) using statnet. Broke out into one notebook per time period, and one notebook for comparisons across time periods. All data still stored in a single RData file (statnet-grp_month.RData).

ANALYSIS - network creation, descriptives and QAP - statnet

Back to Table of Contents

Anaylsis of these descriptives and QAP correlations:

Comparison between automated and baseline networks (month, week1, week2, week3): Dropbox/academia/MSU/program_stuff/prelim_paper/paper/latest/network_snapshots-compare_automated_to_baseline.xlsx
- includes comparisons from network_analysis/statnet/R-statnet-grp_month-compare_graphs.ipynb and network_analysis/statnet/R-statnet-grp_month-compare_graphs_cross_source.ipynb above.

notebooks - author info

network_analysis/author_info - Information on the authors in the data set and their network characteristics.
- NOTE: Still looks to be dependent on the python author info code run in network_analysis/methods-network_analysis-create_network_data.ipynb, section 3.1 (vectors of person IDs, counts hard-coded in the R code)
  - in new notebooks, the weeks 2 and 3 result in different numbers, but look to be based on the same 1-week vectors... would need to understand what is going on here again before I'd use these numbers. I'd expect you'd need to run python author info 4 times, once for month, once for each week, not just for month and 1 week.
- new notebooks:
- original notebooks:
  - network_analysis/author_info/2017.12.07-work_log-prelim-R-grp_month-sna-author_info.r.ipynb - R-based author info (similar to Python-based above) for full month of data.
  - network_analysis/author_info/2017.12.09-work_log-prelim-R-grp_week-sna-author_info.r.ipynb - R-based author info (similar to Python-based above) for single week of data.

network analysis TODO

network analysis TODO - DEFERRED

TODO: update all below so they include the two additional weeks.

network_analysis/statnet/2017.12.07-work_log-prelim-R-statnet-grp_month-01.ipynb - statnet analysis when network is converted to either be 0 or 1 weight, where all weights greater than 1 are converted to 1. No real difference here, so ignoring in paper.

network analysis TODO - DONE

DONE:

updated forms ArticleSelectForm and PersonSelectForm to include field for "coder_id_priority_list"/"person_coder_id_priority_list".
created method NetworkOutput.get_coder_id_list() that:
- knows about the two places where coder IDs can be set.
- if prioritzed list is present:
  - starts with the priotized list
  - appends coders from other field who aren't already in the list to the end of it.
  - stores the list in an instance variable inside the object so it can be retrieved easily.
updated NetworkOutput.create_query_set() to use get_coder_id_list() method.
need to update NetworkOutput.remove_duplicate_article_data() - it is where we choose which Article_Data to omit per article where there are duplicates. Need to go with order of list. Might already do this... Nope.
- get prioritized list.
- for first instance of Article_Data for article, store it (related by id, or unique_identifier?)
- on subsequent Article_Data for article, get index of coder for existing and new.
- Whichever has lower index you keep.
- Need to test
  - person-coded articles:
    - 1) output networks from initial prelim (12/6/2009-12/13/2009). Make sure they are the same now as they were then.
    - 2) keep article specs the same, but change person lookup to use ordered ID list. See if this is the same as the files in 1 (might not be). If not, count rows, find and compare rows for some users to see how different they are. Hopefully same contents, different order...?
    - 3) then, switch the article specs to use ordered list and put person specs back to old way, see how this file compares to the others.
    - look for differences in:
      - number of rows
      - contents of rows
      - IDs of those included
  - automated coder:
    - 1) output networks from initial prelim (12/6/2009-12/13/2009). Make sure they are the same now as they were then.
    - 2) keep article specs the same, but change person lookup to use ordered ID list. See if this is the same as the files in 1 (might not be). If not, count rows, find and compare rows for some users to see how different they are. Hopefully same contents, different order.
    - 3) then, switch the article specs to use ordered list, see how this file compares to the others.
  - as long as the tests above check out, then try out the whole month, with prioritized coder list.
need to update NetworkDataOutput and children? Looks like no - all comes down to the remove_duplicate_article_data().
figure out how to run context_text/R/sna/sna_author_info.r.

Combine results into spreadsheet

Back to Table of Contents

Next step is to pull analysis together in an Excel spreadsheet like I did last time.

For old results and more detailed notes on implementation and interpretation, see Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/archive/prelim_v1-2015/analysis_summary.xlsx.

New analysis file: Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/analysis_summary-2017.12.24.xlsx

Analysis charts for paper (should take all tables, convert to markdown, and add to this notebook): Dropbox/academia/MSU/program_stuff/prelim_paper/paper/latest/methods-charts.xlsx

Notes:

In general, revised procedure:
- content analysis protocol to create testing data.
  - create and test protocol.
  - report reliability.
  - use protocol to create testing data.
- have automated tool code same articles.
- for all disagreements, evaluate manually, correct the testing data when it has an error.
- derive confusion matrix data by comparing automated coding to testing data, assess quality of automated coding using precision and recall (etc.).
so, won't look as much at comparing humans to computer in terms of agreement for content analysis:
- describe protocol development and reliability assessment
- describe process of turning the resulting data into testing data (don't use "ground truth"). Some discussion here of ratio of human error to machine error, proportion of human to machine errors, overall number of errors compared to all decisions, etc.
- outline precision and recall and evaluate.
Removed tabs:
- agree-prelim_reliability - old reliability coding between 2 human coders.
- agree-prelim_network-mentions - agreement between traits of network data derived from human and computer code - tie weights.
- values-detect_names - survey of name detection descriptives - counts across all names of how many were detected and not per coder. Will see if we need to derive this again for new coders. Probably won't.
- values-count_ties - descriptives and comparison of ties weights between human and computer, to look at something like precision and recall (confusion matrix), but just comparing human and computer, not treating human as ground truth. No need for this with precision and recall stats.
- counts_per_person - not sure what this is...
- disagreements - similar to values-count_ties, but higher-level analysis. Will have to create new disagreement information from results of disagreement analysis in creating evaluation data.

Updated spreadsheet:

New analysis file: Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/analysis_summary-2017.12.24.xlsx
Agreement results:
- tabs "CA-reliability-author" and "CA-reliability-subject" are derived from work labeled "prelim_reliability_combined" = articles from both Grand Rapids Press and Detroit News, to minimally test cross-paper use of protocol.
  - pulled from spreadsheet: Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/reliability/2016-data/2016.08.27-reliability-prelim_reliability_combined_human.xlsx
- old results have "mentions". Mentions are weights from network data back when I derived it and stored it in a table of my own design ("context_analysis_reliability_ties"), rather than outputting in formats readable by SNA packages. Omitting this in favor of precision and recall and network statistics.
Network results:
- Main sources:
  - 2017.12.02-work_log-prelim-R-statnet-grp_month.ipynb - basic network analysis of new month and week using statnet.
  - 2017.12.07-work_log-prelim-R-grp_month-sna-author_info.r.ipynb - R-based author info (similar to Python-based above) for full month of data.
  - 2017.12.09-work_log-prelim-R-grp_week-sna-author_info.r.ipynb - R-based author info (similar to Python-based above) for single week of data.
  - 2017.12.02-work_log-prelim-R-igraph-grp_month.ipynb - basic network analysis of new month and week using igraph.
    - for mean transitivity of nodes, look at igraph notebook.
- Other:
  - 2017.11.14-work_log-prelim-network_analysis.ipynb - original Python network analysis, run for original coders, then week and month of new data - per-author source, shared, and article counts; and means of each across all authors.
  - 2017.12.07-work_log-prelim-R-statnet-grp_month-01.ipynb - statnet analysis when network is converted to either be 0 or 1 weight, where all weights greater than 1 are converted to 1. No real difference here, so ignoring in paper.

Paper Edits

Back to Table of Contents

Path to paper: Dropbox/academia/MSU/program_stuff/prelim_paper/paper/latest/Morgan-Prelim.docx

TODO:

cut the shit out of lit. review.
update discussion

DONE:

update methods
- generate content analysis data.
- assess reliability.
- generate attribution data using OpenCalais API.
- evaluate disagreements to establish ground truth (fix human errors).
- calculate precision and recall.
- examine resulting networks.
update results

TODO

Back to Table of Contents

TODO:

brief lit. review of hybrid content analysis coding, so I can mention a few in lit. review.

DONE:

make sure the network code can deal with multiple coders, and can prioritize in an order I specify.

Table of Contents

Setup

Setup - Imports

Setup - virtualenv jupyter kernel

Setup - Initialize Django

Analysis steps

Data Creation

Data Creation - local GRP articles

Data Creation - GRP - prelim month

Data Creation - local TDN articles

Reliability

Coding protocol testing

Reliability Data

prelim_month reliability sample size

original design reliability sample size

Creating Reliability_Names data

Reliability Analysis

Code to calculate reliability numbers

Original notebooks to calculate reliability

prelim_reliability_combined_human_final results

Author reliability

Subject reliability

prelim_reliability_combined_human results

Author reliability

Subject reliability

Preparation for automated assessment

Single name removal

Notes

Error detail

Evaluating disagreements

Automated coder evaluation

prelim_month vs. prelim_month_human

Precision and Recall - prelim_month

Automated Confusion Matrix and Scores

Human Precision and Recall - prelim_month_human

Human Confusion Matrix and Scores

Calculate reliability numbers for prelim_month...

prelim_month Automated Author reliability

prelim_month Automated Subject reliability

... and calculate reliability numbers for prelim_month_human

prelim_month_human - Author reliability

prelim_month_human - Subject reliability

Network Analysis

Basic Plan for getting back up to speed:

New analysis steps

SNA Notebooks

notebooks - create data

notebooks - network analysis - igraph

notebooks - network creation, descriptives and QAP - statnet

ANALYSIS - network creation, descriptives and QAP - statnet

notebooks - author info

network analysis TODO

network analysis TODO - DEFERRED

network analysis TODO - DONE

Combine results into spreadsheet

NEXT

Paper Edits

TODO

`prelim_month` vs. `prelim_month_human`

Precision and Recall - `prelim_month`

Human Precision and Recall - `prelim_month_human`

Calculate reliability numbers for `prelim_month`...

... and calculate reliability numbers for `prelim_month_human`