Methods - network analysis - create network data

2017.11.14 - work log - prelim - network analysis

NOTE: The work captured here is outdated. See methods_paper_planning.ipynb --> Network Analysis for the up-to-date network analysis summary.

Setup

Setup - Imports


In [1]:
from __future__ import unicode_literals
from __future__ import division

# python base imports
import datetime

# import six
import six

print( "packages imported at " + str( datetime.datetime.now() ) )


packages imported at 2018-03-17 21:55:28.967228

Setup - Initialize Django

First, initialize my dev django project, so I can run code in this notebook that references my django models and can talk to the database using my project's settings.


In [2]:
%run ../django_init.py


/home/jonathanmorgan/.virtualenvs/sourcenet/lib/python3.5/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
  """)
django initialized at 2018-03-18 01:55:33.353472

In [3]:
# django imports
from django.contrib.auth.models import User

# sourcenet imports
from context_text.shared.context_text_base import ContextTextBase

# context_analysis imports
from context_analysis.network.network_person_info import NetworkPersonInfo

# sourcenet imports
from context_text.models import Article
from context_text.models import Article_Data
from context_text.models import Person

Network Analysis

Generate some basic network statistics from the ground truth and automated attribution data, characterize and compare using QAP (including explaining substantial limitations of this given sparseness of networks).

Notes:

  • TK

network builder - test prelim network - original coders

First, we configure as we did before, use Network Builder to render human and automated networks, and compare the files to the previous run - should be identical.

Configuration of Network Builder:

  • Configuration to generate network files for prelim:

    • Config of "Select Articles" - use defaults, except:

      • People from: 2009-12-06
      • People to: 2009-12-12
      • Publications: "Grand Rapids Press, The"
      • Coders:

        • for human sample: "brianbowe", "jonathanmorgan"
        • for automated: "automated"
      • coder_type filter - only for coder "automated", for now.

        • use the coder_type filter fields to filter automatically coded Article_Data on coder type if you have tried different automated coder types:
        • "Article_Data coder_type Filter Type:" - "Just automated"
        • "coder_type 'Value In' List (comma-delimited):" - Enter the coder types you want included. Examples:

          • OpenCalais v2: "OpenCalais_REST_API_v2"
      • Article Tag List: "prelim_network"

      • Person allow duplicate articles: "No"
    • Configure "Network Settings" - use defaults, except:

      • Download as File?: "Yes"
      • Data Format: "Tab-Delimited Matrix"
      • Data Output Type: Network + Attribute Columns
      • Include Headers: "Yes"
    • Config of "Select People" - use defaults, except:

      • For each, include all people used by either human or automated, so you have same dimensions. SO...
      • Person Query Type: "Custom, defined below"
      • People from: 2009-12-06
      • People to: 2009-12-12
      • Person publications: "Grand Rapids Press, The"
      • Person coders: "automated", "brianbowe", "jonathanmorgan"
      • coder_type filter

        • use the coder_type filter fields to filter automatically coded Article_Data on coder type if you have tried different automated coder types:
        • "Article_Data coder_type Filter Type:" - "Just automated"
        • "coder_type 'Value In' List (comma-delimited):" - Enter the coder types you want included. Examples:

          • OpenCalais v2: "OpenCalais_REST_API_v2"
      • Article Tag List: "prelim_network"

      • Person allow duplicate articles: "Yes"

Resulting files stored in Dropbox (Dropbox/academia/MSU/program_stuff/prelim_paper/data/network_analysis/2017.11.14/network/original_coders/prelim_network):

  • human - original coders - article and person unordered - sourcenet_data-20171114-182817-prelim_network-week-human-original-unordered.tab
  • human - original coders - article ordered (6,4), person unordered - sourcenet_data-20171114-183930-prelim_network-week-human-original-ordered-64.tab
  • human - original coders - article (6,4) and person (2,6,4) ordered - sourcenet_data-20171115-024247-prelim_network-week-human-original-ordered-64-264.tab
  • automated - sourcenet_data-20171114-182942-prelim_network-week-automated.tab

Notes:

  • The human ones are almost identical - compared in kaleidoscope - 3 lines different. The automated one is quite different - evidence that network correlations are not really so useful with so many 0s.
  • Ordered or unordered in person, configured as it was for this, files are identical (YAY! as it should be!).

network builder - test prelim network - new coders

Next, we start with orignal configuration in Network Builder, update to reflect new coder users and ground_truth, and render human and automated networks for the same week, and compare the files to the old coders.

Configuration to test with new coders:

  • start with base configuration above.
  • set up the Person so it is using minnesota1, minnesota2, minnesota3, ground_truth, and automated, unordered.
  • in Article:

    • try automated alone.
    • try minnesota1, minnesota2, minnesota3, ground_truth unordered.
    • try ordered, ground_truth, then minnesota1, minnesota2, minnesota3 - 13, 8, 9, 10

Resulting files stored in Dropbox (Dropbox/academia/MSU/program_stuff/prelim_paper/data/network_analysis/2017.11.14/network/new_coders/prelim_network):

  • automated - sourcenet_data-20171115-033413-new-prelim_network-week-automated.tab
  • automated using ordered fields - sourcenet_data-20171115-041030-new-prelim_network-week-automated-all_ordered-person-2.13.8.9.10.tab
  • human - new coders - unordered - sourcenet_data-20171115-035924-new-prelim_network-week-human-unordered.tab
  • human - new coders - ordered 13,8,9,10 - sourcenet_data-20171114-183930-new-prelim_network-week-human-new-ordered-13.8.9.10.tab

Notes:

  • There are differences, but they are about what I'd expect for ground_truth taking precedence consistently - 35 rows different.
  • automated I tried two ways - using select, and using ordered list. Identical either way.

network builder - test prelim month

Next, we use Network Builder to create prelim_month data.

nodes - full month; ties - full month

Configuration of Network Builder:

  • Configuration to generate network files for prelim:

    • Config of "Select Articles" - fields in bold need to be changed from default values:

      • Start date (YYYY-MM-DD): 2009-12-01
      • End date (YYYY-MM-DD): 2009-12-31
      • Fancy date range: - Empty.
      • Publications: "Grand Rapids Press, The"
      • Coders: None selected.
      • Coder IDs to include, in order of highest to lowest priority:

        • for human sample: 13,8,9,10
        • for automated: 2
      • if automated: Article_Data coder_type Filter Type and coder_type 'Value In' List (comma-delimited):

        • only for coder "automated" (2), for now.
        • use the coder_type filter fields to filter automatically coded Article_Data on coder type if you have tried different automated coder types:

          • Article_Data coder_type Filter Type: - Just automated
          • coder_type 'Value In' List (comma-delimited): - Enter the coder types you want included. Examples:

            • OpenCalais v2: "OpenCalais_REST_API_v2"
      • Topics: None selected.

      • Article Tag List (comma-delimited): - "grp_month"
      • Unique Identifier List (comma-delimited): - Empty.
      • Allow duplicate articles: - "No"
    • Configure "Network Settings" - fields in bold need to be changed from default values:

      • relations - Include source contact types - All selected.
      • relations - Include source capacities: - None selected.
      • relations - Exclude source capacities: - None selected.
      • Download as File? - "Yes"
      • Include render details? - "No"
      • Data Format: - "Tab-Delimited Matrix"
      • Data Output Type: - "Network + Attribute Columns"
      • Network Label: - Empty.
      • Include Headers: - "Yes"
    • Config of "Select People" - fields in bold need to be changed from default values:

      • NOTE: This will be the same for all networks you want to compare (different weeks within a month, compared to the whole month, for instance). For each, get people from articles that are filtered to include all people used by either human or automated, and all the days covered by any of the networks you want to compare. This means you'll have the same dimensions of network (same set of nodes/people) regardless of the particular network you are generating, allowing the matrices that result to be compared.
      • Person Query Type: - "Custom, defined below"
      • People from (YYYY-MM-DD): - 2009-12-01
      • People to (YYYY-MM-DD): - 2009-12-31
      • Fancy person date range: - Empty.
      • Person publications: - "Grand Rapids Press, The"
      • Person coders: - "automated", "minnesota1", "minnesota2", "minnesota3", "ground_truth"
      • Coder IDs to include, in order of highest to lowest priority: - Empty.
      • Article_Data coder_type Filter Type and coder_type 'Value In' List (comma-delimited):

        • NOTE: not just for automated - since this includes all coders, automated and human, you need to always specify the coder type filter if you need it for automated network.
        • use the coder_type filter fields to filter automatically coded Article_Data on coder type if you have tried different automated coder types:

          • Article_Data coder_type Filter Type: - Just automated
          • coder_type 'Value In' List (comma-delimited): - Enter the coder types you want included. Examples:

            • OpenCalais v2: "OpenCalais_REST_API_v2"
      • Person Topics: None

      • Article Tag List (comma-delimited): - "grp_month"
      • Unique Identifier List (comma-delimited): - Empty.
      • Person allow duplicate articles: - "Yes"

full month file output

Resulting files stored in Dropbox (Dropbox/academia/MSU/program_stuff/prelim_paper/data/network_analysis/2017.11.14/network/new_coders/grp_month):

  • human - sourcenet_data-20171115-043102-grp_month-human.tab
  • automated - sourcenet_data-20171205-022551-grp_month-automated.tab

NOTE: To test configuration, render network file for either human or automated, then check it against the corresponding file in Dropbox. If configured correctly, files will be the same.

nodes - full month; ties - week 1 (2009-12-06 to 2009-12-12)

Then, alter to just output the week, but with all the people for the entire month (so looking at how same matrix compares when populated using a week's worth of data compared to a month's).

  • Configuration to generate network files for prelim - use the above grp_month config, except:

    • Config of "Select Articles" - use fields as configured for full month of ties, except:

      • Start date (YYYY-MM-DD): 2009-12-06
      • End date (YYYY-MM-DD): 2009-12-12

week 1 file output

Resulting files stored in Dropbox (Dropbox/academia/MSU/program_stuff/prelim_paper/data/network_analysis/2017.11.14/network/prelim_network/new_coders/grp_month):

  • human - sourcenet_data-20171206-031319-grp_month-human-week1_subset.tab
  • automated - sourcenet_data-20171206-031358-grp_month-automated-week1_subset.tab

nodes - full month; ties - week 2 (2009-12-13 to 2009-12-19)

Then, alter to just output the week, but with all the people for the entire month (so looking at how same matrix compares when populated using a week's worth of data compared to a month's).

  • Configuration to generate network files for prelim - use the above grp_month config, except:

    • Config of "Select Articles" - use fields as configured for full month of ties, except:

      • Start date (YYYY-MM-DD): 2009-12-13
      • End date (YYYY-MM-DD): 2009-12-19

week 2 file output

Resulting files stored in Dropbox (Dropbox/academia/MSU/program_stuff/prelim_paper/data/network_analysis/2017.11.14/network/prelim_network/new_coders/grp_month):

  • human - sourcenet_data-20180326-034401-grp_month-human-week2_subset.tab
  • automated - sourcenet_data-20180326-040445-grp_month-automated-week2_subset.tab

nodes - full month; ties - week 3 (2009-12-20 to 2009-12-26)

Then, alter to just output the week, but with all the people for the entire month (so looking at how same matrix compares when populated using a week's worth of data compared to a month's).

  • Configuration to generate network files for prelim - use the above grp_month config, except:

    • Config of "Select Articles" - use fields as configured for full month of ties, except:

      • Start date (YYYY-MM-DD): 2009-12-20
      • End date (YYYY-MM-DD): 2009-12-26

week 3 file output

Resulting files stored in Dropbox (Dropbox/academia/MSU/program_stuff/prelim_paper/data/network_analysis/2017.11.14/network/prelim_network/new_coders/grp_month):

  • human - sourcenet_data-20180326-034548-grp_month-human-week3_subset.tab
  • automated - sourcenet_data-20180326-040736-grp_month-automated-week3_subset.tab

Redo network analysis

Notes on original network analysis are in Evernote: MSU PhD - prelim - analysis - Network Analysis notes

  • network descriptives

    • high-level description of networks
    • // From Prelim Proposal: "I’ll then compare values for more specific network statistics between the two, including network-level aggregate degree and transitivity, and reporter-level degree, average tie weight and transitivity." More details:
    • http://faculty.ucr.edu/~hanneman/nettext/C7_Connection.html
    • for old results and more detailed notes on implementation and interpretation, see Dropbox/academia/MSU/program_stuff/prelim_paper/analysis/analysis_summary.xlsx
    • network-level

      • files

        • python script:

          • context_text/examples/analysis/analysis-person_info.py - calculates per-author information - on shared sources, article counts per author, etc.
        • R scripts:

          • context_text/R/db_connect.r
          • context_text/R/sna/functions-sna.r
          • context_text/R/sna/sna-load_data.r
          • context_text/R/sna/igraph/*
          • context_text/R/sna/statnet/*
      • statnet/sna

        • sna::gden() - graph density
      • igraph

        • igraph::transitivity() - vector of transitivity scores for each node in a graph, plus network-level transitivity score.

          • Q - interpretation?

analysis-person_info.py

First, need to figure out context_text/examples/analysis/reliability-build_relations.py

Original file: context_text/examples/analysis/analysis-person_info.py

Moved to:

  • NetworkPersonInfo object: context_analysis/network/network_person_info.py
  • example file: context_analysis/examples/network/network-person_info.py

Try reproducing below with new class - Configure:


In [4]:
%run ../config-coder_index-prelim_month.py


indexing for grp_month/prelim_month initialized at 2018-03-18 01:58:02.409032

In [5]:
%run ../config-coder_index-prelim_week.py


indexing for prelim_network (1 week) initialized at 2018-03-18 01:58:04.341522

And then run the code:


In [7]:
#===============================================================================
# process articles
#===============================================================================

# process articles
my_info_instance.process_articles( tag_list )

#output lists of counts of sources and shared source by author

# declare variables - looking at data
coder_index_to_data_dict = None
coder_index = -1
coder_data_dict = None
coder_author_id_list = None
coder_author_source_count_list = None
coder_author_shared_count_list = None
coder_author_article_count_list = None
mean_source_count = -1
mean_shared_count = -1
mean_article_count = -1
author_index = -1
shared_count = -1
temp_author_id_list = []
temp_source_count_list = []
temp_shared_count_list = []
temp_article_count_list = []

# for each index, get authors.
coder_index_to_data_dict = my_info_instance.coder_index_to_data_map
        
# loop over the dictionary to process each index.
for coder_index, coder_data_dict in six.iteritems( coder_index_to_data_dict ):

    # get data for coder
    coder_author_id_list = coder_data_dict.get( NetworkPersonInfo.PROP_CODER_AUTHOR_ID_LIST, None )
    coder_author_source_count_list = coder_data_dict.get( NetworkPersonInfo.PROP_CODER_AUTHOR_SOURCE_COUNT_LIST, None )
    coder_author_shared_count_list = coder_data_dict.get( NetworkPersonInfo.PROP_CODER_AUTHOR_SHARED_COUNT_LIST, None )
    coder_author_article_count_list = coder_data_dict.get( NetworkPersonInfo.PROP_CODER_AUTHOR_ARTICLE_COUNT_LIST, None )

    # output
    print( "" )
    print( "================================================================================" )
    print( "Data for Coder index " + str( coder_index ) + ":" )

    print( "" )
    print( "==> All authors" )
    print( "- author ID list = " + str( coder_author_id_list ) )    
    print( "- author source count list = " + str( coder_author_source_count_list ) )    
    print( "- author shared count list = " + str( coder_author_shared_count_list ) )    
    print( "- author article count list = " + str( coder_author_article_count_list ) )    

    # and some computations

    # author count
    print( "- author count = " + str( len( coder_author_id_list ) ) )
    
    # mean source count per author
    mean_source_count = float( sum( coder_author_source_count_list ) ) / len( coder_author_source_count_list )
    print( "- mean source count per author = " + str( mean_source_count ) )
    
    # mean shared count per author
    mean_shared_count = float( sum( coder_author_shared_count_list ) ) / len( coder_author_shared_count_list )
    print( "- mean shared count per author = " + str( mean_shared_count ) )
    
    # mean article count per author
    mean_article_count = float( sum( coder_author_article_count_list ) ) / len( coder_author_article_count_list )
    print( "- mean article count per author = " + str( mean_article_count ) )
    
    # the same, but just for those with shared sources.
    author_index = -1
    temp_author_id_list = []
    temp_source_count_list = []
    temp_shared_count_list = []
    temp_article_count_list = []

    for shared_count in coder_author_shared_count_list:
    
        # increment index
        author_index += 1
        
        # greater than 0?
        if ( shared_count > 0 ):
        
            # yes, add info to temp lists.
            temp_author_id_list.append( coder_author_id_list[ author_index ] )
            temp_source_count_list.append( coder_author_source_count_list[ author_index ] )
            temp_shared_count_list.append( coder_author_shared_count_list[ author_index ] )
            temp_article_count_list.append( coder_author_article_count_list[ author_index ] )
            
        #-- END check to see if shared count > 0 --#
    
    #-- END loop over shared_count_list --#

    print( "" )
    print( "==> Authors with shared sources" )
    print( "- author ID list = " + str( temp_author_id_list ) )    
    print( "- author source count list = " + str( temp_source_count_list ) )    
    print( "- author shared count list = " + str( temp_shared_count_list ) )    
    print( "- author article count list = " + str( temp_article_count_list ) )    

    # and some computations

    # author count
    print( "- author count = " + str( len( temp_author_id_list ) ) )
    
    # mean source count per author with shared sources
    mean_source_count = float( sum( temp_source_count_list ) ) / len( temp_source_count_list )
    print( "- mean source count per author with shared sources = " + str( mean_source_count ) )
    
    # mean shared count per author with shared sources
    mean_shared_count = float( sum( temp_shared_count_list ) ) / len( temp_shared_count_list )
    print( "- mean shared count per author with shared sources = " + str( mean_shared_count ) )
    
    # mean article count per author
    mean_article_count = float( sum( temp_article_count_list ) ) / len( temp_article_count_list )
    print( "- mean article count per author = " + str( mean_article_count ) )
    
#-- END loop over coders. --#


Start of update_author_shared_sources():
In update_author_shared_sources: multiple authors ( [3, 84] for source 2731
In update_author_shared_sources: multiple authors ( [598, 1082] for source 2052
In update_author_shared_sources: multiple authors ( [598, 1082] for source 2054
In update_author_shared_sources: multiple authors ( [598, 1082] for source 2055
In update_author_shared_sources: multiple authors ( [598, 1082] for source 2056
In update_author_shared_sources: multiple authors ( [69, 84] for source 345
In update_author_shared_sources: multiple authors ( [377, 69, 23, 437] for source 2094
In update_author_shared_sources: multiple authors ( [66, 23] for source 2096
In update_author_shared_sources: multiple authors ( [598, 1082] for source 2060
In update_author_shared_sources: multiple authors ( [377, 69] for source 100
In update_author_shared_sources: multiple authors ( [46, 161] for source 358
In update_author_shared_sources: multiple authors ( [66, 223, 46, 69, 36, 23] for source 102
In update_author_shared_sources: multiple authors ( [46, 161] for source 360
In update_author_shared_sources: multiple authors ( [46, 161] for source 361
In update_author_shared_sources: multiple authors ( [46, 161] for source 362
In update_author_shared_sources: multiple authors ( [394, 302] for source 2205
In update_author_shared_sources: multiple authors ( [161, 46] for source 162
In update_author_shared_sources: multiple authors ( [161, 46] for source 163
In update_author_shared_sources: multiple authors ( [161, 46] for source 164
In update_author_shared_sources: multiple authors ( [161, 46] for source 165
In update_author_shared_sources: multiple authors ( [161, 46] for source 166
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2222
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2223
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2224
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2225
In update_author_shared_sources: multiple authors ( [36, 599] for source 2226
In update_author_shared_sources: multiple authors ( [46, 3] for source 182
In update_author_shared_sources: multiple authors ( [217, 178] for source 2233
In update_author_shared_sources: multiple authors ( [377, 69, 46, 937, 23, 66, 591] for source 188
In update_author_shared_sources: multiple authors ( [3, 84] for source 2734
In update_author_shared_sources: multiple authors ( [505, 74] for source 2256
In update_author_shared_sources: multiple authors ( [30, 217] for source 2265
In update_author_shared_sources: multiple authors ( [437, 377, 217, 74] for source 218
In update_author_shared_sources: multiple authors ( [36, 223, 178] for source 225
In update_author_shared_sources: multiple authors ( [73, 23] for source 236
In update_author_shared_sources: multiple authors ( [73, 437] for source 237
In update_author_shared_sources: multiple authors ( [36, 74] for source 2315
In update_author_shared_sources: multiple authors ( [69, 387] for source 289
In update_author_shared_sources: multiple authors ( [223, 69, 437] for source 290
In update_author_shared_sources: multiple authors ( [84, 443, 66] for source 308
In update_author_shared_sources: multiple authors ( [178, 336] for source 337
In update_author_shared_sources: multiple authors ( [69, 84, 46, 937, 23, 66, 437] for source 346
In update_author_shared_sources: multiple authors ( [377, 69] for source 2402
In update_author_shared_sources: multiple authors ( [591, 394] for source 368
In update_author_shared_sources: multiple authors ( [29, 69] for source 383
In update_author_shared_sources: multiple authors ( [591, 69] for source 404
In update_author_shared_sources: multiple authors ( [69, 217] for source 439
In update_author_shared_sources: multiple authors ( [66, 69] for source 451
In update_author_shared_sources: multiple authors ( [69, 217] for source 454
In update_author_shared_sources: multiple authors ( [387, 66] for source 474
In update_author_shared_sources: multiple authors ( [69, 30] for source 475
In update_author_shared_sources: multiple authors ( [46, 937, 23, 66] for source 476
In update_author_shared_sources: multiple authors ( [46, 937, 23, 66] for source 477
In update_author_shared_sources: multiple authors ( [46, 937, 23, 66] for source 478
In update_author_shared_sources: multiple authors ( [46, 937, 23, 66] for source 479
In update_author_shared_sources: multiple authors ( [46, 937, 23, 66] for source 480
In update_author_shared_sources: multiple authors ( [178, 161, 437, 66] for source 487
In update_author_shared_sources: multiple authors ( [74, 591] for source 502
In update_author_shared_sources: multiple authors ( [425, 302] for source 510
In update_author_shared_sources: multiple authors ( [425, 302] for source 511
In update_author_shared_sources: multiple authors ( [84, 66] for source 512
In update_author_shared_sources: multiple authors ( [84, 66, 2310] for source 513
In update_author_shared_sources: multiple authors ( [84, 66] for source 514
In update_author_shared_sources: multiple authors ( [84, 66] for source 515
In update_author_shared_sources: multiple authors ( [46, 217, 66, 3] for source 516
In update_author_shared_sources: multiple authors ( [336, 443] for source 521
In update_author_shared_sources: multiple authors ( [73, 217, 66] for source 534
In update_author_shared_sources: multiple authors ( [36, 223] for source 551
In update_author_shared_sources: multiple authors ( [36, 223] for source 552
In update_author_shared_sources: multiple authors ( [36, 223] for source 553
In update_author_shared_sources: multiple authors ( [36, 223] for source 554
In update_author_shared_sources: multiple authors ( [36, 223, 74] for source 555
In update_author_shared_sources: multiple authors ( [13, 161] for source 558
In update_author_shared_sources: multiple authors ( [84, 66] for source 569
In update_author_shared_sources: multiple authors ( [84, 599] for source 572
In update_author_shared_sources: multiple authors ( [84, 178, 599, 66] for source 574
In update_author_shared_sources: multiple authors ( [443, 66] for source 2626
In update_author_shared_sources: multiple authors ( [66, 387] for source 2656
In update_author_shared_sources: multiple authors ( [66, 387] for source 2658
In update_author_shared_sources: multiple authors ( [66, 387] for source 2660
In update_author_shared_sources: multiple authors ( [161, 437, 217] for source 617
In update_author_shared_sources: multiple authors ( [84, 217] for source 664
In update_author_shared_sources: multiple authors ( [66, 443, 599] for source 668
In update_author_shared_sources: multiple authors ( [3, 84] for source 2733
In update_author_shared_sources: multiple authors ( [3, 84] for source 2735
In update_author_shared_sources: multiple authors ( [3, 84] for source 2736
In update_author_shared_sources: multiple authors ( [84, 66] for source 769
In update_author_shared_sources: multiple authors ( [591, 443] for source 137
In update_author_shared_sources: multiple authors ( [443, 66] for source 2934
In update_author_shared_sources: multiple authors ( [46, 437, 217] for source 1059
In update_author_shared_sources: multiple authors ( [46, 3] for source 1080
In update_author_shared_sources: multiple authors ( [377, 437] for source 1109
In update_author_shared_sources: multiple authors ( [443, 66] for source 2630
In update_author_shared_sources: multiple authors ( [377, 217] for source 1203
In update_author_shared_sources: multiple authors ( [73, 1655] for source 1343
In update_author_shared_sources: multiple authors ( [217, 3] for source 1419
In update_author_shared_sources: multiple authors ( [443, 66] for source 2627
In update_author_shared_sources: multiple authors ( [443, 66] for source 2629
In update_author_shared_sources: multiple authors ( [73, 23] for source 1548
In update_author_shared_sources: multiple authors ( [2310, 66] for source 2311
In update_author_shared_sources: multiple authors ( [66, 387] for source 2655
In update_author_shared_sources: multiple authors ( [66, 387] for source 2657
In update_author_shared_sources: multiple authors ( [377, 69, 223] for source 1730
In update_author_shared_sources: multiple authors ( [69, 591] for source 1732
In update_author_shared_sources: multiple authors ( [69, 377, 437, 178] for source 1733
In update_author_shared_sources: multiple authors ( [69, 377] for source 1734
In update_author_shared_sources: multiple authors ( [84, 66] for source 1736
In update_author_shared_sources: multiple authors ( [84, 66] for source 1737
In update_author_shared_sources: multiple authors ( [73, 1655] for source 1778
In update_author_shared_sources: multiple authors ( [437, 217] for source 1861
In update_author_shared_sources: multiple authors ( [460, 332] for source 1892
In update_author_shared_sources: multiple authors ( [437, 66] for source 1980
In update_author_shared_sources: multiple authors ( [29, 437, 46] for source 2027
In update_author_shared_sources: multiple authors ( [3, 84] for source 2731
In update_author_shared_sources: multiple authors ( [1082, 598] for source 2052
In update_author_shared_sources: multiple authors ( [1082, 598] for source 2054
In update_author_shared_sources: multiple authors ( [1082, 598] for source 2056
In update_author_shared_sources: multiple authors ( [69, 84] for source 345
In update_author_shared_sources: multiple authors ( [69, 377, 23, 437] for source 2094
In update_author_shared_sources: multiple authors ( [1082, 598] for source 2060
In update_author_shared_sources: multiple authors ( [377, 69] for source 100
In update_author_shared_sources: multiple authors ( [46, 161] for source 358
In update_author_shared_sources: multiple authors ( [66, 161, 223, 178, 46, 69, 36, 425, 23] for source 102
In update_author_shared_sources: multiple authors ( [46, 161] for source 361
In update_author_shared_sources: multiple authors ( [46, 161] for source 362
In update_author_shared_sources: multiple authors ( [591, 443] for source 137
In update_author_shared_sources: multiple authors ( [394, 302] for source 2205
In update_author_shared_sources: multiple authors ( [46, 161] for source 163
In update_author_shared_sources: multiple authors ( [46, 161] for source 164
In update_author_shared_sources: multiple authors ( [46, 161] for source 165
In update_author_shared_sources: multiple authors ( [46, 161] for source 166
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2222
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2223
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2224
In update_author_shared_sources: multiple authors ( [73, 1655] for source 2225
In update_author_shared_sources: multiple authors ( [36, 599] for source 2226
In update_author_shared_sources: multiple authors ( [46, 3] for source 182
In update_author_shared_sources: multiple authors ( [217, 178] for source 2233
In update_author_shared_sources: multiple authors ( [377, 66, 23, 69, 591] for source 188
In update_author_shared_sources: multiple authors ( [3, 84] for source 2734
In update_author_shared_sources: multiple authors ( [74, 505] for source 2256
In update_author_shared_sources: multiple authors ( [30, 217] for source 2265
In update_author_shared_sources: multiple authors ( [437, 377, 217, 74, 178] for source 218
In update_author_shared_sources: multiple authors ( [73, 23] for source 236
In update_author_shared_sources: multiple authors ( [73, 437] for source 237
In update_author_shared_sources: multiple authors ( [66, 599] for source 250
In update_author_shared_sources: multiple authors ( [36, 74] for source 2315
In update_author_shared_sources: multiple authors ( [69, 387] for source 289
In update_author_shared_sources: multiple authors ( [178, 69, 223, 437] for source 290
In update_author_shared_sources: multiple authors ( [178, 336] for source 337
In update_author_shared_sources: multiple authors ( [69, 84, 66, 23, 437] for source 346
In update_author_shared_sources: multiple authors ( [69, 377] for source 2402
In update_author_shared_sources: multiple authors ( [591, 394] for source 368
In update_author_shared_sources: multiple authors ( [591, 69] for source 404
In update_author_shared_sources: multiple authors ( [69, 217] for source 439
In update_author_shared_sources: multiple authors ( [66, 69] for source 451
In update_author_shared_sources: multiple authors ( [69, 217] for source 454
In update_author_shared_sources: multiple authors ( [387, 66] for source 474
In update_author_shared_sources: multiple authors ( [69, 30] for source 475
In update_author_shared_sources: multiple authors ( [66, 23] for source 476
In update_author_shared_sources: multiple authors ( [66, 23] for source 477
In update_author_shared_sources: multiple authors ( [66, 23] for source 479
In update_author_shared_sources: multiple authors ( [66, 23] for source 480
In update_author_shared_sources: multiple authors ( [178, 161, 437, 66] for source 487
In update_author_shared_sources: multiple authors ( [74, 591] for source 502
In update_author_shared_sources: multiple authors ( [302, 425] for source 510
In update_author_shared_sources: multiple authors ( [302, 425] for source 511
In update_author_shared_sources: multiple authors ( [84, 66] for source 512
In update_author_shared_sources: multiple authors ( [84, 66] for source 514
In update_author_shared_sources: multiple authors ( [84, 66] for source 515
In update_author_shared_sources: multiple authors ( [46, 217, 66, 3] for source 516
In update_author_shared_sources: multiple authors ( [73, 217] for source 534
In update_author_shared_sources: multiple authors ( [223, 36] for source 552
In update_author_shared_sources: multiple authors ( [223, 36] for source 553
In update_author_shared_sources: multiple authors ( [223, 36] for source 554
In update_author_shared_sources: multiple authors ( [223, 36] for source 555
In update_author_shared_sources: multiple authors ( [13, 161] for source 558
In update_author_shared_sources: multiple authors ( [84, 66] for source 569
In update_author_shared_sources: multiple authors ( [84, 599] for source 572
In update_author_shared_sources: multiple authors ( [84, 178, 66] for source 574
In update_author_shared_sources: multiple authors ( [66, 443] for source 2626
In update_author_shared_sources: multiple authors ( [1082, 598] for source 598
In update_author_shared_sources: multiple authors ( [387, 66] for source 2658
In update_author_shared_sources: multiple authors ( [387, 66] for source 2660
In update_author_shared_sources: multiple authors ( [161, 437, 217] for source 617
In update_author_shared_sources: multiple authors ( [443, 66] for source 668
In update_author_shared_sources: multiple authors ( [3, 84] for source 2729
In update_author_shared_sources: multiple authors ( [3, 84] for source 2730
In update_author_shared_sources: multiple authors ( [3, 84] for source 2733
In update_author_shared_sources: multiple authors ( [3, 84] for source 2735
In update_author_shared_sources: multiple authors ( [3, 84] for source 2736
In update_author_shared_sources: multiple authors ( [46, 302] for source 762
In update_author_shared_sources: multiple authors ( [84, 66] for source 769
In update_author_shared_sources: multiple authors ( [1082, 598] for source 2893
In update_author_shared_sources: multiple authors ( [1082, 598] for source 2895
In update_author_shared_sources: multiple authors ( [66, 443] for source 2934
In update_author_shared_sources: multiple authors ( [3, 84] for source 2957
In update_author_shared_sources: multiple authors ( [2310, 66] for source 513
In update_author_shared_sources: multiple authors ( [46, 437, 217] for source 1059
In update_author_shared_sources: multiple authors ( [46, 3] for source 1080
In update_author_shared_sources: multiple authors ( [23, 223] for source 184
In update_author_shared_sources: multiple authors ( [377, 217] for source 1203
In update_author_shared_sources: multiple authors ( [387, 66] for source 2935
In update_author_shared_sources: multiple authors ( [73, 1655] for source 1343
In update_author_shared_sources: multiple authors ( [69, 2614] for source 2619
In update_author_shared_sources: multiple authors ( [217, 3] for source 1419
In update_author_shared_sources: multiple authors ( [66, 443] for source 2627
In update_author_shared_sources: multiple authors ( [66, 443] for source 2629
In update_author_shared_sources: multiple authors ( [66, 443] for source 2630
In update_author_shared_sources: multiple authors ( [73, 23] for source 1548
In update_author_shared_sources: multiple authors ( [2310, 66] for source 2311
In update_author_shared_sources: multiple authors ( [387, 66] for source 2655
In update_author_shared_sources: multiple authors ( [387, 66] for source 2657
In update_author_shared_sources: multiple authors ( [69, 377, 223] for source 1730
In update_author_shared_sources: multiple authors ( [69, 591] for source 1732
In update_author_shared_sources: multiple authors ( [69, 377, 437, 178] for source 1733
In update_author_shared_sources: multiple authors ( [69, 377] for source 1734
In update_author_shared_sources: multiple authors ( [84, 66] for source 1736
In update_author_shared_sources: multiple authors ( [84, 66] for source 1737
In update_author_shared_sources: multiple authors ( [73, 1655] for source 1778
In update_author_shared_sources: multiple authors ( [84, 66, 443] for source 308
In update_author_shared_sources: multiple authors ( [460, 332] for source 1892
In update_author_shared_sources: multiple authors ( [437, 66] for source 1980

Processed 441 Articles.
Processed 882 Article_Data records.

================================================================================
Data for Coder index 1:

==> All authors
- author ID list = [387, 2310, 2567, 394, 652, 13, 654, 3, 46, 23, 2004, 29, 30, 417, 36, 425, 2614, 302, 178, 437, 566, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 482, 591, 336, 84, 598, 599, 217, 223, 736, 2018, 743, 937, 1782, 1655, 332, 505, 703, 637]
- author source count list = [18, 2, 0, 33, 9, 36, 3, 27, 57, 31, 4, 50, 28, 4, 31, 30, 5, 31, 41, 45, 4, 13, 43, 36, 92, 43, 37, 30, 46, 3, 1, 76, 9, 64, 21, 50, 46, 18, 2, 5, 2, 7, 4, 6, 7, 18, 2, 13]
- author shared count list = [7, 2, 0, 2, 0, 1, 0, 9, 22, 12, 0, 2, 2, 0, 9, 2, 0, 3, 6, 13, 0, 5, 9, 10, 37, 19, 12, 10, 5, 1, 0, 6, 2, 19, 5, 4, 13, 9, 0, 0, 0, 7, 0, 6, 1, 1, 0, 0]
- author article count list = [7, 1, 1, 8, 5, 17, 1, 13, 21, 15, 2, 18, 13, 4, 11, 10, 1, 12, 13, 15, 1, 8, 16, 17, 30, 15, 14, 12, 19, 4, 1, 25, 4, 27, 9, 17, 18, 6, 1, 1, 1, 1, 4, 1, 4, 8, 2, 4]
- author count = 48
- mean source count per author = 24.645833333333332
- mean shared count per author = 5.6875
- mean article count per author = 9.541666666666666

==> Authors with shared sources
- author ID list = [387, 2310, 394, 13, 3, 46, 23, 29, 30, 36, 425, 302, 178, 437, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 591, 336, 84, 598, 599, 217, 223, 937, 1655, 332, 505]
- author source count list = [18, 2, 33, 36, 27, 57, 31, 50, 28, 31, 30, 31, 41, 45, 13, 43, 36, 92, 43, 37, 30, 46, 3, 76, 9, 64, 21, 50, 46, 18, 7, 6, 7, 18]
- author shared count list = [7, 2, 2, 1, 9, 22, 12, 2, 2, 9, 2, 3, 6, 13, 5, 9, 10, 37, 19, 12, 10, 5, 1, 6, 2, 19, 5, 4, 13, 9, 7, 6, 1, 1]
- author article count list = [7, 1, 8, 17, 13, 21, 15, 18, 13, 11, 10, 12, 13, 15, 8, 16, 17, 30, 15, 14, 12, 19, 4, 25, 4, 27, 9, 17, 18, 6, 1, 1, 4, 8]
- author count = 34
- mean source count per author with shared sources = 33.088235294117645
- mean shared count per author with shared sources = 8.029411764705882
- mean article count per author = 12.617647058823529

================================================================================
Data for Coder index 2:

==> All authors
- author ID list = [387, 2310, 2567, 394, 652, 13, 654, 3, 46, 23, 2004, 29, 30, 417, 36, 425, 2614, 302, 178, 437, 566, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 482, 591, 336, 84, 598, 599, 217, 223, 736, 2018, 743, 1782, 1655, 332, 505, 703, 637]
- author source count list = [18, 2, 0, 27, 8, 39, 2, 29, 46, 33, 4, 50, 26, 4, 28, 31, 6, 31, 42, 49, 2, 15, 43, 34, 88, 45, 34, 28, 46, 4, 1, 72, 9, 69, 22, 46, 43, 13, 2, 5, 2, 4, 6, 7, 14, 2, 10]
- author shared count list = [7, 2, 0, 2, 0, 1, 0, 12, 13, 11, 0, 0, 2, 0, 7, 3, 1, 4, 8, 10, 0, 7, 8, 9, 35, 19, 11, 10, 4, 1, 0, 6, 1, 20, 7, 3, 11, 8, 0, 0, 0, 0, 6, 1, 1, 0, 0]
- author article count list = [7, 1, 1, 8, 5, 17, 1, 13, 20, 15, 2, 18, 13, 4, 11, 10, 1, 12, 13, 15, 1, 8, 16, 17, 30, 15, 14, 12, 19, 4, 1, 25, 4, 27, 9, 17, 18, 6, 1, 1, 1, 4, 1, 4, 8, 2, 4]
- author count = 47
- mean source count per author = 24.27659574468085
- mean shared count per author = 5.340425531914893
- mean article count per author = 9.702127659574469

==> Authors with shared sources
- author ID list = [387, 2310, 394, 13, 3, 46, 23, 30, 36, 425, 2614, 302, 178, 437, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 591, 336, 84, 598, 599, 217, 223, 1655, 332, 505]
- author source count list = [18, 2, 27, 39, 29, 46, 33, 26, 28, 31, 6, 31, 42, 49, 15, 43, 34, 88, 45, 34, 28, 46, 4, 72, 9, 69, 22, 46, 43, 13, 6, 7, 14]
- author shared count list = [7, 2, 2, 1, 12, 13, 11, 2, 7, 3, 1, 4, 8, 10, 7, 8, 9, 35, 19, 11, 10, 4, 1, 6, 1, 20, 7, 3, 11, 8, 6, 1, 1]
- author article count list = [7, 1, 8, 17, 13, 20, 15, 13, 11, 10, 1, 12, 13, 15, 8, 16, 17, 30, 15, 14, 12, 19, 4, 25, 4, 27, 9, 17, 18, 6, 1, 4, 8]
- author count = 33
- mean source count per author with shared sources = 31.666666666666668
- mean shared count per author with shared sources = 7.606060606060606
- mean article count per author = 12.424242424242424

Results for:

  • grp_month/prelim_month: phd_work/results/network_person_info-grp_month.txt

      Processed 441 Articles.
      Processed 882 Article_Data records.
    
      ================================================================================
      Data for Coder index 1:
    
      ==> All authors
      - author ID list = [387, 2310, 2567, 394, 652, 13, 654, 3, 46, 23, 2004, 29, 30, 417, 36, 425, 2614, 302, 178, 437, 566, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 482, 591, 336, 84, 598, 599, 217, 223, 736, 2018, 743, 937, 1782, 1655, 332, 505, 703, 637]
      - author source count list = [18, 2, 0, 33, 9, 36, 3, 27, 57, 31, 4, 50, 28, 4, 31, 30, 5, 31, 41, 45, 4, 13, 43, 36, 92, 43, 37, 30, 46, 3, 1, 76, 9, 64, 21, 50, 46, 18, 2, 5, 2, 7, 4, 6, 7, 18, 2, 13]
      - author shared count list = [7, 2, 0, 2, 0, 1, 0, 9, 22, 12, 0, 2, 2, 0, 9, 2, 0, 3, 6, 13, 0, 5, 9, 10, 37, 19, 12, 10, 5, 1, 0, 6, 2, 19, 5, 4, 13, 9, 0, 0, 0, 7, 0, 6, 1, 1, 0, 0]
      - author article count list = [7, 1, 1, 8, 5, 17, 1, 13, 21, 15, 2, 18, 13, 4, 11, 10, 1, 12, 13, 15, 1, 8, 16, 17, 30, 15, 14, 12, 19, 4, 1, 25, 4, 27, 9, 17, 18, 6, 1, 1, 1, 1, 4, 1, 4, 8, 2, 4]
      - author count = 48
      - mean source count per author = 24.645833333333332
      - mean shared count per author = 5.6875
      - mean article count per author = 9.541666666666666
    
      ==> Authors with shared sources
      - author ID list = [387, 2310, 394, 13, 3, 46, 23, 29, 30, 36, 425, 302, 178, 437, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 591, 336, 84, 598, 599, 217, 223, 937, 1655, 332, 505]
      - author source count list = [18, 2, 33, 36, 27, 57, 31, 50, 28, 31, 30, 31, 41, 45, 13, 43, 36, 92, 43, 37, 30, 46, 3, 76, 9, 64, 21, 50, 46, 18, 7, 6, 7, 18]
      - author shared count list = [7, 2, 2, 1, 9, 22, 12, 2, 2, 9, 2, 3, 6, 13, 5, 9, 10, 37, 19, 12, 10, 5, 1, 6, 2, 19, 5, 4, 13, 9, 7, 6, 1, 1]
      - author article count list = [7, 1, 8, 17, 13, 21, 15, 18, 13, 11, 10, 12, 13, 15, 8, 16, 17, 30, 15, 14, 12, 19, 4, 25, 4, 27, 9, 17, 18, 6, 1, 1, 4, 8]
      - author count = 34
      - mean source count per author with shared sources = 33.088235294117645
      - mean shared count per author with shared sources = 8.029411764705882
      - mean article count per author = 12.617647058823529
    
      ================================================================================
      Data for Coder index 2:
    
      ==> All authors
      - author ID list = [387, 2310, 2567, 394, 652, 13, 654, 3, 46, 23, 2004, 29, 30, 417, 36, 425, 2614, 302, 178, 437, 566, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 482, 591, 336, 84, 598, 599, 217, 223, 736, 2018, 743, 1782, 1655, 332, 505, 703, 637]
      - author source count list = [18, 2, 0, 27, 8, 39, 2, 29, 46, 33, 4, 50, 26, 4, 28, 31, 6, 31, 42, 49, 2, 15, 43, 34, 88, 45, 34, 28, 46, 4, 1, 72, 9, 69, 22, 46, 43, 13, 2, 5, 2, 4, 6, 7, 14, 2, 10]
      - author shared count list = [7, 2, 0, 2, 0, 1, 0, 12, 13, 11, 0, 0, 2, 0, 7, 3, 1, 4, 8, 10, 0, 7, 8, 9, 35, 19, 11, 10, 4, 1, 0, 6, 1, 20, 7, 3, 11, 8, 0, 0, 0, 0, 6, 1, 1, 0, 0]
      - author article count list = [7, 1, 1, 8, 5, 17, 1, 13, 20, 15, 2, 18, 13, 4, 11, 10, 1, 12, 13, 15, 1, 8, 16, 17, 30, 15, 14, 12, 19, 4, 1, 25, 4, 27, 9, 17, 18, 6, 1, 1, 1, 4, 1, 4, 8, 2, 4]
      - author count = 47
      - mean source count per author = 24.27659574468085
      - mean shared count per author = 5.340425531914893
      - mean article count per author = 9.702127659574469
    
      ==> Authors with shared sources
      - author ID list = [387, 2310, 394, 13, 3, 46, 23, 30, 36, 425, 2614, 302, 178, 437, 1082, 443, 377, 66, 69, 161, 73, 74, 460, 591, 336, 84, 598, 599, 217, 223, 1655, 332, 505]
      - author source count list = [18, 2, 27, 39, 29, 46, 33, 26, 28, 31, 6, 31, 42, 49, 15, 43, 34, 88, 45, 34, 28, 46, 4, 72, 9, 69, 22, 46, 43, 13, 6, 7, 14]
      - author shared count list = [7, 2, 2, 1, 12, 13, 11, 2, 7, 3, 1, 4, 8, 10, 7, 8, 9, 35, 19, 11, 10, 4, 1, 6, 1, 20, 7, 3, 11, 8, 6, 1, 1]
      - author article count list = [7, 1, 8, 17, 13, 20, 15, 13, 11, 10, 1, 12, 13, 15, 8, 16, 17, 30, 15, 14, 12, 19, 4, 25, 4, 27, 9, 17, 18, 6, 1, 4, 8]
      - author count = 33
      - mean source count per author with shared sources = 31.666666666666668
      - mean shared count per author with shared sources = 7.606060606060606
      - mean article count per author = 12.424242424242424
  • prelim_network (1 week): phd_work/results/network_person_info-prelim_network-1week.txt

      Processed 109 Articles.
      Processed 214 Article_Data records.
    
      ================================================================================
      Data for Coder index 1:
    
      ==> All authors
      - author ID list = [66, 387, 69, 73, 74, 23, 332, 13, 591, 336, 3, 84, 302, 599, 217, 29, 30, 223, 161, 36, 425, 46, 178, 937, 505, 443, 460, 394, 377]
      - author source count list = [17, 5, 23, 6, 16, 10, 3, 14, 23, 9, 6, 19, 15, 8, 11, 11, 10, 7, 16, 17, 5, 32, 13, 7, 4, 4, 0, 7, 2]
      - author shared count list = [12, 0, 6, 1, 1, 7, 0, 0, 1, 0, 0, 7, 2, 0, 1, 1, 0, 7, 9, 7, 2, 16, 0, 7, 0, 0, 0, 0, 0]
      - author article count list = [4, 2, 9, 2, 8, 3, 1, 5, 6, 3, 4, 6, 6, 4, 6, 4, 3, 1, 5, 5, 2, 11, 6, 1, 1, 1, 1, 1, 2]
      - author count = 29
      - mean source count per author = 11.03448275862069
      - mean shared count per author = 3.0
      - mean article count per author = 3.896551724137931
    
      ==> Authors with shared sources
      - author ID list = [66, 69, 73, 74, 23, 591, 84, 302, 217, 29, 223, 161, 36, 425, 46, 937]
      - author source count list = [17, 23, 6, 16, 10, 23, 19, 15, 11, 11, 7, 16, 17, 5, 32, 7]
      - author shared count list = [12, 6, 1, 1, 7, 1, 7, 2, 1, 1, 7, 9, 7, 2, 16, 7]
      - author article count list = [4, 9, 2, 8, 3, 6, 6, 6, 6, 4, 1, 5, 5, 2, 11, 1]
      - author count = 16
      - mean source count per author with shared sources = 14.6875
      - mean shared count per author with shared sources = 5.4375
      - mean article count per author = 4.9375
    
      ================================================================================
      Data for Coder index 2:
    
      ==> All authors
      - author ID list = [66, 387, 69, 73, 74, 23, 332, 13, 591, 336, 505, 3, 340, 46, 599, 217, 29, 30, 223, 84, 161, 36, 425, 566, 302, 178, 350, 758, 377, 443, 460, 394]
      - author source count list = [15, 5, 23, 6, 15, 10, 3, 14, 21, 9, 3, 5, 2, 23, 8, 10, 10, 9, 5, 17, 14, 14, 5, 1, 16, 12, 1, 1, 2, 4, 1, 6]
      - author shared count list = [10, 0, 5, 0, 0, 6, 0, 0, 1, 0, 0, 0, 0, 7, 0, 0, 0, 0, 5, 6, 7, 5, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0]
      - author article count list = [4, 2, 9, 2, 8, 3, 1, 5, 6, 3, 1, 4, 1, 10, 4, 6, 4, 3, 1, 6, 5, 5, 2, 1, 6, 6, 1, 1, 2, 1, 1, 1]
      - author count = 32
      - mean source count per author = 9.0625
      - mean shared count per author = 1.75
      - mean article count per author = 3.59375
    
      ==> Authors with shared sources
      - author ID list = [66, 69, 23, 591, 46, 223, 84, 161, 36, 425, 302]
      - author source count list = [15, 23, 10, 21, 23, 5, 17, 14, 14, 5, 16]
      - author shared count list = [10, 5, 6, 1, 7, 5, 6, 7, 5, 2, 2]
      - author article count list = [4, 9, 3, 6, 10, 1, 6, 5, 5, 2, 6]
      - author count = 11
      - mean source count per author with shared sources = 14.818181818181818
      - mean shared count per author with shared sources = 5.090909090909091
      - mean article count per author = 5.181818181818182

R network analysis

Notebooks:

TODO

TODO:

  • TK

DONE:

  • updated forms ArticleSelectForm and PersonSelectForm to include field for "coder_id_priority_list"/"person_coder_id_priority_list".
  • created method NetworkOutput.get_coder_id_list() that:

    • knows about the two places where coder IDs can be set.
    • if prioritzed list is present:

      • starts with the priotized list
      • appends coders from other field who aren't already in the list to the end of it.
      • stores the list in an instance variable inside the object so it can be retrieved easily.
  • updated NetworkOutput.create_query_set() to use get_coder_id_list() method.

  • need to update NetworkOutput.remove_duplicate_article_data() - it is where we choose which Article_Data to omit per article where there are duplicates. Need to go with order of list. Might already do this... Nope.

    • get prioritized list.
    • for first instance of Article_Data for article, store it (related by id, or unique_identifier?)
    • on subsequent Article_Data for article, get index of coder for existing and new.
    • Whichever has lower index you keep.
    • Need to test

      • person-coded articles:

        • 1) output networks from initial prelim (12/6/2009-12/13/2009). Make sure they are the same now as they were then.
        • 2) keep article specs the same, but change person lookup to use ordered ID list. See if this is the same as the files in 1 (it is!). NO NEED: If not, count rows, find and compare rows for some users to see how different they are. Hopefully same contents, different order...?
        • 3) then, switch the article specs to use ordered list and put person specs back to old way, see how this file compares to the others. 3 lines different.
        • look for differences in:

          • number of rows
          • contents of rows
          • IDs of those included
      • automated coder:

        • 1) output networks from initial prelim (12/6/2009-12/13/2009). Make sure they are the same now as they were then.
        • 2) keep article specs the same, but change person lookup to use ordered ID list. See if this is the same as the files in 1 (might not be). If not, count rows, find and compare rows for some users to see how different they are. Hopefully same contents, different order.
        • 3) then, switch the article specs to use ordered list, see how this file compares to the others.
      • as long as the tests above check out, then try out the whole month, with prioritized coder list.

  • need to update NetworkDataOutput and children? Looks like no - all comes down to the remove_duplicate_article_data().

  • make sure the network code can deal with multiple coders, and can prioritize in an order I specify.
  • figure out the old network analysis stuff.

    • get context_text/examples/analysis/analysis-person_info.py moved into context_analysis and working with new index ordering code (might be enough to just extend Reliability_Names_Builder for init and for index specing, then override process_articles().