sourcenet-to-context-dev
This notebook contains the development and testing code for moving data from the sourcenet project, in context_text
, to the context
core.
Moved dev work for exporting network data from context
to ./render-context-networks-dev.ipynb.
Notes:
Questions:
Do we want an Identifier Type separate from Entity and Relation identifiers? I think we do, so we can specify the entity type(s) a given identifier should be used on.
In [1]:
me = "sourcenet-to-context-dev"
In [2]:
debug_flag = False
In [3]:
import datetime
from django.db.models import Avg, Max, Min, Q
from django.utils.text import slugify
import json
import logging
import six
In [4]:
%pwd
Out[4]:
In [5]:
# current working folder
current_working_folder = "/home/jonathanmorgan/work/django/research/work/phd_work/analysis"
current_datetime = datetime.datetime.now()
current_date_string = current_datetime.strftime( "%Y-%m-%d-%H-%M-%S" )
configure logging for this notebook's kernel (If you do not run this cell, you'll get the django application's logging configuration.
In [6]:
logging_file_name = "{}/logs/{}-{}.log.txt".format( current_working_folder, me, current_date_string )
logging.basicConfig(
level = logging.DEBUG,
format = '%(asctime)s - %(levelname)s - %(name)s - %(message)s',
filename = logging_file_name,
filemode = 'w' # set to 'a' if you want to append, rather than overwrite each time.
)
print( "Logging initialized, to {}".format( logging_file_name ) )
If you are using a virtualenv, make sure that you:
Since I use a virtualenv, need to get that activated somehow inside this notebook. One option is to run ../dev/wsgi.py
in this notebook, to configure the python environment manually as if you had activated the sourcenet
virtualenv. To do this, you'd make a code cell that contains:
%run ../dev/wsgi.py
This is sketchy, however, because of the changes it makes to your Python environment within the context of whatever your current kernel is. I'd worry about collisions with the actual Python 3 kernel. Better, one can install their virtualenv as a separate kernel. Steps:
activate your virtualenv:
workon research
in your virtualenv, install the package ipykernel
.
pip install ipykernel
use the ipykernel python program to install the current environment as a kernel:
python -m ipykernel install --user --name <env_name> --display-name "<display_name>"
sourcenet
example:
python -m ipykernel install --user --name sourcenet --display-name "research (Python 3)"
More details: http://ipython.readthedocs.io/en/stable/install/kernel_install.html
First, initialize my dev django project, so I can run code in this notebook that references my django models and can talk to the database using my project's settings.
In [7]:
# init django
django_init_folder = "/home/jonathanmorgan/work/django/research/work/phd_work"
django_init_path = "django_init.py"
if( ( django_init_folder is not None ) and ( django_init_folder != "" ) ):
# add folder to front of path.
django_init_path = "{}/{}".format( django_init_folder, django_init_path )
#-- END check to see if django_init folder. --#
In [8]:
%run $django_init_path
In [9]:
# context imports
from context.export.network.filter_spec import FilterSpec
from context.export.network.network_data_request import NetworkDataRequest
from context.models import Entity
from context.models import Entity_Identifier_Type
from context.models import Entity_Identifier
from context.models import Entity_Relation
from context.models import Entity_Type
from context.tests.export.network.test_helper import TestHelper
# context_text imports
from context_text.article_coding.article_coding import ArticleCoder
from context_text.article_coding.article_coding import ArticleCoding
from context_text.article_coding.open_calais_v2.open_calais_v2_article_coder import OpenCalaisV2ArticleCoder
from context_text.collectors.newsbank.newspapers.GRPB import GRPB
from context_text.collectors.newsbank.newspapers.DTNB import DTNB
from context_text.export.to_context_base.export_to_context import ExportToContext
from context_text.models import Article
from context_text.models import Article_Subject
from context_text.models import Newspaper
from context_text.shared.context_text_base import ContextTextBase
In [10]:
my_exporter = ExportToContext()
# no variables to set, yet...
my_exporter.set_article_uuid_id_type_name( ExportToContext.ENTITY_ID_TYPE_ARTICLE_NEWSBANK_ID )
Out[10]:
Create a LoggingHelper instance to use to log debug and also print at the same time.
Preconditions: Must be run after Django is initialized, since python_utilities
is in the django path.
In [11]:
# python_utilities
from python_utilities.logging.logging_helper import LoggingHelper
# init
my_logging_helper = LoggingHelper()
my_logging_helper.set_logger_name( me )
log_message = None
Set up the following entity and relation types in context:
Entity_Identifier_Type
s:
general:
permalink
(fixture ID 5) - CONTEXT_ENTITY_ID_TYPE_PERMALINKarticle:
article_archive_identifier
(fixture ID 6) - CONTEXT_ENTITY_ID_TYPE_ARTICLE_ARCHIVE_IDENTIFIERarticle_sourcenet_id
(fixture ID 3) - CONTEXT_ENTITY_ID_TYPE_ARTICLE_SOURCENET_IDarticle_newsbank_id
(fixture ID 4) - CONTEXT_ENTITY_ID_TYPE_ARTICLE_NEWSBANK_IDnewspaper:
newspaper_sourcenet_id
(fixture ID 7) - CONTEXT_ENTITY_ID_TYPE_NEWSPAPER_SOURCENET_IDnewspaper_newsbank_code
(fixture ID 8) - CONTEXT_ENTITY_ID_TYPE_NEWSPAPER_NEWSBANK_CODEorganization:
organization_sourcenet_id
(fixture ID 9) - CONTEXT_ENTITY_ID_TYPE_ORGANIZATION_SOURCENET_IDperson:
person_sourcenet_id
(fixture ID 1) - CONTEXT_ENTITY_ID_TYPE_NAME_PERSON_SOURCENET_IDperson_open_calais_uuid
(fixture ID 2) - CONTEXT_ENTITY_ID_TYPE_NAME_PERSON_OPEN_CALAIS_UUIDEntities:
Relations:
from newspaper
from article
through article
Then, export them to JSON fixture files using manage.py / django-admin dumpdata ( https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-dumpdata ) so they can be imported using python manage.py or django-admin loaddata ( https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-loaddata ) rather than having to input them in the admin:
python manage.py dumpdata [app_label[.ModelName] [app_label[.ModelName] ...]] --indent INDENT --output <output_file_path>
python manage.py dumpdata \
--indent 4 \
--output context-sourcenet_entity_and_relation_types.json \
context.Entity_Identifier_Type \
context.Entity_Relation_Type \
context.Entity_Relation_Type_Trait \
context.Entity_Type \
context.Entity_Type_Trait \
context.Trait_Type \
context.Term \
context.Term_Relation \
context.Term_Relation_Type \
context.Vocabulary \
No line breaks:
python manage.py dumpdata --indent 4 --output context-sourcenet_entity_and_relation_types.json context.Entity_Identifier_Type context.Entity_Relation_Type context.Entity_Relation_Type_Trait context.Entity_Type context.Entity_Type_Trait context.Trait_Type context.Term context.Term_Relation context.Term_Relation_Type context.Vocabulary
Seed entities and relations for sourcenet from fixture file created above, stored as a fixture in context.
Example JSON:
{
"model": "context.entity_identifier_type",
"pk": 8,
"fields": {
"create_date": "2019-11-01T15:27:29.526Z",
"last_modified": "2019-11-01T15:27:29.526Z",
"name": "newspaper_newsbank_code",
"label": "newspaper_newsbank_code",
"source": "Newsbank",
"notes": "3-letter code assigned to a given paper by NewsBank.",
"type_list": [
7
]
}
},
In [ ]:
# declare variables
path_to_context_fixture = None
json_file = None
fixture_json = None
# read in the fixture file.
path_to_context_fixture = "/home/jonathanmorgan/work/django/research/context/fixtures/context-sourcenet_entities_and_relations.json"
# load the fixture JSON into memory.
with open( path_to_context_fixture ) as json_file:
# parse the JSON file.
fixture_json = json.load( json_file )
#-- END open of JSON file to read it into memory. --#
In [ ]:
print( json.dumps( fixture_json, sort_keys = True, indent = 4 ) )
In [ ]:
# declare variables
fixture_item = None
item_model = None
item_fields = None
# declare variables - Entity_Identifier_Type
id_type_model = None
id_type_qs = None
id_type_count = None
id_type_instance = None
id_type_name = None
id_type_label = None
id_type_source = None
id_type_notes = None
id_type_type_list = None
# declare variables - associated Entity_Type s
entity_type_id = None
entity_type_qs = None
entity_type_count = None
entity_type_instance = None
# init
id_type_model = "context.entity_identifier_type"
# create or update Entity_Idetifier_Types
# loop over items in fixture JSON
for fixture_item in fixture_json:
# get model, so we can see what we do.
item_model = fixture_item.get( "model", None )
# is it "context.entity_identifier_type"?
if ( item_model == id_type_model ):
# it is an identifier type. Retrieve fields.
item_fields = fixture_item.get( "fields", None )
id_type_name = item_fields.get( "name", None )
# do we have a name? (Better have one...)
if ( id_type_name is not None ):
print( "\nEntity_Identifier_Type: {}".format( id_type_name ) )
# does it already exist?
id_type_qs = Entity_Identifier_Type.objects.all()
id_type_qs = id_type_qs.filter( name__iexact = id_type_name )
id_type_count = id_type_qs.count()
# how many matches
if ( id_type_count == 0 ):
# it does not exist. Create.
# retrieve values
id_type_label = item_fields.get( "label", None )
id_type_source = item_fields.get( "source", None )
id_type_notes = item_fields.get( "notes", None )
id_type_type_list = item_fields.get( "type_list", [] )
# store values and save.
id_type_instance = Entity_Identifier_Type()
id_type_instance.name = id_type_name
id_type_instance.label = id_type_label
id_type_instance.source = id_type_source
id_type_instance.notes = id_type_notes
id_type_instance.save()
# add related entity types.
for entity_type_id in id_type_type_list:
# look up the Entity_Type
entity_type_qs = Entity_Type.objects.all()
entity_type_qs = entity_type_qs.filter( pk = entity_type_id )
entity_type_count = entity_type_qs.count()
# how many matches?
if ( entity_type_count == 1 ):
# one. Associate it.
entity_type_instance = entity_type_qs.get()
id_type_instance.type_list.add( entity_type_instance )
elif ( entity_type_count == 0 ):
# no match - move on.
print( "no matching Entity_Type for id {}, moving on.".format( entity_type_id ) )
elif ( entity_type_count > 1 ):
# multiple pk matches - ERROR.
print( "ERROR - multiple matching Entity_Type for id {}, moving on.".format( entity_type_id ) )
else:
# multiple pk matches - ERROR.
print( "ERROR - multiple matching Entity_Type for id {}, moving on.".format( entity_type_id ) )
#-- END check to see if entity type found --#
#-- END loop over related entity types. --#
print( "----> ADDED Entity_Identifier_Type for name {}.\n==> JSON:\n{}\n==> instance:\n{}".format( id_type_name, fixture_item, id_type_instance ) )
elif ( id_type_count == 1 ):
# exists. Update?
id_type_instance = id_type_qs.get()
print( "Entity_Identifier_Type for name {} already exists.\n==> JSON:\n{}\n==> instance:\n{}".format( id_type_name, fixture_item, id_type_instance ) )
elif ( id_type_count > 1 ):
# error.
print( "ERROR - more than one type match ({}) for name {}".format( id_type_count, id_type_name ) )
else:
# unexpected error.
print( "ERROR - name {} does not have 0, 1, or > 1 matches (count = {}).".format( id_type_name, id_type_count ) )
#-- END check to see how many matches --#
else:
# ERROR
print( "ERROR - no name for Entity_Identifier_Type." )
#-- END check to see if name. --#
#-- END check to see if Entity_Identifier_Type --#
#-- END loop over fixture items. --#
logic:
for each:
Article processing:
Person processing:
context
as entities of type "person", with identifier of type person_sourcenet_id
set to their internal django/database "id", and with identifier of type person_open_calais_uuid
set to their OpenCalais ID, if they have one. If they have any other types of IDs, add them too, untyped. Once entity is created, store ForeignKey of entity in Person record.Author/Reporter processing:
Subject/Source processing:
Relations - create the following:
from newspaper
from article
through article
Relations broken out by person type:
For all reporters/authors:
author
" between the article's entity (FROM) and the entity of the person (TO) for each author.shared_byline
" between the two (it is undirected), THROUGH the article.For all subjects, including sources:
subject
" between the article's entity (FROM) and the entity of the subject person (TO).same_article_subjects
" between the two (it is undirected), THROUGH the article.mentioned
" between each of the article's authors (FROM) and the subject (TO), THROUGH the article.For all sources
source
" between the article's entity and the entity for the source person.same_article_sources
" between the two (it is undirected), THROUGH the article.quoted
" between each of the article's authors (FROM) and the source (TO), THROUGH the article.Convenience methods:
Notes:
Before we do anything else, need to be able to pull back all the articles whose data we want to load into the context store.
In [12]:
# look for publications that have article data:
# - coded by automated coder
# - with coder type of "OpenCalais_REST_API_v2"
# get automated coder
automated_coder_user = ContextTextBase.get_automated_coding_user()
log_message = "{} - Loaded automated user: {}, id = {}".format( datetime.datetime.now(), automated_coder_user, automated_coder_user.id )
my_logging_helper.output_message( log_message, do_print_IN = True, log_level_code_IN = logging.INFO )
In [13]:
# find articles with Article_Data created by the automated user...
article_qs = Article.objects.filter( article_data__coder = automated_coder_user )
# ...and specifically coded using OpenCalais V2...
article_qs = article_qs.filter( article_data__coder_type = OpenCalaisV2ArticleCoder.CONFIG_APPLICATION )
# ...and finally, we just want the distinct articles by ID.
article_qs = article_qs.order_by( "id" ).distinct( "id" )
# count?
article_count = article_qs.count()
log_message = "Found {} articles".format( article_count )
my_logging_helper.output_message( log_message, do_print_IN = True, log_level_code_IN = logging.INFO )
First, make unit tests to test convenience methods added to the following models, in context/tests/models/
:
Entity_Identifier_Type
- test_Entity_Identifier_Type_model.pyEntity_Identifier
- test_Entity_Identifier_model.pyEntity
- test_Entity_model.pyAlso, // move instance creation class methods along with their constants over into "TestHelper" from test_Entity_Identifier_model.py, so they can be re-used across test classes.
To run: python manage.py test context.tests
In test data:
article 21925:
SELECT * FROM context_text_article_data WHERE article_id = 21925;
TODO:
test_entity_model.py
test_export_to_context.py
In [14]:
result = my_exporter.process_articles( article_qs )
Now, we need to filter data in context, so we can test to make sure the data we expect was added.
First step is to translate the filter criteria for nodes and ties from the existing admin for the querying context.
Configuration of Network Builder, from methods-network_analysis-create_network_data.ipynb:
Configuration to generate network files for prelim:
Config of "Select Articles" - fields in bold need to be changed from default values:
Start date (YYYY-MM-DD):
2009-12-01End date (YYYY-MM-DD):
2009-12-31Fancy date range:
- Empty.Publications:
"Grand Rapids Press, The"Coders:
None selected.Coder IDs to include, in order of highest to lowest priority:
if automated: Article_Data coder_type Filter Type
and coder_type 'Value In' List (comma-delimited):
use the coder_type filter fields to filter automatically coded Article_Data on coder type if you have tried different automated coder types:
Article_Data coder_type Filter Type:
- Just automated
coder_type 'Value In' List (comma-delimited):
- Enter the coder types you want included. Examples:
Topics
: None selected.
Article Tag List (comma-delimited):
- "grp_month"Unique Identifier List (comma-delimited):
- Empty.Allow duplicate articles:
- "No"Configure "Network Settings" - fields in bold need to be changed from default values:
relations - Include source contact types
- All selected.relations - Include source capacities:
- None selected.relations - Exclude source capacities:
- None selected.Download as File?
- "Yes"Include render details?
- "No"Data Format:
- "Tab-Delimited Matrix"Data Output Type:
- "Network + Attribute Columns"Network Label:
- Empty.Include Headers:
- "Yes"Config of "Select People" - fields in bold need to be changed from default values:
Person Query Type:
- "Custom, defined below"People from (YYYY-MM-DD):
- 2009-12-01People to (YYYY-MM-DD):
- 2009-12-31Fancy person date range:
- Empty.Person publications:
- "Grand Rapids Press, The"Person coders:
- "automated", "minnesota1", "minnesota2", "minnesota3", "ground_truth"Coder IDs to include, in order of highest to lowest priority:
- Empty.Article_Data coder_type Filter Type
and coder_type 'Value In' List (comma-delimited):
use the coder_type filter fields to filter automatically coded Article_Data on coder type if you have tried different automated coder types:
Article_Data coder_type Filter Type:
- Just automated
coder_type 'Value In' List (comma-delimited):
- Enter the coder types you want included. Examples:
Person Topics
: None
Article Tag List (comma-delimited):
- "grp_month"Unique Identifier List (comma-delimited):
- Empty.Person allow duplicate articles:
- "Yes"Below is a JSON file that is just the automated coding for the month from 2009-12-01 through 2009-12-31. Just about all of the complexity of the original screens is possible here, as long as you loaded the entities and ties and all of the needed traits, including some way of adding tags...
{
"output_specification": {
"output_type": "file",
"output_file_path": "./NetworkDataRequest_test_output.txt",
"output_format": "TSV_matrix",
"output_structure": "both_trait_columns",
"output_include_column_headers": true
},
"relation_selection": {
"relation_type_slug_filter_combine_type": "AND",
"relation_type_slug_filters": [
{
"comparison_type": "includes",
"value_list": [ "mentioned", "qouted", "shared_byline" ]
}
],
"relation_trait_filter_combine_type": "AND",
"relation_trait_filters": [
{
"name": "pub_date",
"data_type": "date",
"comparison_type": "in_range",
"value_from": "2009-12-01",
"value_to": "2009-12-31"
},
{
"name": "sourcenet-coder-User-username",
"data_type": "string",
"comparison_type": "includes",
"value_list": [ "automated" ]
},
{
"name": "coder_type",
"data_type": "string",
"comparison_type": "includes",
"value_list": [ "OpenCalais_REST_API_v2" ]
}
],
"entity_type_slug_filter_combine_type": "AND",
"entity_type_slug_filters": [
{
"comparison_type": "includes",
"value_list": [ "person" ],
"relation_roles_list": [ "FROM" ]
},
{
"comparison_type": "includes",
"value_list": [ "person" ],
"relation_roles_list": [ "TO" ]
},
{
"comparison_type": "includes",
"value_list": [ "article" ],
"relation_roles_list": [ "THROUGH" ]
}
],
"entity_trait_filter_combine_type": "AND",
"entity_trait_filters": [
{
"name": "sourcenet-Newspaper-ID",
"data_type": "int",
"comparison_type": "includes",
"value_list": [ 1 ],
"relation_roles_list": [ "THROUGH" ]
}
]
}
}
To start, try making a fixture of all of the context
app from the temp database. This should be entities and ties for the 50 or so articles we coded there.
This includes contents of file "context-sourcenet_entity_and_relation_types.json
", so you can just load this one file.
Export to JSON fixture files using manage.py / django-admin dumpdata ( https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-dumpdata ) so they can be imported using python manage.py or django-admin loaddata ( https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-loaddata ) rather than having to input them in the admin:
python manage.py dumpdata [app_label[.ModelName] [app_label[.ModelName] ...]] --indent INDENT --output <output_file_path>
python manage.py dumpdata \
--indent 4 \
--output context-sourcenet_entities_and_relations-full.json \
context
No line breaks:
python manage.py dumpdata --indent 4 --output context-sourcenet_entities_and_relations-full.json context
This excludes contents of file "context-sourcenet_entity_and_relation_types.json
", so load that first, then this.
Export to JSON fixture files using manage.py / django-admin dumpdata ( https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-dumpdata ) so they can be imported using python manage.py or django-admin loaddata ( https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-loaddata ) rather than having to input them in the admin:
python manage.py dumpdata [app_label[.ModelName] [app_label[.ModelName] ...]] --indent INDENT --output <output_file_path>
python manage.py dumpdata \
--indent 4 \
--output context-sourcenet_entities_and_relations.json \
--exclude context.Entity_Identifier_Type \
--exclude context.Entity_Relation_Type \
--exclude context.Entity_Relation_Type_Trait \
--exclude context.Entity_Type \
--exclude context.Entity_Type_Trait \
--exclude context.Trait_Type \
--exclude context.Term \
--exclude context.Term_Relation \
--exclude context.Term_Relation_Type \
--exclude context.Vocabulary \
context
No line breaks:
python manage.py dumpdata --indent 4 --output context-sourcenet_entities_and_relations.json --exclude context.Entity_Identifier_Type --exclude context.Entity_Relation_Type --exclude context.Entity_Relation_Type_Trait --exclude context.Entity_Type --exclude context.Entity_Type_Trait --exclude context.Trait_Type --exclude context.Term --exclude context.Term_Relation --exclude context.Term_Relation_Type --exclude context.Vocabulary context
For each type of filter, need to prototype the code to implement the filter.
Figure out which relation roles we are focusing on (combination of FROM, TO, THROUGH), then build out query to properly filter.
In [26]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
entity_qs = entity_qs.filter( entity_identifier__name = "person_sourcenet_id" )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
entity_qs = entity_qs.filter( entity_identifier__uuid = "202" )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [27]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
combined_q = Q( entity_identifier__name = "person_sourcenet_id" ) & Q( entity_identifier__uuid = "202" )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [28]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# try to find relations to an Entity in any relation role (FROM, TO, THROUGH), with identifier name "`person_sourcenet_id`" and id 202.
from_q = Q( relation_from__entity_identifier__name = "person_sourcenet_id" ) & Q( relation_from__entity_identifier__uuid = "202" )
relation_qs = relation_qs.filter( from_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [29]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# try to find relations to an Entity in any relation role (FROM, TO, THROUGH), with identifier name "`person_sourcenet_id`" and id 202.
to_q = Q( relation_to__entity_identifier__name = "person_sourcenet_id" ) & Q( relation_to__entity_identifier__uuid = "202" )
relation_qs = relation_qs.filter( to_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [30]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# try to find relations to an Entity in any relation role (FROM, TO, THROUGH), with identifier name "`person_sourcenet_id`" and id 202.
through_q = Q( relation_through__entity_identifier__name = "person_sourcenet_id" ) & Q( relation_through__entity_identifier__uuid = "202" )
relation_qs = relation_qs.filter( through_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [31]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# try to find relations to an Entity in any relation role (FROM, TO, THROUGH), with identifier name "`person_sourcenet_id`" and id 202.
from_q = Q( relation_from__entity_identifier__name = "person_sourcenet_id" ) & Q( relation_from__entity_identifier__uuid = "202" )
to_q = Q( relation_to__entity_identifier__name = "person_sourcenet_id" ) & Q( relation_to__entity_identifier__uuid = "202" )
through_q = Q( relation_through__entity_identifier__name = "person_sourcenet_id" ) & Q( relation_through__entity_identifier__uuid = "202" )
combined_q = from_q | to_q | through_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [34]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
entity_qs = Entity.objects.filter( pk = 315 )
# try to find relations to an Entity in any relation role (FROM, TO, THROUGH), with identifier name "`person_sourcenet_id`" and id 202.
from_q = Q( relation_from__in = entity_qs )
combined_q = from_q
test_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
to_q = Q( relation_to__in = entity_qs )
combined_q = combined_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
through_q = Q( relation_through__in = entity_qs )
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [15]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "46" )
test_value_list.append( "163" )
test_value_list.append( "161" )
test_value_list.append( "164" )
test_value_list.append( "30" )
test_value_list.append( "175" )
combined_q = Q( entity_identifier__name = "person_sourcenet_id" ) & Q( entity_identifier__uuid__in = test_value_list )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {}".format( entity_id ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [16]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [24]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "20202020" )
test_value_list.append( "20202021" )
test_value_list.append( "20202022" )
id_name_q = Q( entity_identifier__name = "person_sourcenet_id" )
id_value_q = Q( entity_identifier__uuid__in = test_value_list )
combined_q = id_name_q & ( ~ id_value_q )
#entity_qs = entity_qs.filter( id_name_q )
#entity_qs = entity_qs.exclude( id_value_q )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [25]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [26]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "46" )
test_value_list.append( "163" )
test_value_list.append( "161" )
test_value_list.append( "164" )
test_value_list.append( "30" )
test_value_list.append( "175" )
id_name_q = Q( entity_identifier__name = "person_sourcenet_id" )
id_value_q = Q( entity_identifier__uuid__in = test_value_list )
combined_q = id_name_q & ( ~ id_value_q )
#entity_qs = entity_qs.filter( id_name_q )
#entity_qs = entity_qs.exclude( id_value_q )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [27]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
Figure out which relation roles we are focusing on (combination of FROM, TO, THROUGH), then build out query to properly filter.
In [48]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
entity_qs = entity_qs.filter( entity_trait__name = "first_name" )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
entity_qs = entity_qs.filter( entity_trait__value = "John" )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [55]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
combined_q = Q( entity_trait__name = "first_name" ) & Q( entity_trait__value = "Larry" )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [50]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [57]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "John" )
test_value_list.append( "Michael" )
test_value_list.append( "Robert" )
test_value_list.append( "Steve" )
test_value_list.append( "Larry" )
trait_name_q = Q( entity_trait__name = "first_name" )
trait_value_q = Q( entity_trait__value__in = test_value_list )
combined_q = trait_name_q & trait_value_q
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [58]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [61]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "20202020" )
test_value_list.append( "20202021" )
test_value_list.append( "20202022" )
trait_name_q = Q( entity_trait__name = "first_name" )
trait_value_q = ~ Q( entity_trait__value__in = test_value_list )
combined_q = trait_name_q & trait_value_q
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [62]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [63]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "John" )
test_value_list.append( "Michael" )
test_value_list.append( "Robert" )
test_value_list.append( "Steve" )
test_value_list.append( "Larry" )
trait_name_q = Q( entity_trait__name = "first_name" )
trait_value_q = ~ Q( entity_trait__value__in = test_value_list )
combined_q = trait_name_q & trait_value_q
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [64]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [74]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
trait_name_q = Q( entity_trait__name = "pub_date" )
left_range_q = Q( entity_trait__value__gte = "2010-02-08" )
right_range_q = Q( entity_trait__value__lte = "2010-02-13" )
combined_q = trait_name_q & left_range_q & right_range_q
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [75]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
Figure out which relation roles we are focusing on (combination of FROM, TO, THROUGH), then build out query to properly filter.
In [28]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
entity_qs = entity_qs.filter( entity_types__entity_type__slug = "person" )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [29]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
combined_q = Q( entity_types__entity_type__slug = "person" )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [31]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [82]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "person" )
test_value_list.append( "article" )
combined_q = Q( entity_types__entity_type__slug__in = test_value_list )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [83]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | THROUGH
combined_q = from_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO | THROUGH
combined_q = to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nTO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = from_q | to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [39]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "peregrine" )
test_value_list.append( "chartreuse" )
test_value_list.append( "bumblebee" )
combined_q = ~ Q( entity_types__entity_type__slug__in = test_value_list )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [40]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [46]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "peregrine" )
test_value_list.append( "chartreuse" )
test_value_list.append( "bumblebee" )
test_value_list.append( "article" )
combined_q = ~ Q( entity_types__entity_type__slug__in = test_value_list )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [47]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [48]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
name_q = Q( entity_relation_trait__name = "pub_date" )
value_q = Q( entity_relation_trait__value = "2009-12-07" )
combined_q = name_q & value_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [77]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "20202020" )
test_value_list.append( "20202021" )
test_value_list.append( "20202022" )
name_q = Q( entity_relation_trait__name = "pub_date" )
value_q = Q( entity_relation_trait__value__in = test_value_list )
combined_q = name_q & value_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [79]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "2009-12-07" )
test_value_list.append( "2010-02-08" )
test_value_list.append( "2010-02-13" )
name_q = Q( entity_relation_trait__name = "pub_date" )
value_q = Q( entity_relation_trait__value__in = test_value_list )
combined_q = name_q & value_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [78]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "20202020" )
test_value_list.append( "20202021" )
test_value_list.append( "20202022" )
name_q = Q( entity_relation_trait__name = "pub_date" )
value_q = ~ Q( entity_relation_trait__value__in = test_value_list )
combined_q = name_q & value_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [67]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "2009-12-07" )
test_value_list.append( "2010-02-08" )
test_value_list.append( "2010-02-13" )
name_q = Q( entity_relation_trait__name = "pub_date" )
value_q = ~ Q( entity_relation_trait__value__in = test_value_list )
combined_q = name_q & value_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [80]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
name_q = Q( entity_relation_trait__name = "pub_date" )
left_range_q = Q( entity_relation_trait__value__gte = "2019-12-01" )
right_range_q = Q( entity_relation_trait__value__lte = "2019-12-31" )
combined_q = name_q & left_range_q & right_range_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [72]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
name_q = Q( entity_relation_trait__name = "pub_date" )
left_range_q = Q( entity_relation_trait__value__gte = "2010-02-08" )
right_range_q = Q( entity_relation_trait__value__lte = "2010-02-13" )
combined_q = name_q & left_range_q & right_range_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [49]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
name_q = Q( relation_type__slug = "quoted" )
combined_q = name_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [68]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "quoted" )
test_value_list.append( "mentioned" )
test_value_list.append( "shared_byline" )
name_q = Q( relation_type__slug__in = test_value_list )
combined_q = name_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [69]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "quoted" )
test_value_list.append( "mentioned" )
test_value_list.append( "shared_byline" )
name_q = ~ Q( relation_type__slug__in = test_value_list )
combined_q = name_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
In [90]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "person" )
test_value_list.append( "article" )
combined_q = Q( entity_types__entity_type__slug__in = test_value_list )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [91]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | THROUGH
combined_q = from_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO | THROUGH
combined_q = to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nTO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = from_q | to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [92]:
# try to just find the entity with the desired identifiers.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
test_value_list = []
test_value_list.append( "newspaper" )
test_value_list.append( "article" )
combined_q = Q( entity_types__entity_type__slug__in = test_value_list )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# loop to retrieve IDs.
entity_id_list = []
for entity in entity_qs:
# get, print, and store ID.
entity_id = entity.id
print( "Entity ID: {} ( {} )".format( entity_id, entity.name ) )
entity_id_list.append( entity_id )
#-- END loop over matching entities. --#
In [93]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | THROUGH
combined_q = from_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO | THROUGH
combined_q = to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nTO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = from_q | to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [45]:
# start with all Entity_Relations.
#test_relation_qs = None
test_relation_qs = Entity_Relation.objects.all()
In [46]:
# try to just find the entity with the desired identifiers.
# two QuerySets, one for just authors, and one for just articles.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# Q for each of article and person entities.
article_entity_q = Q( entity_types__entity_type__slug = "article" )
person_entity_q = Q( entity_types__entity_type__slug = "person" )
# Make a QuerySet for each.
article_entity_qs = entity_qs.filter( article_entity_q )
print( "\narticle entity result count: {} (SQL: {})".format( article_entity_qs.count(), article_entity_qs.query ) )
person_entity_qs = entity_qs.filter( person_entity_q )
print( "\nperson entity result count: {} (SQL: {})".format( person_entity_qs.count(), person_entity_qs.query ) )
In [47]:
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = person_entity_qs )
to_q = Q( relation_to__in = person_entity_qs )
through_q = Q( relation_through__in = article_entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | THROUGH
combined_q = from_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO | THROUGH
combined_q = to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nTO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = from_q | to_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM & TO & THROUGH
combined_q = from_q & to_q & through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM & TO & THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM & TO & THROUGH - filter
test_qs = relation_qs.filter( from_q )
test_qs = test_qs.filter( to_q )
test_qs = test_qs.filter( through_q )
print( "\nFROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( from_q )
test_relation_qs = test_relation_qs.filter( to_q )
test_relation_qs = test_relation_qs.filter( through_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [40]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
name_q = Q( entity_relation_trait__name = "pub_date" )
left_range_q = Q( entity_relation_trait__value__gte = "2009-12-01" )
right_range_q = Q( entity_relation_trait__value__lte = "2009-12-31" )
pub_date_q = name_q & left_range_q & right_range_q
test_qs = relation_qs.filter( pub_date_q )
print( "\npub_date result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( pub_date_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [41]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
test_value_list = []
test_value_list.append( "automated" )
name_q = Q( entity_relation_trait__name = "sourcenet-coder-User-username" )
value_q = Q( entity_relation_trait__value__in = test_value_list )
coder_username_q = name_q & value_q
relation_qs = relation_qs.filter( coder_username_q )
print( "\ncoder_username result count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( coder_username_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [50]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
test_value_list = []
test_value_list.append( "OpenCalais_REST_API_v2" )
name_q = Q( entity_relation_trait__name = "coder_type" )
value_q = Q( entity_relation_trait__value__in = test_value_list )
coder_type_q = name_q & value_q
relation_qs = relation_qs.filter( coder_type_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( coder_type_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [51]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
combined_q = pub_date_q
test_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
combined_q = combined_q & coder_username_q
test_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
combined_q = combined_q & coder_type_q
test_qs = relation_qs.filter( combined_q )
print( "\n\"&\" result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
test_qs = relation_qs.filter( pub_date_q )
test_qs = test_qs.filter( coder_username_q )
test_qs = test_qs.filter( coder_type_q )
print( "\n.filter() result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
combined_q = coder_username_q & coder_type_q
test_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
In [52]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
combined_q = pub_date_q | coder_username_q | coder_type_q
In [53]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "quoted" )
test_value_list.append( "mentioned" )
test_value_list.append( "shared_byline" )
relation_type_slug_q = Q( relation_type__slug__in = test_value_list )
combined_q = relation_type_slug_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( relation_type_slug_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [34]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
combined_q = Q( entity_trait__name = "sourcenet-Newspaper-ID" ) & Q( entity_trait__value = "1" )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [54]:
debug_flag = False
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# filter using IDs from above cell
#entity_qs = Entity.objects.filter( pk__in = entity_id_list )
# make Q()s
from_q = Q( relation_from__in = entity_qs )
to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# look for Entity_Relations to an Entity in any relation role (FROM, TO, THROUGH) that match Entity QS
# FROM
test_qs = relation_qs.filter( from_q )
print( "\nFROM - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# TO
test_qs = relation_qs.filter( to_q )
print( "\nTO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# THROUGH
test_qs = relation_qs.filter( through_q )
print( "\nTHROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO
combined_q = from_q | to_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# FROM | TO | THROUGH
combined_q = combined_q | through_q
test_qs = relation_qs.filter( combined_q )
print( "\nFROM | TO | THROUGH - result count: {} ( SQL: {} )".format( test_qs.count(), test_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( through_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
Back to Table of Contents
{
"comparison_type": "AND",
"filter_type": "AND",
"value_list": [
{
"comparison_type": "includes",
"filter_type": "relation_type_slug",
"relation_roles_list": [
"ALL"
],
"value_list": [
"mentioned",
"quoted",
"shared_byline"
]
},
{
"comparison_type": "AND",
"filter_type": "entity_type_slug",
"value_list": [
{
"comparison_type": "equals",
"filter_type": "entity_type_slug",
"relation_roles_list": [
"FROM"
],
"value": "person"
},
{
"comparison_type": "equals",
"filter_type": "entity_type_slug",
"relation_roles_list": [
"TO"
],
"value": "person"
},
{
"comparison_type": "equals",
"filter_type": "entity_type_slug",
"relation_roles_list": [
"THROUGH"
],
"value": "article"
}
]
},
{
"comparison_type": "AND",
"filter_type": "relation_trait",
"value_list": [
{
"comparison_type": "in_range",
"filter_type": "relation_trait",
"name": "pub_date",
"relation_roles_list": [
"ALL"
],
"value_from": "2009-12-01",
"value_to": "2009-12-31"
},
{
"comparison_type": "includes",
"filter_type": "relation_trait",
"name": "sourcenet-coder-User-username",
"relation_roles_list": [
"ALL"
],
"value_list": [
"automated"
]
},
{
"comparison_type": "includes",
"filter_type": "relation_trait",
"name": "coder_type",
"relation_roles_list": [
"ALL"
],
"value_list": [
"OpenCalais_REST_API_v2"
]
}
]
},
{
"comparison_type": "includes",
"filter_type": "entity_trait",
"name": "sourcenet-Newspaper-ID",
"relation_roles_list": [
"THROUGH"
],
"value_list": [
1
]
}
]
}
In [71]:
# start with all Entity_Relations.
#test_relation_qs = None
test_relation_qs = Entity_Relation.objects.all()
Back to Table of Contents
{
"comparison_type": "includes",
"filter_type": "relation_type_slug",
"relation_roles_list": [
"ALL"
],
"value_list": [
"mentioned",
"quoted",
"shared_byline"
]
},
In [72]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
#entity_qs = Entity.objects.filter( pk = 315 )
test_value_list = []
test_value_list.append( "quoted" )
test_value_list.append( "mentioned" )
test_value_list.append( "shared_byline" )
relation_type_slug_q = Q( relation_type__slug__in = test_value_list )
combined_q = relation_type_slug_q
relation_qs = relation_qs.filter( combined_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( relation_type_slug_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
Back to Table of Contents
{
"comparison_type": "AND",
"filter_type": "entity_type_slug",
"value_list": [
{
"comparison_type": "equals",
"filter_type": "entity_type_slug",
"relation_roles_list": [
"FROM"
],
"value": "person"
},
{
"comparison_type": "equals",
"filter_type": "entity_type_slug",
"relation_roles_list": [
"TO"
],
"value": "person"
},
{
"comparison_type": "equals",
"filter_type": "entity_type_slug",
"relation_roles_list": [
"THROUGH"
],
"value": "article"
}
]
},
In [73]:
# try to just find the entity with the desired identifiers.
# two QuerySets, one for just authors, and one for just articles.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
# Q for each of article and person entities.
article_entity_q = Q( entity_types__entity_type__slug = "article" )
person_entity_q = Q( entity_types__entity_type__slug = "person" )
# Make a QuerySet for each.
article_entity_qs = entity_qs.filter( article_entity_q )
print( "\narticle entity result count: {} (SQL: {})".format( article_entity_qs.count(), article_entity_qs.query ) )
person_entity_qs = entity_qs.filter( person_entity_q )
print( "\nperson entity result count: {} (SQL: {})".format( person_entity_qs.count(), person_entity_qs.query ) )
In [74]:
# make Q()s
from_q = Q( relation_from__in = person_entity_qs )
to_q = Q( relation_to__in = person_entity_qs )
through_q = Q( relation_through__in = article_entity_qs )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( from_q )
test_relation_qs = test_relation_qs.filter( to_q )
test_relation_qs = test_relation_qs.filter( through_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
Back to Table of Contents
{
"comparison_type": "AND",
"filter_type": "relation_trait",
"value_list": [
{
"comparison_type": "in_range",
"filter_type": "relation_trait",
"name": "pub_date",
"relation_roles_list": [
"ALL"
],
"value_from": "2009-12-01",
"value_to": "2009-12-31"
},
{
"comparison_type": "includes",
"filter_type": "relation_trait",
"name": "sourcenet-coder-User-username",
"relation_roles_list": [
"ALL"
],
"value_list": [
"automated"
]
},
{
"comparison_type": "includes",
"filter_type": "relation_trait",
"name": "coder_type",
"relation_roles_list": [
"ALL"
],
"value_list": [
"OpenCalais_REST_API_v2"
]
}
]
},
In [75]:
name_q = Q( entity_relation_trait__name = "pub_date" )
left_range_q = Q( entity_relation_trait__value__gte = "2009-12-01" )
right_range_q = Q( entity_relation_trait__value__lte = "2009-12-31" )
pub_date_q = name_q & left_range_q & right_range_q
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( pub_date_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [76]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
test_value_list = []
test_value_list.append( "automated" )
name_q = Q( entity_relation_trait__name = "sourcenet-coder-User-username" )
value_q = Q( entity_relation_trait__value__in = test_value_list )
coder_username_q = name_q & value_q
relation_qs = relation_qs.filter( coder_username_q )
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( coder_username_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [77]:
# start with Entity_Relation QS
relation_qs = Entity_Relation.objects.all()
print( "\nresult count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
test_value_list = []
test_value_list.append( "OpenCalais_REST_API_v2" )
name_q = Q( entity_relation_trait__name = "coder_type" )
value_q = Q( entity_relation_trait__value__in = test_value_list )
coder_type_q = name_q & value_q
relation_qs = relation_qs.filter( coder_type_q )
print( "\ncoder_type result count: {} ( SQL: {} )".format( relation_qs.count(), relation_qs.query ) )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( coder_type_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [78]:
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( pub_date_q )
test_qs = test_qs.filter( coder_username_q )
test_qs = test_qs.filter( coder_type_q )
print( "all 3 together count: {}".format( test_qs.count() ) )
Back to Table of Contents
{
"comparison_type": "includes",
"filter_type": "entity_trait",
"name": "sourcenet-Newspaper-ID",
"relation_roles_list": [
"THROUGH"
],
"value_list": [
1
]
}
In [79]:
# try to just find the entity with the desired identifier.
entity_qs = Entity.objects.all()
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
combined_q = Q( entity_trait__name = "sourcenet-Newspaper-ID" ) & Q( entity_trait__value = "1" )
entity_qs = entity_qs.filter( combined_q )
print( "\nresult count: {} (SQL: {})".format( entity_qs.count(), entity_qs.query ) )
In [80]:
# make Q()s
#from_q = Q( relation_from__in = entity_qs )
#to_q = Q( relation_to__in = entity_qs )
through_q = Q( relation_through__in = entity_qs )
# are we doing an end-to-end test?
if ( test_relation_qs is not None ):
test_relation_qs = test_relation_qs.filter( through_q )
print( "\nend-to-end --> FROM & TO & THROUGH (filter) - result count: {} ( SQL: {} )".format( test_relation_qs.count(), test_relation_qs.query ) )
#-- END check to see if we are combining these filters. --#
In [66]:
# load basic file into test instance.
test_instance = TestHelper.load_basic()
#test_instance = TestHelper.load_basic_2()
In [67]:
# filter_specification
selection_filters = test_instance.get_selection_filters()
#print( "base_filter_spec: {}".format( base_filter_spec ) )
# retrieve actual "filter_specification".
filter_specification = selection_filters.get( NetworkDataRequest.PROP_NAME_FILTER_SPECIFICATION, None )
# load it into a FilterSpec.
filter_spec = FilterSpec()
filter_spec.set_filter_spec( filter_specification )
print( "filter_spec:\n{}".format( filter_spec.to_json_string() ) )
print( "- comparison type: {}".format( filter_spec.get_comparison_type() ) )
In [68]:
# build out Q() instances
result_status = test_instance.build_filter_spec_q( filter_spec )
print( "build Q() result: {}".format( result_status ) )
In [69]:
# try filtering.
relation_qs = Entity_Relation.objects.all()
relation_qs = test_instance.filter_relations_by_filter_spec( relation_qs, filter_spec )
print( "relation_qs count: {}".format( relation_qs.count() ) )
In [70]:
relation_qs = None
value_list = None
child_filter_spec_list = None
child_filter_spec_count = None
filter_1 = None
filter_1_q = None
test_qs = None
test_count = None
relation_count = None
# init QS
relation_qs = Entity_Relation.objects.all()
# work through child filter spec list
value_list = filter_spec.get_value_list()
child_filter_spec_list = filter_spec.get_child_filter_spec_list()
child_filter_spec_count = len( child_filter_spec_list )
print( "- Child filter spec count: {}".format( child_filter_spec_count ) )
# filter 1
filter_1 = child_filter_spec_list[ 0 ]
print( "Filter 1: {}\nJSON: {}".format( filter_1, filter_1.to_json_string() ) )
value_1 = value_list[ 0 ]
print( "Value 1: {}".format( value_1 ) )
# Q()
filter_1_q = filter_1.get_my_q()
print( "Q 1: {}".format( filter_1_q ) )
# test Q
test_qs = relation_qs.filter( filter_1_q )
test_count = test_qs.count()
print( "----> test_qs count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_1_q )
relation_count = relation_qs.count()
print( "\n\n- relation_qs count: {}".format( relation_count ) )
In [71]:
#relation_qs = None
current_q = None
test_qs = None
test_count = None
relation_count = None
filter_2 = None
filter_2_child_list = None
filter_2_child_1 = None
filter_2_child_1_q = None
filter_2_child_2 = None
filter_2_child_2_q = None
filter_2_child_3 = None
filter_2_child_3_q = None
# init QS
#relation_qs = Entity_Relation.objects.all()
# filter 2
filter_2 = child_filter_spec_list[ 1 ]
print( "Filter 2:\n{}".format( filter_2.to_json_string() ) )
# Q()
current_q = filter_2.get_my_q()
print( "Q 1: {}".format( current_q ) )
# children
filter_2_child_list = filter_2.get_child_filter_spec_list()
print( filter_2_child_list )
# child 1
filter_2_child_1 = filter_2_child_list[ 0 ]
print( "\n\nFilter 2, child 1:\n{}".format( filter_2_child_1.to_json_string() ) )
# child 1 Q
filter_2_child_1_q = filter_2_child_1.get_my_q()
print( "Q filter_2_child_1_q: {}".format( filter_2_child_1_q ) )
# test child 1 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_1_q )
test_count = test_qs.count()
print( "----> test_qs f2c1q count: {}".format( test_count ) )
# child 2
filter_2_child_2 = filter_2_child_list[ 1 ]
print( "\n\nFilter 2, child 2:\n{}".format( filter_2_child_2.to_json_string() ) )
# child 2 Q
filter_2_child_2_q = filter_2_child_2.get_my_q()
print( "Q filter_2_child_2_q: {}".format( filter_2_child_2_q ) )
# test child 2 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_2_q )
test_count = test_qs.count()
print( "----> test_qs f2c2q count: {}".format( test_count ) )
# child 3
filter_2_child_3 = filter_2_child_list[ 2 ]
print( "\n\nFilter 2, child 3:\n{}".format( filter_2_child_3.to_json_string() ) )
# child 3 Q
filter_2_child_3_q = filter_2_child_3.get_my_q()
print( "Q filter_2_child_3_q: {}".format( filter_2_child_3_q ) )
# test child 3 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_3_q )
test_count = test_qs.count()
print( "----> test_qs f2c3q count: {}".format( test_count ) )
# test all together
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_1_q )
test_qs = test_qs.filter( filter_2_child_2_q )
test_qs = test_qs.filter( filter_2_child_3_q )
test_count = test_qs.count()
print( "\n\n- test_qs ALL count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_2_child_1_q )
relation_qs = relation_qs.filter( filter_2_child_2_q )
relation_qs = relation_qs.filter( filter_2_child_3_q )
relation_count = relation_qs.count()
print( "- relation_qs count: {}".format( relation_count ) )
In [72]:
#relation_qs = None
current_q = None
test_qs = None
test_count = None
relation_count = None
filter_3 = None
filter_3_child_list = None
filter_3_child_1 = None
filter_3_child_1_q = None
filter_3_child_2 = None
filter_3_child_2_q = None
filter_3_child_3 = None
filter_3_child_3_q = None
# init QS
#relation_qs = Entity_Relation.objects.all()
# filter 3
filter_3 = child_filter_spec_list[ 2 ]
print( "Filter 3:\n{}".format( filter_3.to_json_string() ) )
# Q()
current_q = filter_3.get_my_q()
print( "Q 1: {}".format( current_q ) )
# children
filter_3_child_list = filter_3.get_child_filter_spec_list()
print( filter_3_child_list )
# child 1
filter_3_child_1 = filter_3_child_list[ 0 ]
print( "\n\nFilter 3, child 1:\n{}".format( filter_3_child_1.to_json_string() ) )
# child 1 Q
filter_3_child_1_q = filter_3_child_1.get_my_q()
print( "Q filter_3_child_1_q: {}".format( filter_3_child_1_q ) )
# test child 1 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_1_q )
test_count = test_qs.count()
print( "- test_qs f3c1q count: {}".format( test_count ) )
# child 2
filter_3_child_2 = filter_3_child_list[ 1 ]
print( "\n\nFilter 3, child 2:\n{}".format( filter_3_child_2.to_json_string() ) )
# child 2 Q
filter_3_child_2_q = filter_3_child_2.get_my_q()
print( "Q filter_3_child_2_q: {}".format( filter_3_child_2_q ) )
# test child 2 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_2_q )
test_count = test_qs.count()
print( "- test_qs f3c2q count: {}".format( test_count ) )
# child 3
filter_3_child_3 = filter_3_child_list[ 2 ]
print( "\n\nFilter 3, child 3:\n{}".format( filter_3_child_3.to_json_string() ) )
# child 3 Q
filter_3_child_3_q = filter_3_child_3.get_my_q()
print( "Q filter_3_child_3_q: {}".format( filter_3_child_3_q ) )
# test child 3 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_3_q )
test_count = test_qs.count()
print( "- test_qs f3c3q count: {}".format( test_count ) )
# test all together
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_1_q )
test_qs = test_qs.filter( filter_3_child_2_q )
test_qs = test_qs.filter( filter_3_child_3_q )
test_count = test_qs.count()
print( "\n\n- test_qs ALL count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_3_child_1_q )
relation_qs = relation_qs.filter( filter_3_child_2_q )
relation_qs = relation_qs.filter( filter_3_child_3_q )
relation_count = relation_qs.count()
print( "\n\n- relation_qs count: {}".format( relation_count ) )
In [73]:
filter_4 = None
value_4 = None
filter_4_q = None
test_qs = None
test_count = None
relation_count = None
# filter 4
filter_4 = child_filter_spec_list[ 3 ]
print( "Filter 4: {}\n- JSON: {}".format( filter_4, filter_4.to_json_string() ) )
value_4 = value_list[ 3 ]
print( "Value 4: {}".format( value_4 ) )
# Q()
filter_4_q = filter_4.get_my_q()
print( "Q 1: {}".format( filter_4_q ) )
# test Q
test_qs = relation_qs.filter( filter_4_q )
test_count = test_qs.count()
print( "- test_qs count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_4_q )
relation_count = relation_qs.count()
print( "- relation_qs count: {}".format( relation_count ) )
In [74]:
# load basic file into test instance.
#test_instance = TestHelper.load_basic()
test_instance = TestHelper.load_basic_2()
In [75]:
# filter_specification
selection_filters = test_instance.get_selection_filters()
#print( "base_filter_spec: {}".format( base_filter_spec ) )
# retrieve actual "filter_specification".
filter_specification = selection_filters.get( NetworkDataRequest.PROP_NAME_FILTER_SPECIFICATION, None )
# load it into a FilterSpec.
filter_spec = FilterSpec()
filter_spec.set_filter_spec( filter_specification )
print( "filter_spec:\n{}".format( filter_spec.to_json_string() ) )
print( "- comparison type: {}".format( filter_spec.get_comparison_type() ) )
In [76]:
# build out Q() instances
result_status = test_instance.build_filter_spec_q( filter_spec )
print( "build Q() result: {}".format( result_status ) )
In [77]:
# try filtering.
relation_qs = Entity_Relation.objects.all()
relation_qs = test_instance.filter_relations_by_filter_spec( relation_qs, filter_spec )
print( "relation_qs count: {}".format( relation_qs.count() ) )
In [78]:
relation_qs = None
value_list = None
child_filter_spec_list = None
child_filter_spec_count = None
filter_1 = None
filter_1_q = None
test_qs = None
test_count = None
relation_count = None
# init QS
relation_qs = Entity_Relation.objects.all()
# work through child filter spec list
value_list = filter_spec.get_value_list()
child_filter_spec_list = filter_spec.get_child_filter_spec_list()
child_filter_spec_count = len( child_filter_spec_list )
print( "- Child filter spec count: {}".format( child_filter_spec_count ) )
# filter 1
filter_1 = child_filter_spec_list[ 0 ]
print( "Filter 1: {}\nJSON: {}".format( filter_1, filter_1.to_json_string() ) )
value_1 = value_list[ 0 ]
print( "Value 1: {}".format( value_1 ) )
# Q()
filter_1_q = filter_1.get_my_q()
print( "Q 1: {}".format( filter_1_q ) )
# test Q
test_qs = relation_qs.filter( filter_1_q )
test_count = test_qs.count()
print( "----> test_qs count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_1_q )
relation_count = relation_qs.count()
print( "\n\n- relation_qs count: {}".format( relation_count ) )
In [79]:
#relation_qs = None
current_q = None
test_qs = None
test_count = None
relation_count = None
filter_2 = None
filter_2_child_list = None
filter_2_child_1 = None
filter_2_child_1_q = None
filter_2_child_2 = None
filter_2_child_2_q = None
filter_2_child_3 = None
filter_2_child_3_q = None
# init QS
#relation_qs = Entity_Relation.objects.all()
# filter 2
filter_2 = child_filter_spec_list[ 1 ]
print( "Filter 2:\n{}".format( filter_2.to_json_string() ) )
# Q()
current_q = filter_2.get_my_q()
print( "Q 1: {}".format( current_q ) )
# children
filter_2_child_list = filter_2.get_child_filter_spec_list()
print( filter_2_child_list )
# child 1
filter_2_child_1 = filter_2_child_list[ 0 ]
print( "\n\nFilter 2, child 1:\n{}".format( filter_2_child_1.to_json_string() ) )
# child 1 Q
filter_2_child_1_q = filter_2_child_1.get_my_q()
print( "Q filter_2_child_1_q: {}".format( filter_2_child_1_q ) )
# test child 1 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_1_q )
test_count = test_qs.count()
print( "----> test_qs f2c1q count: {}".format( test_count ) )
# child 2
filter_2_child_2 = filter_2_child_list[ 1 ]
print( "\n\nFilter 2, child 2:\n{}".format( filter_2_child_2.to_json_string() ) )
# child 2 Q
filter_2_child_2_q = filter_2_child_2.get_my_q()
print( "Q filter_2_child_2_q: {}".format( filter_2_child_2_q ) )
# test child 2 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_2_q )
test_count = test_qs.count()
print( "----> test_qs f2c2q count: {}".format( test_count ) )
# child 3
filter_2_child_3 = filter_2_child_list[ 2 ]
print( "\n\nFilter 2, child 3:\n{}".format( filter_2_child_3.to_json_string() ) )
# child 3 Q
filter_2_child_3_q = filter_2_child_3.get_my_q()
print( "Q filter_2_child_3_q: {}".format( filter_2_child_3_q ) )
# test child 3 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_3_q )
test_count = test_qs.count()
print( "----> test_qs f2c3q count: {}".format( test_count ) )
# test all together
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_2_child_1_q )
test_qs = test_qs.filter( filter_2_child_2_q )
test_qs = test_qs.filter( filter_2_child_3_q )
test_count = test_qs.count()
print( "\n\n- test_qs ALL count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_2_child_1_q )
relation_qs = relation_qs.filter( filter_2_child_2_q )
relation_qs = relation_qs.filter( filter_2_child_3_q )
relation_count = relation_qs.count()
print( "- relation_qs count: {}".format( relation_count ) )
In [80]:
#relation_qs = None
current_q = None
test_qs = None
test_count = None
relation_count = None
filter_3 = None
filter_3_child_list = None
filter_3_child_1 = None
filter_3_child_1_q = None
filter_3_child_2 = None
filter_3_child_2_q = None
filter_3_child_3 = None
filter_3_child_3_q = None
# init QS
#relation_qs = Entity_Relation.objects.all()
# filter 3
filter_3 = child_filter_spec_list[ 2 ]
print( "Filter 3:\n{}".format( filter_3.to_json_string() ) )
# Q()
current_q = filter_3.get_my_q()
print( "Q 1: {}".format( current_q ) )
# children
filter_3_child_list = filter_3.get_child_filter_spec_list()
print( filter_3_child_list )
# child 1
filter_3_child_1 = filter_3_child_list[ 0 ]
print( "\n\nFilter 3, child 1:\n{}".format( filter_3_child_1.to_json_string() ) )
# child 1 Q
filter_3_child_1_q = filter_3_child_1.get_my_q()
print( "Q filter_3_child_1_q: {}".format( filter_3_child_1_q ) )
# test child 1 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_1_q )
test_count = test_qs.count()
print( "- test_qs f3c1q count: {}".format( test_count ) )
# child 2
filter_3_child_2 = filter_3_child_list[ 1 ]
print( "\n\nFilter 3, child 2:\n{}".format( filter_3_child_2.to_json_string() ) )
# child 2 Q
filter_3_child_2_q = filter_3_child_2.get_my_q()
print( "Q filter_3_child_2_q: {}".format( filter_3_child_2_q ) )
# test child 2 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_2_q )
test_count = test_qs.count()
print( "- test_qs f3c2q count: {}".format( test_count ) )
# child 3
filter_3_child_3 = filter_3_child_list[ 2 ]
print( "\n\nFilter 3, child 3:\n{}".format( filter_3_child_3.to_json_string() ) )
# child 3 Q
filter_3_child_3_q = filter_3_child_3.get_my_q()
print( "Q filter_3_child_3_q: {}".format( filter_3_child_3_q ) )
# test child 3 Q
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_3_q )
test_count = test_qs.count()
print( "- test_qs f3c3q count: {}".format( test_count ) )
# test all together
test_qs = Entity_Relation.objects.all()
test_qs = test_qs.filter( filter_3_child_1_q )
test_qs = test_qs.filter( filter_3_child_2_q )
test_qs = test_qs.filter( filter_3_child_3_q )
test_count = test_qs.count()
print( "\n\n- test_qs ALL count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_3_child_1_q )
relation_qs = relation_qs.filter( filter_3_child_2_q )
relation_qs = relation_qs.filter( filter_3_child_3_q )
relation_count = relation_qs.count()
print( "\n\n- relation_qs count: {}".format( relation_count ) )
In [81]:
filter_4 = None
value_4 = None
filter_4_q = None
test_qs = None
test_count = None
relation_count = None
# filter 4
filter_4 = child_filter_spec_list[ 3 ]
print( "Filter 4: {}\n- JSON: {}".format( filter_4, filter_4.to_json_string() ) )
value_4 = value_list[ 3 ]
print( "Value 4: {}".format( value_4 ) )
# Q()
filter_4_q = filter_4.get_my_q()
print( "Q 1: {}".format( filter_4_q ) )
# test Q
test_qs = relation_qs.filter( filter_4_q )
test_count = test_qs.count()
print( "- test_qs count: {}".format( test_count ) )
# update QS
relation_qs = relation_qs.filter( filter_4_q )
relation_count = relation_qs.count()
print( "- relation_qs count: {}".format( relation_count ) )
general TODO:
methods to find relations, similar to filter_entities()
and lookup_entities()
in Entity
model class. Include:
from_entity_traits
// to =
to_entity_traits
// through =
through_entity_traits
either FROM or TO (so undirected search - "I don't care which side")
any of FROM, TO, THROUGH
relation_traits
NOTE for trait matching types (and probably entity identifiers, also):
Tags on Entity and/or Entity_Relation.
make small dummy person and organization classes in context, so I can use them to test Abstract_Entity_Container without needing context_text.
come up with better way to seed entities and relations for sourcenet - store spec in context_text base, then method to create or update all.
This will need to build up a basic object mapping based on foriegn keys before it creates anything, then create things in the right order (Entity_Types and related first, then Entity_Identifier_Types, then Relation_Types). Order:
for each type:
To actually load, go in order of types outlined above, creating as you go.
abstraction:
make an abstract parent for a type that has associated trait specs (parent to Entity_Type
and Entity_Relation_Type
).
get_trait_spec()
.make an abstract parent for trait containers that have associated types with associated trait specs (parent to Entity
and Entity_Relation
).
? - make an "Abstract_Relation_Container" abstract method for code related to an instance of a given model resulting in a relation (Article_Data)?
general TODO DONE:
// take the "create_article_entity()
" function and put it in a class for loading sourcenet articles into context
context_text/export/to_context_base/export_to_context.py
- class ExportToContext
// build unit test class for this loading class, and add one for creating a fake article, then making an entity out of it. Check entity, traits, and identifiers.
add a unique_identifier_type column to Article model. To set for NewsBank:
--SELECT COUNT( * ) FROM context_text_article;
--SELECT COUNT( * ) FROM context_text_article WHERE archive_source = 'NewsBank';
--SELECT * FROM context_text_article WHERE archive_source != 'NewsBank';
--UPDATE context_text_article SET unique_identifier_type = 'article_newsbank_id' WHERE archive_source = 'NewsBank';
--SELECT COUNT( * ) FROM context_text_article WHERE unique_identifier_type = 'permalink';
--SELECT COUNT( * ) FROM context_text_article WHERE unique_identifier_type = 'article_newsbank_id';
2019.11.07 - // method to find entity - based on type and identifier (accept all the fields that make sense, including optional identifier type instance).
Entity Creation TODO - DONE:
// Build and test basic article entity creation
// Build and test basic newspaper entity creation
// Build and test basic person entity creation
// Build and test basic organization entity creation
Relation Creation (start in export_to_context.py
) TODO - DONE (All added 2019.11.25 unless noted otherwise):
filter_relations
and lookup_relations
with subset of above to serve Entity_Relation creation.create unit test class for Entity_Relation
based on Entity that tests:
instance methods:
set_basic_traits_from_dict()
class methods:
// Entity_Relation.create_entity_relation()
// filter_relations()
lookup_relations()
Entity_Type and Entity_Relation_Type tests:
get_type_for_slug()
// work through entity creation first in export_to_context.py
:
create_entities()
create_newspaper_entities()
create_article_entities()
// test cases for each of these in test_export_to_context
.
create_entity_container_entity()
create_article_entity()
create_person_entity()
@classmethod make_author_entity_list()
@classmethod make_relation_trait_dict()
@classmethod make_subject_entity_list()
(including only sources, and don't include sources in subjects).create_newspaper_relations()
create_article_relations()
create_relations()
process_articles()
// hook these in to process_articles()
// add traits to some of the relations created in TestHelper.create_test_relations
.
update_entity()
method on each entity a second time. Should not result in two separate entities.process_articles()
tests to make sure that tags are getting added to each article as they are processed.