Parsing the Causal Biological Network Database

Author: Charles Tapley Hoyt

Estimated Run Time: 1 minute

This notebook outlines the process of parsing the JSON Graph File format used in the Causal Biological Network (CBN) Database.


In [1]:
import json
import requests
import os
import time

import networkx as nx

import pybel
from pybel.constants import *
import pybel_tools
from pybel_tools.visualization import to_jupyter

In [2]:
pybel.__version__


Out[2]:
'0.7.2-dev'

In [3]:
pybel_tools.__version__


Out[3]:
'0.1.17-dev'

In [4]:
time.asctime()


Out[4]:
'Thu Aug 10 14:07:57 2017'

Data Acquisition

Data can be downloaded directly from the GetJSONGraphFile endpoint, and are returned in the response as JSON.


In [5]:
res = requests.get("http://causalbionet.com/Networks/GetJSONGraphFile?networkId=hox_2.0_hs").json()

Parsing

The structure is traversed, and the BELParser is manipulated directly. Normally, during BEL compilation, the usage of this class is hidden from the user.


In [6]:
graph = pybel.BELGraph()
parser = pybel.parser.BelParser(graph)

In [7]:
def get_citation(evidence):
    return {
        CITATION_NAME: evidence['citation']['name'],
        CITATION_TYPE: evidence['citation']['type'],
        CITATION_REFERENCE: evidence['citation']['id']
    }

In [8]:
annotation_map = {
    'tissue': 'Tissue',
    'disease': 'Disease',
    'species_common_name': 'Species'
}

In [9]:
species_map = {
    'human': '9606',
    'rat': '10116',
    'mouse': '10090'
}

In [10]:
annotation_value_map = {
    'Species': species_map
}

In [11]:
for edge in res['graph']['edges']:    
    for evidence in edge['metadata']['evidences']:
        if 'citation' not in evidence or not evidence['citation']:
            continue
        
        parser.control_parser.clear()
        parser.control_parser.citation = get_citation(evidence)
        parser.control_parser.evidence = evidence['summary_text'] 
        
        d = {}
        
        if 'biological_context' in evidence:
            annotations = evidence['biological_context']
        
            if annotations['tissue']:
                d['Tissue'] = annotations['tissue']

            if annotations['disease']:
                d['Disease'] = annotations['disease']

            if annotations['species_common_name']:
                d['Species'] = species_map[annotations['species_common_name'].lower()]
        
        parser.control_parser.annotations.update(d)
        bel = '{source} {relation} {target}'.format_map(edge)
        try:
            parser.parseString(bel)
        except Exception as e:
            print(e, bel)

Visualization

Finally, the graph is vizualized in the notebook diretly with pybel_tools.visualization.to_jupyter.


In [12]:
to_jupyter(graph)


Out[12]:

In [17]:
pybel.to_database(graph)

In [ ]:
pybel.get_ver

Using PyBEL Functions

This pipeline is implemented directly in PyBEL at pybel.from_cbn_jgif


In [13]:
with open(os.path.join(os.environ['BMS_BASE'], 'cbn', 'Human-2.0', 'Hox-2.0-Hs.jgf')) as f:
    graph_jgif_dict = json.load(f)

In [14]:
%%time
graph = pybel.from_cbn_jgif(graph_jgif_dict)


CPU times: user 4.73 s, sys: 69.2 ms, total: 4.79 s
Wall time: 4.8 s

In [15]:
bel_lines = pybel.to_bel_lines(graph)

graph_reloaded = pybel.from_lines(bel_lines)


WARNING:pybel.parser:Line 0000006 - VersionFormatWarning: Version string "2.0" neither is a date like YYYYMMDD nor adheres to semantic versioning
ERROR:pybel.io.line_utils:Missing required document metadata: ContactInfo
WARNING:pybel.parser:Line 0000021 - IllegalAnnotationValueWarning: "normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000023 - IllegalAnnotationValueWarning: "HUVECs" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000025 - MissingAnnotationKeyWarning: "Disease" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000030 - IllegalAnnotationValueWarning: "Cell line" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000031 - IllegalAnnotationValueWarning: "Normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000033 - IllegalAnnotationValueWarning: "HUVEC" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000035 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000048 - IllegalAnnotationValueWarning: "endothelial cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000050 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000057 - IllegalAnnotationValueWarning: "hematopoietic stem cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000059 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000064 - IllegalAnnotationValueWarning: "Cell line" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000065 - IllegalAnnotationValueWarning: "Normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000067 - IllegalAnnotationValueWarning: "HUVEC" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000069 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000090 - IllegalAnnotationValueWarning: "endothelial cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000092 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000104 - UndefinedAnnotationWarning: "SupportingText" is not defined
WARNING:pybel.parser:Line 0000106 - IllegalAnnotationValueWarning: "cell line eoma" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000107 - MissingSupportWarning: Missing evidence; can't add: p(HGNC:HOXA5) increases p(SFAM:"AKT Family", pmod(Ph, Ser, 473))
WARNING:pybel.parser:Line 0000108 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000109 - MissingAnnotationKeyWarning: "SupportingText" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000113 - IllegalAnnotationValueWarning: "Myocytes, Smooth Muscle" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000115 - IllegalAnnotationValueWarning: "vascular tissue" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000117 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000123 - IllegalAnnotationValueWarning: "endothelial cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000125 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000131 - IllegalAnnotationValueWarning: "normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000133 - IllegalAnnotationValueWarning: "mouse embryonic fibroblasts" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000135 - MissingAnnotationKeyWarning: "Disease" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000162 - IllegalAnnotationValueWarning: "vascular smooth muscle cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000164 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000171 - IllegalAnnotationValueWarning: "endothelial cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000173 - MissingAnnotationKeyWarning: "Tissue" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000179 - IllegalAnnotationValueWarning: "normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000181 - IllegalAnnotationValueWarning: "vascular smooth muscle cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000183 - MissingAnnotationKeyWarning: "Disease" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000185 - IllegalAnnotationValueWarning: "normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000187 - IllegalAnnotationValueWarning: "vascular smooth muscle cells" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000189 - MissingAnnotationKeyWarning: "Disease" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000194 - IllegalAnnotationValueWarning: "Stem Cells" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000197 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000202 - IllegalAnnotationValueWarning: "Endothelial Cells" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000205 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000217 - IllegalAnnotationValueWarning: "Cell line" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000218 - IllegalAnnotationValueWarning: "Normal" is not defined in the Disease annotation
WARNING:pybel.parser:Line 0000220 - IllegalAnnotationValueWarning: "HUVEC" is not defined in the Tissue annotation
WARNING:pybel.parser:Line 0000222 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000227 - IllegalAnnotationValueWarning: "Endothelial Cells" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000230 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Line 0000231 - IllegalAnnotationValueWarning: "Endothelial Cells" is not defined in the Cell annotation
WARNING:pybel.parser:Line 0000234 - MissingAnnotationKeyWarning: "Cell" is not set, so it can't be unset
WARNING:pybel.parser:Added singleton [line 243]: p(HGNC:KDR). Putative error - needs checking.
WARNING:pybel.parser:Added singleton [line 244]: p(HGNC:CDKN1C). Putative error - needs checking.
WARNING:pybel.parser:Added singleton [line 245]: p(HGNC:JUN). Putative error - needs checking.
WARNING:pybel.parser:Added singleton [line 246]: p(HGNC:CDKN1B). Putative error - needs checking.
WARNING:pybel.parser:Added singleton [line 247]: p(HGNC:ITGA5). Putative error - needs checking.
WARNING:pybel.parser:Added singleton [line 248]: p(HGNC:CDKN2D). Putative error - needs checking.

In [16]:
to_jupyter(graph_reloaded)


Out[16]:

In [ ]: