In [103]:
import requests
import json
from tabulate import tabulate

Our list of targets


In [104]:
targets = ['ENSG00000069696', 'ENSG00000144285']
targets_string = ', '.join('"{0}"'.format(t) for t in targets)

Make the API call with our list of targets to find the associations. Set facets to true.


In [105]:
url = 'https://www.targetvalidation.org/api/latest/public/association/filter'
headers = {"Accept": "application/json"}
# There may be an easier way of building these parameters...
data = "{\"target\":[" + targets_string + "], \"facets\":true}"

response = requests.post(url, headers=headers, data=data)
output = response.json()

Print out all the json returned just for reference


In [106]:
#print json.dumps(output, indent=2)

The therapeutic area facets look interesting - lets iterate through these and display


In [107]:
therapeuticareas = []

for bucket in output['facets']['therapeutic_area']['buckets']:
    therapeuticareas.append({
            'target_count' : bucket['unique_target_count']['value'], 
            'disease_count' : bucket['unique_disease_count']['value'],
            'therapeutic_area' : bucket['label'],
            'key' : bucket['key']
        })

Sort by target count and then disease count


In [108]:
therapeuticareas = sorted(therapeuticareas, key=lambda k: (k['target_count'],k['disease_count']), reverse=True)

Using the python tabulate library to render a pretty table of our extracted therapeutic areas. Note: You may need to run pip install tabulate in your python environment


In [109]:
print tabulate(therapeuticareas, headers="keys", tablefmt="grid")


+------------------------------+-----------------+-------------+----------------+
| therapeutic_area             |   disease_count | key         |   target_count |
+==============================+=================+=============+================+
| genetic disorder             |             285 | efo_0000508 |              2 |
+------------------------------+-----------------+-------------+----------------+
| phenotype                    |             115 | efo_0000651 |              2 |
+------------------------------+-----------------+-------------+----------------+
| nervous system disease       |              86 | efo_0000618 |              2 |
+------------------------------+-----------------+-------------+----------------+
| eye disease                  |              80 | efo_0003966 |              2 |
+------------------------------+-----------------+-------------+----------------+
| neoplasm                     |              49 | efo_0000616 |              2 |
+------------------------------+-----------------+-------------+----------------+
| metabolic disease            |              38 | efo_0000589 |              2 |
+------------------------------+-----------------+-------------+----------------+
| cardiovascular disease       |              38 | efo_0000319 |              2 |
+------------------------------+-----------------+-------------+----------------+
| endocrine system disease     |              26 | efo_0001379 |              2 |
+------------------------------+-----------------+-------------+----------------+
| reproductive system disease  |              25 | efo_0000512 |              2 |
+------------------------------+-----------------+-------------+----------------+
| skeletal system disease      |              21 | efo_0002461 |              2 |
+------------------------------+-----------------+-------------+----------------+
| muscular disease             |              19 | efo_0002970 |              2 |
+------------------------------+-----------------+-------------+----------------+
| immune system disease        |              15 | efo_0000540 |              2 |
+------------------------------+-----------------+-------------+----------------+
| respiratory system disease   |              10 | efo_0000684 |              2 |
+------------------------------+-----------------+-------------+----------------+
| infectious disease           |               8 | efo_0005741 |              2 |
+------------------------------+-----------------+-------------+----------------+
| hematological system disease |               6 | efo_0005803 |              2 |
+------------------------------+-----------------+-------------+----------------+
| skin disease                 |              24 | efo_0000701 |              1 |
+------------------------------+-----------------+-------------+----------------+
| digestive system disease     |              11 | efo_0000405 |              1 |
+------------------------------+-----------------+-------------+----------------+
| other                        |               2 | other       |              1 |
+------------------------------+-----------------+-------------+----------------+

Lets just consider the first 5 top therapeutic areas


In [110]:
therapeuticareas = therapeuticareas[:5]
print tabulate(therapeuticareas, headers="keys", tablefmt="grid")


+------------------------+-----------------+-------------+----------------+
| therapeutic_area       |   disease_count | key         |   target_count |
+========================+=================+=============+================+
| genetic disorder       |             285 | efo_0000508 |              2 |
+------------------------+-----------------+-------------+----------------+
| phenotype              |             115 | efo_0000651 |              2 |
+------------------------+-----------------+-------------+----------------+
| nervous system disease |              86 | efo_0000618 |              2 |
+------------------------+-----------------+-------------+----------------+
| eye disease            |              80 | efo_0003966 |              2 |
+------------------------+-----------------+-------------+----------------+
| neoplasm               |              49 | efo_0000616 |              2 |
+------------------------+-----------------+-------------+----------------+

Now for each of those identify the top 5 diseases. Unfortunately we don't get the disease names in the facets, just the codes. Is this is the right approach then an API change???


In [111]:
for therapeuticarea in therapeuticareas:
    print "Therapeutic area: " + therapeuticarea['therapeutic_area']
    data = "{\"target\":[" + targets_string + "], \"facets\":true, \"therapeutic_area\":[\"" + therapeuticarea['key'] + "\"]}"
    response = requests.post(url, headers=headers, data=data)
    output = response.json()
    
    diseases = []

    for bucket in output['facets']['disease']['buckets']:
        diseases.append({
            'target_count' : bucket['unique_target_count']['value'], 
            'doc_count' : bucket['doc_count'],
            'key' : bucket['key']
        })
    
    # Sort and take top 5
    diseases = sorted(diseases, key=lambda k: (k['target_count'],k['doc_count']), reverse=True) 
    diseases = diseases[:5]
    
    print tabulate(diseases, headers="keys", tablefmt="grid")
    print ""


Therapeutic area: genetic disorder
+-------------+-----------------+----------------+
|   doc_count | key             |   target_count |
+=============+=================+================+
|           2 | Orphanet_101435 |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_101953 |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_139009 |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_1478   |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_156638 |              2 |
+-------------+-----------------+----------------+

Therapeutic area: phenotype
+-------------+-------------+----------------+
|   doc_count | key         |   target_count |
+=============+=============+================+
|           2 | EFO_0003108 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0003765 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0003843 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0003847 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0005230 |              2 |
+-------------+-------------+----------------+

Therapeutic area: nervous system disease
+-------------+-------------+----------------+
|   doc_count | key         |   target_count |
+=============+=============+================+
|           2 | EFO_0000249 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000289 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000326 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000474 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000677 |              2 |
+-------------+-------------+----------------+

Therapeutic area: eye disease
+-------------+-----------------+----------------+
|   doc_count | key             |   target_count |
+=============+=================+================+
|           2 | EFO_0001365     |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_101435 |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_183601 |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_183616 |              2 |
+-------------+-----------------+----------------+
|           2 | Orphanet_34533  |              2 |
+-------------+-----------------+----------------+

Therapeutic area: neoplasm
+-------------+-------------+----------------+
|   doc_count | key         |   target_count |
+=============+=============+================+
|           2 | EFO_0000305 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000311 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000313 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000326 |              2 |
+-------------+-------------+----------------+
|           2 | EFO_0000565 |              2 |
+-------------+-------------+----------------+


In [ ]: