GA4GH RNA Quantification API Example

This example illustrates the methods used to access the rna_quantification_service.

Initialize client

In this step we create a client object which will be used to communicate with the server.


In [1]:
from ga4gh.client import client
c = client.HttpClient("http://1kgenomes.ga4gh.org")

In [2]:
#Obtain dataSet id REF: -> `1kg_metadata_service`
dataset = c.search_datasets().next()

Search RNA Quantification Sets Method

This instance returns a list of RNA quantification sets in a dataset. RNA quantification sets are a way to associate a group of related RNA quantifications. Note that we use the dataset_id obtained from the 1kg_metadata_service notebook.


In [3]:
counter = 0
for rna_quant_set in c.search_rna_quantification_sets(dataset_id=dataset.id):
    if counter > 5:
        break
    counter += 1
    print(" id: {}".format(rna_quant_set.id))
    print(" dataset_id: {}".format(rna_quant_set.dataset_id))
    print(" name: {}\n".format(rna_quant_set.name))


 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iXQ
 dataset_id: WyIxa2dlbm9tZXMiXQ
 name: E-GEUV-1 RNA Quantification

Get RNA Quantification Set by id method

This method obtains an single RNA quantification set by it's unique identifier. This id was chosen arbitrarily from the returned results.


In [4]:
single_rna_quant_set = c.get_rna_quantification_set(
    rna_quantification_set_id=rna_quant_set.id)
print(" name: {}\n".format(single_rna_quant_set.name))


 name: E-GEUV-1 RNA Quantification

Search RNA Quantifications

We can list all of the RNA quantifications in an RNA quantification set. The rna_quantification_set_id was chosen arbitrarily from the returned results.


In [5]:
counter = 0
for rna_quant in c.search_rna_quantifications(
        rna_quantification_set_id=rna_quant_set.id):
    if counter > 5:
        break
    counter += 1
    print("RNA Quantification: {}".format(rna_quant.name))
    print(" id: {}".format(rna_quant.id))
    print(" description: {}\n".format(rna_quant.description))
    test_quant = rna_quant


RNA Quantification: HG00104
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMTA0Il0
 description: RNA seq data from lymphoblastoid cell lines in the 1000 Genome Project, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples/

RNA Quantification: HG00103
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMTAzIl0
 description: RNA seq data from lymphoblastoid cell lines in the 1000 Genome Project, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples/

RNA Quantification: HG00102
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMTAyIl0
 description: RNA seq data from lymphoblastoid cell lines in the 1000 Genome Project, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples/

RNA Quantification: HG00101
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMTAxIl0
 description: RNA seq data from lymphoblastoid cell lines in the 1000 Genome Project, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples/

RNA Quantification: HG00100
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMTAwIl0
 description: RNA seq data from lymphoblastoid cell lines in the 1000 Genome Project, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples/

RNA Quantification: HG00099
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5Il0
 description: RNA seq data from lymphoblastoid cell lines in the 1000 Genome Project, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples/

Get RNA Quantification by Id

Similar to RNA quantification sets, we can retrieve a single RNA quantification by specific id. This id was chosen arbitrarily from the returned results.

The RNA quantification reported contains details of the processing pipeline which include the source of the reads as well as the annotations used.


In [6]:
single_rna_quant = c.get_rna_quantification(
    rna_quantification_id=test_quant.id)
print(" name: {}".format(single_rna_quant.name))
print(" read_ids: {}".format(single_rna_quant.read_group_ids))
print(" annotations: {}\n".format(single_rna_quant.feature_set_ids))


 name: HG00099
 read_ids: []
 annotations: [u'WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyJd']

Search Expression Levels

The feature level expression data for each RNA quantification is reported as a set of Expression Levels. The rna_quantification_service makes it easy to search for these.


In [7]:
def getUnits(unitType):
    units = ["", "FPKM", "TPM"]
    return units[unitType]


counter = 0
for expression in c.search_expression_levels(
        rna_quantification_id=test_quant.id):
    if counter > 5:
        break
    counter += 1
    print("Expression Level: {}".format(expression.name))
    print(" id: {}".format(expression.id))
    print(" feature: {}".format(expression.feature_id))
    print(" expression: {} {}".format(expression.expression, getUnits(expression.units)))
    print(" read_count: {}".format(expression.raw_read_count))
    print(" confidence_interval: {} - {}\n".format(
            expression.conf_interval_low, expression.conf_interval_high))


Expression Level: ENST00000619216.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiYjUwMzc0ZDItOTVkOC00NzhmLWJjYWYtMTUzZjU3N2E4YmYxIl0
 feature: 
 expression: 2.16668 TPM
 read_count: 1.75
 confidence_interval: 0.0 - 0.0

Expression Level: ENST00000461467.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiYTUwZjFmYmQtZWI0Yy00ODg2LThjYzItZTYyNTEwOTAzN2Y0Il0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MjE5OTMxMiJd
 expression: 0.33977 TPM
 read_count: 4.5671
 confidence_interval: 0.0 - 0.0

Expression Level: ENST00000466430.5
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiOTc4YjcwYTgtZDZmNi00MmY1LWEyYmMtZTVjZWRhNjJlY2E0Il0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MjM1NDE5MiJd
 expression: 0.140395 TPM
 read_count: 11.7305
 confidence_interval: 0.0 - 0.0

Expression Level: ENST00000471248.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiMTdjMzQ5MGMtNmNkYS00YzBlLWE0YzMtYWJiMjkxYzQ0MjU4Il0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MTk1MDY3MiJd
 expression: 0.259275 TPM
 read_count: 3.81327
 confidence_interval: 0.0 - 0.0

Expression Level: ENST00000610542.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiZTcxMWQzODQtZDMzOC00MDlkLTkzYjYtYWQ5MTM4M2QyNGUxIl0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MTk1MTQ0MCJd
 expression: 0.083744 TPM
 read_count: 1.48721
 confidence_interval: 0.0 - 0.0

Expression Level: ENST00000493797.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiODExNjMwOTgtMjdjNy00MTEwLTlmMGEtMWRkYWUyYWFlYmZjIl0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MjA4NjU0NCJd
 expression: 0.119829 TPM
 read_count: 0.589627
 confidence_interval: 0.0 - 0.0

It is also possible to restrict the search to a specific feature or to request expression values exceeding a threshold amount.


In [8]:
counter = 0
for expression in c.search_expression_levels(
        rna_quantification_id=test_quant.id, feature_ids=[]):
    if counter > 5:
        break
    counter += 1
    print("Expression Level: {}".format(expression.name))
    print(" id: {}".format(expression.id))
    print(" feature: {}\n".format(expression.feature_id))


Expression Level: ENST00000619216.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiYjUwMzc0ZDItOTVkOC00NzhmLWJjYWYtMTUzZjU3N2E4YmYxIl0
 feature: 

Expression Level: ENST00000461467.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiYTUwZjFmYmQtZWI0Yy00ODg2LThjYzItZTYyNTEwOTAzN2Y0Il0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MjE5OTMxMiJd

Expression Level: ENST00000466430.5
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiOTc4YjcwYTgtZDZmNi00MmY1LWEyYmMtZTVjZWRhNjJlY2E0Il0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MjM1NDE5MiJd

Expression Level: ENST00000471248.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiMTdjMzQ5MGMtNmNkYS00YzBlLWE0YzMtYWJiMjkxYzQ0MjU4Il0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MTk1MDY3MiJd

Expression Level: ENST00000610542.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiZTcxMWQzODQtZDMzOC00MDlkLTkzYjYtYWQ5MTM4M2QyNGUxIl0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MTk1MTQ0MCJd

Expression Level: ENST00000493797.1
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiODExNjMwOTgtMjdjNy00MTEwLTlmMGEtMWRkYWUyYWFlYmZjIl0
 feature: WyIxa2dlbm9tZXMiLCJnZW5jb2RlX3YyNGxpZnQzNyIsIjE0MDUwOTE3MjA4NjU0NCJd

Let's look for some high expressing features.


In [9]:
counter = 0
for expression in c.search_expression_levels(
        rna_quantification_id=test_quant.id, threshold=1000):
    if counter > 5:
        break
    counter += 1
    print("Expression Level: {}".format(expression.name))
    print(" id: {}".format(expression.id))
    print(" expression: {} {}\n".format(expression.expression, getUnits(expression.units)))


Expression Level: ENST00000234590.8
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiN2I2NDkxMTctN2NhOS00ZWQ5LWJlM2MtM2RmZWI4YTdjOWRiIl0
 expression: 1754.62 TPM

Expression Level: ENST00000374550.7
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiN2NiNWM4ZjEtYjkzMS00MWQ4LWI3YjItZjAxNGZhZDM2YTQ5Il0
 expression: 1750.91 TPM

Expression Level: ENST00000396651.7
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiMmU5ODM0NWUtYjNjYy00ZDhjLWEzZDMtOWQ3MWJmZjA4NWMxIl0
 expression: 1226.12 TPM

Expression Level: ENST00000370321.7
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiMzkxOGU1ODEtZGM0NC00ZjFhLWE0ODMtMzI2NTdkYWZmNTQxIl0
 expression: 1116.39 TPM

Expression Level: ENST00000368567.4
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiZTBkMGQxOWEtODk3Zi00ZmQ4LWI1NDUtNTU4NDViZWZmNzcxIl0
 expression: 3674.61 TPM

Expression Level: ENST00000372360.7
 id: WyIxa2dlbm9tZXMiLCJFLUdFVVYtMSBSTkEgUXVhbnRpZmljYXRpb24iLCJIRzAwMDk5IiwiZmYwMjE4NDYtYTljNy00ZWZlLThlNmMtY2NiNWEyMDM4ZDdlIl0
 expression: 1922.7 TPM


In [ ]: