GA4GH 1000 Genomes Metadata Service

This example illustrates how to access the available datasets in a GA4GH server.

Initialize client

In this step we create a client object which will be used to communicate with the server. It is initialized using the URL.


In [1]:
from ga4gh.client import client
c = client.HttpClient("http://1kgenomes.ga4gh.org")

We will continue to refer to this client object for accessing the remote server.

Access the dataset

Here, we issue or first API call to get a listing of datasets hosted by the server. The API call returns an iterator, which is iterated on once to get the 1kgenomes dataset.


In [2]:
dataset = c.search_datasets().next()
print dataset
data_set_id = dataset.id


id: "WyIxa2dlbm9tZXMiXQ"
name: "1kgenomes"
description: "Variants from the 1000 Genomes project and GENCODE genes annotations"

NOTE:

We can also obtain individual datasets by knowing its id. From the above field, we use the id to obtain the dataset which belong to that dataset.

In [3]:
dataset_via_get = c.get_dataset(dataset_id=data_set_id)
print dataset_via_get


id: "WyIxa2dlbm9tZXMiXQ"
name: "1kgenomes"
description: "Variants from the 1000 Genomes project and GENCODE genes annotations"

We can then use this identifier to point to the same dataset throughout our examples.

For documentation on the service, and more information go to.

https://ga4gh-schemas.readthedocs.io/en/latest/schemas/metadata_service.proto.html