In [1]:
# Import deriva modules
from deriva.core import ErmrestCatalog, get_credential
In [2]:
# Connect with the deriva catalog
protocol = 'https'
hostname = 'www.facebase.org'
catalog_number = 1
credential = get_credential(hostname)
catalog = ErmrestCatalog(protocol, hostname, catalog_number, credential)
In [3]:
# Get the path builder interface for this catalog
pb = catalog.getPathBuilder()
In [4]:
path = pb.schemas['isa'].tables['dataset'].path
We could have used the more compact dot-notation to start the same path.
In [5]:
path = pb.isa.dataset.path
In [6]:
print(path.uri)
In [7]:
results = path.entities()
In [8]:
results.fetch()
Out[8]:
ResultSets behave like python containers. For example, we can check the count of rows in this ResultSet.
In [9]:
len(results)
Out[9]:
Note: If we had not explicitly called the fetch() method, then it would have been called implicitly on the first container operation such as len(...), list(...), iter(...) or get item [...].
In [10]:
results[9]
Out[10]:
In [11]:
dataset = pb.schemas['isa'].tables['dataset']
print(results[9][dataset.accession.name])
In [12]:
results.fetch(limit=3)
len(results)
Out[12]:
In [13]:
for entity in results:
print(entity[dataset.accession.name])
In [14]:
from pandas import DataFrame
DataFrame(results)
Out[14]:
It is also possible to fetch only a subset of attributes from the catalog. The attributes(...) method accepts a variable argument list followed by keyword arguments. Each argument must be a Column object from the table's columns container.
To rename the selected attributes, use the alias(...) method on the column object. For example, attributes(table.column.alias('new_name')) will rename table.column with new_name in the entities returned from the server. (It will not change anything in the stored catalog data.)
In [15]:
results = path.attributes(dataset.accession, dataset.title, dataset.released.alias('is_released')).fetch(limit=5)
In [16]:
list(results)
Out[16]:
In [ ]: