In [1]:
import datacube.api
from pprint import pprint
By default, the API will use the configured database connection found in the config file.
Details on setting up the config file and database and be found here: http://agdc-v2.readthedocs.org/en/develop/db_setup.html
In [2]:
dc = datacube.api.API()
In [3]:
dc.list_fields()
Out[3]:
The product and platform fields looks interesting. Find out more about them:
In [4]:
dc.list_field_values('product')
Out[4]:
In [5]:
dc.list_field_values('platform')
Out[5]:
There are several API calls the describe and provide data in different ways:
get_descriptor() - provides a descripton of the data for a given queryget_data() - provides the data as xarray.DataArrays for each variable. This is usually called based on information returned by the get_descriptor call.get_data_array() - returns an xarray.DataArray n-dimensional object, with the variables stack along the dimension labelled variables.get_dataset() - return an xarray.Dataset object, containing an xarray.DataArray for each variable.
In [6]:
query = {
'product': 'gamma0',
'platform': ['ALOS_2','SENTINEL_1A'],
}
descriptor = dc.get_descriptor(query, include_storage_units=False)
pprint(descriptor)
The query can be restricted to provide information on particular range along a dimension.
For spatial queries, the dimension names should be used. The default projection for the range query values is in WGS84, although
In [7]:
query = {
'product': 'gamma0',
'platform': ['ALOS_2','SENTINEL_1A'],
'dimensions': {
'x' : {
'range': (146.0, 147.0),
},
'y' : {
'range': (-42.0, -41.0),
},
'time': {
'range': ((2015, 1, 1), (2017, 1 ,2)),
}
}
}
pprint(dc.get_descriptor(query, include_storage_units=False))
A coordinate reference sytsem can be provided for the spatial dimensions, either as a EPSG code or a WKT description:
In [8]:
query = {
'product': 'gamma0',
'platform': ['ALOS_2','SENTINEL_1A'],
'dimensions': {
'x' : {
'range': (1187756.25, 1284918.75),
'crs': 'EPSG:3577',
},
'y' : {
'range': (-4666481.25,-4548968.75),
'crs': 'EPSG:3577',
},
'time': {
'range': ((2016, 1, 1), (2017, 1 ,1)),
}
}
}
This retrieves the data, usually as a subset, based on the information provided by the get_descriptor call.
The query is in a similar form to the get_descriptor call, with the addition of a variables parameter. If not specified, all variables are returned.
The query also accepts an array_range parameter on a dimension that provides a subset based on array indicies, rather than labelled coordinates.
In [10]:
query = {
'product': 'gamma0',
'platform': 'ALOS_2',
'variables': ['hh_gamma0', 'hv_gamma0'],
'dimensions': {
'x' : {
'range': (146, 147),
'array_range': (0, 1),
},
'y' : {
'range': (-41, -42),
'array_range': (0, 1),
},
'time': {
'range': ((2016, 1, 1), (2017, 1, 1))
}
}
}
data = dc.get_data(query)
data.keys()
Out[10]:
In [12]:
alos2 = dc.get_data_array(product='gamma0', platform='ALOS_2', y=(-41,-42), x=(146,147))
s1a = dc.get_data_array(product='gamma0', platform='SENTINEL_1A', y=(-41,-42), x=(146,147))
In [15]:
dc.get_dataset(product='gamma0', platform='SENTINEL_1A', y=(-41,-42), x=(146,147))
Out[15]:
In [ ]: