This is a quick notebook to demonstrate the pyroSAR functionality for importing processed SAR scenes into an Open Data Cube



In [ ]:

    
from pyroSAR.datacube_util import Product, Dataset
from pyroSAR.ancillary import groupby, find_datasets



In [ ]:

    
# define a directory containing processed SAR scenes
dir = '/path/to/some/data'

# define a name for the product YML; this is used for creating a new product in the datacube
yml_product = './product_def.yml'

# define a directory for storing the indexing YMLs; these are used to index the dataset in the datacube
yml_index_outdir = './yml_indexing'

# define a name for the ingestion YML; this is used to ingest the indexed datasets into the datacube
yml_ingest = './ingestion.yml'

# product description
product_name_indexed = 'S1_GRD_index'
product_name_ingested = 'S1_GRD_ingest'
product_type = 'gamma0'
description = 'this is just some test'

# define the units of the dataset measurements (i.e. polarizations)
units = 'backscatter'
# alternatively this could be a dictionary:
# units = {'VV': 'backscatter VV', 'VH': 'backscatter VH'}

ingest_location = './ingest'



In [ ]:

    
# find pyroSAR files by metadata attributes
files = find_datasets(dir, recursive=True, sensor=('S1A', 'S1B'), acquisition_mode='IW')

# group the found files by their file basenames
# files with the same basename are considered to belong to the same dataset
grouped = groupby(files, 'outname_base')



In [ ]:

    
print(len(files))
print(len(grouped))

In the next step we create a new product, add the grouped datasets to it and create YML files for indexing the datasets in the cube.



In [ ]:

    
# create a new product and add the collected datasets to it
# alternatively, an existing product can be used by providing the corresponding product YML file
with Product(name=product_name_indexed,
             product_type=product_type,
             description=description) as prod:

    for dataset in grouped:
        with Dataset(dataset, units=units) as ds:

            # add the datasets to the product
            # this will generalize the metadata from those datasets to measurement descriptions,
            # which define the product definition
            prod.add(ds)

            # parse datacube indexing YMLs from product and dataset metadata
            prod.export_indexing_yml(ds, yml_index_outdir)

    # write the product YML
    prod.write(yml_product)
    
    # print the product metadata, which is written to the product YML
    print(prod)

Now that we have a YML file for creating a new product and individual YML files for indexing the datasets, we can create a last YML file, which will ingest the indexed datasets into the cube. For this a new product is created and the files are converted to NetCDF, which are optimised for useage in the cube. The location of those NetCDF files also needs to be defined.



In [ ]:

    
with Product(yml_product) as prod:
    prod.export_ingestion_yml(yml_ingest, product_name_ingested, ingest_location, 
                              chunking={'x': 512, 'y': 512, 'time': 1})



In [ ]: