sat-search

This notebook is a tutorial on how to use sat-search to search STAC APIs, save the results, and download assets.

Sat-search is built using sat-stac which provides the core Python classes used to represent STAC catalogs: Collection, Item, and Items. It is recommended to review the tutorial on STAC Classes for more information on how to use these objects returned from searching.

Only the search module is in sat-search is used as a library, and it contains a single class, Search. The parser module is used for creating a CLI parser, and main contains the main function used in the CLI.

API endpoint: Sat-search uses an endpoint defined by the SATUTILS_API_URL environment variable. This defaults to https://sat-api.developmentseed.org/ but can point to any STAC compatible endpoint.

Initializing a Search object

The first step in performing a search is to create a Search object with all the desired query parameters. Query parameters need to follow the querying as provided in the STAC specification, although an abbreviated form is also supported (see below).

Another place to look at the STAC query format is in the sat-api docs, specifically see the section on full-features querying which is what sat-search uses to POST queries to an API. Any field that can be provided in the searchBody can be provided as a keyword parameter when creating the search. These fields include:

  • bbox: bounding box of the form [minlon, minlat, maxlon, maxlat]
  • intersects: A GeoJSON geometry
  • time: A single date-time, a period string, or a range (seperated by /)
  • sort: A dictionary of fields to sort along with ascending/descending
  • query: Dictionary of properties to query on, supports eq, lt, gt, lte, gte

Examples of queries are in the sat-api docs, but an example JSON query that would be POSTed might be:

{
  "bbox": [
    -110,
    39.5,
    -105,
    40.5
  ],
  "time": "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z",
  "query": {
    "eo:cloud_cover": {
      "lt": 10
    }
  },
  "sort": [
    {
      "field": "eo:cloud_cover",
      "direction": "desc"
    }
  ]
}

Simple queries

In sat-search, each of the fields in the query is simply provided as a keyword argument


In [1]:
from satsearch import Search

search = Search(bbox=[-110, 39.5, -105, 40.5])
print('bbox search: %s items' % search.found())

search = Search(time='2018-02-12T00:00:00Z/2018-03-18T12:31:12Z')
print('time search: %s items' % search.found())

search = Search(query={'eo:cloud_cover': {'lt': 10}})
print('cloud_cover search: %s items' % search.found())


bbox search: 6626 items
time search: 254719 items
cloud_cover search: 2241357 items

Complex query

Now we combine all these filters and add in a sort filter to order the results (which will be shown further below).


In [2]:
search = Search(bbox=[-110, 39.5, -105, 40.5],
               time='2018-02-12T00:00:00Z/2018-03-18T12:31:12Z',
               query={'eo:cloud_cover': {'lt': 10}})
print('%s items' % search.found())


39 items

Intersects query

The intersects query works the same way, except a geometry is provided.


In [3]:
geom = {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -66.3958740234375,
          43.305193797650546
        ],
        [
          -64.390869140625,
          43.305193797650546
        ],
        [
          -64.390869140625,
          44.22945656830167
        ],
        [
          -66.3958740234375,
          44.22945656830167
        ],
        [
          -66.3958740234375,
          43.305193797650546
        ]
      ]
    ]
}

search = Search(intersects=geom)
print('intersects search: %s items' % search.found())


intersects search: 2657 items

Alternate search syntax

This all works fine, except the syntax for creating queries is a bit verbose, so sat-search provides a factory function (Search.search) that uses an alternate syntax that is translated into proper STAC queries. This is the query syntax used by the sat-search CLI.

The keywords accepted by the Search.search function are slightly different:

  • bbox (same)
  • intersects (same)
  • time (same)
  • datetime: this is an alias to 'time'
  • ids: This is a list of IDs to fetch directly. The 'collection' keyword msut be provided in this case and all other keywords are ignored.
  • collection: this can be provided as a property, but this is a shortcut since individual collections are frequently searched on their own.
  • property: instead of query, and uses alternate syntax
  • sort: uses alternate syntax

The alternate syntax for query and sort uses simple strings and equality symbols.

A typical query is shown below for eo:cloud_cover and collection, along with the alternate versions that use the Search::search() factory function.


In [4]:
query = {
  "eo:cloud_cover": {
    "lt": 10
  },
  "collection": {
    "eq": "landsat-8-l1"
  }
}

search = Search(query=query)
print('%s items found' % search.found())

search = Search.search(property=["eo:cloud_cover<10", "collection=landsat-8-l1"])
print('%s items found' % search.found())

# or use collection shortcut
search = Search.search(collection='landsat-8-l1', property=["eo:cloud_cover<10"])
print('%s items found' % search.found())


679597 items found
679597 items found
679597 items found

Fetching results

The examples above use the Search::found() function, but this only returns the total number of hits by performing a fast query with limit=0 (returns no items). To fetch the actual Items use the Search::items() function. This returns a sat-stac Items object.


In [5]:
search = Search(bbox=[-110, 39.5, -105, 40.5],
               datetime='2018-02-01/2018-02-04',
               property=["eo:cloud_cover<5"])
print('%s items' % search.found())

items = search.items()
print('%s items' % len(items))
print('%s collections' % len(items._collections))
print(items._collections)

for item in items:
    print(item)


6 items
6 items
2 collections
[landsat-8-l1, sentinel-2-l1c]
LC80340332018034LGN00
LC80340322018034LGN00
S2A_12SWJ_20180202_0
S2A_12SXJ_20180202_0
S2A_12TXK_20180202_0
S2A_12TWK_20180202_0

Limit

The search.items() function does take 1 argument, limit. This is the total number of items that will be returned. Behind the scenes sat-search may make multiple queries to the API, up until either the limit, or the total number of hits, whichever is greater.


In [6]:
items = search.items(limit=2)
print(items.summary())


Items (2):
date                      id                        
2018-02-03                LC80340332018034LGN00     
2018-02-03                LC80340322018034LGN00     

Returned Items

The returned Items object has several useful functions and is covered in detail in the sat-stac STAC classes tutorial. The Items object contains all the returned Items (Items._items), along with any Collection references by those Items (Items._collections), and the search parameters used (Items._search) Below are some examples.


In [7]:
print(items.summary())


Items (2):
date                      id                        
2018-02-03                LC80340332018034LGN00     
2018-02-03                LC80340322018034LGN00     


In [8]:
from satstac import Items

search = Search.search(bbox=[-110, 39.5, -105, 40.5],
               datetime='2018-02-01/2018-02-10',
               property=["eo:cloud_cover<50"],
               collection='landsat-8-l1')
items = search.items()
print(items.summary())

items.save('test.json')
items2 = Items.load('test.json')

print(items2.summary(['date', 'id', 'eo:cloud_cover']))


Items (2):
date                      id                        
2018-02-08                LC80370332018039LGN00     
2018-02-03                LC80340332018034LGN00     

Items (2):
date                      id                        eo:cloud_cover            
2018-02-08                LC80370332018039LGN00     19                        
2018-02-03                LC80340332018034LGN00     36                        


In [9]:
# download a specific asset from all items and put in a directory by date in 'downloads'
filenames = items.download('MTL', path='downloads/${date}')
print(filenames)


['downloads/2018-02-08/LC80370332018039LGN00_MTL.txt', 'downloads/2018-02-03/LC80340332018034LGN00_MTL.txt']

Fetching specific IDs

A STAC API doesn't provide for searching by IDs because they can referenced directly within their collection (e.g., /collections/landsat-8-l1/items/LC80340332018034LGN00). However, the alternate search in sat-search allows for searching by IDs, as long as the collection is also provided.

To simply get an Items object from a list of IDs, provide the ids and the collection name to the Search::items_by_id() function


In [10]:
ids = ['LC80340332018034LGN00', 'LC80340322018034LGN00']
search = Search.search(ids=ids, collection='landsat-8-l1')
items = search.items()

print(items.summary())


Items (2):
date                      id                        
2018-02-03                LC80340332018034LGN00     
2018-02-03                LC80340322018034LGN00