DRC Roads - Find and Download Scenes

In this notebook we focus on a region in the Dominican Republic of Congo where roads have been built into the forest. In the area of interest (AOI) we are using, the roads were built between September and November 2017.

Because the AOI overlaps the OrthoTile grid cells, but only covers a portion of each OrthoTile, this notebook demonstrates working with OrthoTile strips to identify significantly overlapping strips, accessing OrthoTiles as Cloud-Optimized Geotiffs (COGs), and mosaicing multipls COGs on download. Additionally, this notebook demonstrates use of the planet client downloader to manage activation, download, and mosaicing of multiple scenes across multiple strips.

This notebook does the following:

  • use PS Orthotiles
  • filter by overlap of a set of orthotiles (strips) to an aoi that straddles orthotile grid boundaries
  • take advantage of the Cloud-Optimized Geotiffs (COGs) Planet provides to download only the pixels within the AOI
  • mosaic and crop orthotiles on download to a single strip scene that significantly overlaps AOI
  • use planet client downloader to activate, download, and mosaic multiple scenes across multiple orthotile strips

Keywords: Cloud-optimized geotiff, COG, orthotiles, orthotile strips, mosaicing, search, download, downloader


In [1]:
import datetime
import json
import os
import subprocess

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from planet import api
from planet.api import downloader, filters
import rasterio
from shapely import geometry as sgeom
from shapely.ops import cascaded_union

Define AOI

The AOI is a region in the Democratic Republic of Congo that experiences road development between September and November 2017. It is a rectangle that overlaps orthotile grid cell boundaries. Usually, we would redefine the AOI to be within an Orthotile, but we would lose a lot of context if we limited this AOI to only one Orthotile grid cell.


In [2]:
aoi = {"geometry": {
    "type":"Polygon",
    "coordinates":
        [[
            [25.42429478260258,1.0255377823058893],
            [25.592960813580472,1.0255377823058893],
            [25.592960813580472,1.1196578801254304],
            [25.42429478260258,1.1196578801254304],
            [25.42429478260258,1.0255377823058893]
        ]]}}

item_type = 'PSOrthoTile'

In [3]:
aoi_shape = sgeom.shape(aoi['geometry'])
aoi_shape


Out[3]:

Find Orthotiles that overlap AOI

Build API search request


In [4]:
planet_api_key = os.environ['PL_API_KEY']

# quick check that key is defined
assert planet_api_key, "PL_API_KEY not defined."

client = api.ClientV1(api_key=planet_api_key)

In [5]:
# utility functions for searching for scenes that cover the aoi between given dates
def build_ps_request(aoi, item_type, begin, end):
    search_aoi = aoi['geometry']
    query = filters.and_filter(
        filters.geom_filter(search_aoi),
        filters.range_filter('cloud_cover', lt=5),
        filters.date_range('acquired', gte=begin),
        filters.date_range('acquired', lt=end)
    )

    # build a request for only PlanetScope imagery
    request = filters.build_search_request(
        query, item_types=[item_type]
    )
    
    return request

def get_monthly_stats(client, request):
    stats_request = request.copy()
    stats_request['interval'] = 'month'
    return client.stats(stats_request).get()

begin=datetime.datetime(year=2017,month=9,day=1)
end=datetime.datetime(year=2017,month=12,day=1)
print(json.dumps(get_monthly_stats(client, build_ps_request(aoi, item_type, begin, end)),
      indent=4))


{
    "utc_offset": "+0h",
    "interval": "month",
    "buckets": [
        {
            "count": 50,
            "start_time": "2017-09-01T00:00:00.000000Z"
        },
        {
            "count": 4,
            "start_time": "2017-10-01T00:00:00.000000Z"
        },
        {
            "count": 3,
            "start_time": "2017-11-01T00:00:00.000000Z"
        }
    ]
}

In [6]:
def search_pl_api(client, request, limit=500):
    result = client.quick_search(request)
    
    # note that this returns a generator
    return result.items_iter(limit=limit)

items = list(search_pl_api(client, build_ps_request(aoi, item_type, begin, end)))
print(len(items))
# uncomment below to see entire metadata for a landsat scene
# print(json.dumps(items[0], indent=4))
del items


500

Save scene data for further processing


In [7]:
def items_to_scenes(items):
    item_types = []

    def _get_props(item):
        props = item['properties']
        
        # add data not in properties list
        props.update({
            'thumbnail': item['_links']['thumbnail'],
            'id': item['id'],
            'footprint': item['geometry'],
        })
        return props
    
    scenes = pd.DataFrame(data=[_get_props(i) for i in items])
    
    # convert acquired from string to datetime for processing
    scenes['acquired'] = pd.to_datetime(scenes['acquired'])
    
    return scenes

scenes = items_to_scenes(search_pl_api(client, build_ps_request(aoi, item_type, begin, end)))
scenes.head()


Out[7]:
acquired anomalous_pixels black_fill cloud_cover columns epsg_code footprint grid_cell ground_control gsd ... published rows satellite_id strip_id sun_azimuth sun_elevation thumbnail updated usable_data view_angle
0 2017-11-27 08:01:53.230819 0.00 0.97 0.001 8000 32635 {'coordinates': [[[25.48568122826816, 0.973082... 3539508 True 4.0 ... 2017-11-27T22:00:03.000Z 8000 1033 945671 128.1 52.4 https://tiles.planet.com/data/v1/item-types/PS... 2017-11-28T07:05:01.000Z 0.03 1.0
1 2017-11-27 09:24:48.323689 0.11 0.28 0.107 8000 32635 {'coordinates': [[[25.568817759066434, 1.08080... 3539608 True 3.7 ... 2017-11-27T18:24:03.000Z 8000 0f2b 945827 156.2 65.5 https://tiles.planet.com/data/v1/item-types/PS... 2017-11-28T07:07:56.000Z 0.61 4.0
2 2017-11-26 08:01:21.692018 0.00 0.90 0.005 8000 32635 {'coordinates': [[[25.449984283371272, 1.08072... 3539607 True 4.0 ... 2017-11-27T02:29:11.000Z 8000 1004 943459 128.1 52.5 https://tiles.planet.com/data/v1/item-types/PS... 2017-11-28T07:30:11.000Z 0.09 0.9
3 2017-10-25 08:02:06.021491 0.16 0.03 0.159 8000 32635 {'coordinates': [[[25.27017644110049, 0.863604... 3539507 True 3.9 ... 2017-10-26T14:38:32.000Z 8000 1031 864668 114.7 57.0 https://tiles.planet.com/data/v1/item-types/PS... 2017-10-26T14:38:32.000Z 0.81 4.2
4 2017-10-25 09:30:46.525017 0.19 0.73 0.193 8000 32635 {'coordinates': [[[25.485647821944195, 1.08075... 3539608 True 3.7 ... 2017-10-26T00:04:17.000Z 8000 0f2b 863467 148.9 74.3 https://tiles.planet.com/data/v1/item-types/PS... 2017-12-09T02:38:34.000Z 0.08 4.5

5 rows × 26 columns

Filter to scenes with footprints that overlap AOI

When the API searches for a scene that overlaps a given AOI, it uses the scene extent. However, we are interested in the scene footprint. That is, we don't care if a portion of a scene with no data overlaps the AOI. We want to filter those scenes out.


In [8]:
def aoi_intersection(footprint, aoi):
    aoi_shape = sgeom.shape(aoi['geometry'])
    footprint_shape = sgeom.shape(footprint)
    intersection_shape = aoi_shape.intersection(footprint_shape)

    try:
        intersection_percent = 100 * footprint_shape.area / intersection_shape.area
    except ZeroDivisionError:
        intersection_percent = 0

    data = {'intersection_shape': intersection_shape,
            'intersection_fp_perc': intersection_percent}
    return pd.Series(data=data)

intersections = scenes.footprint.apply(aoi_intersection, args=(aoi,))

scenes_inter = pd.concat([scenes, intersections], axis=1, sort=False)

In [9]:
# filter out scenes with no intersection
scenes_inter = scenes_inter[scenes_inter.intersection_fp_perc > 0]
len(scenes_inter)


Out[9]:
55

Strip Overlap

Because the AOI straddles orthotile grid lines, we focus on the overlap between the AOI and the strip (which is what is cut into orthotiles).

We want to filter to strips that have a significant (80%) overlap

Group scenes by strips


In [10]:
scenes_sid = scenes_inter.groupby(['strip_id'])
print('{} intersecting strips'.format(scenes_sid.ngroups))


25 intersecting strips

In [11]:
def get_date(group):
    dates = set([a.date() for a in group['acquired']])
    assert len(dates) == 1
    return min(dates)

strip_date = scenes_sid.apply(get_date)
strip_date.head()


Out[11]:
strip_id
736994    2017-09-06
739002    2017-09-07
739199    2017-09-07
741219    2017-09-08
741529    2017-09-08
dtype: object

Calculate strip overlap


In [12]:
def get_strip_aoi_inter(group, aoi):
    '''group: data frame with strip id as index'''
    intersections = group['intersection_shape'].tolist()
    intersection_shape = cascaded_union(intersections)
    aoi_shape = sgeom.shape(aoi['geometry'])

    try:
        intersection_percent = 100 * intersection_shape.area / aoi_shape.area
    except ZeroDivisionError:
        intersection_percent = 0 

    data = {'strip_intersection_shape': intersection_shape,
            'strip_intersection_aoi_perc': intersection_percent}
    return pd.Series(data=data)
    

# with help from: https://stackoverflow.com/a/43616001/2344416
strip_aoi_inter = scenes_sid.apply(get_strip_aoi_inter, aoi=aoi)

strip_aoi_inter.head()


Out[12]:
strip_intersection_shape strip_intersection_aoi_perc
strip_id
736994 POLYGON ((25.42429478260258 1.059224489085933,... 34.632402
739002 POLYGON ((25.59296081358047 1.089882591128244,... 8.235740
739199 POLYGON ((25.49464809563758 1.025537782305889,... 100.000000
741219 POLYGON ((25.59296081358047 1.089847379752558,... 52.223891
741529 POLYGON ((25.49464809563758 1.025537782305889,... 99.997564

In [13]:
# what does the distribution of intersection percent of aoi look like?
strip_aoi_inter.hist(bins=10)


Out[13]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7fdea5ab1e48>]],
      dtype=object)

Filter to strips that have significant overlap

Here we are defining significant overlap as an overlap of at least 80% of the AOI area.


In [14]:
# add acquisition date information before filtering
strips = strip_aoi_inter.assign(acquisition_date=strip_date)

In [15]:
# filter to strips that have significant overlap
strips_filt = strips[strips.strip_intersection_aoi_perc > 80]
print('{} strips with significant overlap'.format(len(strips_filt)))


3 strips with significant overlap

In [16]:
# what are the collection dates of strips with significant overlap?
strips_filt.acquisition_date


Out[16]:
strip_id
739199    2017-09-07
741529    2017-09-08
758681    2017-09-16
Name: acquisition_date, dtype: object

There are 3 significantly overlapping strips in September, just 1 in October, and 5 in November. Now let's move on to filtering the scene list to scenes in those strips.

Filter to scenes in strips that have significant overlap


In [17]:
overlapping_strip_ids = strips_filt.index.tolist()
overlapping_strip_ids


Out[17]:
['739199', '741529', '758681']

In [18]:
# filter to scenes that are in the resulting strips
overlapping_scenes = scenes[scenes['strip_id'].isin(overlapping_strip_ids)]

In [19]:
print('There are {} OrthoTiles in {} strips that significantly overlap the aoi.'.format(
    len(overlapping_scenes), len(overlapping_strip_ids)))


There are 12 OrthoTiles in 3 strips that significantly overlap the aoi.

In [20]:
# save overlapping scenes dataframe for use in other notebooks
# uncomment to save
# overlapping_scenes_filename = os.path.join('pre-data', 'overlapping-scenes')
# overlapping_scenes.to_pickle(overlapping_scenes_filename)

Create download function

Combine all of the steps above into one function call.


In [21]:
def get_overlapping_scenes(aoi, begin, end, client, item_type, overlap_perc=80):
    # get all scenes with overlapping bounds
    request = build_ps_request(aoi, item_type, begin, end)
    items = search_pl_api(client, request)
    scenes = items_to_scenes(items)
    print('{} scenes with intersecting borders'.format(len(scenes)))
    
    # filter to scenes where the footprint overlaps
    intersections = scenes.footprint.apply(aoi_intersection, args=(aoi,))
    scenes_inter = pd.concat([scenes, intersections], axis=1, sort=False)
    scenes_inter = scenes_inter[scenes_inter.intersection_fp_perc > 0]
    print('{} scenes with intersecting footprints'.format(len(scenes_inter)))
    
    # filter to strips with significant overlap
    scenes_sid = scenes_inter.groupby(['strip_id'])
    strip_aoi_inter = scenes_sid.apply(get_strip_aoi_inter, aoi=aoi)
    print('{} intersecting strips'.format(scenes_sid.ngroups))
    
    strip_date = scenes_sid.apply(get_date)
    strips = strip_aoi_inter.assign(acquisition_date=strip_date)
    strips_filt = strips[strips.strip_intersection_aoi_perc > overlap_perc]
    print('{} strips with significant overlap'.format(len(strips_filt)))
    
    # filter to scenes that are in resulting strips
    overlapping_strip_ids = strips_filt.index.tolist()
    overlapping_scenes = scenes[scenes['strip_id'].isin(overlapping_strip_ids)]
    
    print('There are {} OrthoTiles in {} strips that significantly overlap the aoi'.format(
    len(overlapping_scenes), len(overlapping_strip_ids)))
    return overlapping_scenes

# run again with the same inputs to make sure we get the same results
_ = get_overlapping_scenes(aoi, begin, end, client, item_type)


500 scenes with intersecting borders
55 scenes with intersecting footprints
25 intersecting strips
3 strips with significant overlap
There are 12 OrthoTiles in 3 strips that significantly overlap the aoi

Download and Mosaic Strip Orthotile COGs

Because the AOI crosses over Orthotile grid lines, we need multiple orthotiles from one strip to obtain the image that overlaps the AOI. But the portion of each Orthotile that overlaps the AOI is small relative to the orthotile. Therefore, we only want to download the pixels in the orthotile that overlap the AOI. We will accomplish this by accessing the Orthotiles as Cloud-Optimized Geotiffs (COGs).

This is a variation of the COG activation and download performed in temporal-analysis/crop-temporal.ipynb. For this notebook, we wait until all orthotiles in a strip are activated then we download the COGs together, using gdalwarp to perform the download as well as the mosaicing.

First, we will go through this exercise with one strip, then we will move onto multiple strips.

Define asset type

For this application, we are interested in the analytic product. This is top-of-atmosphere radiance.


In [22]:
asset_type = 'analytic'

Get scenes that compose a strip

Next we need all the ids for scenes that compose a strip.


In [23]:
strip_id = overlapping_strip_ids[0]
print(strip_id)

strip_scenes = scenes[scenes['strip_id'] == strip_id]
strip_scenes


739199
Out[23]:
acquired anomalous_pixels black_fill cloud_cover columns epsg_code footprint grid_cell ground_control gsd ... published rows satellite_id strip_id sun_azimuth sun_elevation thumbnail updated usable_data view_angle
50 2017-09-07 07:58:16.447743 0.05 0.37 0.051 8000 32635 {'coordinates': [[[25.485767058956622, 0.86370... 3539508 True 3.9 ... 2017-09-08T10:11:08.000Z 8000 0f43 739199 80.9 55.3 https://tiles.planet.com/data/v1/item-types/PS... 2017-09-08T10:11:08.000Z 0.57 0.1
51 2017-09-07 07:58:13.342761 0.00 0.72 0.004 8000 32635 {'coordinates': [[[25.4092576216948, 1.0807228... 3539607 True 3.9 ... 2017-09-08T10:11:03.000Z 8000 0f43 739199 81.2 55.4 https://tiles.planet.com/data/v1/item-types/PS... 2017-09-08T10:11:03.000Z 0.27 0.1
52 2017-09-07 07:58:12.825264 0.03 0.18 0.029 8000 32635 {'coordinates': [[[25.48569073728843, 1.080757... 3539608 True 3.9 ... 2017-09-08T10:11:02.000Z 8000 0f43 739199 81.3 55.4 https://tiles.planet.com/data/v1/item-types/PS... 2017-09-08T10:11:02.000Z 0.79 0.1
53 2017-09-07 07:58:16.447743 0.03 0.53 0.031 8000 32635 {'coordinates': [[[25.366993457688604, 0.86364... 3539507 True 3.9 ... 2017-09-08T03:29:35.000Z 8000 0f43 739199 80.9 55.3 https://tiles.planet.com/data/v1/item-types/PS... 2017-09-08T03:29:35.000Z 0.44 0.1

4 rows × 26 columns

Activate Scenes

We use the planet api client downloader to handle activation of the scenes. The downloader handles activation, polling activation status, and (if desired), downloading. Because we are using remote COGs, we do not want to download the scene. However, the downloader can still help us out. It has a cool feature where you can provide it with a function to call when a scene is activated.

In this section, we will provide it with a function that records the scene id and download url and checks to see if all scenes in the strip are activated. The function is actually just a method of a class (Tracker), which maintains a dataset of scene ids and download urls and a list of scene ids in the strip. The method updates the scene id list when it is called by the downloader. Also, it checks to see if all scenes in the strip have been activated. In the future, we will update this part so that when all scenes in a strip are activated, a download and mosaic is triggered.


In [24]:
# create a downloader that will handle scene activation
dl = downloader.create(client)

In [25]:
# This class keeps track of activated scene download urls and the strip scene id's
# It also creates the `on_complete` partial function, which can be called
# by the downloader to update the list of scene download urls and check to see if all
# scenes in the strip are activated
class Tracker(object):
    def __init__(self, strip_scenes):
        self.urls = dict()
        self.strip_scenes = set(strip_scenes)
        
    def get_on_complete(self):
        def on_complete(item, asset):
            self.urls[item['id']] = asset['location']
            print('{}:{}'.format(item['id'], asset['location']))
            if self._got_all_strip_scenes():
                print('strip complete')
        return on_complete
    
    def _got_all_strip_scenes(self):
        return self.strip_scenes.intersection(set(self.urls)) == self.strip_scenes
        

# create the function that keeps track of the download urls and checks to see if all
# scenes in the strip are activated
strip_scene_ids = strip_scenes.id.tolist()
tracker = Tracker(strip_scene_ids)
dl.on_complete = tracker.get_on_complete()

In [26]:
# the downloader works with items, so get item object for each scene id
strip_scene_items = (client.get_item(item_type, sid).get()
                     for sid in strip_scenes.id.tolist())

In [27]:
# use the downloader to activate the scenes and get the download urls
dl.shutdown()
dl.activate(strip_scene_items, [asset_type])


739199_3539508_2017-09-07_0f43:https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJldXA1SmVvVHhUZ2l0SmdmSjc5MU1aWm53SUMzRENsWm12cWNLT2xRQUZad015RUpKU1djSE9FR0R3cWs5VlBKVEY4UkZSQjZ4TndCUGtMUWU3ZUptQT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzgyMCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NTA4XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.e6wY-KJGyVR64WBNYFKYXZIyBck8UX31ebZCMkvOH-nJRBc-W7Ybnvn63Uo3hOfWuUtbDhy-4if6mEnKMtN0mg
739199_3539607_2017-09-07_0f43:https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJkeXFzQW1oMzBkZzRyV3FFSzROeHQ1bCs1ZmRLdm16QWRMWDV4dnRnUW5hUWJKV2JWNExOZVJyaHJ5WUhqREcva0o5cDZ6Q2lZK0JnS1BYaTN5V0d1UT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzgzMCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NjA3XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.gqfX-S5kmD9OosjbPXr_d5Zz3NVtcW189gBAgfcSKOuuKk0TGD4aNkhIpdFBaQNGQQIYuuXI_i97WNVHPX5Qzw
739199_3539608_2017-09-07_0f43:https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIwcjR1eVc1YTFTRUI3b050bXZuZHBCV3FCVHY5d3d1NUYvNUVpY2Q5OUZhQTZxR1hZZFJzajE3ZGZCb01weEU3cVIyQ1R3UG1aQWJGeFFXeUZjYkJDUT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzg0MCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NjA4XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.dzXhliGavuvetNQ6P5VU7ry9g-r_y8aZ7LHfWVMpzLdbT-x2pOofF-Oyw-dLyMrxqT5UaEyAYEMVo-hSKfaQwg
739199_3539507_2017-09-07_0f43:https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJjdjlodTRvZUMrcWFTZ3dlaWQrczlmMTltZ3NLS2RJWHY1amkrSityM1BXRXUydVhMZ2NtN0hwQ2U5a2RVQ2dxc2FoZTFIaDhLSEhweG9RU05aK1J4UT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzg1NiwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NTA3XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.dIV0tAm_0-xMSGi65Z6mcJV0Tsps40_jhVWDkPOP0h-M92M_hQOdZ_lt8FyKgExgVCIIxDTBbFlkE_srFTQC1A
strip complete
Out[27]:
{'activating': 0, 'complete': 4, 'paging': False, 'pending': 0}

Download and mosaic strip scenes

Ok! All scenes in the strip are activated. Now it's time to download them. We are using gdalwarp to download the scenes, and it turns out gdalwarp is also used for mosaicing scenes, so we are going to download and mosaic the scenes all in one step.

Save aoi as geojson file

gdalwarp requires the aoi used to crop the COG be saved to disk. We want the AOI to persist in git for use elsewhere so we save it in the pre-data folder.


In [28]:
def create_save_dir(root_dir='data'):
    save_dir = root_dir

    if not os.path.isdir(save_dir):
        os.makedirs(save_dir)
    return save_dir

In [29]:
def save_geojson_file(aoi_geojson, save_dir):
    filename = os.path.join(save_dir, 'aoi.geojson')
    with open(filename, "w") as f:
        f.write(json.dumps(aoi_geojson))
    return filename

geojson_filename = save_geojson_file(aoi, create_save_dir('pre-data'))
print('wrote AOI to {}'.format(geojson_filename))


wrote AOI to pre-data/aoi.geojson

Define output filename

gdalwarp saves the mosaic to disk and needs a filename for where to save it.

We don't want the mosaic images to persist in git so we save them to the data folder, which is not tracked in git.


In [36]:
def create_output_filename(strip_id, save_dir):
    create_save_dir(save_dir)
    filename = os.path.join(save_dir, strip_id + '_analytic_mosaic.tif')
    return filename

output_file = create_output_filename(strip_id, 'data')
output_file


Out[36]:
'data/739199_analytic_mosaic.tif'

Download and mosaic COGs


In [37]:
(item_ids, download_urls) = zip(*tracker.urls.items())
item_ids


Out[37]:
('739199_3539508_2017-09-07_0f43',
 '739199_3539607_2017-09-07_0f43',
 '739199_3539608_2017-09-07_0f43',
 '739199_3539507_2017-09-07_0f43')

In [38]:
import tempfile

In [39]:
# we use gdalwarp to only download the aoi portion of the COGs and mosaic them in one step
def _gdalwarp(input_filenames, output_filename, options, verbose=False):
    commands = ['gdalwarp'] + options + \
               ['-overwrite'] + \
               input_filenames + \
               [output_filename]
    if verbose: print(' '.join(commands))
    subprocess.check_call(commands)

# lossless compression of an image
def _compress(input_filename, output_filename, verbose=False):
    commands = ['gdal_translate',
                '-co', 'compress=LZW',
                '-co', 'predictor=2',
                input_filename,
                output_filename]
    if verbose: print(' '.join(commands))
    subprocess.check_call(commands)

def download_strip_aoi(download_urls, output_filename, geojson_filename, compress=True, verbose=False):
    vsicurl_urls = ['/vsicurl/' + d for d in download_urls]
    options = [
        '-cutline', geojson_filename,
        '-crop_to_cutline',
    ]
    
    if compress:
        with tempfile.NamedTemporaryFile(suffix='.vrt') as vrt_file:
            options += ['-of', 'vrt']
            _gdalwarp(vsicurl_urls, vrt_file.name, options, verbose=verbose)
            _compress(vrt_file.name, output_filename, verbose=verbose)
    else:
        _gdalwarp(vsicurl_urls, output_filename, options, verbose=verbose)

download_strip_aoi(download_urls, 'data/739199_analytic_mosaic_comp.tif', geojson_filename, verbose=True)
download_strip_aoi(download_urls, output_file, geojson_filename, compress=False, verbose=True)


gdalwarp -cutline pre-data/aoi.geojson -crop_to_cutline -of vrt -overwrite /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJldXA1SmVvVHhUZ2l0SmdmSjc5MU1aWm53SUMzRENsWm12cWNLT2xRQUZad015RUpKU1djSE9FR0R3cWs5VlBKVEY4UkZSQjZ4TndCUGtMUWU3ZUptQT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzgyMCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NTA4XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.e6wY-KJGyVR64WBNYFKYXZIyBck8UX31ebZCMkvOH-nJRBc-W7Ybnvn63Uo3hOfWuUtbDhy-4if6mEnKMtN0mg /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJkeXFzQW1oMzBkZzRyV3FFSzROeHQ1bCs1ZmRLdm16QWRMWDV4dnRnUW5hUWJKV2JWNExOZVJyaHJ5WUhqREcva0o5cDZ6Q2lZK0JnS1BYaTN5V0d1UT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzgzMCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NjA3XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.gqfX-S5kmD9OosjbPXr_d5Zz3NVtcW189gBAgfcSKOuuKk0TGD4aNkhIpdFBaQNGQQIYuuXI_i97WNVHPX5Qzw /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIwcjR1eVc1YTFTRUI3b050bXZuZHBCV3FCVHY5d3d1NUYvNUVpY2Q5OUZhQTZxR1hZZFJzajE3ZGZCb01weEU3cVIyQ1R3UG1aQWJGeFFXeUZjYkJDUT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzg0MCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NjA4XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.dzXhliGavuvetNQ6P5VU7ry9g-r_y8aZ7LHfWVMpzLdbT-x2pOofF-Oyw-dLyMrxqT5UaEyAYEMVo-hSKfaQwg /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJjdjlodTRvZUMrcWFTZ3dlaWQrczlmMTltZ3NLS2RJWHY1amkrSityM1BXRXUydVhMZ2NtN0hwQ2U5a2RVQ2dxc2FoZTFIaDhLSEhweG9RU05aK1J4UT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzg1NiwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NTA3XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.dIV0tAm_0-xMSGi65Z6mcJV0Tsps40_jhVWDkPOP0h-M92M_hQOdZ_lt8FyKgExgVCIIxDTBbFlkE_srFTQC1A /tmp/tmp6ixtap1j.vrt
gdal_translate -co compress=LZW -co predictor=2 /tmp/tmp6ixtap1j.vrt data/739199_analytic_mosaic_comp.tif
gdalwarp -cutline pre-data/aoi.geojson -crop_to_cutline -overwrite /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJldXA1SmVvVHhUZ2l0SmdmSjc5MU1aWm53SUMzRENsWm12cWNLT2xRQUZad015RUpKU1djSE9FR0R3cWs5VlBKVEY4UkZSQjZ4TndCUGtMUWU3ZUptQT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzgyMCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NTA4XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.e6wY-KJGyVR64WBNYFKYXZIyBck8UX31ebZCMkvOH-nJRBc-W7Ybnvn63Uo3hOfWuUtbDhy-4if6mEnKMtN0mg /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJkeXFzQW1oMzBkZzRyV3FFSzROeHQ1bCs1ZmRLdm16QWRMWDV4dnRnUW5hUWJKV2JWNExOZVJyaHJ5WUhqREcva0o5cDZ6Q2lZK0JnS1BYaTN5V0d1UT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzgzMCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NjA3XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.gqfX-S5kmD9OosjbPXr_d5Zz3NVtcW189gBAgfcSKOuuKk0TGD4aNkhIpdFBaQNGQQIYuuXI_i97WNVHPX5Qzw /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIwcjR1eVc1YTFTRUI3b050bXZuZHBCV3FCVHY5d3d1NUYvNUVpY2Q5OUZhQTZxR1hZZFJzajE3ZGZCb01weEU3cVIyQ1R3UG1aQWJGeFFXeUZjYkJDUT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzg0MCwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NjA4XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.dzXhliGavuvetNQ6P5VU7ry9g-r_y8aZ7LHfWVMpzLdbT-x2pOofF-Oyw-dLyMrxqT5UaEyAYEMVo-hSKfaQwg /vsicurl/https://api.planet.com/data/v1/download?token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJjdjlodTRvZUMrcWFTZ3dlaWQrczlmMTltZ3NLS2RJWHY1amkrSityM1BXRXUydVhMZ2NtN0hwQ2U5a2RVQ2dxc2FoZTFIaDhLSEhweG9RU05aK1J4UT09IiwiaXRlbV90eXBlX2lkIjoiUFNPcnRob1RpbGUiLCJ0b2tlbl90eXBlIjoidHlwZWQtaXRlbSIsImV4cCI6MTU0ODgxMzg1NiwiaXRlbV9pZCI6IjczOTE5OV8zNTM5NTA3XzIwMTctMDktMDdfMGY0MyIsImFzc2V0X3R5cGUiOiJhbmFseXRpYyJ9.dIV0tAm_0-xMSGi65Z6mcJV0Tsps40_jhVWDkPOP0h-M92M_hQOdZ_lt8FyKgExgVCIIxDTBbFlkE_srFTQC1A data/739199_analytic_mosaic.tif

Visualize Strip Mosaic Image


In [34]:
# load local visual module
import visual

In [35]:
def load_4band(filename):
    with rasterio.open(filename, 'r') as src:
        # visual band ordering: red, green, blue, alpha
        b, g, r, n = src.read() 

        # NoData value is 0
        mask = b == 0

    return [np.ma.array(band, mask=mask) for band in [b, g, r, n]]

def visualize_4band(bgrn_bands, title='Orthotile Strip Mosaic', figdim=15):
    rgb_bands = [bgrn_bands[i] for i in [2, 1, 0]]
    visual.plot_image(rgb_bands, title=title, figsize=(figdim, figdim))

print(output_file)
visualize_4band(load_4band(output_file))


data/739199_analytic_mosaic.tif

What a beautiful image! It is a little blue due to atmospheric effects. There are a few clouds, but we can clearly see the roads in the forest.

Download and Mosaic Multiple Strip Orthotile COGs

This process is much like the process for downloading and mosaicing a single strip orthotile, but we want to trigger the download as soon as all scenes in a strip are activated. So this will require a bit of a change to the on_complete() function.


In [36]:
class StripDownloader(object):
    def __init__(self, scenes, geojson_file, client, root_dir='data'):
        self.scenes = scenes #pandas DataFrame describing scenes to download
        self.geojson_file = geojson_file
        self.client = client
        self.save_dir = root_dir
        
        self.item_type = 'PSOrthoTile'

        self.urls = dict() # this will be populated by on_complete()
        self.strip_mosaics = [] # this will be populated by on_complete()
        
        self.strip_scenes = self._scenes_to_strip_scenes()
    
    def _scenes_to_strip_scenes(self):
        strip_ids = self.scenes.strip_id.unique()
        print('{} strips'.format(len(strip_ids)))
        
        strip_scenes = dict()
        for sid in strip_ids:
            strip_scenes[sid] = self.scenes[self.scenes['strip_id'] == sid].id.tolist()
        return strip_scenes
        

    def get_on_complete(self, asset_type, verbose=False):
        def on_complete(item, asset):
            download_url = asset['location']
            scene_id = item['id']
            if verbose: print('{}'.format(scene_id))

            strip_id = item['properties']['strip_id']
            if self._completes_strip_scenes(strip_id, scene_id):
                # do this after check that this scene_id completes a strip to avoid
                # a race condition causing scene to be downloaded multiple times
                self.urls[scene_id] = download_url
            
                output_file = self.get_filename(strip_id, asset_type)
                self._download_strip_mosaic(strip_id, output_file, verbose)
                self.strip_mosaics.append(output_file)
            else:
                self.urls[scene_id] = download_url

        return on_complete

    def _completes_strip_scenes(self, strip_id, scene_id):
        activated_scenes_set = set(list(self.urls.keys()) + [scene_id])
        strip_scenes_set = set(self.strip_scenes[strip_id])
        return strip_scenes_set.intersection(activated_scenes_set) == strip_scenes_set
    
    def get_filename(self, strip_id, asset_type):
        if not os.path.isdir(self.save_dir): os.makedirs(self.save_dir)
        
        filename = strip_id + '_' + asset_type + '_mosaic.tif'
        filepath = os.path.join(self.save_dir, filename)
        return filepath
  
    def _download_strip_mosaic(self, strip_id, output_file, verbose):
        scene_ids = self.strip_scenes[strip_id]
        download_urls = [self.urls[scene_id]
                         for scene_id in scene_ids]

        if verbose: print('downloading {} as {}'.format(scene_ids, output_file))
        download_strip_aoi(download_urls, output_file, self.geojson_file, verbose=False)
    
    def run(self, asset_type, overwrite=False, verbose=False):
        # filter scenes by those that already exist
        if not overwrite:
            dl_strip_scenes = self._filter_by_existing_strip_mosaics(self.strip_scenes,
                                                                     asset_type,
                                                                     verbose)
        else:
            dl_strip_scenes = self.strip_scenes

        if len(dl_strip_scenes):
            dl = downloader.create(self.client)
            dl.on_complete = self.get_on_complete(asset_type, verbose=verbose)
            dl.shutdown()
            dl.activate(iter(self._get_items()), [asset_type])
        elif verbose:
            print('Nothing to download')
    
    def _filter_by_existing_strip_mosaics(self, strip_scenes, asset_type, verbose):
        def _strip_mosaic_exists(strip_id):
            strip_mosaic_filename = self.get_filename(strip_id, asset_type)
            
            found = False
            if os.path.isfile(strip_mosaic_filename):
                found = True
                if verbose: print('found {}'.format(strip_mosaic_filename))
            return found
        
        filtered_strip_ids = (s for s in strip_scenes.keys()
                              if not _strip_mosaic_exists(s))
        
        filt_strip_scenes = {sid: strip_scenes[sid] for sid in filtered_strip_ids}
        return filt_strip_scenes
        
    def _get_items(self):
        return [self.client.get_item(self.item_type, sid).get()
                for sid in self.scenes.id.tolist()]

Test downloader with initial strip

This should be pretty quick as we have already activated all of these scenes.


In [37]:
single_strip_downloader = StripDownloader(strip_scenes, geojson_filename, client)
single_strip_downloader.run(asset_type, overwrite=False, verbose=True)


1 strips
found data/739199_analytic_mosaic.tif
Nothing to download

Visualize Strip Mosaic and UDM


In [38]:
visualize_4band(load_4band(single_strip_downloader.get_filename(strip_id, asset_type)))



In [39]:
udm_asset_type = 'udm'
single_strip_downloader.run(udm_asset_type, overwrite=False, verbose=True)


found data/739199_udm_mosaic.tif
Nothing to download

In [40]:
from collections import OrderedDict

In [41]:
# Utility functions for loading a UDM image and identifying binary representation as class labels
def load_udm(udm_filename):
    '''Load single-band bit-encoded UDM as a 2D array.'''
    with rasterio.open(udm_filename, 'r') as src:
        udm = src.read()[0,...]
    return udm

def get_udm_labels(udm):
    '''Get the binary representation of the UDM values'''
    return OrderedDict((v, '{0:08b}'.format(v)) for v in np.unique(udm))

udm_filename = single_strip_downloader.get_filename(strip_id, udm_asset_type)
udm = load_udm(udm_filename)
udm_labels = get_udm_labels(udm)

In [42]:
visual.plot_classified_band(udm, class_labels=udm_labels, title='UDM', figdim=15)


Run downloader on all strips


In [43]:
strip_downloader = StripDownloader(overlapping_scenes, geojson_filename, client)
strip_downloader.run(asset_type, overwrite=False, verbose=True)
strip_downloader.run(udm_asset_type, overwrite=False, verbose=True)


9 strips
found data/943459_analytic_mosaic.tif
found data/904538_analytic_mosaic.tif
found data/915538_analytic_mosaic.tif
found data/883193_analytic_mosaic.tif
found data/879726_analytic_mosaic.tif
found data/863467_analytic_mosaic.tif
found data/758681_analytic_mosaic.tif
found data/741529_analytic_mosaic.tif
found data/739199_analytic_mosaic.tif
Nothing to download
found data/943459_udm_mosaic.tif
found data/904538_udm_mosaic.tif
found data/915538_udm_mosaic.tif
found data/883193_udm_mosaic.tif
found data/879726_udm_mosaic.tif
found data/863467_udm_mosaic.tif
found data/758681_udm_mosaic.tif
found data/741529_udm_mosaic.tif
found data/739199_udm_mosaic.tif
Nothing to download

Visualize Strip Mosaics and UDMs


In [71]:
# Decimate image band arrays for memory conservation
def decimate(arry, num=16):
    return arry[::num, ::num].copy()

def visualize_image_with_udm(strip_downloader, strip_id):
    img_filename = strip_downloader.get_filename(strip_id, asset_type)
    img = load_4band(img_filename)
    visualize_4band([decimate(b) for b in img], title=strip_id, figdim=5)
    
    udm_filename = strip_downloader.get_filename(strip_id, udm_asset_type)
    udm = decimate(load_udm(udm_filename))
    udm_labels = get_udm_labels(udm)
    visual.plot_classified_band(udm, class_labels=udm_labels, title=strip_id + ' UDM', figdim=5)

for strip_id in overlapping_strip_ids:
    visualize_image_with_udm(strip_downloader, strip_id)


We can see from the UDMs associated with the strip mosaics that clouds cover a significant portion of many mosaics, even though we restricted our search to scenes that had less than 5% clouds. To get decent classification, we should filter our strip scenes by the percentage of good UDM pixels in the scene.

Search for Scenes Based on Mosaic Scene Usefulness

In this section, we create a scene search that filters scenes based on the percentage of useful pixels in the resulting mosaic scene, as determined from the Unusable Data Map (UDM), an asset available alongside the images.


In [98]:
class StripSearcher(object):
    def __init__(self, aoi_geojson, begin, end, client, root_dir='data'):
        self.aoi = aoi_geojson # aoi geojson
        self.begin = begin # datetime indicating begin date
        self.end = end # datetime indicating end date
        
        self.client = client
        self.save_dir = root_dir
        
        self.item_type = 'PSOrthoTile'
        
        self.overlapping_scenes = None # this is populated by get_overlapping_scenes()
        self.good_scenes = None # this is populated by get_good_scenes()
    
    def search(self, geojson_filename=None, use_udm=True):
        self.get_overlapping_scenes()
        if use_udm:
            if geojson_filename is None:
                raise('Must specify geojson_filename')

            good_strip_ids = self.get_good_strip_ids(geojson_filename)
            all_strip_ids = self.get_strip_ids(self.overlapping_scenes)
            print('Filtered to {} strips out of {} strips.'.format(len(good_strip_ids), len(all_strip_ids)))
            scenes = self.get_scenes_by_ids(good_strip_ids)
        else:
            scenes = self.overlapping_scenes
        print('{} OrthoTiles found'.format(len(scenes)))
        return scenes
        
    def get_overlapping_scenes(self):
        # get all scenes that fit search
        items = search_pl_api(self.client,
                              build_ps_request(self.aoi, self.item_type, self.begin, self.end))
        scenes = items_to_scenes(items)
        print('{} OrthoTiles were returned from the api search.'.format(len(scenes)))
        
        # filter to scenes where footprint overlaps aoi
        intersections = scenes.footprint.apply(aoi_intersection, args=(self.aoi,))
        scenes_inter = pd.concat([scenes, intersections], axis=1, sort=False)
        scenes_inter = scenes_inter[scenes_inter.intersection_fp_perc > 0]
        print('There are {} OrthoTiles that overlap aoi.'.format(len(scenes_inter)))
        
        # filter to scenes in strips that have significant overlap
        scenes_sid = scenes_inter.groupby(['strip_id'])
        strip_aoi_inter = scenes_sid.apply(get_strip_aoi_inter, aoi=self.aoi)
        strips_filt = strip_aoi_inter[strip_aoi_inter.strip_intersection_aoi_perc > 80]
        
        overlapping_strip_ids = strips_filt.index.tolist()
        overlapping_scenes = scenes[scenes['strip_id'].isin(overlapping_strip_ids)]  
        print('There are {} OrthoTiles in {} strips that significantly overlap the aoi.'.format(
            len(overlapping_scenes), len(overlapping_strip_ids)))
        
        self.overlapping_scenes = overlapping_scenes
    
    def get_overlapping_strips(self):
        return [s for s in self.overlapping_scenes.strip_id.unique().tolist()]

    def get_strip_ids(self, scenes):
        return[s for s in scenes.strip_id.unique().tolist()]
        
    def get_good_strip_ids(self, geojson_filename):
        udm_asset_type = 'udm'
        strip_downloader = StripDownloader(self.overlapping_scenes, geojson_filename, self.client)
        strip_downloader.run(udm_asset_type, overwrite=False, verbose=True)
        
        strip_ids = self.get_overlapping_strips()
        udm_strip_mosaics = [strip_downloader.get_filename(i, udm_asset_type) for i in strip_ids]
        
        strip_quality = [self._is_good_udm(u) for u in udm_strip_mosaics]
        
        good_strips = [sid for (sid, squality) in zip(strip_ids, strip_quality) if squality]
        return good_strips

    def _is_good_udm(self, udm_strip_mosaic):
        udm = load_udm(udm_strip_mosaic)
        good_percent = ((np.size(udm) - np.count_nonzero(udm)) / np.size(udm)) * 100
        return good_percent > 80
    
    def get_scenes_by_ids(self, strip_ids):
        scenes = self.overlapping_scenes.copy()
        return scenes[scenes['strip_id'].isin(strip_ids)]

In [96]:
begin=datetime.datetime(year=2017,month=9,day=1)
end=datetime.datetime(year=2017,month=12,day=1)
strip_searcher = StripSearcher(aoi, begin, end, client)
good_scenes = strip_searcher.search(geojson_filename)


132 OrthoTiles were returned from the api search.
There are 129 OrthoTiles that overlap aoi.
There are 36 OrthoTiles in 9 strips that significantly overlap the aoi.
9 strips
found data/943459_udm_mosaic.tif
found data/904538_udm_mosaic.tif
found data/915538_udm_mosaic.tif
found data/883193_udm_mosaic.tif
found data/879726_udm_mosaic.tif
found data/863467_udm_mosaic.tif
found data/758681_udm_mosaic.tif
found data/741529_udm_mosaic.tif
found data/739199_udm_mosaic.tif
Nothing to download
Filtered to 4 strips out of 9 strips.
16 OrthoTiles found

In [93]:
for strip_id in strip_searcher.get_strip_ids(good_scenes):
    visualize_image_with_udm(strip_downloader, strip_id)


That looks much better! From now on, we will filter by udm when downloading strips


In [ ]: