In [1]:
%matplotlib inline
This notebook is inspired by an IEEE publication titled High-Dimensional Pixel Composites From Earth Observation Time Series authored by Dale Roberts, Norman Mueller, and Alexis McIntyre.
This notebook applies this compositing method to Landsat 7 imagery, displays a rendering of the computed composites, and saves them off to disk for further validation.
Data cube object
A datacube object is your interface with data stored on your data cube system.
In [2]:
import datacube
dc = datacube.Datacube()
Listing available products
Lookup product/platform name to load in your data.
In [3]:
product_listing = dc.list_products()
product_listing
Out[3]:
In [4]:
platform = 'LANDSAT_7'
product = 'ls7_usgs_sr_scene'
Select and visualize
Select and visualize the region you'll be working with.
In [5]:
# Zanzibar, Tanzania
lat = (-6.2238, -6.1267)
lon = (39.2298, 39.2909)
In [6]:
from utils.data_cube_utilities.dc_display_map import display_map
display_map(latitude = lat,longitude = lon)
Out[6]:
In [7]:
from datetime import datetime
date_range = ("2015-01-01", "2015-12-31")
In [8]:
landsat_dataset = dc.load(product = product,\
platform = platform,\
lat = lat,\
lon = lon,\
time = date_range,\
measurements = ['red','green','nir','swir1', 'swir2','blue', 'pixel_qa'])
In [9]:
landsat_dataset
Out[9]:
In [10]:
from functools import reduce
import numpy as np
def ls7_qa_mask(dataset, keys):
land_cover_endcoding = dict( fill = [1],
clear = [66, 130],
water = [68, 132],
shadow = [72, 136],
snow = [80, 112, 144, 176],
cloud = [96, 112, 160, 176, 224],
low_conf = [66, 68, 72, 80, 96, 112],
med_conf = [130, 132, 136, 144, 160, 176],
high_conf= [224]
)
def merge_lists(a, b):
return a.union(set(land_cover_endcoding[b]))
relevant_encodings = reduce(merge_lists, keys,set())
return np.isin(dataset.pixel_qa.values, list(relevant_encodings))
In [11]:
clean_mask_np = ls7_qa_mask(landsat_dataset, ["clear", "water"])
Geometric Medoid Compositing
To compute a Geo-Medoid composite , the geometric medoid algorithm is applied to the time series of every pixel (indexed by
lat,lon).
Every pixel( indexed bytime,lat,lon) in the the time series is treated as an independent observation used in the computation of the geometric medoid.In the case of Landsat7 imagery an observation
<red,green,blue,nir,swir1,swir2>is a vector/point embedded in 6-dimensional feature-space.Formal Definition of a Geometric Medoid
Given a finite set $\mathbb{X}$ of $\mathbb{_p}$-dimensional observation vectors $\mathbb{X} = \{x_1,...,x_n \}$ , the medoid of these observations is given by the following equation [[1]](#hd_medians):
$$ m := argmin_{ x \in \mathbb{X}} \sum_{i=1}^{n}{ \lVert x - x_i\rVert } $$
In [12]:
from utils.data_cube_utilities.dc_mosaic import create_hdmedians_multiple_band_mosaic
medoid_mosaic = create_hdmedians_multiple_band_mosaic( landsat_dataset,
clean_mask = clean_mask_np,
operation = 'medoid')
In [13]:
from utils.data_cube_utilities.plotter_utils import figure_ratio
figsize = figure_ratio(landsat_dataset, fixed_width=12)
medoid_mosaic.swir1.plot(figsize = figsize, cmap = 'magma')
Out[13]:
Geometric Median Compositing
To compute a Geo-Median composite , the geometric Median algorithm is applied to the time series of every pixel (indexed by
lat,lon).
Every pixel( indexed bytime,lat,lon) in the the time series is treated as an independent observation used in the computation of the geometric Median.In the case of Landsat7 imagery an observation
<red,green,blue,nir,swir1,swir2>is a vector/point embedded in 6-dimensional feature-space.Formal Definition of a Geometric Median
Given a finite set $\mathbb{X}$ of $\mathbb{_p}$-dimensional observation vectors $\mathbb{X} = \{ x_1,...,x_n \}$ , the Median of these observations is given by the following equation [[1]](#hd_medians):
$$ \hat{\mu} := argmin_{ x \in \mathbb{R^{_p}}} \sum_{i=1}^{n}{ \lVert x - x_i\rVert } $$Note:
there is a subtle difference between the definition of the geometric median and the medoid: the search space for the solution differs and has the effect that the medoid returns one of the true observations whereas the geometric median can be described as a synthetic (not physically observed) observation.[[1]](#hd_medians)
In [14]:
median_mosaic = create_hdmedians_multiple_band_mosaic(landsat_dataset,
clean_mask = clean_mask_np,
operation = 'median')
In [15]:
figsize = figure_ratio(landsat_dataset, fixed_width=12)
median_mosaic.swir1.plot(figsize = figsize, cmap = 'magma')
Out[15]:
In [16]:
import os
import pathlib
from utils.data_cube_utilities.dc_utilities import write_png_from_xr
png_dir = 'output/pngs'
pathlib.Path(png_dir).mkdir(parents=True, exist_ok=True)
write_png_from_xr('{}/geo_median.png'.format(png_dir), median_mosaic, ['red','green','blue'], scale=(0,4000))
write_png_from_xr('{}/geo_medoid.png'.format(png_dir), medoid_mosaic, ['red','green','blue'], scale=(0,4000))
In [17]:
from utils.data_cube_utilities.import_export import export_xarray_to_netcdf
output_dir = 'output/netcdfs/landsat7'
pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)
geo_median_filepath = output_dir + '/geo_medians_ls7.nc'
export_xarray_to_netcdf(median_mosaic, geo_median_filepath)
geo_medoid_filepath = output_dir + '/geo_medoids_ls7.nc'
export_xarray_to_netcdf(medoid_mosaic, geo_medoid_filepath)
In [18]:
from utils.data_cube_utilities.import_export import export_slice_to_geotiff
output_dir = 'output/geotiffs/landsat7'
pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)
geo_median_filepath = output_dir + '/geo_medians_ls7.tif'
export_slice_to_geotiff(median_mosaic, geo_median_filepath)
geo_medoid_filepath = output_dir + '/geo_medoids_ls7.tif'
export_slice_to_geotiff(medoid_mosaic, geo_medoid_filepath)
In [19]:
!ls -lah output/geotiffs/landsat7/geo*.tif
Dale Roberts 2018. Hdmedians. Github: https://github.com/daleroberts/hdmedians,
Small, C. G. (1990). A survey of multidimensional medians. International Statistical Review/Revue Internationale de Statistique, 263-277.