In [1]:
%matplotlib inline
This notebook is inspired by an IEEE publication titled High-Dimensional Pixel Composites From Earth Observation Time Series authored by Dale Roberts, Norman Mueller, and Alexis McIntyre.
This notebook applies this compositing method to Landsat 7 imagery, displays a rendering of the computed composites, and saves them off to disk for further validation.
Data cube object
A datacube object is your interface with data stored on your data cube system.
In [2]:
import datacube
dc = datacube.Datacube()
Listing available products
Lookup product/platform name to load in your data.
In [3]:
product_listing = dc.list_products()
product_listing
Out[3]:
In [4]:
platform = 'LANDSAT_7'
product = 'ls7_usgs_sr_scene'
Select and visualize
Select and visualize the region you'll be working with.
In [5]:
# Zanzibar, Tanzania
lat = (-6.2238, -6.1267)
lon = (39.2298, 39.2909)
In [6]:
from utils.data_cube_utilities.dc_display_map import display_map
display_map(latitude = lat,longitude = lon)
Out[6]:
In [7]:
from datetime import datetime
date_range = ("2015-01-01", "2015-12-31")
In [8]:
landsat_dataset = dc.load(product = product,\
platform = platform,\
lat = lat,\
lon = lon,\
time = date_range,\
measurements = ['red','green','nir','swir1', 'swir2','blue', 'pixel_qa'])
In [9]:
landsat_dataset
Out[9]:
In [10]:
from functools import reduce
import numpy as np
def ls7_qa_mask(dataset, keys):
land_cover_endcoding = dict( fill = [1],
clear = [66, 130],
water = [68, 132],
shadow = [72, 136],
snow = [80, 112, 144, 176],
cloud = [96, 112, 160, 176, 224],
low_conf = [66, 68, 72, 80, 96, 112],
med_conf = [130, 132, 136, 144, 160, 176],
high_conf= [224]
)
def merge_lists(a, b):
return a.union(set(land_cover_endcoding[b]))
relevant_encodings = reduce(merge_lists, keys,set())
return np.isin(dataset.pixel_qa.values, list(relevant_encodings))
In [11]:
clean_mask_np = ls7_qa_mask(landsat_dataset, ["clear", "water"])
Geometric Medoid Compositing
To compute a Geo-Medoid composite , the geometric medoid algorithm is applied to the time series of every pixel (indexed by
lat,lon
).
Every pixel( indexed bytime,lat,lon
) in the the time series is treated as an independent observation used in the computation of the geometric medoid.In the case of Landsat7 imagery an observation
<red,green,blue,nir,swir1,swir2>
is a vector/point embedded in 6-dimensional feature-space.Formal Definition of a Geometric Medoid
Given a finite set $\mathbb{X}$ of $\mathbb{_p}$-dimensional observation vectors $\mathbb{X} = \{x_1,...,x_n \}$ , the medoid of these observations is given by the following equation [[1]](#hd_medians):
$$ m := argmin_{ x \in \mathbb{X}} \sum_{i=1}^{n}{ \lVert x - x_i\rVert } $$
In [12]:
from utils.data_cube_utilities.dc_mosaic import create_hdmedians_multiple_band_mosaic
medoid_mosaic = create_hdmedians_multiple_band_mosaic( landsat_dataset,
clean_mask = clean_mask_np,
operation = 'medoid')
In [13]:
from utils.data_cube_utilities.plotter_utils import figure_ratio
figsize = figure_ratio(landsat_dataset, fixed_width=12)
medoid_mosaic.swir1.plot(figsize = figsize, cmap = 'magma')
Out[13]:
Geometric Median Compositing
To compute a Geo-Median composite , the geometric Median algorithm is applied to the time series of every pixel (indexed by
lat,lon
).
Every pixel( indexed bytime,lat,lon
) in the the time series is treated as an independent observation used in the computation of the geometric Median.In the case of Landsat7 imagery an observation
<red,green,blue,nir,swir1,swir2>
is a vector/point embedded in 6-dimensional feature-space.Formal Definition of a Geometric Median
Given a finite set $\mathbb{X}$ of $\mathbb{_p}$-dimensional observation vectors $\mathbb{X} = \{ x_1,...,x_n \}$ , the Median of these observations is given by the following equation [[1]](#hd_medians):
$$ \hat{\mu} := argmin_{ x \in \mathbb{R^{_p}}} \sum_{i=1}^{n}{ \lVert x - x_i\rVert } $$Note:
there is a subtle difference between the definition of the geometric median and the medoid: the search space for the solution differs and has the effect that the medoid returns one of the true observations whereas the geometric median can be described as a synthetic (not physically observed) observation.[[1]](#hd_medians)
In [14]:
median_mosaic = create_hdmedians_multiple_band_mosaic(landsat_dataset,
clean_mask = clean_mask_np,
operation = 'median')
In [15]:
figsize = figure_ratio(landsat_dataset, fixed_width=12)
median_mosaic.swir1.plot(figsize = figsize, cmap = 'magma')
Out[15]:
In [16]:
import os
import pathlib
from utils.data_cube_utilities.dc_utilities import write_png_from_xr
png_dir = 'output/pngs'
pathlib.Path(png_dir).mkdir(parents=True, exist_ok=True)
write_png_from_xr('{}/geo_median.png'.format(png_dir), median_mosaic, ['red','green','blue'], scale=(0,4000))
write_png_from_xr('{}/geo_medoid.png'.format(png_dir), medoid_mosaic, ['red','green','blue'], scale=(0,4000))
In [17]:
from utils.data_cube_utilities.import_export import export_xarray_to_netcdf
output_dir = 'output/netcdfs/landsat7'
pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)
geo_median_filepath = output_dir + '/geo_medians_ls7.nc'
export_xarray_to_netcdf(median_mosaic, geo_median_filepath)
geo_medoid_filepath = output_dir + '/geo_medoids_ls7.nc'
export_xarray_to_netcdf(medoid_mosaic, geo_medoid_filepath)
In [18]:
from utils.data_cube_utilities.import_export import export_slice_to_geotiff
output_dir = 'output/geotiffs/landsat7'
pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)
geo_median_filepath = output_dir + '/geo_medians_ls7.tif'
export_slice_to_geotiff(median_mosaic, geo_median_filepath)
geo_medoid_filepath = output_dir + '/geo_medoids_ls7.tif'
export_slice_to_geotiff(medoid_mosaic, geo_medoid_filepath)
In [19]:
!ls -lah output/geotiffs/landsat7/geo*.tif
Dale Roberts 2018. Hdmedians. Github: https://github.com/daleroberts/hdmedians,
Small, C. G. (1990). A survey of multidimensional medians. International Statistical Review/Revue Internationale de Statistique, 263-277.