Inspecting Satellite Imagery using Rasterio

A first look at satellite data with Python

At this point, you will have learned different ways of searching for, filtering, and downloading satellite imagery. Now let's use one of these acquired datasets and dig into it a bit with Python.

Here we're going to use a Python library called Rasterio: you may be familiar with it already, or perhaps with the related C library, GDAL. If you've used Numpy before, working with Rasterio will feel very familiar.


In [1]:
from __future__ import division

import math

import rasterio


# This notebook explores a single 4 band (blue, green, red, NIR) PlanetScope scene in a UTM projection.
image_file = "example.tif"

satdat = rasterio.open(image_file)

Basic details

What can we learn about this satellite image using just Python?


In [2]:
# Minimum bounding box in projected units

print(satdat.bounds)


BoundingBox(left=544170.0, bottom=3759147.0, right=550146.0, top=3766416.0)

In [3]:
# Get dimensions, in map units (using the example GeoTIFF, that's meters)

width_in_projected_units = satdat.bounds.right - satdat.bounds.left
height_in_projected_units = satdat.bounds.top - satdat.bounds.bottom

print("Width: {}, Height: {}".format(width_in_projected_units, height_in_projected_units))


Width: 5976.0, Height: 7269.0

In [4]:
# Number of rows and columns.

print("Rows: {}, Columns: {}".format(satdat.height, satdat.width))


Rows: 2423, Columns: 1992

In [5]:
# This dataset's projection uses meters as distance units.  What are the dimensions of a single pixel in meters?

xres = (satdat.bounds.right - satdat.bounds.left) / satdat.width
yres = (satdat.bounds.top - satdat.bounds.bottom) / satdat.height

print(xres, yres)
print("Are the pixels square: {}".format(xres == yres))


3.0 3.0
Are the pixels square: True

In [6]:
# Get coordinate reference system

satdat.crs


Out[6]:
CRS({'init': 'epsg:32611'})

In [7]:
# Convert pixel coordinates to world coordinates.

# Upper left pixel
row_min = 0
col_min = 0

# Lower right pixel.  Rows and columns are zero indexing.
row_max = satdat.height - 1
col_max = satdat.width - 1

# Transform coordinates with the dataset's affine transformation.
topleft = satdat.transform * (row_min, col_min)
botright = satdat.transform * (row_max, col_max)

print("Top left corner coordinates: {}".format(topleft))
print("Bottom right corner coordinates: {}".format(botright))


Top left corner coordinates: (544170.0, 3766416.0)
Bottom right corner coordinates: (551436.0, 3760443.0)

In [8]:
# All of the metadata required to create an image of the same dimensions, datatype, format, etc. is stored in
# one location.

print(satdat.meta)


{'driver': 'GTiff', 'dtype': 'uint16', 'nodata': 0.0, 'width': 1992, 'height': 2423, 'count': 4, 'crs': CRS({'init': 'epsg:32611'}), 'transform': Affine(3.0, 0.0, 544170.0,
       0.0, -3.0, 3766416.0)}

Bands

So far, we haven't done too much geospatial-raster-specific work yet. Since we know we're inspecting a multispectral satellite image, let's see what we can learn about its bands.


In [9]:
# The dataset reports a band count.
print(satdat.count)

# And provides a sequence of band indexes.  These are one indexing, not zero indexing like Numpy arrays.
print(satdat.indexes)


4
(1, 2, 3, 4)

Because we know we're look at a PlanetScope 4-band analytic satellite image, we can define the bands by their order:


In [10]:
# PlanetScope 4-band band order: BGRN

blue, green, red, nir = satdat.read()

# Or the slightly less efficient:
#    blue = satdat.read(1)
#    green = satdat.read(2)
#    red = satdat.read(3)
#    nir = satdat.read(4)

# Or read the entire dataset into a single 3D array:
#    data = satdat.read()

Pixels

In a raster dataset, each pixel has a value. Pixels are arranged in a grid, and pixels representing equivalent data have the same value:


In [11]:
# Bands are stored as Numpy arrays.

print(type(blue))


<class 'numpy.ndarray'>

In [12]:
# How many dimensions would a single raster band have?  Two dimensions: rows and columns.

print(blue.ndim)


2

In [13]:
# Glimpse at the band's values and datatype.

print(blue)
print(blue.dtype)


[[7261 7137 7087 ... 7015 6868 6891]
 [7180 7076 7166 ... 7387 7391 7431]
 [7424 7436 7443 ... 7497 7713 7760]
 ...
 [7791 7840 8139 ... 7108 7086 7267]
 [7873 8132 8441 ... 7134 7023 7042]
 [8320 8464 8542 ... 6893 6921 6989]]
uint16

In [14]:
# Output a min & max pixel value in each band.

for bidx in satdat.indexes:
    data = satdat.read(bidx)
    print("Band {bidx} min {min} max {max}".format(bidx=bidx, min=data.min(), max=data.max()))


# And an overall min/max for the entire dataset.
    
data = satdat.read()
print("Overall min/max: {} {}".format(data.min(), data.max()))


Band 1 min 3493 max 22058
Band 2 min 2905 max 17846
Band 3 min 2048 max 19242
Band 4 min 1132 max 13004
Overall min/max: 1132 22058

In [15]:
# Let's grab the pixel 2km east and 2km south of the upper left corner

# World coordinates for the desired pixel.
x_coord = satdat.bounds.left - 2000
y_coord = satdat.bounds.top + 2000

# Convert world coordinates to pixel.  World coordinates may not transform precisely to row and column indexes,
# but a Numpy array can only be indexed by integer values.  The 'op' parameter for 'satdat.index()' determines
# how the transformed values are rounded.  In some cases any point falling within a pixel should be considered
# contained, and in other cases only points falling within one portion of the pixels hould be considered contained.
# The 'op' parameter lets users make this decision on their own.  The values must still be cast to integers.
col, row = satdat.index(x_coord, y_coord, op=math.floor)
col = int(col)
row = int(row)


# Now let's look at the value of each band at this pixel
print("Red: {}".format(red[row, col]))
print("Green: {}".format(green[row, col]))
print("Blue: {}".format(blue[row, col]))
print("NIR: {}".format(nir[row, col]))


Red: 3856
Green: 4610
Blue: 5097
NIR: 3208