Chris Holden (ceholden@gmail.com) - https://github.com/ceholden

Chapter 1: Exploring the GDALDataset class

Introduction

One of the most fundamental components of the GDAL raster library is the class object that represents a raster image dataset. That class is called the GDALDataset. You can find the Application Programming Interface (API) references here:

You will find on these API web pages references to class methods (remember: a method is just a function that belongs to a class) suhc as:

  • GetDriver
  • GetRasterBand
  • GetGeoTransform
  • GetProjection
  • GetSubDatasets

These class methods are so called "getter" methods that allow you to access class attributes (remember: class attributes are just variables that belong to the class). When you call the class method, GetDriver, the GDAL dataset will return the image format driver (e.g., ENVI driver, GeoTIFF driver, HDF driver) responsibile for handling the input and output operations for this raster file format. Similarly, the GetGeoTransform method will the transformation that can be used to translate between pixel coordinates and projection coordinates.

Note: the "getter" and "setter" class methods for accessing and settting class properties is not "Pythonic" - these methods exist because the API was originally written for C++ where such methods are normal.

Another suite of class methods allow you to set class attributes. These include:

  • SetGeoTransform
  • SetProjection

which allow you to modify the geographic projection and location of the image.

Module import in Python

Now that we've seen some of how the GDALDataset object encapsulates many of the ideas relevant to the concept of a raster image, let's see how we can implement these ideas in Python.

Before we can get started, we need to tell Python that we will be using functions, classes, and variables from the GDAL Python package. The technical wording for this is that we need to import the GDAL module into our namespace (see Python's documentation on the module system here).

We will do this using some import statements:


In [9]:
# Import the Python 3 print function
from __future__ import print_function

# Import the "gdal" submodule from within the "osgeo" module
from osgeo import gdal

# We can check which version we're running by printing the "__version__" variable
print("GDAL's version is: " + gdal.__version__)
print(gdal)


GDAL's version is: 1.10.1
<module 'osgeo.gdal' from '/usr/lib/python2.7/dist-packages/osgeo/gdal.pyc'>

Once we import the gdal submodule, Python will know where to look on our system for the code that implements the GDAL API. When we want to access classes, variables, or functions within the gdal submodule, we will need to reference the full path, which includes the gdal reference:


In [10]:
# Let's print out the value of the GDAL Byte data type (GDT_Byte)
#     the number printed out is how GDAL keeps track of the various data types
#     this variable, which has a fixed numeric value, is what's called an enumerated type, or enum

# Works
print(gdal.GDT_Byte)
# Doesn't work
print(GDT_Byte)


1
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-cb75e61c602f> in <module>()
      6 print(gdal.GDT_Byte)
      7 # Doesn't work
----> 8 print(GDT_Byte)

NameError: name 'GDT_Byte' is not defined

The datatype GDT_Byte doesn't exist in the global namespace. We need to tell Python where to look for it to find it.

With some basic explanation of Python's namespace setup and how it applies to GDAL out of the way, let's get into some examples:

Examples

Open an image

When we open an image in GDAL we create a GDALDataset object. As the name would suggest, we can open an image with the "Open" function within gdal.

We will use an example image provided in this repository for this chapter. This image is a subset of a Landsat 7 image containing the 7 bands on this sensor rearranged in order of wavelength (e.g., Landsat 7's second SWIR channel comes before thermal channel in our stack). The last band in this image is a cloud and cloud shadow mask from Fmask.


In [11]:
# Open a GDAL dataset
dataset = gdal.Open('../example/LE70220491999322EDC01_stack.gtif', gdal.GA_ReadOnly)

print(dataset)


<osgeo.gdal.Dataset; proxy of <Swig Object of type 'GDALDatasetShadow *' at 0x7f452f141e40> >

Now that we have this dataset open, let's explore some of its capabilities.

Image attributes


In [12]:
# How many bands does this image have?
num_bands = dataset.RasterCount
print('Number of bands in image: {n}\n'.format(n=num_bands))

# How many rows and columns?
rows = dataset.RasterYSize
cols = dataset.RasterXSize
print('Image size is: {r} rows x {c} columns\n'.format(r=rows, c=cols))

# Does the raster have a description or metadata?
desc = dataset.GetDescription()
metadata = dataset.GetMetadata()

print('Raster description: {desc}'.format(desc=desc))
print('Raster metadata:')
print(metadata)
print('\n')

# What driver was used to open the raster?
driver = dataset.GetDriver()
print('Raster driver: {d}\n'.format(d=driver.ShortName))

# What is the raster's projection?
proj = dataset.GetProjection()
print('Image projection:')
print(proj + '\n')

# What is the raster's "geo-transform"
gt = dataset.GetGeoTransform()
print('Image geo-transform: {gt}\n'.format(gt=gt))


Number of bands in image: 8

Image size is: 250 rows x 250 columns

Raster description: ../example/LE70220491999322EDC01_stack.gtif
Raster metadata:
{'AREA_OR_POINT': 'Area', 'Band_6': 'band 7 reflectance', 'Band_7': 'band 6 temperature', 'Band_4': 'band 4 reflectance', 'Band_5': 'band 5 reflectance', 'Band_2': 'band 2 reflectance', 'Band_3': 'band 3 reflectance', 'Band_1': 'band 1 reflectance', 'Band_8': 'Band 8'}


Raster driver: GTiff

Image projection:
PROJCS["WGS 84 / UTM zone 15N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-93],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AUTHORITY["EPSG","32615"]]

Image geo-transform: (462405.0, 30.0, 0.0, 1741815.0, 0.0, -30.0)

The first few pieces of information we obtained are fairly straightforward - the raster size, the number of bands, a description, some metadata, and the raster's file format.

The image's projection is formatted in what's known as "Well Known Text". For more information on specific projections and for format conversions among projection description formats (e.g., proj4 string, WKT, ESRI WKT, JSON, etc.) see Spatial Reference.

The last piece of information we accessed is something called a "geotransform". This set of 6 numbers provides all the information required to and from transform pixel and projected coordinates. In this example, the first number (462405) and the fourth number (1741815) are the top left of the upper left pixel of the raster. The pixel size in x and y dimensions of the raster is listed as the second (30) and the sixth (-30) numbers. Since our raster is north up oriented, the third and fifth numbers are 0. For more information on the GDAL data model, visit this web page.

Image raster bands

The GDALDataset object we created contains a lot of useful information but it is not directly used to read in the raster image. Instead we will need to access each of the raster's bands individually using the method GetRasterBand:


In [13]:
# Open the blue band in our image
blue = dataset.GetRasterBand(1)

print(blue)


<osgeo.gdal.Band; proxy of <Swig Object of type 'GDALRasterBandShadow *' at 0x7f452f149180> >

Following our guide of the GDALDataset, let's explore some of the attributes and methods of the GDALRasterBand:


In [14]:
# What is the band's datatype?
datatype = blue.DataType
print('Band datatype: {dt}'.format(dt=blue.DataType))

# If you recall from our discussion of enumerated types, this "3" we printed has a more useful definition for us to use
datatype_name = gdal.GetDataTypeName(blue.DataType)
print('Band datatype: {dt}'.format(dt=datatype_name))

# We can also ask how much space does this datatype take up
bytes = gdal.GetDataTypeSize(blue.DataType)
print('Band datatype size: {b} bytes\n'.format(b=bytes))

# How about some band statistics?
band_max, band_min, band_mean, band_stddev = blue.GetStatistics(0, 1)
print('Band range: {minimum} - {maximum}'.format(maximum=band_max,
                                                 minimum=band_min))
print('Band mean, stddev: {m}, {s}\n'.format(m=band_mean, s=band_stddev))


Band datatype: 3
Band datatype: Int16
Band datatype size: 16 bytes

Band range: 1810.0 - 198.0
Band mean, stddev: 439.015984, 139.716828766

Note that we didn't need to read the image into Python's memory to calculate these statistics - GDAL did all of this for us.

For most applications, however, we will need to use GDAL to read our raster bands into memory. When we load our raster band into memory we will read it into a NumPy 2 dimensional array. NumPy is, "the fundamental package for scientific computing with Python", because it allows us to represent our data in a very memory efficient way.

NumPy arrays are the cornerstone or building block of the rest of the Scientific Python suite of software. Get familiar with them:

In order to read our band into one of these wonderful np.array objects, we will use the ReadAsArray method from our GDALRasterBand object:


In [24]:
help(blue.ReadAsArray)


Help on method ReadAsArray in module osgeo.gdal:

ReadAsArray(self, xoff=0, yoff=0, win_xsize=None, win_ysize=None, buf_xsize=None, buf_ysize=None, buf_obj=None) method of osgeo.gdal.Band instance

The method ReadAsArray takes arguments that allow us to specify a subset of the raster band image using X and Y offsets and sizes. Remember this ability when you want to process large images or are working with a limited amount of memory. In these circumstances, you will run out of memory if you read the entire dataset in at once. Instead, read in a block of some number of columns and rows at one time, perform your computation and store your output, and then chunk through the rest of the image.

For now, we'll just read in the entire image:


In [28]:
blue_data = blue.ReadAsArray()

print(blue_data)
print()
print('Blue band mean is: {m}'.format(m=blue_data.mean()))
print('Size is: {sz}'.format(sz=blue_data.shape))


[[569 526 569 ..., 311 289 311]
 [568 589 568 ..., 267 332 332]
 [546 525 589 ..., 311 311 311]
 ..., 
 [499 543 478 ..., 306 349 372]
 [520 520 543 ..., 328 372 393]
 [543 564 543 ..., 393 414 436]]

Blue band mean is: 439.015984
Size is: (250, 250)

With our data read into a NumPy array, we can print it to console and even perform statistics on it. In addition to helping us store massive amounts of data efficiently, NumPy will help us with some basic linear algebra, numerical operations, and summary statistics.

Let's say we want to read all of our bands into one 3 dimensional (nrow x ncol x nband) dataset. We will begin by initializing a 3 dimensional array. Next, we will loop over all bands in our raster image dataset and read them into our newly allocated 3 dimensional array:


In [42]:
# Initialize a 3d array -- use the size properties of our image for portability!
image = np.zeros((dataset.RasterYSize, dataset.RasterXSize, dataset.RasterCount))

# Loop over all bands in dataset
for b in xrange(dataset.RasterCount):
    # Remember, GDAL index is on 1, but Python is on 0 -- so we add 1 for our GDAL calls
    band = dataset.GetRasterBand(b + 1)
    
    # Read in the band's data into the third dimension of our array
    image[:, :, b] = band.ReadAsArray()

print(image)
print(image.dtype)


[[[  569.   886.   758. ...,   988.  2686.     0.]
  [  526.   886.   758. ...,   988.  2686.     0.]
  [  569.   798.   776. ...,   961.  2686.     0.]
  ..., 
  [  311.   452.   356. ...,   780.  2487.     0.]
  [  289.   452.   297. ...,   647.  2487.     0.]
  [  311.   430.   337. ...,   674.  2436.     0.]]

 [[  568.   886.   815. ...,  1014.  2686.     0.]
  [  589.   929.   853. ...,  1014.  2686.     0.]
  [  568.   907.   910. ...,  1014.  2686.     0.]
  ..., 
  [  267.   430.   318. ...,   674.  2487.     0.]
  [  332.   452.   337. ...,   594.  2487.     0.]
  [  332.   452.   395. ...,   727.  2487.     0.]]

 [[  546.   864.   834. ...,   988.  2637.     0.]
  [  525.   886.   834. ...,   988.  2637.     0.]
  [  589.   928.   891. ...,   988.  2637.     0.]
  ..., 
  [  311.   452.   356. ...,   727.  2487.     0.]
  [  311.   475.   395. ...,   727.  2487.     0.]
  [  311.   475.   356. ...,   753.  2487.     0.]]

 ..., 
 [[  499.   851.   766. ...,  1124.  2785.     0.]
  [  543.   829.   747. ...,  1387.  2735.     0.]
  [  478.   829.   766. ...,  1414.  2735.     0.]
  ..., 
  [  306.   473.   336. ...,   753.  2537.     0.]
  [  349.   494.   375. ...,   753.  2537.     0.]
  [  372.   584.   454. ...,   780.  2587.     0.]]

 [[  520.   829.   766. ...,  1283.  2785.     0.]
  [  520.   829.   786. ...,  1387.  2785.     0.]
  [  543.   829.   805. ...,  1361.  2785.     0.]
  ..., 
  [  328.   473.   336. ...,   568.  2587.     0.]
  [  372.   628.   433. ...,  1045.  2587.     0.]
  [  393.   628.   512. ...,  1018.  2587.     0.]]

 [[  543.   829.   825. ...,  1440.  2785.     0.]
  [  564.   851.   863. ...,  1467.  2785.     0.]
  [  543.   829.   766. ...,  1387.  2785.     0.]
  ..., 
  [  393.   607.   551. ...,  1230.  2686.     0.]
  [  414.   673.   572. ...,  1440.  2686.     0.]
  [  436.   628.   493. ...,  1045.  2637.     0.]]]
float64

One minor tweak we can make is to ensure that we read our GDAL image into a NumPy array of a matching datatype. GDAL has a function which can make this GDAL <-> NumPy translation for us:


In [40]:
dataset.GetRasterBand(1).DataType


Out[40]:
3

In [43]:
from osgeo import gdal_array

# DataType is a property of the individual raster bands
image_datatype = dataset.GetRasterBand(1).DataType

# Allocate our array, but in a more efficient way
image_correct = np.zeros((dataset.RasterYSize, dataset.RasterXSize, dataset.RasterCount),
                 dtype=gdal_array.GDALTypeCodeToNumericTypeCode(image_datatype))

# Loop over all bands in dataset
for b in xrange(dataset.RasterCount):
    # Remember, GDAL index is on 1, but Python is on 0 -- so we add 1 for our GDAL calls
    band = dataset.GetRasterBand(b + 1)
    
    # Read in the band's data into the third dimension of our array
    image_correct[:, :, b] = band.ReadAsArray()

print("Compare datatypes: ")
print("    when unspecified: {dt}".format(dt=image.dtype))
print("    when specified: {dt}".format(dt=image_correct.dtype))


Compare datatypes: 
    when unspecified: float64
    when specified: int16

Much more efficiently done this way -- we're saving 4x the memory!

As you proceed into the next chapter, the last key concept to understand is how to de-allocate memory within Python.

In order to close out your GDAL datasets and to signal that your NumPy arrays can be de-allocated, you can set them to None:


In [47]:
dataset = None

image = None
image_correct = None