IHE Python course, 2017

CHIRPS data for precipitation (worldwide between 50S and 50N latitude)

Never again find yourself without appropriate precipitation data

Thisi is what we've leaned from the presentation by Tim Hessels on March 14.

To put this in practice, we'll download precipitaion data for a groundwater model in Morocco in the Tafilat area near Erfoud (find it in GoogleMaps.


In [17]:
import numpy as np
from pprint import pprint 


def prar(A, ncol=8, maxsize=1000):
    """prints 2D arrays the Matlab 2 (more readable)"""
    if A.size>1000:   # don't try to print a million values, or your pc will hang.      
        print(A)
        return
    n = A.shape[1]
    # print columns in formatted chunks that fit on on line
    for i, Asub in enumerate(np.split(A, range(ncol, n, ncol), axis=1)):
        if Asub.size == 0: Asub=A
        print("columns[{}:{}]".format(i * ncol, i * ncol +Asub.shape[1]))
        for L in Asub:
            print((" {:10.5g}" * len(L)).format(*L))
        print()

CHIRPS (Climate Hazards group Infrared Precipitation with Stations

Download the data files for the desired periods for the whole of Africa from CHIRPS. You can do this with FileZilla a free app for this purpose.

For access to the CHIRPTS data see

http://chg.ucsb.edu/data/chirps/

Next to tiff files one can find png (images) on the sight that can be directly viewed in your browser or imported into any application, whithout any processing. But of course a pictures does not have the original data.

glob (unix-like file handling for python)

Assuming that you have downloaded some files, use glob to get a list of them on your computer.


In [18]:
import glob 

chirps_files = glob.glob('../**/*/*.tif')
pprint(chirps_files)
fname = chirps_files[0]


['../Mar14/chirps/chirps-v2.0.2017.01.1.tif',
 '../Mar14/chirps/chirps-v2.0.2017.01.2.tif',
 '../Mar14/chirps/chirps-v2.0.2017.01.3.tif']

gdal (working with tiff files among others, GIS)

import gdal and check if the file is present by opening it.


In [19]:
import gdal

try: # is the file present?
    g = gdal.Open(fname)
except:
    exception(FileExistsError("Can't open file <{}>".fname))

Get some basic information from the tiff file

Ok, now with g the successfully opended CHIRPS file, get some basic information from that file.


In [20]:
print("\nBasic information on file <{}>\n".format(fname))
print("Driver: ", g.GetDriver().ShortName, '/', g.GetDriver().LongName)
print("Size : ", g.RasterXSize, 'x', g.RasterYSize, 'x', g.RasterCount)
print("Projection :\n", g.GetProjection())
print()
print("\nGeotransform information:\n")
gt = g.GetGeoTransform()
print("Geotransform :", gt)

# assign the individual fields to more recognizable variables
xUL, dx, xRot, yUL, yRot, dy = gt

# get the size of the data and the number of bands in the tiff file (is 1)
Nx, Ny, Nband = g.RasterXSize, g.RasterYSize, g.RasterCount

# show what we've got:
print('Nx = {}\nNy = {}\nxUL = {}\nyUL = {}\ndx  = {}\ndy  = {} <--- Negative !'.format(Nx, Ny, xUL, yUL, dx, dy))


Basic information on file <../Mar14/chirps/chirps-v2.0.2017.01.1.tif>

Driver:  GTiff / GeoTIFF
Size :  1500 x 1600 x 1
Projection :
 GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433],AUTHORITY["EPSG","4326"]]


Geotransform information:

Geotransform : (-20.0, 0.05000000074505806, 0.0, 40.0, 0.0, -0.05000000074505806)
Nx = 1500
Ny = 1600
xUL = -20.0
yUL = 40.0
dx  = 0.05000000074505806
dy  = -0.05000000074505806 <--- Negative !

This projection says that it's WGS1984 (same as GoogleEarth and GoogleMaps. Therefore it is in longitude (x) and latitute (y) coordinates. This allows to immediately compute the WGS coordinates (lat/lon) from it, for instance for each pixel/cell center. It's also straightforward to compute the bounding box of this array and plot it in QGIS for instance:


In [21]:
# Bounding box around the tiff data set
tbb = [xUL, yUL + Ny * dy, xUL + Nx * dx, yUL]
print("Boudning box of data in tiff file :", tbb)


Boudning box of data in tiff file : [-20.0, -40.000001192092896, 55.00000111758709, 40.0]

In [22]:
# Generate coordinates for tiff pixel centers
xm = 0.5 * dx + np.linspace(xUL, xUL + Nx * dx, Nx) 
ym = 0.5 * dy + np.linspace(yUL, yUL + Ny * dy, Ny)

Generate a shapefile with a polyline that represents the model boundary

The contour coordinates of the Erfoud/Tafilalet groundwater model happen to be the file ErfoudModelContour.kml. Kml files come from GoogleEarth and are in WGS84 coordinates. It was obtained by digitizing the line directly in Google Earth.

We extract the coordinates from that HTML file and put them in a list of lists, the form needed to inject the coordinates into the shapefile.

Extraction can be done in several ways, for instance with one of the HTML parsers that are available on the internet. However if you look at this file in the internet, it's clear that we may do this in a simple way. Read the file line by line until we find the word "coordinates". Then read the next line, which contains all the coordinates. Then clean that line form tabs, put a comma between each tripple of coordinate values and turn it into a list of lists with each list the x, y an z values of one vertice of the model boundary:


In [23]:
with open('ErfoudModelContour.kml', 'r') as f:
    for s in f:  # read lines from this file
        if s.find('coord') > 0:  # word "coord" bound?
            
            # Then the next line has all coordinates. Read it and clean up.
            pnts_as_str = f.readline().replace(' ',',').replace('\t','').split(',')
            
            # Use a comprehension to put these coordinates in a list, where list[i] has
            # a sublist of the three x, y and z coordinates.
            points = [ [float(p) for p in p3]
                          for p3 in [pnts_as_str[i:i+3]
                                         for i in range(0, len(pnts_as_str), 3)] ]
            break;

# The points
pnts = np.array(points)

# The bounding box
mbb = [np.min(pnts[:,0]), np.min(pnts[:,1]), np.max(pnts[:,0]), np.max(pnts[:,1])]

#pprint(points)
#print(mbb)

Generate the shapefile holding a 3 polygons a) The bonding box around the data in the tiff file, b) the bounding box of around the model contour. 3) the model contour


In [24]:
import shapefile as shp

tb = lambda indices: [tbb[i] for i in indices] # convenience for selecting from tiff bounding box
mb = lambda indices: [mbb[i] for i in indices] # same for selecting from model bounding box

# open a shape file writer objetc
w = shp.Writer(shapeType=shp.POLYGON)

# add the three polylines to w.shapes
# each shape has parts of of which can contain a polyline. We have one polyline, i.e. one part
# in each chape. Therfore parts is a list of one item, which is a list of points of the polyline.
w.poly(parts=[points]) # only one part, therefore, put points inbetween brackets.
w.poly(parts=[[ tb([0, 1]), tb([2, 1]), tb([2, 3]), tb([0, 3]), tb([0, 1])]]) # bbox of tiff file
w.poly(parts=[[ mb([0, 1]), mb([2, 1]), mb([2, 3]), mb([0, 3]), mb([0, 1])]]) # bbox of model

w.field("Id","C", 20)      # Add one field
w.field("Id2", "N")        # Add another field, just to see if it works and how

# Aadd three records to w.records (one for eache shape
w.record("model contour", 1) # each record has two values, a string and a nuber, see fields
w.record("model bbox", 2)
w.record("Tiff bbox", 3)

# save this to a new shapefile
w.save("ErfoudModelContour")

# Change False to True so see the coordinates and the records
if False:
    print()
    for i, sh in enumerate(w.shapes()):
        pprint(sh.points)
        print()
    #w.shapes()[0].points   # check if w knows about these points

    for r in w.records:
        print(r)

# To verify what's been saved read the saved file and show what's in it:
if False:
    s = shp.Reader("ErfoudModelContour")

    for sh in s.shapeRecords():
        pprint(sh.shape.points)
        print(sh.record)

Show shapefile in QGIS

Fire up QGIS and load the shape file. Set its CRS to WGS84 (same coordinates as GoogleMaps, most general LatLon)

Here are the pictures taken van de screen of QGIS after the shapefile was loaded and the label under properties was set tot transparent with solid contour line.

To get the GoogleMaps image, look for it it under web in the main menu.

The first image is zoomed out, so that the location of the model can be seen in the south east of this image. It's in Morocco.

The more detailed image shows the contour of the model and its bounding box. It proves that it works.

The next step is to select the appropriage precipitation data from the CHIRPS file.

Get the precipitation data from the CHIRPS tiff file

The actual data are stored in rasterbands. We saw from the size above, that this file has only one rasterband. Rasterband information is obtained one band at a time. So here we pass band number 1.


In [25]:
A = g.GetRasterBand(1).ReadAsArray()

A[A <- 9000] = 0.  # replace no-dta values by 0

print()
print("min precipitation in mm ", np.min(A))
print("max precipitation in mm ", np.max(A))


min precipitation in mm  0.0
max precipitation in mm  541.912

Select a subarea equal to the bbox of the model contour.


In [26]:
# define a function to get the indices of the center points between the bounding box extents of the model

def between(x, a, b):
    """returns indices of ponts between a and b"""
    I = np.argwhere(np.logical_and(min(a, b) < x, x < max(a, b)))
    return [i[0] for i in I] 

ix = between(xm, mbb[0], mbb[2])
iy = between(ym, mbb[1], mbb[3])

print(ix)
print(iy)


[307, 308, 309, 310, 311, 312]
[168, 169, 170, 171]

Read the data again, but now only the part that covers the model in Marocco:


In [27]:
A = g.GetRasterBand(1).ReadAsArray(xoff=int(ix[0]), yoff=int(iy[0]), win_xsize=len(ix), win_ysize=len(iy))

print("Preciptation on the Erfoud model area in Marocco from file\n{}:\n".format(fname))
prar(A)


Preciptation on the Erfoud model area in Marocco from file
../Mar14/chirps/chirps-v2.0.2017.01.1.tif:

columns[0:6]
     3.2983     3.2105      3.275     3.2667     3.3076     3.5025
      3.544     3.4894     3.5061     3.4976     3.5473     3.7107
     3.9205     4.0261     3.9009     3.6915     3.8585     3.8236
     4.0787     4.2196     4.1013     3.9311     4.0262     3.9142

Just for curiosity, show the size of the area covered and the size resolution of the precipitation data.


In [28]:
# The extent of this area can be obtained from the latiture and longitude together with the radius of the earth.
R = 6371  # km
EWN = R * np.cos(np.pi/180 * mbb[1]) * np.pi/180. *(mbb[2] - mbb[0])
EWS = R * np.cos(np.pi/180 * mbb[3]) * np.pi/180. *(mbb[2] - mbb[0])
NS  = R * np.pi/180 * (mbb[3] - mbb[1])

print("The size of the bounding box in km:")
print("EW along the north boundary : ",EWN)
print("EW along the south boundary : ",EWS)
print("NS                          : ",NS)
print("Size of each tile (the resolution) = {:.3f} x {:.3f} km: ".format(EWN/A.shape[1], NS/A.shape[0]))


The size of the bounding box in km:
EW along the north boundary :  31.9000522861
EW along the south boundary :  31.8330523883
NS                          :  21.8543936582
Size of each tile (the resolution) = 5.317 x 5.464 km: 

It should be clear that the EW resolution depends on the latitude while the NS resolution is constant.

Conclusion

It is now straightforward to get the data of an arbitrary number of periods from the CHIRPS website for the model, in fact for any location covered by CHIRPS on a 5 by b km resolution.