Thisi is what we've leaned from the presentation by Tim Hessels on March 14.
To put this in practice, we'll download precipitaion data for a groundwater model in Morocco in the Tafilat area near Erfoud (find it in GoogleMaps.
In [17]:
import numpy as np
from pprint import pprint
def prar(A, ncol=8, maxsize=1000):
"""prints 2D arrays the Matlab 2 (more readable)"""
if A.size>1000: # don't try to print a million values, or your pc will hang.
print(A)
return
n = A.shape[1]
# print columns in formatted chunks that fit on on line
for i, Asub in enumerate(np.split(A, range(ncol, n, ncol), axis=1)):
if Asub.size == 0: Asub=A
print("columns[{}:{}]".format(i * ncol, i * ncol +Asub.shape[1]))
for L in Asub:
print((" {:10.5g}" * len(L)).format(*L))
print()
Download the data files for the desired periods for the whole of Africa from CHIRPS. You can do this with FileZilla a free app for this purpose.
For access to the CHIRPTS data see
http://chg.ucsb.edu/data/chirps/
Next to tiff files one can find png (images) on the sight that can be directly viewed in your browser or imported into any application, whithout any processing. But of course a pictures does not have the original data.
In [18]:
import glob
chirps_files = glob.glob('../**/*/*.tif')
pprint(chirps_files)
fname = chirps_files[0]
In [19]:
import gdal
try: # is the file present?
g = gdal.Open(fname)
except:
exception(FileExistsError("Can't open file <{}>".fname))
In [20]:
print("\nBasic information on file <{}>\n".format(fname))
print("Driver: ", g.GetDriver().ShortName, '/', g.GetDriver().LongName)
print("Size : ", g.RasterXSize, 'x', g.RasterYSize, 'x', g.RasterCount)
print("Projection :\n", g.GetProjection())
print()
print("\nGeotransform information:\n")
gt = g.GetGeoTransform()
print("Geotransform :", gt)
# assign the individual fields to more recognizable variables
xUL, dx, xRot, yUL, yRot, dy = gt
# get the size of the data and the number of bands in the tiff file (is 1)
Nx, Ny, Nband = g.RasterXSize, g.RasterYSize, g.RasterCount
# show what we've got:
print('Nx = {}\nNy = {}\nxUL = {}\nyUL = {}\ndx = {}\ndy = {} <--- Negative !'.format(Nx, Ny, xUL, yUL, dx, dy))
This projection says that it's WGS1984 (same as GoogleEarth and GoogleMaps. Therefore it is in longitude (x) and latitute (y) coordinates. This allows to immediately compute the WGS coordinates (lat/lon) from it, for instance for each pixel/cell center. It's also straightforward to compute the bounding box of this array and plot it in QGIS for instance:
In [21]:
# Bounding box around the tiff data set
tbb = [xUL, yUL + Ny * dy, xUL + Nx * dx, yUL]
print("Boudning box of data in tiff file :", tbb)
In [22]:
# Generate coordinates for tiff pixel centers
xm = 0.5 * dx + np.linspace(xUL, xUL + Nx * dx, Nx)
ym = 0.5 * dy + np.linspace(yUL, yUL + Ny * dy, Ny)
The contour coordinates of the Erfoud/Tafilalet groundwater model happen to be the file ErfoudModelContour.kml. Kml files come from GoogleEarth and are in WGS84 coordinates. It was obtained by digitizing the line directly in Google Earth.
We extract the coordinates from that HTML file and put them in a list of lists, the form needed to inject the coordinates into the shapefile.
Extraction can be done in several ways, for instance with one of the HTML parsers that are available on the internet. However if you look at this file in the internet, it's clear that we may do this in a simple way. Read the file line by line until we find the word "coordinates". Then read the next line, which contains all the coordinates. Then clean that line form tabs, put a comma between each tripple of coordinate values and turn it into a list of lists with each list the x, y an z values of one vertice of the model boundary:
In [23]:
with open('ErfoudModelContour.kml', 'r') as f:
for s in f: # read lines from this file
if s.find('coord') > 0: # word "coord" bound?
# Then the next line has all coordinates. Read it and clean up.
pnts_as_str = f.readline().replace(' ',',').replace('\t','').split(',')
# Use a comprehension to put these coordinates in a list, where list[i] has
# a sublist of the three x, y and z coordinates.
points = [ [float(p) for p in p3]
for p3 in [pnts_as_str[i:i+3]
for i in range(0, len(pnts_as_str), 3)] ]
break;
# The points
pnts = np.array(points)
# The bounding box
mbb = [np.min(pnts[:,0]), np.min(pnts[:,1]), np.max(pnts[:,0]), np.max(pnts[:,1])]
#pprint(points)
#print(mbb)
Generate the shapefile holding a 3 polygons a) The bonding box around the data in the tiff file, b) the bounding box of around the model contour. 3) the model contour
In [24]:
import shapefile as shp
tb = lambda indices: [tbb[i] for i in indices] # convenience for selecting from tiff bounding box
mb = lambda indices: [mbb[i] for i in indices] # same for selecting from model bounding box
# open a shape file writer objetc
w = shp.Writer(shapeType=shp.POLYGON)
# add the three polylines to w.shapes
# each shape has parts of of which can contain a polyline. We have one polyline, i.e. one part
# in each chape. Therfore parts is a list of one item, which is a list of points of the polyline.
w.poly(parts=[points]) # only one part, therefore, put points inbetween brackets.
w.poly(parts=[[ tb([0, 1]), tb([2, 1]), tb([2, 3]), tb([0, 3]), tb([0, 1])]]) # bbox of tiff file
w.poly(parts=[[ mb([0, 1]), mb([2, 1]), mb([2, 3]), mb([0, 3]), mb([0, 1])]]) # bbox of model
w.field("Id","C", 20) # Add one field
w.field("Id2", "N") # Add another field, just to see if it works and how
# Aadd three records to w.records (one for eache shape
w.record("model contour", 1) # each record has two values, a string and a nuber, see fields
w.record("model bbox", 2)
w.record("Tiff bbox", 3)
# save this to a new shapefile
w.save("ErfoudModelContour")
# Change False to True so see the coordinates and the records
if False:
print()
for i, sh in enumerate(w.shapes()):
pprint(sh.points)
print()
#w.shapes()[0].points # check if w knows about these points
for r in w.records:
print(r)
# To verify what's been saved read the saved file and show what's in it:
if False:
s = shp.Reader("ErfoudModelContour")
for sh in s.shapeRecords():
pprint(sh.shape.points)
print(sh.record)
Fire up QGIS and load the shape file. Set its CRS to WGS84 (same coordinates as GoogleMaps, most general LatLon)
Here are the pictures taken van de screen of QGIS after the shapefile was loaded and the label under properties was set tot transparent with solid contour line.
To get the GoogleMaps image, look for it it under web in the main menu.
The first image is zoomed out, so that the location of the model can be seen in the south east of this image. It's in Morocco.
The more detailed image shows the contour of the model and its bounding box. It proves that it works.
The next step is to select the appropriage precipitation data from the CHIRPS file.
In [25]:
A = g.GetRasterBand(1).ReadAsArray()
A[A <- 9000] = 0. # replace no-dta values by 0
print()
print("min precipitation in mm ", np.min(A))
print("max precipitation in mm ", np.max(A))
Select a subarea equal to the bbox of the model contour.
In [26]:
# define a function to get the indices of the center points between the bounding box extents of the model
def between(x, a, b):
"""returns indices of ponts between a and b"""
I = np.argwhere(np.logical_and(min(a, b) < x, x < max(a, b)))
return [i[0] for i in I]
ix = between(xm, mbb[0], mbb[2])
iy = between(ym, mbb[1], mbb[3])
print(ix)
print(iy)
Read the data again, but now only the part that covers the model in Marocco:
In [27]:
A = g.GetRasterBand(1).ReadAsArray(xoff=int(ix[0]), yoff=int(iy[0]), win_xsize=len(ix), win_ysize=len(iy))
print("Preciptation on the Erfoud model area in Marocco from file\n{}:\n".format(fname))
prar(A)
Just for curiosity, show the size of the area covered and the size resolution of the precipitation data.
In [28]:
# The extent of this area can be obtained from the latiture and longitude together with the radius of the earth.
R = 6371 # km
EWN = R * np.cos(np.pi/180 * mbb[1]) * np.pi/180. *(mbb[2] - mbb[0])
EWS = R * np.cos(np.pi/180 * mbb[3]) * np.pi/180. *(mbb[2] - mbb[0])
NS = R * np.pi/180 * (mbb[3] - mbb[1])
print("The size of the bounding box in km:")
print("EW along the north boundary : ",EWN)
print("EW along the south boundary : ",EWS)
print("NS : ",NS)
print("Size of each tile (the resolution) = {:.3f} x {:.3f} km: ".format(EWN/A.shape[1], NS/A.shape[0]))
It should be clear that the EW resolution depends on the latitude while the NS resolution is constant.