GIS refers to methods of storing, displaying and analyzing geogaphical information. These methods have become essential in economic analysis (as you have noticed from the reading list for our Ph.D. course on economic growth). For this reason, it is good that you acquaint yourself with these methods. They will prove very useful when doing research, especially to show the spatial distribution of your variables of interest, contructing new measures, or doing spatial analysis.
There are various GIS specialized programs and packages. ESRI produces ArcGIS, which is the most known and commonly used commercial software. It is very easy to use to produce maps and do simple computations. Most universities (including ours) offer it in their computer labs. The main disadvantages are that it requires a computer running Windows, it is costly, and extremely slow for computations.
For this reason I always suggest you use and learn open-source alternatives
There are many open source GIS projects, many of which are supported/gathered at
Here I will give you the basic idea of what you need to install to have a working GIS environment. I assume you have already installed Canopy with all the pakages provided by Enthought. Additionally, you will need to install GRASS, QGIS, and GDAL/OGR.
Download the installers in each of these websites and you should be done!
There are various methods of getting these on your computer.
I used to install using the installers provided by Kyngchaos. But I have moved to using HomeBrew, which allows you install many other GNU projects. to do so, open a terminal window (I recommend getting iTerm2, which is more powerful than the one provided by OSX) and run the following code (I think you will need to have Xcode and its command-line utilities installed)
ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
This should install HomeBrew on your system and let you know of any issues. Once you have done so and have a working homebrew installation, you will be able to install packages and programs using the
brew install command. Before doing so, you should run the following commands
brew tap homebrew/science
This will update HomeBrew's formulas to the latest version. Now issue the command
brew install gdal
brew install qgis
This will install the latest version of QGIS and its dependencies. Regretfully the HomeBrew version installed is 1.8, while QGIS is already at version 2.x. To install that version you will need to look for solutions on the web. On OS X Mavericks I used the following instructions, as explained in this unofficial QGIS-2.0 formula.
brew tap osgeo/osgeo4mac
brew tap --repair
brew tap homebrew/science
brew install osgeo/osgeo4mac/qgis-2x
sudo easy_install psycopg2
This should install the required software. I will update this as I know more...
It seems Canopy already has GDAL incorporated, at least so it seems on Mac OS X. But there is a bug that might prevent it from working, unless you add the following line on your
Ok...now that we have a working GIS desktop system, let us talk a little about types of data. All GIS data includes elements with their properties and their geographical information like location, as determined by e.g. latitude and longitude, address, zip code, etc. So, for example an element might be a country, its properties might be GDP per capita, Gini coefficient, etc. and its location. Another example might be a restaurant with its menu and prices, services (dine-in, take-out, delivery), area of service and location.
GIS comes in very different formats, althogh most of them can be categorizewd into two types Rasters and Vector formats.
Examples of Raster formats:
Examples of Vector formats:
There are many places to find data. Some useful links are:
See also Wikipedia links
Let us start by creating some simple maps. For this, create a directory called
mydata in your
$HOME directory and download the following datasets and extract them:
Now, open QGIS. You should see something like this
Now we can use various tools available within QGIS to learn about the data, select the data, analyze the data, etc.
Besides the geometrical information that we observe, most Shape files have other information relating to each feature. This information is contained in the Attribute Table, which can be accessed with
Doing so shows you the additional information contained in the shape file. For the GADM file it includes
We can use this information to color our maps, or to select features for further analysis. I will use the
select feature using an expression button
to select all features with Kenya's ISO-3 code.
Sometimes we will want to work with a subset of features, especially if the data we are working has many features, like the GADM data. To select a subset of the features and only work with those, use the
Layer->Query option in the
Menu as shown below
This opens a window where you can write expression as the previous one, to select features by their attributes. These are
SQL expressions and have to conform to
SQL's grammar (we will not go into this at this point). Once I execute the same expression as above
QGIS leaves only the features in Kenya for analysis.
Let us zoom into these features by
right-clicking on the name of the layer and choosing
Zoom to Layer Extent. This shows the extent of Kenya alone.
Other important tools can be found in
Layer -> Properties or by
double-clicking on the layer's name
Vector menu has many tools that can be applied to vector layers. Let's use some of these to create new layers.
Vector -> Geometry Tools -> Singlepart to Mutliparts we can generate a new layer and shape file where features are aggregated according to some characteristic. Let us use this tool to aggregate administrative level 2 features to the administrative level 0 (Country) level. Thus, each feature will be a country (administrative level 0) instead of the current administrative level 2.
your_country.shp file you had created in the previous exercise and compute the centroid for each Adminitrative level 1 in your country and export the layer to the file
Search for your place of birth among the most populated places. Where you born in a populated place? If not, identify the closest most populated place. To do so, use Google or Wikipedia to find the latitude and longitude of your place of birth. Using your mouse and the
coordinate window at the bottom
to search for your location of birth. Better yet, using the
Plugins->Manage and Install Plugins menu, install the
and use it to find your place of birth as shown below
Compute the distance between your place of birth and both the closest centroid and the closest most populated place. If needed find a Plugin that will do the job for you. I recommend installing among many
Make sure you can correctly identify the features. For example using ISO codes, ID numbers, etc. If the shape file does not have an ID identifier, it is best to create one, so that you can correctly identify the features. To do so, use the
Field Calculator by double-clicking on the name of the layer, then choosing
Fields. After that you need to select the pencil icon to enable editing mode.
your_country_places.shp on top of the Ramankutty data. It might look something like this.
Since we do not want to work with all the Ramankutty data, but only with the data for our country, let us
clip the part of the Ramankutty data that belongs to our country. To do this, use the
In the window that comes up, choose a name for the clipped raster
suit_your_country and choose the
save as GeoTiff option, so that your file is saved as in GeoTiff format. Then click on the
Mask layer button, make sure
your_country layer is chosen as the mask. If you want choose a different
No data value. Then click
This will clip the Ramankutty data to the extent of your country of origin. Notice that in the big text box there is a command written, something like
gdalwarp -q -cutline "gadm2.shp|layerid=0|subset=\"ISO\" = 'KEN'" -crop_to_cutline -of GTiff "suit/w001001.adf" GitHub/CompEcon/notebooks/QGIS/suit-KEN.tif
This is a the command QGIS uses to create the clip. QGIS is actually calling GDAL to perform this operation. This command line will be very useful when you are planning to use Python or other scripting languages to perform an operation many times. You can do it by hand once and copy the command executed by QGIS and use it to create an iterable version...more on this later.
You might have to assign a projection to the Suitability Raster. To do so, use the
Raster->Projection->Assign Projection option in the menu.
The output should look like this
Notice that given the large size of the cells in the raster, the clipping tool creates a lot of measurement error. It might be better to decrease the size of cells an then clip, so that the clipping is less erroneous. Let's try setting the cell size to $5''$ instead of $0.5^o$. To do so,
right-click on the raster name and select
Save as. Then set the Resolution to $5''=1/12=0.08333$ for both
Now use the clip tool again. Much better!OBviously, the smaller the cell size, the more similar the clipped raster and the polygon will look like.
Care has to be taken when converting raster's cells size, since values have to be interpolated. QGIS seems to have taken away your choice for setting it. Luckily, GDAL can help out. You can use its tools to change the cells size, the projection, clip, etc. We will see some tools in another lecture.
Let us use this raster to assign the average suitability in each administrative region. But before doing so, we need to reproject both the raster and shape files to a format that ensures the areas are correctly take into account. One such projection is the equal area projection. Right click on the raster or shape and select
save as.... Then in the
CRS option choose
Selected CRS and click on
browse and choose the following CRS (or create it if not present by using the
Settings->Custom CRS menu)
+proj=cea +lon_0=0 +lat_ts=0 +x_0=0 +y_0=0 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
Now save the file adding the postfix
cyl so that you know it is the cylindrical equal area rojected one. Now, select
Raster->Zonal Statistics->Zonal Statistics.
Select the cylindrical versions of the raster and shape files and
suitcyl as the
Output column prefix. Now, in the attribute table you will find three new columns with the prefix
suitcyl that show the
mean suitability in in each feature. If you repeat the analysis with the unprojected (non-cyl) versions of the raster and shape files, you will see that the results vary (sometimes significantly). Whenever you do this type of analysis, it is important to make sure you are using the correct rojection for the analysis.
Raster menu you will find other useful tools to work with rasters. Especially useful is the
Raster Calculator, with which you can do computations on one or more rasters.
Now that you have the average agricultural suitability for each administrative region. Let's color code it so we can visually observe the regional differences. For this double-click on the
your_country_cyl layer, which should open a new window in the
style tab of the
Now, click on
Single Symbol and choose
Graduated. This allows us to use colors to show the different values of a field/variable of interest.
Next let us select the field of interest in the attribues table, namely,
suitmean. To do so, click on
column and choose the variable/field of interest, inthis case
You can choose how many gradations to draw, e.g. increasing it from 5 to 10; change the color ramp, or the mode of creating the gradation. Try it out. You should get something like this
In [ ]: