make category map

This tutorial is going to make a map with a grid-shape polygon shapefile that has a column name "com", which contained the category of each cell.

The grids are the 500mx500m grids that cover the Tokyo 23 special wards.

In this tutorials, the following functions will be covered:

mpoly.prepare_map: for preparing the matplotlib figures+ax for drawing map (equal aspect, geometry context, background...)
mpoly.map_shape: simply map the shape of the polygon in the shapefile (just like what you get when you add a layer into qgis/arcmap)
mpoly.map_category: map a column with the name of the category it belongs to
mpoly.map_colour: map a column that include the color code (e.g. hexcode)
mpoly.add_border: add the polygon with none colour, that is, just the border of the polygon
mpoly.add_label: add the label (from a column) to the polygons

Lets start mapping!

First, import things that is needed for the following steps.



In [1]:

    
import geopandas as gpd # for reading and manupulating shapefile
import matplotlib.pyplot as plt # for making figure

import colouringmap.mapping_polygon as mpoly # for making maps

# magic line for matlotlib figure to be shown inline in jupyter cell
%matplotlib inline



In [2]:

    
from palettable.colorbrewer.qualitative import Dark2_7 # to get the colormap for more custom manupulation in the last step

Second, read the file, and take a look on the attribute table of the shapefile.



In [3]:

    
grid_res = gpd.read_file('data/community_results.shp')
grid_res.head()









    Out[3]:







  
    
      
      com
      geometry
      node
      tweets
      usercount
      xcor
      ycor
    
  
  
    
      0
      14
      POLYGON ((175239.9457184017 3947195.841823581,...
      0
      1
      1
      139.939807
      35.640542
    
    
      1
      56
      POLYGON ((175239.9457767347 3947695.841815081,...
      1
      0
      0
      139.939919
      35.645048
    
    
      2
      1
      POLYGON ((142239.9457464929 3956695.841823446,...
      10
      35
      21
      139.576848
      35.731640
    
    
      3
      18
      POLYGON ((144239.9457266586 3959695.841818351,...
      100
      40
      32
      139.599535
      35.758373
    
    
      4
      4
      POLYGON ((154239.9457194024 3947195.841822605,...
      1000
      1898
      660
      139.707733
      35.644166

Then, start playing with the shapefile. The following show how to prepare the figure before drawing the map on the figure.

The first line in the following cell is just a standard way to create a matplotlib figure, along with an "ax".
The second line prepare the "ax" for mapping things. Setting map_context to the context of the shapefile is to make sure the shapefile is within the figure, so it will be shown within the figure.

The figure is just a matplotlib figure & ax, so you can also set the map context manually by something like this:

ax.set_xlim([minx, maxx])
ax.set_ylim([miny, maxy])



In [4]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)

The following show how to create a map with just the shape of the polygons with a same colour.



In [5]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_shape(grid_res, ax, lw=.1, alpha=.7)

map the categories

So, lets try to map the 'com' column to colours.

use mpoly.map_category function, throw in the geopandas gdf, the column name, and the ax.



In [6]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_category(grid_res, 'com', ax)









    



!!!
number of colour is less then number of category
colours will be repeating
!!!

The resulting map is not so good.

The map return that there are too many categories.
Normally, there are two ways to coupe with this problem:

add more colour to the map, which can be done by adding colour_group and colour_name parameters;
reduce the number of category to something less than 7, which is the magic number of the number of colour on a map.

In this case, the number of category is way too high, this can be observe by the legend. There are no colormap that can support so much number of categories.

So, in the following, I try to reduce the number of category to 6 major cats, plus one category as "other".

The first part in the following cell is to find the "major" categories, which is determined by the appearance frequency of the "com".

The second part is to create a new column in the attribute table, to store the major cats data.



In [7]:

    
## find major categories
coms = grid_res.com.tolist() # get the cats column from the attribute table
comset = list(set(coms)) # get the unique cat id
comcount = [ coms.count(c) for c in comset ] # count the appearance of the cats
comset2 = [ (c,n) for n,c in sorted(zip(comcount,comset), reverse=True) ] # sort the cats by the frequency
major = [c[0] for c in comset2[:6]] # 6 major cats
print major

## create new column of categories with major/other
collist = []
for c in coms:
    if c in major:
        collist.append(c) # the cat id if it is in the major cats
    else:
        collist.append(-1) # or -1 for other cats
print len(collist)==len(grid_res) # just a check
grid_res['com2'] = collist # put the new column into the attribute table









    



[0, 11, 18, 3, 1, 14]
True

Then, map the major categories, with some modification (lw, ec, alpha) for better looking on the edge of the polygon.



In [8]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_category(grid_res, 'com2', ax, lw=.1, ec='k', alpha=.7)

Actually, this can also be done by setting the cat_order paramter to the desire category list like this. (Experimental)



In [9]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_category(grid_res, 'com2', ax, lw=.1, ec='k', alpha=.7, cat_order=[0,1,3,11,14,18])

customize the categories manually

The result is not perfect yet. The colour of the "other" cat cannot be controlled. And, its colour looks just as it is as important as the other cats.

So, lets manually assign colours to the categories, and set the "other" cat to something less attractive.



In [10]:

    
colors = Dark2_7.mpl_colors # get this colormap from palletable

collist2 = []
for c in grid_res.com.tolist():
    if c in major:
        collist2.append(colors[major.index(c)]) # if the cat is a major, then use the colormap
    else:
        collist2.append('lightgrey') # silver colour for those "other" cat
len(collist2)==len(grid_res)
grid_res['colour'] = collist2 # assign the new list with color to the attribute table



In [11]:

    
grid_res.head() # lets take a look









    Out[11]:







  
    
      
      com
      geometry
      node
      tweets
      usercount
      xcor
      ycor
      com2
      colour
    
  
  
    
      0
      14
      POLYGON ((175239.9457184017 3947195.841823581,...
      0
      1
      1
      139.939807
      35.640542
      14
      (0.901960784314, 0.670588235294, 0.0078431372549)
    
    
      1
      56
      POLYGON ((175239.9457767347 3947695.841815081,...
      1
      0
      0
      139.939919
      35.645048
      -1
      lightgrey
    
    
      2
      1
      POLYGON ((142239.9457464929 3956695.841823446,...
      10
      35
      21
      139.576848
      35.731640
      1
      (0.4, 0.650980392157, 0.117647058824)
    
    
      3
      18
      POLYGON ((144239.9457266586 3959695.841818351,...
      100
      40
      32
      139.599535
      35.758373
      18
      (0.458823529412, 0.439215686275, 0.701960784314)
    
    
      4
      4
      POLYGON ((154239.9457194024 3947195.841822605,...
      1000
      1898
      660
      139.707733
      35.644166
      -1
      lightgrey

This time, map the polygon directly with the colour column. Note that the function is "map_colour()".



In [12]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_colour(grid_res, 'colour', ax, lw=.1, ec='k', alpha=.7)

The map looks better now.

adding meaningful boundaries

But, it would be better if we also map the administrative boundaries to the map, to show which colour belong to which area. So lets add the boundary shapefile.

The following read the administrative boundary of the Tokyo 23 special wards.



In [13]:

    
borders = gpd.read_file('data/tokyo_special_ward.shp')
borders.head()









    Out[13]:







  
    
      
      CC_1
      CC_2
      ENGTYPE4
      ENGTYPE_1
      ENGTYPE_2
      ENGTYPE_3
      ENGTYPE_4
      ENGTYPE_5
      HASC_1
      HASC_2
      ...
      VALIDFR_4
      VALIDTO_1
      VALIDTO_2
      VALIDTO_3
      VALIDTO_4
      VARNAME_1
      VARNAME_2
      VARNAME_3
      VARNAME_4
      geometry
    
  
  
    
      0
      None
      None
      None
      Metropolis
      Special Ward
      None
      None
      None
      JP.TK
      None
      ...
      None
      Unknown
      Present
      None
      Unknown
      Edo|Yedo|Tokio|T┼uio
      None
      None
      None
      (POLYGON ((139.7594604492192 35.61920547485357...
    
    
      1
      None
      None
      None
      Metropolis
      Special Ward
      None
      None
      None
      JP.TK
      None
      ...
      None
      Unknown
      Present
      None
      Unknown
      Edo|Yedo|Tokio|T┼uio
      None
      None
      None
      (POLYGON ((139.756988525391 35.61753082275391,...
    
    
      2
      None
      None
      None
      Metropolis
      Special Ward
      None
      None
      None
      JP.TK
      None
      ...
      None
      Unknown
      Present
      None
      Unknown
      Edo|Yedo|Tokio|T┼uio
      None
      None
      None
      POLYGON ((139.6250152587891 35.76376342773449,...
    
    
      3
      None
      None
      None
      Metropolis
      Special Ward
      None
      None
      None
      JP.TK
      None
      ...
      None
      Unknown
      Present
      None
      Unknown
      Edo|Yedo|Tokio|T┼uio
      None
      None
      None
      POLYGON ((139.6917114257814 35.68527603149425,...
    
    
      4
      None
      None
      None
      Metropolis
      Special Ward
      None
      None
      None
      JP.TK
      None
      ...
      None
      Unknown
      Present
      None
      Unknown
      Edo|Yedo|Tokio|T┼uio
      None
      None
      None
      (POLYGON ((139.7405700683598 35.5415992736817,...
    
  

5 rows × 53 columns

Before we map the boundaries to the map, lets check if the two shapefile have the same projection.



In [14]:

    
print borders.crs==grid_res.crs # check if the two shapefile have the same projection
print borders.crs # check the two projections
print grid_res.crs









    



False
{'init': u'epsg:4326'}
{u'lon_0': 138, u'ellps': u'WGS84', u'y_0': 0, u'no_defs': True, u'proj': u'eqdc', u'x_0': 0, u'units': u'm', u'lat_2': 40, u'lat_1': 34, u'lat_0': 0}

Turns out they are not same, so lets do some projection.



In [15]:

    
borders = borders.to_crs(grid_res.crs) # convert the borders projection to the same as the grid_res
print borders.crs==grid_res.crs # now check again if the two shapefile have the same projection









    



True

So they are now in the same projection.

Now, lets add the administrative boundaries to the map.
Because the boundaries are set to black colour (ec='k'), so the grid edge colour is changed to white (ec='w')



In [16]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_colour(grid_res, 'colour', ax, lw=.1, ec='w', alpha=.7)
ax = mpoly.add_border(borders, ax, ec='k', alpha=.4)

The map is better now.

Sometimes, we may need to add the labels of the area name to the map. So, lets try to find the name of each administrative.



In [17]:

    
print borders['NAME_2']









    



0         Minato
1      Shinagawa
2         Nerima
3        Shibuya
4            Ota
5       Setagaya
6        Edogawa
7        Chiyoda
8       Shinjuku
9         Bunkyo
10        Adachi
11         Taito
12        Nakano
13      Suginami
14      Itabashi
15          Kita
16          Chuo
17          Koto
18       Arakawa
19    Katsushika
20        Meguro
21       Toshima
22        Sumida
Name: NAME_2, dtype: object

The column that record the names seems to be in the "NAME_2" column.

So, lets try to add it to the previous map.



In [23]:

    
fig,ax = plt.subplots(figsize=(7,7))
ax = mpoly.prepare_map(ax, map_context=grid_res)
ax = mpoly.map_colour(grid_res, 'colour', ax, lw=.1, ec='w', alpha=.7)
ax = mpoly.add_border(borders, ax, ec='k', alpha=.4)
ax = mpoly.add_label(borders, ax, 'NAME_2', font_colour='k', font_size=10)









    



NAME_2

To be honest, this map can be helpful for observation, but it is not very nice looking.
So, maybe use the previous map for publication, and this for discussion. XD

That is all in this tutorial.

end of this tutorial...



In [ ]:

	com	geometry	node	tweets	usercount	xcor	ycor
0	14	POLYGON ((175239.9457184017 3947195.841823581,...	0	1	1	139.939807	35.640542
1	56	POLYGON ((175239.9457767347 3947695.841815081,...	1	0	0	139.939919	35.645048
2	1	POLYGON ((142239.9457464929 3956695.841823446,...	10	35	21	139.576848	35.731640
3	18	POLYGON ((144239.9457266586 3959695.841818351,...	100	40	32	139.599535	35.758373
4	4	POLYGON ((154239.9457194024 3947195.841822605,...	1000	1898	660	139.707733	35.644166

	CC_1	CC_2	ENGTYPE4	ENGTYPE_1	ENGTYPE_2	ENGTYPE_3	ENGTYPE_4	ENGTYPE_5	HASC_1	HASC_2	...	VALIDFR_4	VALIDTO_1	VALIDTO_2	VALIDTO_3	VALIDTO_4	VARNAME_1	VARNAME_2	VARNAME_3	VARNAME_4	geometry
0	None	None	None	Metropolis	Special Ward	None	None	None	JP.TK	None	...	None	Unknown	Present	None	Unknown	Edo\|Yedo\|Tokio\|T┼uio	None	None	None	(POLYGON ((139.7594604492192 35.61920547485357...
1	None	None	None	Metropolis	Special Ward	None	None	None	JP.TK	None	...	None	Unknown	Present	None	Unknown	Edo\|Yedo\|Tokio\|T┼uio	None	None	None	(POLYGON ((139.756988525391 35.61753082275391,...
2	None	None	None	Metropolis	Special Ward	None	None	None	JP.TK	None	...	None	Unknown	Present	None	Unknown	Edo\|Yedo\|Tokio\|T┼uio	None	None	None	POLYGON ((139.6250152587891 35.76376342773449,...
3	None	None	None	Metropolis	Special Ward	None	None	None	JP.TK	None	...	None	Unknown	Present	None	Unknown	Edo\|Yedo\|Tokio\|T┼uio	None	None	None	POLYGON ((139.6917114257814 35.68527603149425,...
4	None	None	None	Metropolis	Special Ward	None	None	None	JP.TK	None	...	None	Unknown	Present	None	Unknown	Edo\|Yedo\|Tokio\|T┼uio	None	None	None	(POLYGON ((139.7405700683598 35.5415992736817,...