HW3 visualisation


In [1]:
import folium
import pandas as pd
import numpy as np
from map_universities import *

Load the topojson with the Swiss cantons and the CSV containing the grants by canton


In [2]:
swiss_canton = 'ch-cantons.topojson.json'
grant_data = pd.read_csv(r'all_canton_grants.csv')

The canton that don't receive any grant don't appear in the CSV. The we add them in the dataframe with an amount of 0


In [3]:
#Create a dataframe of the canton abreviation -> names
#the function cantons() from map_universities.py return a dict : abreviation -> names
list_canton = pd.DataFrame.from_dict(cantons(), orient='index')
#get the canton that doen't appear in the the grand_data dataframe
not_in_grant_data = list_canton[~list_canton.index.isin(grant_data.canton)]
#Create a new dataframe containing those cantons
not_in_grant_data = pd.DataFrame(not_in_grant_data.index,  columns=['canton'])
not_in_grant_data['amount'] = 0
#concatenate the 2 dataframe
grant_data = pd.concat([grant_data, not_in_grant_data], ignore_index= True)
grant_data


Out[3]:
canton amount
0 AG 1.261875e+08
1 BE 1.574573e+09
2 BS 1.392498e+09
3 FR 4.590737e+08
4 GE 1.877102e+09
5 GR 3.653832e+07
6 JU 3.479035e+07
7 LU 5.467329e+07
8 NE 4.018976e+08
9 SG 1.107189e+08
10 SH 1.766910e+05
11 SO 4.277191e+07
12 SZ 9.365510e+05
13 TG 4.018981e+06
14 TI 1.152623e+08
15 VD 2.401656e+09
16 VS 2.964409e+07
17 ZG 4.957150e+05
18 ZH 3.661665e+09
19 AR 0.000000e+00
20 UR 0.000000e+00
21 BL 0.000000e+00
22 GL 0.000000e+00
23 OW 0.000000e+00
24 NW 0.000000e+00
25 AI 0.000000e+00

We take the log of each value since that scale is more appropriate for the amount we have


In [4]:
def log_function(row):
    if row.amount == 0: return row
    row.amount = np.log10(row.amount)
    return row


grant_data_log = grant_data.apply(log_function,1)
grant_data_log


Out[4]:
canton amount
0 AG 8.101016
1 BE 9.197163
2 BS 9.143795
3 FR 8.661882
4 GE 9.273488
5 GR 7.562749
6 JU 7.541459
7 LU 7.737775
8 NE 8.604115
9 SG 8.044222
10 SH 5.247214
11 SO 7.631159
12 SZ 5.971531
13 TG 6.604116
14 TI 8.061687
15 VD 9.380511
16 VS 7.471938
17 ZG 5.695232
18 ZH 9.563679
19 AR 0.000000
20 UR 0.000000
21 BL 0.000000
22 GL 0.000000
23 OW 0.000000
24 NW 0.000000
25 AI 0.000000

In [5]:
swiss_map = folium.Map(location=[46.5966, 7.9761],zoom_start=7)
swiss_map.choropleth(
                geo_path=swiss_canton, 
                topojson='objects.cantons', 
                data=grant_data_log,
                columns=['canton', 'amount'],
                key_on='id',
                threshold_scale=[0,5,6,7,8,9],
                fill_color='YlOrRd', fill_opacity=0.7, line_opacity=0.6,
                legend_name='Grant money received by canton (CHF)'
            )

swiss_map.save('swiss_map.html')
swiss_map
#the map is not rendered on Github. To see it, you can download swiss_map.html and open it on your browser.


Out[5]:

Bonus


In [6]:
%run map_universities.py

We want to compare the difference between the area divided by the röstigraben So we need:

1) Map each canton to its main language. Since the topojson allow us only to give one color for each canton, we only consider the main language of each canton. It means:
    * Fribourg -> French
    * Valais -> French
    * Bern -> German
    * Graubünden -> German)
2) Group by each language and sum the grant
3) Join the list of canton with the list of grant by language
4) Show the map with the color code mapped to the amount by language

In [7]:
grant_rostigraben = grant_data.copy()

Map each canton to its languages. We apply the function get_language (in map_university.py) that given an abreviation of canton, return a list of language. We didn't consider romanish since it'is only spoken by a small part of the Graubünden.


In [8]:
grant_rostigraben['language'] = grant_rostigraben.canton.apply(get_language)
grant_rostigraben.head()


Out[8]:
canton amount language
0 AG 1.261875e+08 [D]
1 BE 1.574573e+09 [D, FR]
2 BS 1.392498e+09 [D]
3 FR 4.590737e+08 [FR, D]
4 GE 1.877102e+09 [FR]

Some cantons have 2 languages (Fribourg, Berne and Valais). We need to split the list and create a second attribut language2. The array of language is sorted by main language. So language contains the main language first and language2 int the second place


In [9]:
def split_if_two_language(row):
    l = row['language']
    row.language = l[0]
    if(len(l)>1):
        row['language2'] = l[1]
    return row
    
grant_rostigraben = grant_rostigraben.apply(split_if_two_language, 1)

In [10]:
grant_rostigraben.head()


Out[10]:
amount canton language language2
0 1.261875e+08 AG D NaN
1 1.574573e+09 BE D FR
2 1.392498e+09 BS D NaN
3 4.590737e+08 FR FR D
4 1.877102e+09 GE FR NaN

Group by language (FR, D, IT) and sum the amount. Then rename the amount columns to amount_by_language


In [11]:
grant_language =   grant_rostigraben.groupby(by='language', axis=0, as_index=False).sum()
grant_language.rename(columns={'amount':'amount_by_language'}, inplace=True)

Join the the table group by language and the one containing the grant by canton


In [12]:
grant_with_language = pd.merge(grant_rostigraben, grant_language, how='inner')
grant_with_language = pd.merge(grant_rostigraben, grant_language, on='language')
grant_with_language.head()


Out[12]:
amount canton language language2 amount_by_language
0 1.261875e+08 AG D NaN 7.005254e+09
1 1.574573e+09 BE D FR 7.005254e+09
2 1.392498e+09 BS D NaN 7.005254e+09
3 3.653832e+07 GR D NaN 7.005254e+09
4 5.467329e+07 LU D NaN 7.005254e+09

In [13]:
rostigraben_map = folium.Map(location=[46.5966, 7.9761],zoom_start=7)
rostigraben_map.choropleth(
                geo_path=swiss_canton, 
                topojson='objects.cantons', 
                data=grant_with_language,
                columns=['canton', 'amount_by_language'],
                key_on='id',
                #threshold_scale=[7,8,9],
                fill_color='YlOrRd', fill_opacity=0.7, line_opacity=0.6,
                legend_name='Grant money received by language (CHF)'
            )

rostigraben_map.save('rostigraben_map.html')
rostigraben_map
#the map is not rendered on Github. To see it, you can download rostigraben_map.html and open it on your browser.


/home/lukas/anaconda3/lib/python3.5/site-packages/ipykernel/__main__.py:10: FutureWarning: 'threshold_scale' default behavior has changed. Now you get a linear scale between the 'min' and the 'max' of your data. To get former behavior, use folium.utilities.split_six.
Out[13]: