05 - Format Personal Information

From the scraped data from MemberCouncil, we want to format a file containing all the information relevant to the deputies we're dealing with, in order to be able to access it simply and have it centred at one place.


In [1]:
import pandas as pd
import numpy as np
import glob
import os
# There's a lot of columns in the DF. 
# Therefore, we add this option so that we can see more columns
pd.options.display.max_columns = 100

Import the Voting. It is heavy, but we load it in order to know which are the deputies for which we need to get additional data.


In [2]:
dataset_tmp = []
path = '../datas/scrap/Voting'
allFiles = glob.glob(os.path.join(path, 'Session*.csv'))

for file_ in allFiles:
    data_tmp = pd.read_csv(file_,index_col=0)
    dataset_tmp += [data_tmp] 
voting_df = pd.concat(dataset_tmp)

We create an array called names which contains all the unique name entries into the Voting dataframe.


In [3]:
voting_df['Name'] = voting_df['FirstName']+' '+voting_df['LastName']

names = voting_df['Name'].unique()

We now get the MemberCouncil dataframe, from which we'll scrap what we need for the infos on each deputee.


In [4]:
dataset_tmp = []
path = '../datas/scrap/MemberCouncil'
allFiles = glob.glob(os.path.join(path, 'MemberCouncilid*.csv'))

for file_ in allFiles:
    data_tmp = pd.read_csv(file_,index_col=0)
    dataset_tmp += [data_tmp] 
member_df = pd.concat(dataset_tmp)

member_df['Name'] = member_df['FirstName']+' '+member_df['LastName']

We see below that there are quite some information available. We'll consider only a few ones:

  • Active
  • Canton
  • Name
  • PartyName
  • PartyAbbreviation
  • DateOfBirth
  • ...

But it wil be easy to add some other fields when needed.


In [5]:
member_df.head()


Out[5]:
Active AdditionalActivity AdditionalMandate BirthPlace_Canton BirthPlace_City Canton CantonAbbreviation CantonName Citizenship Council CouncilAbbreviation CouncilName DateElection DateJoining DateLeaving DateOath DateOfBirth DateOfDeath DateResignation FirstName GenderAsString ID IdPredecessor Language LastName Mandates MaritalStatus MaritalStatusText MilitaryRank MilitaryRankText Modified NumberOfChildren OfficialName ParlGroupAbbreviation ParlGroupFunction ParlGroupFunctionText ParlGroupName ParlGroupNumber Party PartyAbbreviation PartyName PersonIdCode PersonNumber Name
0 False Réducteur TSV de 1971 à 1983 Prés. Féd. Romande des Sociolistes ch., Prés. ... NaN Pompaples 22.0 VD Vaud Sullens (VD),Lutry (VD) 1 CN Conseil national 1995-12-04T00:00:00 1995-12-04T00:00:00 1999-12-05T00:00:00 1995-12-04T00:00:00 1938-03-02T00:00:00 NaN 1999-12-05T00:00:00 Pierre m 1 NaN FR Aguet Conseiller communal de 1965 à 1982, Conseiller... 2.0 marié(e) 5.0 Fourrier 2015-05-17T21:18:19.387 NaN Aguet NaN NaN NaN NaN NaN 12.0 PSS Parti socialiste suisse 2200.0 1 Pierre Aguet
1 False NaN NaN NaN NaN 1.0 ZH Zurich Kreuzlingen (TG),Fällanden (ZH) 1 CN Conseil national 1979-11-26T00:00:00 1979-11-26T00:00:00 1995-12-03T00:00:00 1979-11-26T00:00:00 1928-02-22T00:00:00 NaN 1995-12-03T00:00:00 Heinz m 2 NaN FR Allenspach NaN NaN NaN NaN NaN 2015-05-17T21:18:19.387 NaN Allenspach NaN NaN NaN NaN NaN 15.0 PLR PLR.Les Libéraux-Radicaux 2002.0 2 Heinz Allenspach
2 False Zentralpräsident Schweiz. Skliverband 1985 bis... NaN NaN Hasle 3.0 LU Lucerne Hasle (LU) 1 CN Conseil national 1995-12-04T00:00:00 1995-12-04T00:00:00 1999-12-05T00:00:00 1995-12-04T00:00:00 1931-01-27T00:00:00 NaN 1999-12-05T00:00:00 Manfred m 6 NaN FR Aregger Grossrat 1967 bis 1983 (Präsident 1977) 2.0 marié(e) 7.0 Adjudant sous-officier 2015-05-17T21:18:19.387 5.0 Aregger NaN NaN NaN NaN NaN 15.0 PLR PLR.Les Libéraux-Radicaux 2004.0 6 Manfred Aregger
3 False NaN NaN NaN NaN 2.0 BE Berne Tavannes (BE) 1 CN Conseil national 1979-11-26T00:00:00 1979-11-26T00:00:00 1995-12-03T00:00:00 1979-11-26T00:00:00 1928-03-04T00:00:00 NaN 1995-12-03T00:00:00 Geneviève f 7 NaN FR Aubry NaN NaN NaN NaN NaN 2015-05-17T21:18:19.387 NaN Aubry Geneviève NaN NaN NaN NaN NaN NaN NaN NaN 2005.0 7 Geneviève Aubry
4 False NaN NaN NaN NaN 2.0 BE Berne Siselen (BE),Richterswil (ZH) 1 CN Conseil national 1987-11-30T00:00:00 1987-11-30T00:00:00 1995-12-03T00:00:00 1987-11-30T00:00:00 1947-12-01T00:00:00 NaN 1995-12-03T00:00:00 Rosmarie f 8 NaN FR Bär NaN NaN NaN NaN NaN 2015-05-17T21:18:19.387 NaN Bär NaN NaN NaN NaN NaN NaN NaN NaN 2008.0 8 Rosmarie Bär

In [6]:
member_df.columns


Out[6]:
Index(['Active', 'AdditionalActivity', 'AdditionalMandate',
       'BirthPlace_Canton', 'BirthPlace_City', 'Canton', 'CantonAbbreviation',
       'CantonName', 'Citizenship', 'Council', 'CouncilAbbreviation',
       'CouncilName', 'DateElection', 'DateJoining', 'DateLeaving', 'DateOath',
       'DateOfBirth', 'DateOfDeath', 'DateResignation', 'FirstName',
       'GenderAsString', 'ID', 'IdPredecessor', 'Language', 'LastName',
       'Mandates', 'MaritalStatus', 'MaritalStatusText', 'MilitaryRank',
       'MilitaryRankText', 'Modified', 'NumberOfChildren', 'OfficialName',
       'ParlGroupAbbreviation', 'ParlGroupFunction', 'ParlGroupFunctionText',
       'ParlGroupName', 'ParlGroupNumber', 'Party', 'PartyAbbreviation',
       'PartyName', 'PersonIdCode', 'PersonNumber', 'Name'],
      dtype='object')

In [7]:
member_df = member_df.loc[member_df['Name'].isin(names)]
member_df_final = member_df.drop(['BirthPlace_Canton','Canton','Council','FirstName','LastName','IdPredecessor','Language',
                                 'MaritalStatus','MilitaryRank','Modified','OfficialName','ParlGroupFunction',
                                 'ParlGroupNumber', 'Party'],1)
member_df_final.loc[:,'DateOfBirth'] = member_df_final['DateOfBirth'].apply(pd.to_datetime).apply(lambda x: x.date())

Now we match the party names to the ones we have in our database, in order to prevent as much as possible clashes in the data.


In [8]:
member_df_final.loc[:,'Active'] = member_df_final['Active'].astype(str)
member_df_final = member_df_final.fillna('Not specified')
member_df_final.head()


Out[8]:
Active AdditionalActivity AdditionalMandate BirthPlace_City CantonAbbreviation CantonName Citizenship CouncilAbbreviation CouncilName DateElection DateJoining DateLeaving DateOath DateOfBirth DateOfDeath DateResignation GenderAsString ID Mandates MaritalStatusText MilitaryRankText NumberOfChildren ParlGroupAbbreviation ParlGroupFunctionText ParlGroupName PartyAbbreviation PartyName PersonIdCode PersonNumber Name
11 False Präsident der Parlamentarischen Gruppe Luft- u... Not specified Illnau-Effretikon ZH Zurich Illnau-Effretikon (ZH) CN Conseil national 2011-10-23T00:00:00 2011-12-05T00:00:00 2015-11-29T00:00:00 2011-12-05T00:00:00 1947-11-26 Not specified 2015-11-29T00:00:00 m 15 Legislative der Gemeinde (Grosser Gemeinderat)... marié(e) -- 3 Not specified Not specified Not specified UDC Union Démocratique du Centre 2270 15 Max Binder
17 False Not specified Präsident der SVP Kanton Zürich: von Februar 1... Laufen am Rheinfall ZH Zurich Schattenhalb (BE),Zurich (ZH),Meilen (ZH),Lü (GR) CN Conseil national 2011-10-23T00:00:00 2011-12-05T00:00:00 2014-05-31T00:00:00 2011-12-05T00:00:00 1940-10-11 Not specified 2014-05-31T00:00:00 m 21 Gemeinderat Meilen: von März 1974 bis März 197... marié(e) Colonel 4 Not specified Not specified Not specified UDC Union Démocratique du Centre 2017 21 Christoph Blocher
22 False Not specified Not specified Soleure SO Soleure Le Petit Lucelle (SO) CN Conseil national 2011-10-23T00:00:00 2011-12-05T00:00:00 2015-11-29T00:00:00 2011-12-05T00:00:00 1951-01-27 Not specified 2015-11-29T00:00:00 m 26 Legislative des Kantons: von 1989 bis 1991 Not specified Major Not specified Not specified Not specified Not specified UDC Union Démocratique du Centre 2272 26 Roland F. Borer
24 False Co-Präsident der Parlamentarischen Gruppe für ... Not specified Affoltern am Albis ZH Zurich Affoltern am Albis (ZH) CN Conseil national 2011-10-23T00:00:00 2011-12-05T00:00:00 2015-11-29T00:00:00 2011-12-05T00:00:00 1947-02-16 Not specified 2015-11-29T00:00:00 m 28 Exekutive der Gemeinde (Gemeinderat): von 1982... marié(e) Not specified 4 Not specified Not specified Not specified UDC Union Démocratique du Centre 2274 28 Toni Bortoluzzi
68 True Vize-Präsident der Parlamentarischen Gruppe fü... Not specified Bâle BS Bâle-Ville Bâle (BS) CN Conseil national 2015-10-18T00:00:00 2015-11-30T00:00:00 Not specified 2015-11-30T00:00:00 1951-01-15 Not specified Not specified m 74 Weiterer Bürgerrat: von 1989 bis 1991, Grosser... marié(e) Capitaine 3 RL Membre Groupe libéral-radical PLD Parti libéral démocrate 2287 74 Christoph Eymann

The list of parties below shows all the parties we consider in a vote. This is more extensive than what we have in the voting fields.


In [9]:
member_df_final.PartyName.unique()


Out[9]:
array(['Union Démocratique du Centre', 'Parti libéral démocrate',
       'Parti socialiste suisse', 'PLR.Les Libéraux-Radicaux', 'La Gauche',
       'Parti bourgeois-démocratique suisse',
       'Parti démocrate-chrétien suisse', 'Parti écologiste suisse',
       'Parti évangélique suisse', 'Lega dei Ticinesi',
       "Parti vert'libéral", 'Alliance verte',
       'Alternative Canton de Zoug', 'Not specified', 'sans parti',
       'Parti chrétien-social', 'Union Démocratique Féderale',
       'Christlich-soziale Partei Obwalden', 'Mouvement Citoyens Genevois',
       'Grüne (Basels starke Alternative)', 'Parti Suisse du travail'], dtype=object)

Exporting the names in a single json file each time. We have to perform a little trick to get each row into the dict we want, and then formatting it properly for exportation into a .json file.


In [10]:
import json
directory = '../datas/analysis/deputee_names/'
if not os.path.exists(directory):
    os.makedirs(directory)
    
for deputee in names:
    deputee_list = member_df_final.loc[member_df_final.Name==deputee].to_dict(orient='records')
    deputee_list=deputee_list[0]
    deputee_list['DateOfBirth'] = deputee_list['DateOfBirth'].isoformat()
    with open(directory+deputee+'_info.json', 'w') as f:
        json.dump(deputee_list, f)