Split Vote and Voting

Splits the Voting DataFrame by each single vote that was taken in .csv files. Each file will have two different IDs together: the one from the law that is voted, and the other from the iteration of the law.

1. Loading the data we will need: vote and voting fields



In [14]:

    
import pandas as pd
import glob
import os
import numpy as np
import matplotlib.pyplot as plt
from helpers import load_vote_df, load_voting,format_voting_session, split_df_dict
%matplotlib inline
%load_ext autoreload
%autoreload 2

# There's a lot of columns in the DF. 
# Therefore, we add this option so that we can see more columns
pd.options.display.max_columns = 100









    



The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload



In [2]:

    
voting_df = format_voting_session(load_voting())









    



Entries in the DataFrame (1713854, 26)

2. Map name of the law to an ID

Allows to load a light file, which will serve as link between the initial visualisation and the law specific.

Aside: Removing the not relevant laws from the Vote file

We see here a disparity, as in our Voting file, the smallest IdVote is 6392, which removes a third of the laws we actually have. This is due to the scraping not retrieving everything, but this is not a bad thing, as it removes the oldest entries, which are not the most relevant ones.



In [3]:

    
voting_df.IdVote.sort_values().unique()









    Out[3]:





array([ 6392,  6393,  6394, ..., 17979, 17981, 17983])



In [15]:

    
vote_df = load_vote_df()
vote_df.head()









    



Entries in the DataFrame (11182, 16)






    Out[15]:






  
    
      
      VoteEnd
      BillTitle
      BusinessTitle
      Subject
      MeaningNo
      MeaningYes
      BusinessShortNumber
      ID
      IdLegislativePeriod
      IdSession
    
  
  
    
      0
      2003-12-01
      Loi fédérale sur l'octroi d'aides financières ...
      Fondation Bibliomedia. Aide financière 2004-2007
      Gesamtabstimmung
      NaN
      NaN
      3.054
      1
      47
      4701
    
    
      1
      2003-12-01
      Arrêté fédéral concernant le plafond de dépens...
      Fondation Bibliomedia. Aide financière 2004-2007
      Gesamtabstimmung
      NaN
      NaN
      3.054
      2
      47
      4701
    
    
      2
      2003-12-01
      Loi fédérale sur l'octroi d'aides financières ...
      Musée suisse des transports. Aide financière 2...
      Vote sur l'ensemble
      NaN
      NaN
      3.055
      4
      47
      4701
    
    
      3
      2003-12-01
      Arrêté fédéral concernant le plafond de dépens...
      Musée suisse des transports. Aide financière 2...
      Vote sur l'ensemble
      NaN
      NaN
      3.055
      5
      47
      4701
    
    
      4
      2003-12-01
      Arrêté fédéral relatif au mandat de prestation...
      EPF. Mandat de prestations pour les années 200...
      Vote sur l'ensemble
      NaN
      NaN
      3.045
      6
      47
      4701

Indeed, we see here that here, the ID, which corresponds to the IdVote on the Voting file, starts from 1, but the subject is about the oldest votes. We will now generate an "epurated" version of the Vote dataframe, with only the votes which have a counterpart in the Voting file.

Moreover, we know that there are some empty BillTitle entries, so we want to put the BusinessTitle as BillTitle to be able to handle the subject generally.



In [16]:

    
vote_df = vote_df.loc[vote_df.ID>=6392]

# Setting the entries with a null BillTitle to have BusinessTitle as their entry
vote_df.loc[vote_df.BillTitle.isnull(),'BillTitle'] =  vote_df.loc[vote_df.BillTitle.isnull(),'BusinessTitle']
    
directory = '../../datas/treated_data/Vote/'
if not os.path.exists(directory):
    os.makedirs(directory)
    
vote_df.to_csv(directory+'legiid_47-50.csv')

End of the aside



In [17]:

    
vote_df = format_voting_session(vote_df)
#Filling the NaN with some text so the javascript does not crash later on.
vote_df = vote_df.fillna('Not specified')
vote_df.head()









    Out[17]:






  
    
      
      VoteEnd
      BillTitle
      BusinessTitle
      Subject
      MeaningNo
      MeaningYes
      BusinessShortNumber
      ID
      IdLegislativePeriod
      IdSession
      SessionName
      Date
    
  
  
    
      2609
      2009-09-07
      Arrêté fédéral concernant la contribution de l...
      Réduction des disparités économiques et social...
      Entrer en matière
      Proposition de la minorité Stamm (ne pas entre...
      Proposition de la majorité (entrer en matière)
      9.055
      6392
      48
      4811
      Session d'automne 2009
      2009-09-07
    
    
      2610
      2009-09-07
      Arrêté fédéral concernant la contribution de l...
      Réduction des disparités économiques et social...
      Art. 1 (frein aux dépenses)
      Not specified
      Not specified
      9.055
      6393
      48
      4811
      Session d'automne 2009
      2009-09-07
    
    
      2611
      2009-09-07
      Arrêté fédéral concernant la contribution de l...
      Réduction des disparités économiques et social...
      Art. 2bis
      Proposition de la minorité Stamm
      Proposition de la majorité
      9.055
      6394
      48
      4811
      Session d'automne 2009
      2009-09-07
    
    
      2612
      2009-09-07
      Arrêté fédéral concernant la contribution de l...
      Réduction des disparités économiques et social...
      Art. 2ter
      Proposition de la minorité Schlüer
      Proposition de la majorité
      9.055
      6395
      48
      4811
      Session d'automne 2009
      2009-09-07
    
    
      2613
      2009-09-07
      Arrêté fédéral concernant la contribution de l...
      Réduction des disparités économiques et social...
      Art. 2quater
      Proposition de la minorité Fehr Hans-Jürg
      Proposition de la majorité
      9.055
      6396
      48
      4811
      Session d'automne 2009
      2009-09-07

First of all, we associate a unique ID to each BillTitle. To do so, we take as ID the one that a given Bill has at the last time it appears in our DataFrame.



In [18]:

    
def map_BillTitle_Vote(vote_df):
    df_link = vote_df[['BillTitle','ID']]
    df_link.columns=['BillTitle','ID_Bill']
    df_link = df_link.drop_duplicates(['BillTitle'], keep = 'last')
    df_link.set_index('BillTitle',inplace=True)
    vote_df = vote_df.join(df_link, on='BillTitle')
    
    return df_link, vote_df

df_link,vote_df = map_BillTitle_Vote(vote_df)

directory = '../../datas/analysis/'
if not os.path.exists(directory):
    os.makedirs(directory)
    
df_link.to_csv(directory+'map_bill_ID.csv')

We must first have a table in which we can link the BillTitle to the corresponding file, which will contain all the informations regarding a certain BillTitle.



In [19]:

    
bills_dict = split_df_dict(vote_df, 'ID_Bill')









    



Number of unique entries in ID_Bill : 3415



In [20]:

    
directory = '../../datas/analysis/bill_link/'
if not os.path.exists(directory):
    os.makedirs(directory)

for ID_Bill, df in bills_dict.items():
    df.to_csv(directory+'bill_'+str(ID_Bill)+'.csv')

3. Map the ID of a given iteration of a law to the votes on the subject

The last task that we have to do is linking the ID of a given iteration of the law (in the file in the bill_link folder) to the votes that happened on the subject. We simply need to group each voting by its IdVote and export it to a single file.



In [21]:

    
voting_df.head()









    Out[21]:






  
    
      
      Name
      BillTitle
      Decision
      BusinessShortNumber
      ParlGroupName
      ParlGroupCode
      IdVote
      IdSession
      VoteEnd
      SessionName
      Date
    
  
  
    
      0
      Adèle Thorens Goumaz
      Loi fédérale concernant la redevance pour l’ut...
      2
      12.018
      Groupe écologiste
      G
      12188
      4907
      2013-03-04
      Session de printemps 2013
      2013-03-04
    
    
      1
      Ada Marra
      Loi fédérale concernant la redevance pour l’ut...
      2
      12.018
      Groupe socialiste
      S
      12188
      4907
      2013-03-04
      Session de printemps 2013
      2013-03-04
    
    
      2
      Tiana Angelina Moser
      Loi fédérale concernant la redevance pour l’ut...
      2
      12.018
      Groupe vert'libéral
      GL
      12188
      4907
      2013-03-04
      Session de printemps 2013
      2013-03-04
    
    
      3
      Natalie Rickli
      Loi fédérale concernant la redevance pour l’ut...
      1
      12.018
      Groupe des Paysans, Artisans et Bourgeois
      V
      12188
      4907
      2013-03-04
      Session de printemps 2013
      2013-03-04
    
    
      4
      Barbara Schmid-Federer
      Loi fédérale concernant la redevance pour l’ut...
      3
      12.018
      Groupe conservateur-catholique
      CE
      12188
      4907
      2013-03-04
      Session de printemps 2013
      2013-03-04

Splitting the voting DataFrame by the Idvote field in order to get many subdataframes.



In [22]:

    
voting_dict = split_df_dict(voting_df, 'IdVote')









    



Number of unique entries in IdVote : 8573

Before exporting it, we remove a lot of redundant columns. Indeed, as we will access this file through the .csv in the bill_link folder, we will already have access to all the information about the vote itself. We just need to have what each member voted.



In [24]:

    
directory = '../../datas/analysis/bill_voting/'
if not os.path.exists(directory):
    os.makedirs(directory)

for IdVote, df in voting_dict.items():
    df = df.drop(['BillTitle','BusinessShortNumber', 'IdSession', 'VoteEnd', 'SessionName', 'Date'],axis=1)
    df.to_csv(directory+'voting_'+str(IdVote)+'.csv')

	VoteEnd	BillTitle	BusinessTitle	Subject	MeaningNo	MeaningYes	BusinessShortNumber	ID	IdLegislativePeriod	IdSession
0	2003-12-01	Loi fédérale sur l'octroi d'aides financières ...	Fondation Bibliomedia. Aide financière 2004-2007	Gesamtabstimmung	NaN	NaN	3.054	1	47	4701
1	2003-12-01	Arrêté fédéral concernant le plafond de dépens...	Fondation Bibliomedia. Aide financière 2004-2007	Gesamtabstimmung	NaN	NaN	3.054	2	47	4701
2	2003-12-01	Loi fédérale sur l'octroi d'aides financières ...	Musée suisse des transports. Aide financière 2...	Vote sur l'ensemble	NaN	NaN	3.055	4	47	4701
3	2003-12-01	Arrêté fédéral concernant le plafond de dépens...	Musée suisse des transports. Aide financière 2...	Vote sur l'ensemble	NaN	NaN	3.055	5	47	4701
4	2003-12-01	Arrêté fédéral relatif au mandat de prestation...	EPF. Mandat de prestations pour les années 200...	Vote sur l'ensemble	NaN	NaN	3.045	6	47	4701

	VoteEnd	BillTitle	BusinessTitle	Subject	MeaningNo	MeaningYes	BusinessShortNumber	ID	IdLegislativePeriod	IdSession	SessionName	Date
2609	2009-09-07	Arrêté fédéral concernant la contribution de l...	Réduction des disparités économiques et social...	Entrer en matière	Proposition de la minorité Stamm (ne pas entre...	Proposition de la majorité (entrer en matière)	9.055	6392	48	4811	Session d'automne 2009	2009-09-07
2610	2009-09-07	Arrêté fédéral concernant la contribution de l...	Réduction des disparités économiques et social...	Art. 1 (frein aux dépenses)	Not specified	Not specified	9.055	6393	48	4811	Session d'automne 2009	2009-09-07
2611	2009-09-07	Arrêté fédéral concernant la contribution de l...	Réduction des disparités économiques et social...	Art. 2bis	Proposition de la minorité Stamm	Proposition de la majorité	9.055	6394	48	4811	Session d'automne 2009	2009-09-07
2612	2009-09-07	Arrêté fédéral concernant la contribution de l...	Réduction des disparités économiques et social...	Art. 2ter	Proposition de la minorité Schlüer	Proposition de la majorité	9.055	6395	48	4811	Session d'automne 2009	2009-09-07
2613	2009-09-07	Arrêté fédéral concernant la contribution de l...	Réduction des disparités économiques et social...	Art. 2quater	Proposition de la minorité Fehr Hans-Jürg	Proposition de la majorité	9.055	6396	48	4811	Session d'automne 2009	2009-09-07

	Name	BillTitle	Decision	BusinessShortNumber	ParlGroupName	ParlGroupCode	IdVote	IdSession	VoteEnd	SessionName	Date
0	Adèle Thorens Goumaz	Loi fédérale concernant la redevance pour l’ut...	2	12.018	Groupe écologiste	G	12188	4907	2013-03-04	Session de printemps 2013	2013-03-04
1	Ada Marra	Loi fédérale concernant la redevance pour l’ut...	2	12.018	Groupe socialiste	S	12188	4907	2013-03-04	Session de printemps 2013	2013-03-04
2	Tiana Angelina Moser	Loi fédérale concernant la redevance pour l’ut...	2	12.018	Groupe vert'libéral	GL	12188	4907	2013-03-04	Session de printemps 2013	2013-03-04
3	Natalie Rickli	Loi fédérale concernant la redevance pour l’ut...	1	12.018	Groupe des Paysans, Artisans et Bourgeois	V	12188	4907	2013-03-04	Session de printemps 2013	2013-03-04
4	Barbara Schmid-Federer	Loi fédérale concernant la redevance pour l’ut...	3	12.018	Groupe conservateur-catholique	CE	12188	4907	2013-03-04	Session de printemps 2013	2013-03-04