IDH

Fontes

Municípios: http://www.pnud.org.br/atlas/ranking/Ranking-IDHM-Municipios-2010.aspx

Estados: http://pt.wikipedia.org/wiki/Lista_de_unidades_federativas_do_Brasil_por_IDH



In [34]:

    
%matplotlib inline
import pandas as pd
import requests as req
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import ttest_ind, ttest_rel

np.set_printoptions(precision=3)

Carregando dados de IDH-M da Wikipedia



In [35]:

    
url = 'http://pt.wikipedia.org/wiki/Lista_de_unidades_federativas_do_Brasil_por_IDH'



In [36]:

    
html_text = req.get(url).text



In [37]:

    
table = pd.read_html(html_text, attrs={"class":"wikitable"})[0]



In [38]:

    
def idh_format(str):
    num = float(str)/1000.0
    return num

Pré-Processando IDH-M Data



In [39]:

    
"""
  0,800 – 1 (Muito alto) - idh_level = 0
  0,700 - 0,799 (Alto)   - idh_level = 0
  0,600 - 0,699 (Médio)  - idh_level = 1
  0,500 - 0,599 (Baixo)  - idh_level = 2
  0 - 0,499 (Muito baixo)- idh_level = 3
"""
def idh_level(x):
    if x >= 0.7:
        return 0
    elif 0.6 <= x < 0.7:
        return 1
    elif 0.5 <= x < 0.6:
        return 2
    elif 0.4 <= x < 0.5:
        return 3
    else: raise Exception("Invalid!")
    
"""
  Abaixo da mediana de 2000 = level 0
  Igual ou acima da mediana de 2000 = level 1
"""
def idh_level2(x):
    if x >= table[4][2:].apply(idh_format).median():
        return "ALTO"
    else: return "BAIXO"



In [40]:

    
idhm_df = pd.DataFrame({u'Estado':table[2][2:].tolist(),u'I2010':table[3][2:].apply(idh_format).tolist(),u'I2000':table[4][2:].apply(idh_format).tolist()})
idhm_df["Ratio"] = idhm_df["I2010"]/idhm_df["I2000"]
idhm_df["idh_level_2000"] = idhm_df["I2000"].apply(idh_level2)



In [41]:

    
idhm_df









    Out[41]:






  
    
      
      Estado
      I2000
      I2010
      Ratio
      idh_level_2000
    
  
  
    
      0 
          Distrito Federal
       0.725
       0.824
       1.136552
        ALTO
    
    
      1 
                 São Paulo
       0.702
       0.783
       1.115385
        ALTO
    
    
      2 
            Santa Catarina
       0.674
       0.774
       1.148368
        ALTO
    
    
      3 
            Rio de Janeiro
       0.664
       0.761
       1.146084
        ALTO
    
    
      4 
                    Paraná
       0.650
       0.749
       1.152308
        ALTO
    
    
      5 
         Rio Grande do Sul
       0.664
       0.746
       1.123494
        ALTO
    
    
      6 
            Espírito Santo
       0.640
       0.740
       1.156250
        ALTO
    
    
      7 
                     Goiás
       0.615
       0.735
       1.195122
        ALTO
    
    
      8 
              Minas Gerais
       0.624
       0.731
       1.171474
        ALTO
    
    
      9 
        Mato Grosso do Sul
       0.613
       0.729
       1.189233
        ALTO
    
    
      10
               Mato Grosso
       0.601
       0.725
       1.206323
        ALTO
    
    
      11
                     Amapá
       0.577
       0.708
       1.227036
        ALTO
    
    
      12
                   Roraima
       0.598
       0.707
       1.182274
        ALTO
    
    
      13
                 Tocantins
       0.525
       0.699
       1.331429
       BAIXO
    
    
      14
                  Rondônia
       0.537
       0.690
       1.284916
       BAIXO
    
    
      15
       Rio Grande do Norte
       0.552
       0.684
       1.239130
        ALTO
    
    
      16
                     Ceará
       0.541
       0.682
       1.260628
       BAIXO
    
    
      17
                  Amazonas
       0.515
       0.674
       1.308738
       BAIXO
    
    
      18
                Pernambuco
       0.544
       0.673
       1.237132
       BAIXO
    
    
      19
                   Sergipe
       0.518
       0.665
       1.283784
       BAIXO
    
    
      20
                      Acre
       0.517
       0.663
       1.282398
       BAIXO
    
    
      21
                     Bahia
       0.512
       0.660
       1.289062
       BAIXO
    
    
      22
                   Paraíba
       0.506
       0.658
       1.300395
       BAIXO
    
    
      23
                     Piauí
       0.484
       0.646
       1.334711
       BAIXO
    
    
      24
                      Pará
       0.518
       0.646
       1.247104
       BAIXO
    
    
      25
                  Maranhão
       0.476
       0.639
       1.342437
       BAIXO
    
    
      26
                   Alagoas
       0.471
       0.631
       1.339703
       BAIXO

Análise Descritiva



In [42]:

    
idhm_df.describe()



In [43]:

    
f = plt.figure(14)
idhm_df[["I2000","I2010","Ratio"]].hist(bins=15)









    Out[43]:





array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000000022876470>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000000022DFD630>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x000000002340F2E8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000000023470940>]], dtype=object)






    





<matplotlib.figure.Figure at 0x22875ef0>

Testando hipótese

A diferença média entre os IDHs de 2000 e 2010 é estatisticamente significativa?



In [44]:

    
ttest_rel(idhm_df['I2000'], idhm_df['I2010'])









    Out[44]:





(-24.939064182558965, 1.1042795294978112e-19)



In [45]:

    
import scipy  
import scikits.bootstrap as bootstrap
  
# compute 95% confidence intervals around the mean  
CIs00 = bootstrap.ci(data=idhm_df["I2000"])  
CIs10 = bootstrap.ci(data=idhm_df["I2010"])
CIsR  = bootstrap.ci(data=idhm_df["Ratio"])

print("IDHM 2000 mean 95% confidence interval. Low={0:.3f}\tHigh={1:.3f}".format(*tuple(CIs00)))
print("IDHM 2010 mean 95% confidence interval. Low={0:.3f}\tHigh={1:.3f}".format(*tuple(CIs10)))
print("IDHM ratio mean 95% confidence interval. Low={0:.3f}\tHigh={1:.3f}".format(*tuple(CIsR)))









    



IDHM 2000 mean 95% confidence interval. Low=0.550	High=0.604
IDHM 2010 mean 95% confidence interval. Low=0.688	High=0.725
IDHM ratio mean 95% confidence interval. Low=1.204	High=1.258



In [46]:

    
CIs00 = bootstrap.ci(data=idhm_df["I2000"], statfunction=scipy.median)  
CIs10 = bootstrap.ci(data=idhm_df["I2010"], statfunction=scipy.median)
CIsR  = bootstrap.ci(data=idhm_df["Ratio"], statfunction=scipy.median)

print("IDHM 2000 median 95% confidence interval. Low={0:.3f}\tHigh={1:.3f}".format(*tuple(CIs00)))
print("IDHM 2010 median 95% confidence interval. Low={0:.3f}\tHigh={1:.3f}".format(*tuple(CIs10)))
print("IDHM ratio median 95% confidence interval. Low={0:.3f}\tHigh={1:.3f}".format(*tuple(CIsR)))









    



IDHM 2000 median 95% confidence interval. Low=0.518	High=0.613
IDHM 2010 median 95% confidence interval. Low=0.665	High=0.729
IDHM ratio median 95% confidence interval. Low=1.171	High=1.282

A resposta de diversos testes, para um nível de 5% de significância, mostra que há fortes evidências que sim.

Montando percentual de impacto da administração de cada partido em cada Estado da Federação.



In [47]:

    
state_parties = pd.DataFrame({"Estado":idhm_df.Estado,"PT":np.repeat(0.0,27),"PSDB":np.repeat(0.0,27),"Outros":np.repeat(0.0,27)})



In [48]:

    
st_pa = np.array([
        [u"Distrito Federal", 0.0, 0.0, 1.0],
        [u"São Paulo", 0.0, 0.925, 0.075],
        [u"Santa Catarina", 0.0, 0.0, 1.0],
        [u"Rio de Janeiro", 0.4, 0.0, 0.6],
        [u"Paraná", 0.0, 0.0, 1.0],
        [u"Rio Grande do Sul", 0.2, 0.4, 0.4],
        [u"Espírito Santo", 0.0, 0.2, 0.8],
        [u"Goiás", 0.0, 0.6, 0.4],
        [u"Minas Gerais", 0.0, 0.8, 0.2],
        [u"Mato Grosso do Sul", 0.0, 0.6, 0.4],
        [u"Mato Grosso", 0.0, 0.2, 0.8],
        [u"Amapá", 0.075, 0.0, 0.925],
        [u"Roraima", 0.275, 0.4, 0.325], # double check
        [u"Tocantins", 0.0, 0.2, 0.8], 
        [u"Rondônia", 0.0, 0.4, 0.6],
        [u"Rio Grande do Norte", 0.0, 0.0, 1.0],
        [u"Ceará", 0.6, 0.0, 0.4],
        [u"Amazonas", 0.0, 0.0, 1.0],
        [u"Pernambuco", 0.0, 0.0, 1.0],
        [u"Sergipe", 0.4, 0.2, 0.4],
        [u"Acre", 1.0, 0.0, 0.0],
        [u"Bahia", 0.4, 0.0, 0.6],
        [u"Paraíba", 0.0, 0.55, 0.45],
        [u"Piauí", 0.8, 0.0, 0.2],
        [u"Pará", 0.4, 0.6, 0.0],
        [u"Maranhão", 0.0, 0.0, 1.0],
        [u"Alagoas", 0.0, 0.0, 1.0],
       ])



In [49]:

    
np.float64(st_pa[:,1:]).sum(axis=1)









    Out[49]:





array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])



In [50]:

    
np.float64(st_pa[:,1:]).sum(axis=0)









    Out[50]:





array([  4.55 ,   6.075,  16.375])



In [51]:

    
state_parties_df = pd.DataFrame({"Estado":st_pa[:,0],"PSDB":np.float64(st_pa[:,2]),"PT":np.float64(st_pa[:,1]),"Outros":np.float64(st_pa[:,3])})



In [52]:

    
state_parties_df









    Out[52]:






  
    
      
      Estado
      Outros
      PSDB
      PT
    
  
  
    
      0 
          Distrito Federal
       1.000
       0.000
       0.000
    
    
      1 
                 São Paulo
       0.075
       0.925
       0.000
    
    
      2 
            Santa Catarina
       1.000
       0.000
       0.000
    
    
      3 
            Rio de Janeiro
       0.600
       0.000
       0.400
    
    
      4 
                    Paraná
       1.000
       0.000
       0.000
    
    
      5 
         Rio Grande do Sul
       0.400
       0.400
       0.200
    
    
      6 
            Espírito Santo
       0.800
       0.200
       0.000
    
    
      7 
                     Goiás
       0.400
       0.600
       0.000
    
    
      8 
              Minas Gerais
       0.200
       0.800
       0.000
    
    
      9 
        Mato Grosso do Sul
       0.400
       0.600
       0.000
    
    
      10
               Mato Grosso
       0.800
       0.200
       0.000
    
    
      11
                     Amapá
       0.925
       0.000
       0.075
    
    
      12
                   Roraima
       0.325
       0.400
       0.275
    
    
      13
                 Tocantins
       0.800
       0.200
       0.000
    
    
      14
                  Rondônia
       0.600
       0.400
       0.000
    
    
      15
       Rio Grande do Norte
       1.000
       0.000
       0.000
    
    
      16
                     Ceará
       0.400
       0.000
       0.600
    
    
      17
                  Amazonas
       1.000
       0.000
       0.000
    
    
      18
                Pernambuco
       1.000
       0.000
       0.000
    
    
      19
                   Sergipe
       0.400
       0.200
       0.400
    
    
      20
                      Acre
       0.000
       0.000
       1.000
    
    
      21
                     Bahia
       0.600
       0.000
       0.400
    
    
      22
                   Paraíba
       0.450
       0.550
       0.000
    
    
      23
                     Piauí
       0.200
       0.000
       0.800
    
    
      24
                      Pará
       0.000
       0.600
       0.400
    
    
      25
                  Maranhão
       1.000
       0.000
       0.000
    
    
      26
                   Alagoas
       1.000
       0.000
       0.000



In [53]:

    
df = idhm_df.merge(state_parties_df, on="Estado")
df









    Out[53]:






  
    
      
      Estado
      I2000
      I2010
      Ratio
      idh_level_2000
      Outros
      PSDB
      PT
    
  
  
    
      0 
          Distrito Federal
       0.725
       0.824
       1.136552
        ALTO
       1.000
       0.000
       0.000
    
    
      1 
                 São Paulo
       0.702
       0.783
       1.115385
        ALTO
       0.075
       0.925
       0.000
    
    
      2 
            Santa Catarina
       0.674
       0.774
       1.148368
        ALTO
       1.000
       0.000
       0.000
    
    
      3 
            Rio de Janeiro
       0.664
       0.761
       1.146084
        ALTO
       0.600
       0.000
       0.400
    
    
      4 
                    Paraná
       0.650
       0.749
       1.152308
        ALTO
       1.000
       0.000
       0.000
    
    
      5 
         Rio Grande do Sul
       0.664
       0.746
       1.123494
        ALTO
       0.400
       0.400
       0.200
    
    
      6 
            Espírito Santo
       0.640
       0.740
       1.156250
        ALTO
       0.800
       0.200
       0.000
    
    
      7 
                     Goiás
       0.615
       0.735
       1.195122
        ALTO
       0.400
       0.600
       0.000
    
    
      8 
              Minas Gerais
       0.624
       0.731
       1.171474
        ALTO
       0.200
       0.800
       0.000
    
    
      9 
        Mato Grosso do Sul
       0.613
       0.729
       1.189233
        ALTO
       0.400
       0.600
       0.000
    
    
      10
               Mato Grosso
       0.601
       0.725
       1.206323
        ALTO
       0.800
       0.200
       0.000
    
    
      11
                     Amapá
       0.577
       0.708
       1.227036
        ALTO
       0.925
       0.000
       0.075
    
    
      12
                   Roraima
       0.598
       0.707
       1.182274
        ALTO
       0.325
       0.400
       0.275
    
    
      13
                 Tocantins
       0.525
       0.699
       1.331429
       BAIXO
       0.800
       0.200
       0.000
    
    
      14
                  Rondônia
       0.537
       0.690
       1.284916
       BAIXO
       0.600
       0.400
       0.000
    
    
      15
       Rio Grande do Norte
       0.552
       0.684
       1.239130
        ALTO
       1.000
       0.000
       0.000
    
    
      16
                     Ceará
       0.541
       0.682
       1.260628
       BAIXO
       0.400
       0.000
       0.600
    
    
      17
                  Amazonas
       0.515
       0.674
       1.308738
       BAIXO
       1.000
       0.000
       0.000
    
    
      18
                Pernambuco
       0.544
       0.673
       1.237132
       BAIXO
       1.000
       0.000
       0.000
    
    
      19
                   Sergipe
       0.518
       0.665
       1.283784
       BAIXO
       0.400
       0.200
       0.400
    
    
      20
                      Acre
       0.517
       0.663
       1.282398
       BAIXO
       0.000
       0.000
       1.000
    
    
      21
                     Bahia
       0.512
       0.660
       1.289062
       BAIXO
       0.600
       0.000
       0.400
    
    
      22
                   Paraíba
       0.506
       0.658
       1.300395
       BAIXO
       0.450
       0.550
       0.000
    
    
      23
                     Piauí
       0.484
       0.646
       1.334711
       BAIXO
       0.200
       0.000
       0.800
    
    
      24
                      Pará
       0.518
       0.646
       1.247104
       BAIXO
       0.000
       0.600
       0.400
    
    
      25
                  Maranhão
       0.476
       0.639
       1.342437
       BAIXO
       1.000
       0.000
       0.000
    
    
      26
                   Alagoas
       0.471
       0.631
       1.339703
       BAIXO
       1.000
       0.000
       0.000



In [54]:

    
sns.set()
sns.pairplot(df, hue="idh_level_2000", size=2.5)









    Out[54]:





<seaborn.axisgrid.PairGrid at 0x23786e48>



In [62]:

    
sns.coefplot("center(I2010) ~ PT + PSDB + Outros + C(idh_level_2000)", df, palette="Set1");

Impacto por partido ou nível do IDH-M em 2000



In [56]:

    
sns.coefplot("center(Ratio) ~ center(PT) + center(PSDB) + center(Outros) + C(idh_level_2000)", df, palette="Set1");
sns.coefplot("center(Ratio) ~ center(PT) + center(PSDB) + C(idh_level_2000)", df, palette="Set1");
sns.coefplot("center(Ratio) ~ center(PT) + center(PSDB) + C(idh_level_2000)", df, palette="Set1");
sns.coefplot("center(Ratio) ~ center(PT) + center(PSDB) + C(idh_level_2000)", df, palette="Set1");



In [57]:

    
from statsmodels.formula.api import ols



In [58]:

    
formula = "center(Ratio) ~ center(PT) + center(PSDB) + center(Outros) + C(idh_level_2000)"
model = ols(formula, df).fit()
model.summary()









    Out[58]:





OLS Regression Results

  Dep. Variable:       center(Ratio)     R-squared:             0.786


  Model:                    OLS          Adj. R-squared:        0.758


  Method:              Least Squares     F-statistic:           28.17


  Date:              Wed, 22 Apr 2015    Prob (F-statistic):  7.03e-08


  Time:                  01:11:52        Log-Likelihood:       53.727


  No. Observations:           27         AIC:                  -99.45


  Df Residuals:               23         BIC:                  -94.27


  Df Model:                    3                                     


  Covariance Type:       nonrobust                                   




                             coef      std err       t       P>|t|  [95.0% Conf. Int.] 


  Intercept                   -0.0608      0.010     -6.047   0.000     -0.082    -0.040


  idh_level_2000[T.BAIXO]      0.1263      0.015      8.309   0.000      0.095     0.158


  center(PT)                  -0.0089      0.019     -0.463   0.648     -0.048     0.031


  center(PSDB)                -0.0130      0.018     -0.719   0.479     -0.050     0.024


  center(Outros)               0.0218      0.014      1.600   0.123     -0.006     0.050




  Omnibus:         1.722    Durbin-Watson:         1.385


  Prob(Omnibus):   0.423    Jarque-Bera (JB):      1.085


  Skew:           -0.142    Prob(JB):              0.581


  Kurtosis:        2.060    Cond. No.           1.14e+16

Não foi possível observar diferença significantiva entre os partidos.

Modelo com pooling

Será que existe diferença para quem está acima da média de IDH?



In [59]:

    
sns.coefplot("center(Ratio) ~ center(PSDB) + center(PT) + center(Outros)",df[df.idh_level_2000=="ALTO"], palette="Set1");
sns.coefplot("center(Ratio) ~ center(PSDB) + center(PT)", df[df.idh_level_2000=="ALTO"], palette="Set1");
sns.coefplot("center(Ratio) ~ center(Outros) + center(PSDB)", df[df.idh_level_2000=="ALTO"], palette="Set1");
sns.coefplot("center(Ratio) ~ center(PSDB) + center(Outros)", df[df.idh_level_2000=="ALTO"], palette="Set1");









    



---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-59-e45466c70f43> in <module>()
----> 1 sns.coefplot("center(Ratio) ~ center(PSDB) + center(PT) + center(Outros)",df[df.idh_level_2000==0], palette="Set1");
      2 sns.coefplot("center(Ratio) ~ center(PSDB) + center(PT)", df[df.idh_level_2000==0], palette="Set1");
      3 sns.coefplot("center(Ratio) ~ center(Outros) + center(PSDB)", df[df.idh_level_2000==0], palette="Set1");
      4 sns.coefplot("center(Ratio) ~ center(PSDB) + center(Outros)", df[df.idh_level_2000==0], palette="Set1");

C:\Python27\lib\site-packages\seaborn\linearmodels.pyc in coefplot(formula, data, groupby, intercept, ci, palette)
   1302     alpha = 1 - ci / 100
   1303     if groupby is None:
-> 1304         coefs = sf.ols(formula, data).fit().params
   1305         cis = sf.ols(formula, data).fit().conf_int(alpha)
   1306     else:

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\model.pyc in from_formula(cls, formula, data, subset, *args, **kwargs)
    148         kwargs.update({'missing_idx': missing_idx,
    149                        'missing': missing})
--> 150         mod = cls(endog, exog, *args, **kwargs)
    151         mod.formula = formula
    152 

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\regression\linear_model.pyc in __init__(self, endog, exog, missing, hasconst, **kwargs)
    689                  **kwargs):
    690         super(OLS, self).__init__(endog, exog, missing=missing,
--> 691                                   hasconst=hasconst, **kwargs)
    692         if "weights" in self._init_keys:
    693             self._init_keys.remove("weights")

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\regression\linear_model.pyc in __init__(self, endog, exog, weights, missing, hasconst, **kwargs)
    584             weights = weights.squeeze()
    585         super(WLS, self).__init__(endog, exog, missing=missing,
--> 586                                   weights=weights, hasconst=hasconst, **kwargs)
    587         nobs = self.exog.shape[0]
    588         weights = self.weights

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\regression\linear_model.pyc in __init__(self, endog, exog, **kwargs)
     89     """
     90     def __init__(self, endog, exog, **kwargs):
---> 91         super(RegressionModel, self).__init__(endog, exog, **kwargs)
     92         self._data_attr.extend(['pinv_wexog', 'wendog', 'wexog', 'weights'])
     93 

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\model.pyc in __init__(self, endog, exog, **kwargs)
    184 
    185     def __init__(self, endog, exog=None, **kwargs):
--> 186         super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
    187         self.initialize()
    188 

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\model.pyc in __init__(self, endog, exog, **kwargs)
     58         hasconst = kwargs.pop('hasconst', None)
     59         self.data = self._handle_data(endog, exog, missing, hasconst,
---> 60                                       **kwargs)
     61         self.k_constant = self.data.k_constant
     62         self.exog = self.data.exog

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\model.pyc in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
     82 
     83     def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
---> 84         data = handle_data(endog, exog, missing, hasconst, **kwargs)
     85         # kwargs arrays could have changed, easier to just attach here
     86         for key in kwargs:

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\data.pyc in handle_data(endog, exog, missing, hasconst, **kwargs)
    564     klass = handle_data_class_factory(endog, exog)
    565     return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
--> 566                  **kwargs)

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\data.pyc in __init__(self, endog, exog, missing, hasconst, **kwargs)
     73 
     74         # this has side-effects, attaches k_constant and const_idx
---> 75         self._handle_constant(hasconst)
     76         self._check_integrity()
     77         self._cache = resettable_cache()

C:\Python27\lib\site-packages\statsmodels-0.6.1-py2.7-win-amd64.egg\statsmodels\base\data.pyc in _handle_constant(self, hasconst)
     91             # detect where the constant is
     92             check_implicit = False
---> 93             const_idx = np.where(self.exog.ptp(axis=0) == 0)[0].squeeze()
     94             self.k_constant = const_idx.size
     95 

ValueError: zero-size array to reduction operation maximum which has no identity



In [ ]:

    
formula = "scale(Ratio) ~ PT + PSDB"
model = ols(formula, df[df.idh_level_2000=="ALTO"]).fit()
model.summary()

Não foi possível identificar diferença nesse caso também.

Será que existe diferença então para quem está abaixo da média de IDH?



In [ ]:

    
sns.coefplot("center(Ratio) ~ center(PSDB) + center(PT) + center(Outros)",df[df.idh_level_2000=="BAIXO"], palette="Set1");
sns.coefplot("center(Ratio) ~ center(PSDB) + center(PT)", df[df.idh_level_2000=="BAIXO"], palette="Set1");
sns.coefplot("center(Ratio) ~ center(PT) + center(Outros)", df[df.idh_level_2000=="BAIXO"], palette="Set1");
sns.coefplot("center(Ratio) ~ center(PSDB) + center(Outros)", df[df.idh_level_2000=="BAIXO"], palette="Set1");



In [ ]:

    
formula = "scale(Ratio) ~ PT + PSDB"
model = ols(formula, df[df.idh_level_2000=="BAIXO"], ).fit()
model.summary()

Também não foi possível identificar diferença estatisticamente relevante.

Faz diferença se é aliado ao governo Federal ou não?

TDB



In [ ]:

	I2000	I2010	Ratio
count	27.000000	27.000000	27.000000
mean	0.576407	0.704519	1.230795
std	0.072960	0.049284	0.072885
min	0.471000	0.631000	1.115385
25%	0.517500	0.664000	1.163862
50%	0.552000	0.699000	1.237132
75%	0.632000	0.737500	1.286989
max	0.725000	0.824000	1.342437

	Estado	I2000	I2010	Ratio	idh_level_2000
0	Distrito Federal	0.725	0.824	1.136552	ALTO
1	São Paulo	0.702	0.783	1.115385	ALTO
2	Santa Catarina	0.674	0.774	1.148368	ALTO
3	Rio de Janeiro	0.664	0.761	1.146084	ALTO
4	Paraná	0.650	0.749	1.152308	ALTO
5	Rio Grande do Sul	0.664	0.746	1.123494	ALTO
6	Espírito Santo	0.640	0.740	1.156250	ALTO
7	Goiás	0.615	0.735	1.195122	ALTO
8	Minas Gerais	0.624	0.731	1.171474	ALTO
9	Mato Grosso do Sul	0.613	0.729	1.189233	ALTO
10	Mato Grosso	0.601	0.725	1.206323	ALTO
11	Amapá	0.577	0.708	1.227036	ALTO
12	Roraima	0.598	0.707	1.182274	ALTO
13	Tocantins	0.525	0.699	1.331429	BAIXO
14	Rondônia	0.537	0.690	1.284916	BAIXO
15	Rio Grande do Norte	0.552	0.684	1.239130	ALTO
16	Ceará	0.541	0.682	1.260628	BAIXO
17	Amazonas	0.515	0.674	1.308738	BAIXO
18	Pernambuco	0.544	0.673	1.237132	BAIXO
19	Sergipe	0.518	0.665	1.283784	BAIXO
20	Acre	0.517	0.663	1.282398	BAIXO
21	Bahia	0.512	0.660	1.289062	BAIXO
22	Paraíba	0.506	0.658	1.300395	BAIXO
23	Piauí	0.484	0.646	1.334711	BAIXO
24	Pará	0.518	0.646	1.247104	BAIXO
25	Maranhão	0.476	0.639	1.342437	BAIXO
26	Alagoas	0.471	0.631	1.339703	BAIXO

	Estado	Outros	PSDB	PT
0	Distrito Federal	1.000	0.000	0.000
1	São Paulo	0.075	0.925	0.000
2	Santa Catarina	1.000	0.000	0.000
3	Rio de Janeiro	0.600	0.000	0.400
4	Paraná	1.000	0.000	0.000
5	Rio Grande do Sul	0.400	0.400	0.200
6	Espírito Santo	0.800	0.200	0.000
7	Goiás	0.400	0.600	0.000
8	Minas Gerais	0.200	0.800	0.000
9	Mato Grosso do Sul	0.400	0.600	0.000
10	Mato Grosso	0.800	0.200	0.000
11	Amapá	0.925	0.000	0.075
12	Roraima	0.325	0.400	0.275
13	Tocantins	0.800	0.200	0.000
14	Rondônia	0.600	0.400	0.000
15	Rio Grande do Norte	1.000	0.000	0.000
16	Ceará	0.400	0.000	0.600
17	Amazonas	1.000	0.000	0.000
18	Pernambuco	1.000	0.000	0.000
19	Sergipe	0.400	0.200	0.400
20	Acre	0.000	0.000	1.000
21	Bahia	0.600	0.000	0.400
22	Paraíba	0.450	0.550	0.000
23	Piauí	0.200	0.000	0.800
24	Pará	0.000	0.600	0.400
25	Maranhão	1.000	0.000	0.000
26	Alagoas	1.000	0.000	0.000

Dep. Variable:	center(Ratio)	R-squared:	0.786
Model:	OLS	Adj. R-squared:	0.758
Method:	Least Squares	F-statistic:	28.17
Date:	Wed, 22 Apr 2015	Prob (F-statistic):	7.03e-08
Time:	01:11:52	Log-Likelihood:	53.727
No. Observations:	27	AIC:	-99.45
Df Residuals:	23	BIC:	-94.27
Df Model:	3
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[95.0% Conf. Int.]
Intercept	-0.0608	0.010	-6.047	0.000	-0.082 -0.040
idh_level_2000[T.BAIXO]	0.1263	0.015	8.309	0.000	0.095 0.158
center(PT)	-0.0089	0.019	-0.463	0.648	-0.048 0.031
center(PSDB)	-0.0130	0.018	-0.719	0.479	-0.050 0.024
center(Outros)	0.0218	0.014	1.600	0.123	-0.006 0.050

Omnibus:	1.722	Durbin-Watson:	1.385
Prob(Omnibus):	0.423	Jarque-Bera (JB):	1.085
Skew:	-0.142	Prob(JB):	0.581
Kurtosis:	2.060	Cond. No.	1.14e+16