Validacion climMAPcore (Aguascalientes)

En el siguiente ejercicio vamos a generar ciertos indicadores estadisticos para la validacion de las salidas de informacion de la aplicacion climMAPcore

Procedimiento

Los valores a comparar son el valor diario de la estacion vs el valor de la aplicacion climMAPcore. La base que se utilizara se encuentra en la carpeta data de nombre dataFromAguascalientestTest.csv la cual incluye los siguientes campos:

  • Station : numero de la estacion
  • State : estado
  • Lat : latitud
  • Long : longitud
  • Year : anio
  • Month : mes
  • Day : dia
  • Rain : precipitacion estacion
  • Hr : humedad relativa estacion
  • Tpro : temperatura promedio estacion
  • RainWRF : precipitacion modelo WRF
  • HrWRF : humedad relativa modelo WRF
  • TproWRF : tmperatura promedio modelo WRF

In [27]:
# librerias
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.formula.api as sm
%matplotlib inline
plt.style.use('grayscale')

In [28]:
# leer archivo
data = pd.read_csv('../data/dataFromAguascalientesClimmapcore.csv')

In [29]:
# verificar su contenido
data.head()


Out[29]:
Station State Lat Long Year Month Day Rain Hr Tpro RainClimmapcore HrClimmapcore TproClimmapcore
0 22.0 AGS 21.7853 21.7853 2017.0 1.0 1.0 0.0 55.49 14.18 0.090005 58.330972 13.649469
1 22.0 AGS 21.7853 21.7853 2017.0 1.0 2.0 0.0 46.33 14.26 0.000000 53.417499 13.828637
2 22.0 AGS 21.7853 21.7853 2017.0 1.0 3.0 0.0 32.51 15.43 0.000000 39.286445 14.939110
3 22.0 AGS 21.7853 21.7853 2017.0 1.0 4.0 0.0 25.60 17.01 0.000000 39.291486 15.209583
4 22.0 AGS 21.7853 21.7853 2017.0 1.0 5.0 0.0 28.03 16.75 0.000000 31.613730 15.769464

In [30]:
# diferencia entre valores de precipitacion, humedad relativa y temperatura promedio
data['diffRain'] = data['Rain'] - data['RainClimmapcore']
data['diffHr'] = data['Hr'] - data['HrClimmapcore']
data['diffTpro'] = data['Tpro'] - data['TproClimmapcore']

In [31]:
# verificar contenido
data.head()


Out[31]:
Station State Lat Long Year Month Day Rain Hr Tpro RainClimmapcore HrClimmapcore TproClimmapcore diffRain diffHr diffTpro
0 22.0 AGS 21.7853 21.7853 2017.0 1.0 1.0 0.0 55.49 14.18 0.090005 58.330972 13.649469 -0.090005 -2.840972 0.530531
1 22.0 AGS 21.7853 21.7853 2017.0 1.0 2.0 0.0 46.33 14.26 0.000000 53.417499 13.828637 0.000000 -7.087499 0.431363
2 22.0 AGS 21.7853 21.7853 2017.0 1.0 3.0 0.0 32.51 15.43 0.000000 39.286445 14.939110 0.000000 -6.776445 0.490890
3 22.0 AGS 21.7853 21.7853 2017.0 1.0 4.0 0.0 25.60 17.01 0.000000 39.291486 15.209583 0.000000 -13.691486 1.800417
4 22.0 AGS 21.7853 21.7853 2017.0 1.0 5.0 0.0 28.03 16.75 0.000000 31.613730 15.769464 0.000000 -3.583730 0.980536

In [32]:
# histograma de diferencias Hr
data['diffHr'].hist()


Out[32]:
<matplotlib.axes._subplots.AxesSubplot at 0x121331358>

In [33]:
# comportamiento de los datos por mes
data.groupby(['Month']).mean()[['Hr','HrClimmapcore']]


Out[33]:
Hr HrClimmapcore
Month
1.0 41.450787 44.815389
2.0 36.156996 41.821895
3.0 37.344165 45.491191
4.0 27.901539 33.474755
5.0 29.969440 35.266573
6.0 47.800255 53.295624

In [34]:
# visualizar los datos en grafica
data.groupby(['Month']).mean()[['Hr','HrClimmapcore']].plot.bar()


Out[34]:
<matplotlib.axes._subplots.AxesSubplot at 0x121f72470>

In [35]:
# histograma de diferencias Tpro
data['diffTpro'].hist()


Out[35]:
<matplotlib.axes._subplots.AxesSubplot at 0x1219d8400>

In [36]:
# comportamiento de los datos por mes
data.groupby(['Month']).mean()[['Tpro','TproClimmapcore']]


Out[36]:
Tpro TproClimmapcore
Month
1.0 13.485769 12.241899
2.0 15.096891 13.478516
3.0 17.353890 15.635514
4.0 19.495078 17.539529
5.0 22.461708 20.859032
6.0 22.240676 21.143828

In [37]:
# visualizar los datos en grafica
data.groupby(['Month']).mean()[['Tpro','TproClimmapcore']].plot.bar()


Out[37]:
<matplotlib.axes._subplots.AxesSubplot at 0x12271fac8>

In [38]:
# histograma de diferencias Rain
data['diffRain'].hist()


Out[38]:
<matplotlib.axes._subplots.AxesSubplot at 0x122899748>

In [39]:
# comportamiento de los datos por mes
data.groupby(['Month']).mean()[['Rain','RainClimmapcore']]


Out[39]:
Rain RainClimmapcore
Month
1.0 0.014991 0.029041
2.0 0.081513 0.195703
3.0 0.292979 0.644869
4.0 0.031078 0.094193
5.0 0.126376 0.709989
6.0 1.480784 3.068844

In [40]:
# visualizar los datos en grafica
data.groupby(['Month']).mean()[['Rain','RainClimmapcore']].plot.bar()


Out[40]:
<matplotlib.axes._subplots.AxesSubplot at 0x122a041d0>

Regresion Lineal


In [41]:
# librerias seabron as sns
import seaborn as sns

In [42]:
# Hr
sns.lmplot(x='Hr',y='HrClimmapcore',data=data, col='Month', aspect=0.6, size=8)


Out[42]:
<seaborn.axisgrid.FacetGrid at 0x122456ba8>

In [43]:
# Tpro
sns.lmplot(x='Tpro',y='TproClimmapcore',data=data, col='Month', aspect=0.6, size=8)


Out[43]:
<seaborn.axisgrid.FacetGrid at 0x1233f2358>

In [44]:
# Rain
sns.lmplot(x='Rain',y='RainClimmapcore',data=data, col='Month', aspect=0.6, size=8)


Out[44]:
<seaborn.axisgrid.FacetGrid at 0x124104a20>

In [45]:
# Rain polynomial regression
sns.lmplot(x='Rain',y='RainClimmapcore',data=data, col='Month', aspect=0.6, size=8, order=2)


Out[45]:
<seaborn.axisgrid.FacetGrid at 0x124d91fd0>

Regresion lineal con p y pearsonr


In [46]:
# Hr
sns.jointplot("Hr", "HrClimmapcore", data=data, kind="reg")


Out[46]:
<seaborn.axisgrid.JointGrid at 0x125b4d8d0>

In [47]:
# Tpro
sns.jointplot("Tpro", "TproClimmapcore", data=data, kind="reg")


Out[47]:
<seaborn.axisgrid.JointGrid at 0x124078780>

In [48]:
# Rain
sns.jointplot("Rain", "RainClimmapcore", data=data, kind="reg",color="k")


Out[48]:
<seaborn.axisgrid.JointGrid at 0x1266d1550>

OLS Regression


In [49]:
# HR
result = sm.ols(formula='HrClimmapcore ~ Hr', data=data).fit()

In [50]:
print(result.params)


Intercept    8.961574
Hr           0.908297
dtype: float64

In [51]:
print(result.summary())


                            OLS Regression Results                            
==============================================================================
Dep. Variable:          HrClimmapcore   R-squared:                       0.735
Model:                            OLS   Adj. R-squared:                  0.735
Method:                 Least Squares   F-statistic:                 1.706e+04
Date:                Tue, 24 Oct 2017   Prob (F-statistic):               0.00
Time:                        11:28:47   Log-Likelihood:                -21078.
No. Observations:                6154   AIC:                         4.216e+04
Df Residuals:                    6152   BIC:                         4.217e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      8.9616      0.273     32.862      0.000       8.427       9.496
Hr             0.9083      0.007    130.614      0.000       0.895       0.922
==============================================================================
Omnibus:                       12.704   Durbin-Watson:                   1.087
Prob(Omnibus):                  0.002   Jarque-Bera (JB):               12.784
Skew:                           0.109   Prob(JB):                      0.00168
Kurtosis:                       2.949   Cond. No.                         113.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

In [52]:
# Tpro
result = sm.ols(formula='TproClimmapcore ~ Tpro', data=data).fit()

In [53]:
print(result.params)


Intercept    0.677357
Tpro         0.879475
dtype: float64

In [54]:
print(result.summary())


                            OLS Regression Results                            
==============================================================================
Dep. Variable:        TproClimmapcore   R-squared:                       0.855
Model:                            OLS   Adj. R-squared:                  0.855
Method:                 Least Squares   F-statistic:                 3.634e+04
Date:                Tue, 24 Oct 2017   Prob (F-statistic):               0.00
Time:                        11:28:47   Log-Likelihood:                -10994.
No. Observations:                6154   AIC:                         2.199e+04
Df Residuals:                    6152   BIC:                         2.201e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.6774      0.087      7.806      0.000       0.507       0.847
Tpro           0.8795      0.005    190.640      0.000       0.870       0.889
==============================================================================
Omnibus:                      108.862   Durbin-Watson:                   0.478
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              160.688
Skew:                           0.197   Prob(JB):                     1.28e-35
Kurtosis:                       3.687   Cond. No.                         88.9
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

In [55]:
# Rain
result = sm.ols(formula='RainClimmapcore ~ Rain', data=data).fit()

In [56]:
print(result.params)


Intercept    0.587440
Rain         0.604637
dtype: float64

In [57]:
print(result.summary())


                            OLS Regression Results                            
==============================================================================
Dep. Variable:        RainClimmapcore   R-squared:                       0.190
Model:                            OLS   Adj. R-squared:                  0.190
Method:                 Least Squares   F-statistic:                     1441.
Date:                Tue, 24 Oct 2017   Prob (F-statistic):          1.75e-283
Time:                        11:28:47   Log-Likelihood:                -14737.
No. Observations:                6154   AIC:                         2.948e+04
Df Residuals:                    6152   BIC:                         2.949e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.5874      0.034     17.152      0.000       0.520       0.655
Rain           0.6046      0.016     37.960      0.000       0.573       0.636
==============================================================================
Omnibus:                     7397.878   Durbin-Watson:                   1.200
Prob(Omnibus):                  0.000   Jarque-Bera (JB):          1297078.169
Skew:                           6.293   Prob(JB):                         0.00
Kurtosis:                      73.001   Cond. No.                         2.19
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Histogramas seaborn


In [58]:
# Hr
sns.distplot(data['diffHr'],color="k")


Out[58]:
<matplotlib.axes._subplots.AxesSubplot at 0x126c739b0>

In [59]:
# Tpro
sns.distplot(data['diffTpro'], color="k")


Out[59]:
<matplotlib.axes._subplots.AxesSubplot at 0x126ceaba8>

In [60]:
# Rain
sns.distplot(data['diffRain'], color="k")


Out[60]:
<matplotlib.axes._subplots.AxesSubplot at 0x126f269e8>

In [ ]: