The Pst class

The pst_handler module contains the Pst class for dealing with pest control files. It relies heavily on pandas to deal with tabular sections, such as parameters, observations, and prior information.


In [3]:
from __future__ import print_function
import os
import numpy as np
import pyemu
from pyemu import Pst

We need to pass the name of a pest control file to instantiate the class. The class instance (or object) is assigned to the variable p.


In [4]:
pst_name = os.path.join("..", "..", "examples", "henry","pest.pst")
p = Pst(pst_name)

Now all of the relevant parts of the pest control file are attributes of the object. For example, the parameter_data, observation data, and prior information are available as pandas dataframes.


In [5]:
p.parameter_data.head()


Out[5]:
parnme partrans parchglim parval1 parlbnd parubnd pargp scale offset dercom extra
parnme
global_k global_k log factor 200.0 150.00 250.00 m 1.0 0.0 1 NaN
mult1 mult1 log factor 1.0 0.75 1.25 m 1.0 0.0 1 NaN
mult2 mult2 log factor 1.0 0.50 2.00 m 1.0 0.0 1 NaN
kr01c01 kr01c01 log factor 1.0 0.10 10.00 p 1.0 0.0 1 NaN
kr01c02 kr01c02 log factor 1.0 0.10 10.00 p 1.0 0.0 1 NaN

In [6]:
p.observation_data.head()


Out[6]:
obsnme obsval weight obgnme extra
obsnme
h_obs01_1 h_obs01_1 0.051396 152.1458 head NaN
h_obs01_2 h_obs01_2 0.022156 0.0000 head NaN
h_obs02_1 h_obs02_1 0.046879 152.1458 head NaN
h_obs02_2 h_obs02_2 0.020853 0.0000 head NaN
h_obs03_1 h_obs03_1 0.036584 152.1458 head NaN

In [7]:
p.prior_information.head()


Out[7]:
equation obgnme pilbl weight
pilbl
mult1 1.0 * log(mult1) = 0.000000 regul_m mult1 1.0
kr01c01 1.0 * log(kr01c01) = 0.0 regul_p kr01c01 1.0
kr01c02 1.0 * log(kr01c02) = 0.0 regul_p kr01c02 1.0
kr01c03 1.0 * log(kr01c03) = 0.0 regul_p kr01c03 1.0
kr01c04 1.0 * log(kr01c04) = 0.0 regul_p kr01c04 1.0

A residual file (.rei or res) can also be passed to the resfile argument at instantiation to enable some simple residual analysis and weight adjustments. If the residual file is in the same directory as the pest control file and has the same base name, it will be accessed automatically:


In [8]:
p.res.head()


Out[8]:
name group measured modelled residual weight
name
h_obs01_1 h_obs01_1 head 0.051396 0.080402 -0.029006 152.1458
h_obs01_2 h_obs01_2 head 0.022156 0.036898 -0.014742 0.0000
h_obs02_1 h_obs02_1 head 0.046879 0.069121 -0.022241 152.1458
h_obs02_2 h_obs02_2 head 0.020853 0.034311 -0.013458 0.0000
h_obs03_1 h_obs03_1 head 0.036584 0.057722 -0.021138 152.1458

The pst class has some @decorated convience methods related to the residuals allowing the user to access the values and print in a straightforward way.


In [9]:
print(p.phi,p.phi_components)


1855.6874378297073 {'conc': 197.05822096106502, 'head': 1658.6292168686423, 'regul_m': 0.0, 'regul_p': 0.0}

Some additional @decorated convience methods:


In [10]:
print(p.npar,p.nobs,p.nprior)


603 75 601

In [11]:
print(p.par_groups,p.obs_groups)


['m', 'p'] ['conc', 'head']

Printing the attribue type shows that some are returned as lists and others single values.


In [14]:
print(type(p.par_names)) # all parameter names
print(type(p.adj_par_names)) # adjustable parameter names
print(type(p.obs_names)) # all observation names
print(type(p.nnz_obs_names)) # non-zero weight observations
print(type(p.phi))  # float value that is the weighted total objective function


<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'float'>

The "control_data" section of the pest control file is accessible in the Pst.control_data attribute:


In [15]:
print('jacupdate = {0}'.format(p.control_data.jacupdate))
print('numlam = {0}'.format(p.control_data.numlam))
p.control_data.numlam = 100
print('numlam has been changed to --> {0}'.format(p.control_data.numlam))


jacupdate = 999
numlam = 10
numlam has been changed to --> 100

The Pst class also exposes a method to get a new Pst instance with a subset of parameters and or obseravtions. Note this method does not propogate prior information to the new instance:


In [16]:
pnew = p.get(p.par_names[:10],p.obs_names[-10:])
print(pnew.prior_information)


                            equation   obgnme    pilbl  weight      names
pilbl                                                                    
mult1    1.0 * log(mult1) = 0.000000  regul_m    mult1     1.0    [mult1]
kr01c01     1.0 * log(kr01c01) = 0.0  regul_p  kr01c01     1.0  [kr01c01]
kr01c02     1.0 * log(kr01c02) = 0.0  regul_p  kr01c02     1.0  [kr01c02]
kr01c03     1.0 * log(kr01c03) = 0.0  regul_p  kr01c03     1.0  [kr01c03]
kr01c04     1.0 * log(kr01c04) = 0.0  regul_p  kr01c04     1.0  [kr01c04]
kr01c05     1.0 * log(kr01c05) = 0.0  regul_p  kr01c05     1.0  [kr01c05]
kr01c06     1.0 * log(kr01c06) = 0.0  regul_p  kr01c06     1.0  [kr01c06]
kr01c07     1.0 * log(kr01c07) = 0.0  regul_p  kr01c07     1.0  [kr01c07]

You can also write a pest control file with altered parameters, observations, and/or prior information:


In [17]:
pnew.write("test.pst")


C:\Python27\Anaconda\envs\py36\lib\site-packages\pandas\core\indexing.py:337: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[key] = _infer_fill_value(value)
C:\Python27\Anaconda\envs\py36\lib\site-packages\pandas\core\indexing.py:517: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s

Some other methods in Pst include:


In [18]:
# add preferred value regularization with weights proportional to parameter bounds
pyemu.utils.helpers.zero_order_tikhonov(pnew)
pnew.prior_information.head()


Out[18]:
equation obgnme pilbl weight
pilbl
global_k 1.0 * log(global_k) = 2.301030E+00 regulm global_k 4.507576
mult1 1.0 * log(mult1) = 0.000000E+00 regulm mult1 4.507576
mult2 1.0 * log(mult2) = 0.000000E+00 regulm mult2 1.660964
kr01c01 1.0 * log(kr01c01) = 0.000000E+00 regulp kr01c01 0.500000
kr01c02 1.0 * log(kr01c02) = 0.000000E+00 regulp kr01c02 0.500000

In [19]:
# add preferred value regularization with unity weights
pyemu.utils.helpers.zero_order_tikhonov(pnew,parbounds=False)
pnew.prior_information.head()


Out[19]:
equation obgnme pilbl weight
0 1.0 * log(global_k) = 2.301030E+00 regulm global_k 1.0
1 1.0 * log(mult1) = 0.000000E+00 regulm mult1 1.0
2 1.0 * log(mult2) = 0.000000E+00 regulm mult2 1.0
3 1.0 * log(kr01c01) = 0.000000E+00 regulp kr01c01 1.0
4 1.0 * log(kr01c02) = 0.000000E+00 regulp kr01c02 1.0

Some more res functionality


In [20]:
# adjust observation weights to account for residual phi components
#pnew = p.get()
print(p.phi, p.nnz_obs, p.phi_components)
p.adjust_weights_resfile()
print(p.phi, p.nnz_obs, p.phi_components)


1855.6874378297073 36 {'conc': 197.05822096106502, 'head': 1658.6292168686423}
36.0 36 {'conc': 15.000000000000004, 'head': 21.0}

adjust observation weights by an arbitrary amount by groups:


In [21]:
print(p.phi, p.nnz_obs, p.phi_components)
grp_dict = {"head":100}
p.adjust_weights(obsgrp_dict=grp_dict)
print(p.phi, p.nnz_obs, p.phi_components)


36.0 36 {'conc': 15.000000000000004, 'head': 21.0}
114.99999999999997 36 {'conc': 15.000000000000004, 'head': 99.99999999999997}

adjust observation weights by an arbitrary amount by individual observations:


In [22]:
print(p.phi, p.nnz_obs, p.phi_components)
obs_dict = {"h_obs01_1":25}
p.adjust_weights(obs_dict=obs_dict)
print(p.phi, p.nnz_obs, p.phi_components)


114.99999999999997 36 {'conc': 15.000000000000004, 'head': 99.99999999999997}
138.82580701146588 36 {'conc': 15.000000000000004, 'head': 123.82580701146588}
C:\Python27\Anaconda\envs\py36\lib\site-packages\pyemu-0.4-py3.6.egg\pyemu\pst\pst_handler.py:1586: FutureWarning: 'name' is both a column name and an index level.
Defaulting to column but this will raise an ambiguity error in a future version
C:\Python27\Anaconda\envs\py36\lib\site-packages\pyemu-0.4-py3.6.egg\pyemu\pst\pst_handler.py:1587: FutureWarning: 'obsnme' is both a column name and an index level.
Defaulting to column but this will raise an ambiguity error in a future version

setup weights inversely proportional to the observation values


In [23]:
p.adjust_weights_resfile()
print(p.phi, p.nnz_obs, p.phi_components)
p.proportional_weights(fraction_stdev=0.1,wmax=20.0)
print(p.phi, p.nnz_obs, p.phi_components)


36.0 36 {'conc': 14.999999999999998, 'head': 21.0}
222.96996562151 36 {'conc': 194.30909581453392, 'head': 28.66086980697609}