System of non-linear equations - Coupled chemical equilibria

Author: Björn Dahlgren, Applied Physcial Chemistry, KTH Royal Insitiute of Technology

In this example we will study the equilibria between aqueous cupric ions and ammonia. We will use ChemPy which is a Python package collecting functions and classes useful for chemistry related problems. We will also make use of SymPy for manipulating and inspecting the formulae encountered.


In [ ]:
from collections import defaultdict
from chempy import atomic_number
from chempy.chemistry import Species, Equilibrium
from chempy.equilibria import EqSystem, NumSysLin, NumSysLog, NumSysSquare
from IPython.display import Latex, display
import matplotlib.pyplot as plt
%matplotlib inline
def show(s):  # convenience function
    display(Latex('$'+s+'$'))
import sympy; import chempy; print('SymPy: %s, ChemPy: %s' % (sympy.__version__, chempy.__version__))

Let's define our species with names and composition, ChemPy can parse chemical formulae:


In [ ]:
NH3_complexes = ['CuNH3+2', 'Cu(NH3)2+2', 'Cu(NH3)3+2', 'Cu(NH3)4+2', 'Cu(NH3)5+2']
OH_complexes = ['Cu2(OH)2+2', 'Cu(OH)3-', 'Cu(OH)4-2']
substances = [
    Species.from_formula(n) for n in ['H+', 'OH-', 'NH4+', 'NH3', 'H2O', 'Cu+2'] + 
        NH3_complexes + OH_complexes + ['Cu(OH)2(s)']
] #, CuOHp, CuOH2,
substance_names = [s.name for s in substances]

Let's see how the species are pretty-printed:


In [ ]:
show(', '.join([s.latex_name for s in substances]))

Let's define some initial concentrations. We will consider different amount of added ammonia in 10 mM solutions of $Cu^{2+}$:


In [ ]:
init_conc = defaultdict(float, {'H+': 1e-7, 'OH-': 1e-7, 'NH4+': 0,
                                'NH3': 1.0, 'Cu+2': 1e-2, 'H2O': 55.5})

Now, let us define the equilibria, data are from course material at Applied Physcial Chemistry, KTH Royal Insitiute of Technology.


In [ ]:
H2O_c = init_conc['H2O']
w_autop = Equilibrium({'H2O': 1}, {'H+': 1, 'OH-': 1}, 10**-14/H2O_c)
NH4p_pr = Equilibrium({'NH4+': 1}, {'H+': 1, 'NH3': 1}, 10**-9.26)
CuOH2_s = Equilibrium({'Cu(OH)2(s)': 1}, {'Cu+2': 1, 'OH-': 2}, 10**-18.8)
CuOH_B3 = Equilibrium({'Cu(OH)2(s)': 1, 'OH-': 1}, {'Cu(OH)3-': 1}, 10**-3.6)
CuOH_B4 = Equilibrium({'Cu(OH)2(s)': 1, 'OH-': 2}, {'Cu(OH)4-2': 1}, 10**-2.7)
Cu2OH2 = Equilibrium({'Cu+2': 2, 'H2O': 2}, {'Cu2(OH)2+2': 1, 'H+': 2}, 10**-10.6 / H2O_c**2)
CuNH3_B1 = Equilibrium({'CuNH3+2': 1}, {'Cu+2': 1, 'NH3': 1}, 10**-4.3)
CuNH3_B2 = Equilibrium({'Cu(NH3)2+2': 1}, {'Cu+2': 1, 'NH3': 2}, 10**-7.9)
CuNH3_B3 = Equilibrium({'Cu(NH3)3+2': 1}, {'Cu+2': 1, 'NH3': 3}, 10**-10.8)
CuNH3_B4 = Equilibrium({'Cu(NH3)4+2': 1}, {'Cu+2': 1, 'NH3': 4}, 10**-13.0)
CuNH3_B5 = Equilibrium({'Cu(NH3)5+2': 1}, {'Cu+2': 1, 'NH3': 5}, 10**-12.4)
equilibria = w_autop, NH4p_pr, CuNH3_B1, CuNH3_B2, CuNH3_B3, CuNH3_B4, CuNH3_B5, Cu2OH2, CuOH2_s, CuOH_B3, CuOH_B4

Let's see if we can print equilibria in a human-readable form:


In [ ]:
show(', '.join([s.latex_name for s in substances]))
show('~')
from math import log10
for eq in equilibria:
    ltx = eq.latex(dict(zip(substance_names, substances)))
    show(ltx + '~'*(80-len(ltx)) + 'lgK = {0:12.5g}'.format(log10(eq.param)))  # latex table would be better...

To keep our numerical treatment as simple as possible we will try to avoid representing $Cu(OH)_2(s)$ explicitly (which is present in the three last equilibria). This is becuase the system of equations change when precipitation sets in.

However, we do want to keep the two last equilibria, therefore we rewrite those using the dissolution equilibria them only using dissolved species:


In [ ]:
new_eqs = CuOH2_s - CuOH_B3, CuOH2_s - CuOH_B4
[str(_) for _ in new_eqs]

Now it's time to exclude the precipitate species and replace the last three equilibria with our two new ones:


In [ ]:
#skip_subs, skip_eq = (4, 4) # (0, 0), (1, 1), (3, 3), (4, 4), (11, 9)
skip_subs, skip_eq = (1, 3)
simpl_subs = substances[:-skip_subs]
simpl_eq = equilibria[:-skip_eq] + new_eqs
simpl_c0 = {k.name: init_conc[k.name] for k in simpl_subs}

From the law of mass action we can from the equilbria and from the preservation of mass and charge formulate a non-linear system of equations:


In [ ]:
import sympy as sp
import numpy as np
sp.init_printing()
eqsys = EqSystem(simpl_eq, simpl_subs)
x, i, Ks = sp.symarray('x', eqsys.ns), sp.symarray('i', eqsys.ns), sp.symarray('K', eqsys.nr)
params = np.concatenate((i, Ks))
numsys_lin = NumSysLin(eqsys, backend=sp)
numsys_lin.f(x, params)

It turns out that the success of the numerical root finding process for above system of equations is terribly sensitive on the choice of the initial guess. We therefore reformulate the equations in terms of the logarithm of the concentrations:


In [ ]:
numsys_log = NumSysLog(eqsys, backend=sp)
f = numsys_log.f(x, params)
f

We can take a peek on the jacobian of this vector:


In [ ]:
sp.Matrix(1, len(f), lambda _, q: f[q]).jacobian(x)

The preservation equations of mass and charge actually contain a redundant equation, so currently our system is over-determined:


In [ ]:
len(f), eqsys.ns

We could cast the preservation equations into reduced row echelon form (which would remove one equation), but for now we'll leave this be and rely on the Levenberg-Marquardt algorithm to solve our problem in a least-squares sense. (Levenberg-Marquardt uses QR-decomposition internally for which it is acceptable to have overdetermined systems).

Let's solve the equations for our inital concentrations:


In [ ]:
C, sol, sane = eqsys.root(simpl_c0, NumSys=NumSysLog)
assert sol['success'] and sane
C

Great, let's now vary the initial concentration of $NH_3$ and plot the equilibrium concentrations of our species:


In [ ]:
import numpy as np
plt.figure(figsize=(20,8))
NH3_varied = np.logspace(-4, 0, 200)
Cout_logC, extra, success = eqsys.roots(
    simpl_c0, NH3_varied, 'NH3', NumSys=NumSysLog, plot_kwargs={'latex_names': True})
all(success), sum([nfo['nfev'] for nfo in extra['info']]), sum([nfo['njev'] for nfo in extra['info']])
_ = plt.gca().set_ylim([1e-10, 60])

But the above diagram is only true if we are below the solubility limit of our neglected $\rm Cu(OH)_2(s)$.

Let's plot the solubility product in the same diagram:


In [ ]:
sol_prod = Cout_logC[:, eqsys.as_substance_index('Cu+2')]*Cout_logC[:, eqsys.as_substance_index('OH-')]**2
plt.figure(figsize=(20,6))
plt.loglog(NH3_varied, sol_prod, label='[$%s$][$%s$]$^2$' % (eqsys.substances['Cu+2'].latex_name,
                                                             eqsys.substances['OH-'].latex_name))
plt.loglog(NH3_varied, Cout_logC[:, eqsys.as_substance_index('H+')], ls=':',
           label='[$%s$]' % eqsys.substances['H+'].latex_name)
plt.loglog([NH3_varied[0], NH3_varied[-1]], [10**-18.8, 10**-18.8], 'k--', label='$K_{sp}(Cu(OH)_2(s))$')
plt.xlabel('[$NH_3$]')
_ = plt.legend()

We see that for a ammonia concentration exceeding ~500-600 mM we would not precipitate $Cu(OH)_2(s)$ even though our pH is quite high (almost 12).

We have solved the above system of equations for the logarithm of the concentrations. How big are our absolute and relative errors compared to the linear system? Let's plot them:


In [ ]:
fig, axes = plt.subplots(1, 2, figsize=(20, 8))
eqsys.plot_errors(Cout_logC, simpl_c0, NH3_varied, 'NH3', axes=axes)

Not bad. So the problem is essentially solved. But let's say that it is very important to know the exact position of the intersection for the solutbility limit. We can locate it using the secant method:


In [ ]:
from scipy.optimize import newton
convergence = []
def sol_lim(c_NH3):
    c0 = simpl_c0.copy()
    c0['NH3'] = c_NH3
    C, sol, sane = eqsys.root(c0, NumSys=NumSysLog)
    assert sol['success'] and sane
    prod = C[eqsys.as_substance_index('Cu+2')]*C[eqsys.as_substance_index('OH-')]**2
    discrepancy = prod/10**-18.8 - 1
    convergence.append(discrepancy)
    return discrepancy
Climit_NH3 = newton(sol_lim, 0.5)
convergence = np.array(convergence).reshape((len(convergence), 1))
print(convergence)
Climit_NH3

For fun, let's see what the equation system looks like if we canonicalize it by transforming the equations for equibliria and the equations for the preservation relations to their respective reduced row echelon form:


In [ ]:
numsys_log_rref = NumSysLog(eqsys, rref_equil=True, rref_preserv=True, backend=sp)
rf = numsys_log_rref.f(x, params)
rf

So the Jacobian should be considerably more diagonally dominant now:


In [ ]:
sp.Matrix(1, len(rf), lambda _, q: rf[q]).jacobian(x)

And let's see if this system converges as well:


In [ ]:
plt.figure(figsize=(20,8))
out = eqsys.roots(simpl_c0, NH3_varied, 'NH3', rref_equil=True,
                  rref_preserv=True, NumSys=NumSysLog, plot_kwargs={'latex_names': True})
_ = plt.gca().set_ylim([1e-10, 60])

Sinvce version 0.2.0 of chempy there is support for automatic reformulation of the system of equations when precipitation occurs:


In [ ]:
full_eqsys = EqSystem(equilibria[:-2] + new_eqs, substances)
full_numsys_log_rref = NumSysLog(full_eqsys, rref_equil=False, rref_preserv=False, precipitates=[False], backend=sp)
full_x, full_i, full_Ks = sp.symarray('x', full_eqsys.ns), sp.symarray('i', full_eqsys.ns), sp.symarray('K', full_eqsys.nr)
full_rf = full_numsys_log_rref.f(full_x, np.concatenate((full_i, full_Ks)))
full_rf

In [ ]:
def solve_and_plot_full(NumSys, plot_kwargs={'latex_names': True}, **kwargs):
    plt.figure(figsize=(18, 7))
    result = full_eqsys.roots(init_conc, NH3_varied, 'NH3', NumSys=NumSys, plot_kwargs=plot_kwargs or {}, **kwargs)
    
    try:
        cur_val = None
        for idx, condition in enumerate([nfo['intermediate_info'][0]['conditions'] for nfo in result[1]['info']]):
            any_precip = condition != (False,)*len(condition)
            if cur_val is None:
                if any_precip:
                    onset = idx
            elif cur_val == any_precip:
                pass
            else:
                if any_precip:
                    onset = idx
                else:
                    plt.axvspan(NH3_varied[onset], NH3_varied[idx], facecolor='gray', alpha=0.1)
                    onset = None
            cur_val = any_precip
        if onset is not None:
            plt.axvspan(NH3_varied[onset], NH3_varied[-1], facecolor='gray', alpha=0.1)
    except KeyError:
        pass
    return result

In [ ]:
xout_log, sols_log, sane_log = solve_and_plot_full(NumSysLog, solver='scipy', tol=1e-10, conditional_maxiter=30,
                                                   rref_equil=True, rref_preserv=True, method='lm')
plt.gca().set_ylim([1e-10, 60])
plt.gca().set_xscale('log')
plt.gca().set_yscale('log')

We see that the numerical solution is not perfect (it could probably be improved by scaling the components individually). But the principle is clear: we can solve the solve non-linear system of equations using this method.

Let's see if we can gain some more insight here:


In [ ]:
def sum_species(x, species, substance_names, weights=None):
    accum = np.zeros(x.shape[0])
    if weights is None:
        weights = [1]*len(substance_names)
    for idx in map(substance_names.index, species):
        accum += x[:, idx]*weights[idx]
    return accum

def plot_groups(varied, x):
    substance_names = list(full_eqsys.substances.keys())
    weights = [s.composition.get(atomic_number('Cu'), ) for s in full_eqsys.substances.values()]
    amines = sum_species(x, NH3_complexes, substance_names, weights)
    hydroxides = sum_species(x, OH_complexes, substance_names, weights)
    free = sum_species(x, ['Cu+2'], substance_names, weights)
    precip = sum_species(x, ['Cu(OH)2(s)'], substance_names, weights)
    
    tot = amines + hydroxides + free + precip

    plt.figure(figsize=(13.7, 7))
    plt.plot(NH3_varied, tot, label='tot')
    plt.plot(NH3_varied, amines, label='amines')
    plt.plot(NH3_varied, hydroxides, label='hydroxides')
    plt.plot(NH3_varied, free, label='free')
    plt.plot(NH3_varied, precip, label='precip')
    plt.legend(loc='best')
    plt.gca().set_xscale('log')

plot_groups(NH3_varied, xout_log)

Without any precipitates (we force the system to not precipitate any solids, applicable for short timescales for example):


In [ ]:
xout_static, sols_static, sane_static = solve_and_plot_full(
    NumSysLog, solver='scipy', tol=1e-12, # , NumSysSquare
    neqsys_type='static_conditions',
    rref_equil=True, rref_preserv=True,
    precipitates=(False,), method='lm', plot_kwargs=None
)
plt.gca().set_xscale('log')
plt.gca().set_yscale('log')
plt.gca().set_ylim([1e-10, 60])

plot_groups(NH3_varied, xout_static)