Root finding

There are a number of papers, which seem to follow from Xiao, Dunford and Capel, on modelling the $I, V$ (current versus voltage) curve of a PV cell. A typical model has

\begin{equation} I = I_{\text{ph}} - I_0 \left[ \exp \left( \frac{V + I \times R_s}{a} - 1 \right) - 1 \right] - \frac{V + I \times R_s}{R_p}. \end{equation}

The parameters $I_{\text{ph}}, I_0, a, R_s, R_p$ all need to be determined. To do this we'll need five pairs of $I, V$ values, determined by experiment.

Even given those values the equation still can't be solved algebraically: it's nonlinear. Finding values that solve this equation is the realm of nonlinear root finding, and is crucial for a range of numerical methods.

One dimensional case

Let's assume we know the values of three of the parameters,

\begin{equation} I_{\text{ph}} = 3.5, \quad I_0 = 3.2, \quad a = 20, \quad R_p = 300. \end{equation}

We'll also assume that a voltage $V = 10$ gives a current $I = 3$. So we have to solve the equation

\begin{align} f(R_s) &= I_{\text{ph}} - I_0 \left[ \exp \left( \frac{V + I \times R_s}{a} - 1 \right) - 1 \right] - \frac{V + I \times R_s}{R_p} - I \\ &= 3.5 - 3.2 \left[ \exp \left( \frac{10 + 3 R_s}{20} - 1 \right) - 1 \right] - \frac{10 + 3 R_s}{300} - 3\\ &= 0. \end{align}

Where this equation is satisfied we have the value of $R_s$.

Let's check that there is a solution, by plotting values of $f$ for different possible values of $R_s$:


In [ ]:
from __future__ import division
import numpy
from matplotlib import pyplot
%matplotlib notebook

In [ ]:
def f(Rs):
    return 3.5 -3.2*(numpy.exp((10+3*Rs)/20.0 - 1.0) - 1.0) - (10.0 + 3.0*Rs)/300.0 - 3

Rs = numpy.linspace(0, 10)
pyplot.figure(figsize=(10,6))
pyplot.plot(Rs, f(Rs))
pyplot.xlabel(r"$R_s$")
pyplot.ylabel(r"$f$")
pyplot.show()

We see that there is a solution at around $R_s = 4$. We want to find it precisely.

Bisection

It's clear from the figure that there's a solution in the middle: the curve is continuous, and has different signs at either end.


In [ ]:
print("When Rs=0, f(Rs)={}".format(f(0)))
print("When Rs=10, f(Rs)={}".format(f(10)))

We could pick the point in the middle, where $R_s=5$, and check its value:


In [ ]:
print("When Rs=5, f(Rs)={}".format(f(5)))

So the sign of $f$ changes between $0$ and $5$ (as expected):


In [ ]:
pyplot.figure(figsize=(10,6))
pyplot.plot(Rs, f(Rs))
pyplot.xlim(0, 5)
pyplot.ylim(-0.5, 2)
pyplot.xlabel(r"$R_s$")
pyplot.ylabel(r"$f$")
pyplot.show()

This suggests an algorithm - the bisection algorithm.

  1. Pick $R_s^{\text{L}}$ and $R_s^{\text{R}}$ such that $f(R_s^{\text{L}}) \times f(R_s^{\text{R}}) < 0$.
  2. Compute the midpoint $R_s^{\text{M}} = \tfrac{1}{2}(R_s^{\text{L}} + R_s^{\text{R}})$.
  3. Compute the function $f(R_s^{\text{M}})$.
    1. If $f(R_s^{\text{L}}) \times f(R_s^{\text{M}}) < 0$ then the root is between the left point and the midpoint. So set $R_s^{\text{R}} = R_s^{\text{M}}$.
    2. Otherwise the root is between the midpoint and the right point. So set $R_s^{\text{L}} = R_s^{\text{M}}$.
  4. Repeat from 2 until the value of $f(R_s^{\text{M}})$ is sufficiently small.

Let's implement that, requiring that $| f(R_s^{\text{M}}) | < 10^{-12}$.


In [ ]:
RsL = 0
RsR = 10
fL = f(RsL)
fR = f(RsR)
max_iterations = 100
tolerance = 1e-12
fM = 1
iteration = 0
while (abs(fM) > tolerance) and (iteration < max_iterations):
    iteration += 1
    RsM = 0.5*(RsL + RsR)
    fM = f(RsM)
    if fL*fM < 0:
        RsR = RsM
        fR = fM
    else:
        RsL = RsM
        fL = fM
print("Determined Rs={}, with f(Rs)={}, after {} iterations".format(RsM, fM, iteration))

We can write this as a function:


In [ ]:
def bisection(f, RsL, RsR, tolerance = 1e-12, max_iterations = 100):
    fL = f(RsL)
    fR = f(RsR)
    assert(fL*fR<0), "Require initial interval to be such that f(xL)*f(xR)<0!"
    fM = 1
    iteration = 0
    while (abs(fM) > tolerance) and (iteration < max_iterations):
        iteration += 1
        RsM = 0.5*(RsL + RsR)
        fM = f(RsM)
        if fL*fM < 0:
            RsR = RsM
            fR = fM
        else:
            RsL = RsM
            fL = fM
    return RsM, fM, iteration

RsM, fM, iteration = bisection(f, 0, 10)
print("Determined Rs={}, with f(Rs)={}, after {} iterations".format(RsM, fM, iteration))

Newton's method

Bisection is safe - it will always converge. It has two problems. The first is that it's slow: requiring tens or hundreds of function evaluations (which could be expensive) to find the root. The second is that it's very difficult to generalize to multiple dimensions.

Instead we use Newton's method. This starts from a guess for $R_s$, $R_s^{(0)}$. It then updates this guess, computing

\begin{equation} R_s^{(k+1)} = R_s^{(k)} - \frac{f \left( R_s^{(k)} \right)}{f' \left( R_s^{(k)} \right)}. \end{equation}

This is computing the tangent to the curve $f(R_s)$ at the guess $R_s^{(k)}$. It then moves along a straight line with this slope until it intersects the horizontal axis: that becomes our new guess.

Newton's method is much faster. Its disadvantages are that it isn't safe - it's not guaranteed to converge - and there's the additional effort of computing the derivative.

Let's check how fast it is, starting from a guess of 5:


In [ ]:
def df(Rs):
    return -3.2*3/20.0*numpy.exp((10+3*Rs)/20.0 - 1.0) -  3.0/300.0

In [ ]:
Rs_guess = 5
f_guess = f(Rs_guess)
iteration = 0
while (abs(f_guess) > tolerance) and (iteration < max_iterations):
    iteration += 1
    df_guess = df(Rs_guess)
    Rs_guess = Rs_guess - f_guess / df_guess
    f_guess = f(Rs_guess)
print("Determined Rs={}, with f(Rs)={}, after {} iterations".format(Rs_guess, f_guess, iteration))

The huge improvement in speed is extremely important when solving complex problems. However, the choice of initial guess is crucial to the method converging to the right answer (or at all!).

Again, we can rewrite this as a function:


In [ ]:
def newton(f, df, Rs_guess, tolerance=1e-12, max_iterations=100):
    f_guess = f(Rs_guess)
    iteration = 0
    while (abs(f_guess) > tolerance) and (iteration < max_iterations):
        iteration += 1
        df_guess = df(Rs_guess)
        Rs_guess = Rs_guess - f_guess / df_guess
        f_guess = f(Rs_guess)
    return Rs_guess, f_guess, iteration

Rs_value, f_value, iteration = newton(f, df, 5)
print("Determined Rs={}, with f(Rs)={}, after {} iterations".format(Rs_value, f_value, iteration))

Multi-dimensional case

In the more general case we want to solve for all the parameters. We're going to assume that the previous calculation has given us $R_s = 4.165$, and that $R_p = 300$ is also known, and so only need to find $I_{\text{ph}}, I_0, R_p$. This means we'll need three bits of information: three experimental measurements $(I^{[k]}, V^{[k]})$. We'll assume that they are

\begin{align} V^{[1]} &= 0, & I^{[1]} & = 3.97, \\ V^{[2]} &= 10, & I^{[2]} & = 3.01, \\ V^{[3]} &= 20, & I^{[3]} & = 1.88. \end{align}

This means we can write down four equations. We'll write this as a vector equation

\begin{equation} {\bf f} \left( {\bf P} \right) = 0 \end{equation}

where ${\bf P} = \left( I_{\text{ph}}, I_0, a \right)$ are the parameters to be determined. The components of the vector equation each have the form

\begin{equation} f_k \left( {\bf P} \right) = I_{\text{ph}} - I_0 \left[ \exp \left( \frac{V^{[k]} + I^{[k]} \times R_s}{a} - 1 \right) - 1 \right] - \frac{V^{[k]} + I^{[k]} \times R_s}{R_p} - I^{[k]}. \end{equation}

We can write this down as a single function:


In [ ]:
parameters = {
    "Vk" : [0, 10, 20],
    "Ik" : [3.97, 3.01, 1.88],
    "Rs" : 4.165,
    "Rp" : 300
    }

def f_vector(p, parameters):
    Vk = parameters["Vk"]
    Ik = parameters["Ik"]
    Rs = parameters["Rs"]
    Rp = parameters["Rp"]
    Iph, I0, a = p
    f = numpy.zeros_like(p)
    for k in range(len(Vk)):
        f[k] = Iph - I0*(numpy.exp((Vk[k] + Ik[k]*Rs)/a-1.0)-1.0) - (Vk[k] + Ik[k]*Rs)/Rp - Ik[k]
    return f

We now need Newton's method applied to a vector function. The generalization of the derivative of a scalar function of a scalar to a vector function of a vector is the Jacobian matrix

\begin{equation} \frac{\partial {\bf f}}{\partial {\bf p}} = J = \begin{pmatrix} \frac{\partial f_1}{\partial p_1} & \dots & \frac{\partial f_n}{\partial p_1} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_n}{\partial p_1} & \dots & \frac{\partial f_n}{\partial p_n} \end{pmatrix}. \end{equation}

We roughly think of Newton's method as going from

\begin{equation} p^{(k+1)} = p^{(k)} - \frac{f \left( p^{(k)} \right)}{f' \left( p^{(k)} \right)} \end{equation}

to

\begin{equation} {\bf p}^{(k+1)} = {\bf p}^{(k)} - \frac{{\bf f} \left( {\bf p}^{(k)} \right)}{J \left( {\bf p}^{(k)} \right)}. \end{equation}

However, this equation makes no sense: dividing a vector by a matrix doesn't mean anything. What we really want to do is the "equivalent" linear system

\begin{equation} J \left( {\bf p}^{(k)} \right) \cdot \left( {\bf p}^{(k+1)} - {\bf p}^{(k)} \right) = -{\bf f} \left( {\bf p}^{(k)} \right). \end{equation}

By defining the vector ${\bf c}^{(k)} = {\bf p}^{(k+1)} - {\bf p}^{(k)}$, Newton's method becomes

\begin{align} J \cdot {\bf c}^{(k)} & = -{\bf f}^{(k)}, \\ {\bf p}^{(k+1)} &= {\bf p}^{(k)} + {\bf c}^{(k)}. \end{align}

Digression

The equation $J \cdot {\bf c}^{(k)} = -{\bf f}^{(k)}$ is an example of a linear system: a known matrix multiplies an unknown vector to match the known vector on the right hand side. The linear system problem is a fundamental building block of numerical methods, and there are lots of algorithms to solve it. We'll use the simple numpy.linalg.solve here: be aware that when the matrix gets large (bigger than say $1,000^2$), much better algorithms will be required.

Of course, analytically we would solve this by inverting the matrix. In numerical methods you should never invert the matrix: it's expensive and prone to nasty problems. In nearly all cases, you actually want to solve a linear system.

Implementation

In this case all components of our vector function have the same form, so computing the Jacobian matrix is just tedious:

\begin{align} \frac{\partial f_k}{\partial p_1} & = \frac{\partial f_k}{\partial I_{\text{ph}}} \\ &= 1, \\ \frac{\partial f_k}{\partial p_2} & = \frac{\partial f_k}{\partial I_0} \\ &= 1 - \exp \left( \frac{V^{[k]} + I^{[k]} \times R_s}{a} - 1 \right), \\ \frac{\partial f_k}{\partial p_3} & = \frac{\partial f_k}{\partial a} \\ &= I_0 \frac{V^{[k]} + I^{[k]} \times R_s}{a^2} \exp \left( \frac{V^{[k]} + I^{[k]} \times R_s}{a} - 1 \right). \end{align}

In [ ]:
def jacobian(p, parameters):
    Vk = parameters["Vk"]
    Ik = parameters["Ik"]
    Rs = parameters["Rs"]
    Rp = parameters["Rp"]
    Iph, I0, a = p
    J = numpy.zeros((len(p),len(p)))
    for k in range(len(p)):
        J[k,0] = 1.0
        J[k,1] = 1.0 - numpy.exp((Vk[k] + Ik[k]*Rs)/a - 1.0)
        J[k,2] = I0 * (Vk[k] + Ik[k]*Rs)/a**2*numpy.exp((Vk[k] + Ik[k]*Rs)/a - 1.0)
    return J

And now we can try Newton's method. We'll start from an initial guess given by the parameters above. We'll check for convergence by measuring the norm of f using numpy.linalg.norm.


In [ ]:
from numpy.linalg import norm, solve
p_guess = numpy.array([3.5, 3.2, 20])
f_guess = f_vector(p_guess, parameters)
iteration = 0
while (norm(f_guess) > tolerance) and (iteration < max_iterations):
    iteration += 1
    J_guess = jacobian(p_guess, parameters)
    c_guess = solve(J_guess, -f_guess)
    p_guess = p_guess + c_guess
    f_guess = f_vector(p_guess, parameters)
print("Determined parameters={}, with |f(p)|={}, after {} iterations".format(p_guess, norm(f_guess), iteration))

Again it converges very rapidly: again it's very sensitive to the initial guess.

Black box methods

In Python, the standard solvers are in the scipy.optimize library. The default solver for one dimensional problems in brentq which is safe (like bisection) and converges rapidly (although not quite as fast as Newton's method). For multi-dimensional problems the root function is the standard. All these methods can be accelerated by providing the Jacobian. All are strongly dependent on the initial guess, or interval.