Automatic Differentiation

This example demonstrates automatic differentiation using both an operator overloading method and a source code transformation method. The function we will use is reasonably complex. It is a routine from my blade element momentum code (available here). The full code is much more involved, but we will focus on just one function to keep things simpler. The actual routine is written in Fortran but I've converted it into Python for this demonstration. The function has several inputs, and three outputs. For simplicity we will focus on just the derivatives of the first output, although all are available.


In [1]:
from math import pi
import numpy as np
from math import sin, cos, acos, exp, sqrt


def inductionFactors(r, chord, Rhub, Rtip, phi, cl, cd, B,
        Vx, Vy, useCd, hubLoss, tipLoss, wakerotation):
    """Computes induction factors and residual error at a given location
    on the blade.  Full details on inputs/outputs ommitted here."""

    sigma_p = B/2.0/pi*chord/r
    sphi = sin(phi)
    cphi = cos(phi)

    # resolve into normal and tangential forces
    if not useCd:
        cn = cl*cphi
        ct = cl*sphi
    else:
        cn = cl*cphi + cd*sphi
        ct = cl*sphi - cd*cphi

    # Prandtl's tip and hub loss factor
    Ftip = 1.0
    if tipLoss:
        factortip = B/2.0*(Rtip - r)/(r*abs(sphi))
        Ftip = 2.0/pi*acos(exp(-factortip))

    Fhub = 1.0
    if hubLoss:
        factorhub = B/2.0*(r - Rhub)/(Rhub*abs(sphi))
        Fhub = 2.0/pi*acos(exp(-factorhub))

    F = Ftip * Fhub

    # bem parameters
    k = sigma_p*cn/4.0/F/sphi/sphi
    kp = sigma_p*ct/4.0/F/sphi/cphi

    # compute axial induction factor
    if phi > 0.0:  # momentum/empirical

        # update axial induction factor
        if k <= 2.0/3.0:  # momentum state
            a = k/(1+k)

        else:  # Glauert(Buhl) correction

            g1 = 2.0*F*k - (10.0/9-F)
            g2 = 2.0*F*k - (4.0/3-F)*F
            g3 = 2.0*F*k - (25.0/9-2*F)

            if abs(g3) < 1e-6:  # avoid singularity
                a = 1.0 - 1.0/2.0/sqrt(g2)
            else:
                a = (g1 - sqrt(g2)) / g3


    else:  # propeller brake region (a and ap not directly used but update anyway)

        if k > 1.0:
            a = k/(k-1.0)
        else:
            a = 0.0  # dummy value

    # compute tangential induction factor
    ap = kp/(1.0-kp)

    if not wakerotation:
        ap = 0.0
        kp = 0.0

    # error function
    lambda_r = Vy/Vx
    if phi > 0:  # momentum/empirical
        fzero = sphi/(1.0-a) - cphi/lambda_r*(1.0-kp)
    else:  # propeller brake region
        fzero = sphi*(1.0-k) - cphi/lambda_r*(1.0-kp)

    return fzero, a, ap

Just for convenience, we are going to wrap this function. Our wrapper will take two inputs: x and params. The x vector will contain all the variables that we want to take derivatives with respect to. The params are parameters that do not change during a simulation and so we don't need derivatives with respect to them (these are things like B the number of blades, and various boolean options like useCd).


In [2]:
# wrap function
def function(x, params):
    
    # unpack variables
    r, chord, Rhub, Rtip, phi, cl, cd, Vx, Vy = x
    B, useCd, hubLoss, tipLoss, wakerotation = params

    # call the original function
    return inductionFactors(r, chord, Rhub, Rtip, phi, cl, cd, B,
        Vx, Vy, useCd, hubLoss, tipLoss, wakerotation)

Now we setup some values for the inputs and parameters and put them into a variable array and an parameter array. Also we will save n which is the number of variables in x.


In [3]:
# setup inputs
r = 0.5
chord = 0.1
Rhub = 0.1
Rtip = 1.0
phi = 0.2
cl = 0.3
cd = 0.002
B = 3
Vx = 1.0
Vy = 5.0
useCd = True
hubLoss = True
tipLoss = True
wakerotation = True

x = np.array([r, chord, Rhub, Rtip, phi, cl, cd, Vx, Vy])
params = np.array([B, useCd, hubLoss, tipLoss, wakerotation])

n = len(x)

First, let's find the derivatives of the first output (fzero) with respect to x using finite differencing. We already know how to do this.


In [4]:
# ------ finite difference --------
output, a, ap = function(x, params)  # we are ignoring the other outputs although we could easily get there derivatives as well

g_fd = np.zeros(n)  # initialize gradient vector for finite difference

for i in range(n):  # iterate across all vars

    # step size
    step = 1e-6*x[i]  

    # take a step
    xplus = np.copy(x)  
    xplus[i] += step

    output_plus, a, ap = function(xplus, params)

    g_fd[i] = (output_plus - output) / step

Now let's compute exact derivatives using automatic differentiation (AD). We will use the algopy module in Python. This is an operator overloading method. If you are using Matlab there are not any AD methods built in, but you can find some 3rd party tools.

To use algopy properly we are need to import overloaded versions of the functions we are using (sin, cos, etc.). These overloaded versions will be able to keep track of partial derivatives through a chain rule.


In [5]:
# You can ignore this.  I'm just ignoring a printed FutureWarning about a change affecting something internal to algopy
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

In [6]:
from algopy import UTPM  # just the name of the algorithm (stands for univariate Taylor propagation of matrices)
from algopy import sin, cos, exp, sqrt  # overloaded versions of functions we use
from algopy import arccos as acos  # need to rename b.c. using the math version (acos) whereas numpy uses arccos

# create an algopy version of x
x_algopy = UTPM.init_jacobian(x)

# create an algopy version of outputs
output, a, ap = function(x_algopy, params)

# extract the gradients
g_ad_oo = UTPM.extract_jacobian(output)  # could call again for the other outputs

That's it! We have numerically exact derivatives now for our output w.r.t all inputs in x. (We will show comparisons in accuracy at the end.

algopy is pretty easy to use, but I rarely use it myself for two main reasons. 1) It's very slow. Overloaded methods are already slow, and pure Python itself is slow, so for large problems there is a significant slow down. 2) algopy can handle most things in numpy but it's not as versatile and powerful as some other tools. The tool I use most frequently for AD is Tapenade, which works for Fortran and C code. This is nice because Python can call Fortran and C code fairly easily. I use Tapenade because 1) Fortran is fast and callable from Python so we can move computational bottlenecks to Fortran and still have an easy to use wrapper in Python. Tapenade also uses a source code transformation method, which keeps the AD part fast as well. 2) Tapenade is powerful and can handle forward and reverse mode, vectorized modes, loops, etc.

The Fortran version of the function at the beginning is reproduced below:

subroutine inductionFactors(r, chord, Rhub, Rtip, phi, cl, cd, B, &
    Vx, Vy, useCd, hubLoss, tipLoss, wakerotation, &
    fzero, a, ap)

    implicit none

    integer, parameter :: dp = kind(0.d0)

    ! in
    real(dp), intent(in) :: r, chord, Rhub, Rtip, phi, cl, cd
    integer, intent(in) :: B
    real(dp), intent(in) :: Vx, Vy
    logical, intent(in) :: useCd, hubLoss, tipLoss, wakerotation
    !f2py logical, optional, intent(in) :: useCd = 1, hubLoss = 1, tipLoss = 1, wakerotation = 1

    ! out
    real(dp), intent(out) :: fzero, a, ap

    ! local
    real(dp) :: pi, sigma_p, sphi, cphi, lambda_r
    real(dp) :: factortip, Ftip, factorhub, Fhub
    real(dp) :: k, kp, cn, ct, F
    real(dp) :: g1, g2, g3


    ! constants
    pi = 3.1415926535897932_dp
    sigma_p = B/2.0_dp/pi*chord/r
    sphi = sin(phi)
    cphi = cos(phi)

    ! resolve into normal and tangential forces
    if ( .not. useCd ) then
        cn = cl*cphi
        ct = cl*sphi
    else
        cn = cl*cphi + cd*sphi
        ct = cl*sphi - cd*cphi
    end if

    ! Prandtl's tip and hub loss factor
    Ftip = 1.0_dp
    if ( tipLoss ) then
        factortip = B/2.0_dp*(Rtip - r)/(r*abs(sphi))
        Ftip = 2.0_dp/pi*acos(exp(-factortip))
    end if

    Fhub = 1.0_dp
    if ( hubLoss ) then
        factorhub = B/2.0_dp*(r - Rhub)/(Rhub*abs(sphi))
        Fhub = 2.0_dp/pi*acos(exp(-factorhub))
    end if

    F = Ftip * Fhub

    ! bem parameters
    k = sigma_p*cn/4.0_dp/F/sphi/sphi
    kp = sigma_p*ct/4.0_dp/F/sphi/cphi

    ! compute axial induction factor
    if (phi > 0) then  ! momentum/empirical

        ! update axial induction factor
        if (k <= 2.0_dp/3.0) then  ! momentum state
            a = k/(1+k)

        else  ! Glauert(Buhl) correction

            g1 = 2.0_dp*F*k - (10.0_dp/9-F)
            g2 = 2.0_dp*F*k - (4.0_dp/3-F)*F
            g3 = 2.0_dp*F*k - (25.0_dp/9-2*F)

            if (abs(g3) < 1e-6_dp) then  ! avoid singularity
                a = 1.0_dp - 1.0_dp/2.0/sqrt(g2)
            else
                a = (g1 - sqrt(g2)) / g3
            end if

        end if

    else  ! propeller brake region (a and ap not directly used but update anyway)

        if (k > 1) then
            a = k/(k-1)
        else
            a = 0.0_dp  ! dummy value
        end if

    end if

    ! compute tangential induction factor
    ap = kp/(1-kp)

    if (.not. wakerotation) then
        ap = 0.0_dp
        kp = 0.0_dp
    end if

    ! error function
    lambda_r = Vy/Vx
    if (phi > 0) then  ! momentum/empirical
        fzero = sphi/(1-a) - cphi/lambda_r*(1-kp)
    else  ! propeller brake region
        fzero = sphi*(1-k) - cphi/lambda_r*(1-kp)
    end if

end subroutine inductionFactors

We then run this code through Tapenade and it creates a source code transformed version that computes derivatives in addition to function values (I've used a forward mode in this case). It's not pretty to look at, but it's automatically generated. If we change the original source, we would need to regenerate.

!        Generated by TAPENADE     (INRIA, Tropics team)
!  Tapenade 3.9 (r5096) - 24 Feb 2014 16:54
!
!  Differentiation of inductionfactors in forward (tangent) mode:
!   variations   of useful results: ap fzero a
!   with respect to varying inputs: r rtip rhub chord phi cd cl
!                vx vy
!   RW status of diff variables: r:in rtip:in ap:out rhub:in chord:in
!                fzero:out phi:in cd:in cl:in vx:in vy:in a:out

SUBROUTINE INDUCTIONFACTORS_DV(r, chord, rhub, rtip, phi, cl, cd, b, &
  vx, vy, usecd, hubloss, tiploss, wakerotation, &
  rd, chordd, rhubd, rtipd, phid, cld, cdd, vxd, vyd, &
  fzero, a, ap, fzerod, ad, apd, nbdirs)
!  Hint: nbdirsmax should be the maximum number of differentiation directions
  IMPLICIT NONE
  INTEGER, PARAMETER :: dp=KIND(0.d0)
! in
  REAL(dp), INTENT(IN) :: r, chord, rhub, rtip, phi, cl, cd
  REAL(dp), DIMENSION(nbdirs), INTENT(IN) :: rd, chordd, rhubd, rtipd&
& , phid, cld, cdd
  INTEGER, INTENT(IN) :: b
  REAL(dp), INTENT(IN) :: vx, vy
  REAL(dp), DIMENSION(nbdirs), INTENT(IN) :: vxd, vyd
  LOGICAL, INTENT(IN) :: usecd, hubloss, tiploss, wakerotation
  INTEGER, intent(in) :: nbdirs
!f2py logical, optional, intent(in) :: useCd = 1, hubLoss = 1, tipLoss = 1, wakerotation = 1
! out
  REAL(dp), INTENT(OUT) :: fzero, a, ap
  REAL(dp), DIMENSION(nbdirs), INTENT(OUT) :: fzerod, ad, apd
! local
  REAL(dp) :: pi, sigma_p, sphi, cphi, lambda_r
  REAL(dp), DIMENSION(nbdirs) :: sigma_pd, sphid, cphid, lambda_rd
  REAL(dp) :: factortip, ftip, factorhub, fhub
  REAL(dp), DIMENSION(nbdirs) :: factortipd, ftipd, factorhubd, fhubd
  REAL(dp) :: k, kp, cn, ct, f
  REAL(dp), DIMENSION(nbdirs) :: kd, kpd, cnd, ctd, fd
  REAL(dp) :: g1, g2, g3
  REAL(dp), DIMENSION(nbdirs) :: g1d, g2d, g3d
  INTRINSIC KIND
  INTRINSIC SIN
  INTRINSIC COS
  INTRINSIC ABS
  INTRINSIC EXP
  INTRINSIC ACOS
  INTRINSIC SQRT
  REAL(dp) :: arg1
  REAL(dp), DIMENSION(nbdirs) :: arg1d
  REAL(dp) :: result1
  REAL(dp), DIMENSION(nbdirs) :: result1d
  INTEGER :: nd
  REAL(dp) :: abs1d(nbdirs)
  REAL(dp) :: abs0d(nbdirs)
  REAL(dp) :: abs2
  REAL(dp) :: abs1
  REAL(dp) :: abs0
! constants
  pi = 3.1415926535897932_dp
  DO nd=1,nbdirs
    sigma_pd(nd) = (b*chordd(nd)*r/(2.0_dp*pi)-b*chord*rd(nd)/(2.0_dp*pi&
&     ))/r**2
    sphid(nd) = phid(nd)*COS(phi)
    cphid(nd) = -(phid(nd)*SIN(phi))
  END DO
  sigma_p = b/2.0_dp/pi*chord/r
  sphi = SIN(phi)
  cphi = COS(phi)
! resolve into normal and tangential forces
  IF (.NOT.usecd) THEN
    DO nd=1,nbdirs
      cnd(nd) = cld(nd)*cphi + cl*cphid(nd)
      ctd(nd) = cld(nd)*sphi + cl*sphid(nd)
    END DO
    cn = cl*cphi
    ct = cl*sphi
  ELSE
    DO nd=1,nbdirs
      cnd(nd) = cld(nd)*cphi + cl*cphid(nd) + cdd(nd)*sphi + cd*sphid(nd&
&       )
      ctd(nd) = cld(nd)*sphi + cl*sphid(nd) - cdd(nd)*cphi - cd*cphid(nd&
&       )
    END DO
    cn = cl*cphi + cd*sphi
    ct = cl*sphi - cd*cphi
  END IF
! Prandtl's tip and hub loss factor
  ftip = 1.0_dp
  IF (tiploss) THEN
    IF (sphi .GE. 0.) THEN
      DO nd=1,nbdirs
        abs0d(nd) = sphid(nd)
      END DO
      abs0 = sphi
    ELSE
      DO nd=1,nbdirs
        abs0d(nd) = -sphid(nd)
      END DO
      abs0 = -sphi
    END IF
    factortip = b/2.0_dp*(rtip-r)/(r*abs0)
    arg1 = EXP(-factortip)
    DO nd=1,nbdirs
      factortipd(nd) = (b*(rtipd(nd)-rd(nd))*r*abs0/2.0_dp-b*(rtip-r)*(&
&       rd(nd)*abs0+r*abs0d(nd))/2.0_dp)/(r*abs0)**2
      arg1d(nd) = -(factortipd(nd)*EXP(-factortip))
      IF (arg1 .EQ. 1.0 .OR. arg1 .EQ. (-1.0)) THEN
        result1d(nd) = 0.0
      ELSE
        result1d(nd) = -(arg1d(nd)/SQRT(1.0-arg1**2))
      END IF
      ftipd(nd) = 2.0_dp*result1d(nd)/pi
    END DO
    result1 = ACOS(arg1)
    ftip = 2.0_dp/pi*result1
  ELSE
    DO nd=1,nbdirs
      ftipd(nd) = 0.0
    END DO
  END IF
  fhub = 1.0_dp
  IF (hubloss) THEN
    IF (sphi .GE. 0.) THEN
      DO nd=1,nbdirs
        abs1d(nd) = sphid(nd)
      END DO
      abs1 = sphi
    ELSE
      DO nd=1,nbdirs
        abs1d(nd) = -sphid(nd)
      END DO
      abs1 = -sphi
    END IF
    factorhub = b/2.0_dp*(r-rhub)/(rhub*abs1)
    arg1 = EXP(-factorhub)
    DO nd=1,nbdirs
      factorhubd(nd) = (b*(rd(nd)-rhubd(nd))*rhub*abs1/2.0_dp-b*(r-rhub)&
&       *(rhubd(nd)*abs1+rhub*abs1d(nd))/2.0_dp)/(rhub*abs1)**2
      arg1d(nd) = -(factorhubd(nd)*EXP(-factorhub))
      IF (arg1 .EQ. 1.0 .OR. arg1 .EQ. (-1.0)) THEN
        result1d(nd) = 0.0
      ELSE
        result1d(nd) = -(arg1d(nd)/SQRT(1.0-arg1**2))
      END IF
      fhubd(nd) = 2.0_dp*result1d(nd)/pi
    END DO
    result1 = ACOS(arg1)
    fhub = 2.0_dp/pi*result1
  ELSE
    DO nd=1,nbdirs
      fhubd(nd) = 0.0
    END DO
  END IF
  f = ftip*fhub
  DO nd=1,nbdirs
    fd(nd) = ftipd(nd)*fhub + ftip*fhubd(nd)
! bem parameters
    kd(nd) = ((((sigma_pd(nd)*cn+sigma_p*cnd(nd))*f/4.0_dp-sigma_p*cn*fd&
&     (nd)/4.0_dp)*sphi/f**2-sigma_p*cn*sphid(nd)/(4.0_dp*f))/sphi-&
&     sigma_p*cn*sphid(nd)/(4.0_dp*f*sphi))/sphi**2
    kpd(nd) = ((((sigma_pd(nd)*ct+sigma_p*ctd(nd))*f/4.0_dp-sigma_p*ct*&
&     fd(nd)/4.0_dp)*sphi/f**2-sigma_p*ct*sphid(nd)/(4.0_dp*f))*cphi/&
&     sphi**2-sigma_p*ct*cphid(nd)/(4.0_dp*f*sphi))/cphi**2
  END DO
  k = sigma_p*cn/4.0_dp/f/sphi/sphi
  kp = sigma_p*ct/4.0_dp/f/sphi/cphi
! compute axial induction factor
  IF (phi .GT. 0) THEN
! momentum/empirical
! update axial induction factor
    IF (k .LE. 2.0_dp/3.0) THEN
      DO nd=1,nbdirs
! momentum state
        ad(nd) = (kd(nd)*(1+k)-k*kd(nd))/(1+k)**2
      END DO
      a = k/(1+k)
    ELSE
      DO nd=1,nbdirs
! Glauert(Buhl) correction
        g1d(nd) = 2.0_dp*(fd(nd)*k+f*kd(nd)) + fd(nd)
        g2d(nd) = 2.0_dp*(fd(nd)*k+f*kd(nd)) - (4.0_dp/3-f)*fd(nd) + fd(&
&         nd)*f
        g3d(nd) = 2.0_dp*(fd(nd)*k+f*kd(nd)) + 2*fd(nd)
      END DO
      g1 = 2.0_dp*f*k - (10.0_dp/9-f)
      g2 = 2.0_dp*f*k - (4.0_dp/3-f)*f
      g3 = 2.0_dp*f*k - (25.0_dp/9-2*f)
      IF (g3 .GE. 0.) THEN
        abs2 = g3
      ELSE
        abs2 = -g3
      END IF
      IF (abs2 .LT. 1e-6_dp) THEN
        result1 = SQRT(g2)
        DO nd=1,nbdirs
! avoid singularity
          IF (g2 .EQ. 0.0) THEN
            result1d(nd) = 0.0
          ELSE
            result1d(nd) = g2d(nd)/(2.0*SQRT(g2))
          END IF
          ad(nd) = result1d(nd)/2.0/result1**2
        END DO
        a = 1.0_dp - 1.0_dp/2.0/result1
      ELSE
        result1 = SQRT(g2)
        DO nd=1,nbdirs
          IF (g2 .EQ. 0.0) THEN
            result1d(nd) = 0.0
          ELSE
            result1d(nd) = g2d(nd)/(2.0*SQRT(g2))
          END IF
          ad(nd) = ((g1d(nd)-result1d(nd))*g3-(g1-result1)*g3d(nd))/g3**&
&           2
        END DO
        a = (g1-result1)/g3
      END IF
    END IF
  ELSE IF (k .GT. 1) THEN
! propeller brake region (a and ap not directly used but update anyway)
    DO nd=1,nbdirs
      ad(nd) = (kd(nd)*(k-1)-k*kd(nd))/(k-1)**2
    END DO
    a = k/(k-1)
  ELSE
! dummy value
    a = 0.0_dp
    DO nd=1,nbdirs
      ad(nd) = 0.0
    END DO
  END IF
  DO nd=1,nbdirs
! compute tangential induction factor
    apd(nd) = (kpd(nd)*(1-kp)+kp*kpd(nd))/(1-kp)**2
  END DO
  ap = kp/(1-kp)
  IF (.NOT.wakerotation) THEN
    ap = 0.0_dp
    kp = 0.0_dp
    DO nd=1,nbdirs
      apd(nd) = 0.0
      kpd(nd) = 0.0
    END DO
  END IF
  DO nd=1,nbdirs
! error function
    lambda_rd(nd) = (vyd(nd)*vx-vy*vxd(nd))/vx**2
  END DO
  lambda_r = vy/vx
  IF (phi .GT. 0) THEN
    DO nd=1,nbdirs
! momentum/empirical
      fzerod(nd) = (sphid(nd)*(1-a)+sphi*ad(nd))/(1-a)**2 - (cphid(nd)*&
&       lambda_r-cphi*lambda_rd(nd))*(1-kp)/lambda_r**2 + cphi*kpd(nd)/&
&       lambda_r
    END DO
    fzero = sphi/(1-a) - cphi/lambda_r*(1-kp)
  ELSE
    DO nd=1,nbdirs
! propeller brake region
      fzerod(nd) = sphid(nd)*(1-k) - sphi*kd(nd) - (cphid(nd)*lambda_r-&
&       cphi*lambda_rd(nd))*(1-kp)/lambda_r**2 + cphi*kpd(nd)/lambda_r
    END DO
    fzero = sphi*(1-k) - cphi/lambda_r*(1-kp)
  END IF
END SUBROUTINE INDUCTIONFACTORS_DV

We now build this Fortran code into a shared library that I called _bem. I will skip the details because this isn't our focus, but this is fairly easy to do. We can now access this function from Python by importing the library.

This AD method let's us compute combinations of partial derivatives if we want, but generally we just want each derivatie separately. There are nine inputs in x and we are using the array version that let's us compute multiple derivatives simultaneously. We will set:

dr = [1, 0, 0, 0, 0, 0, 0, 0, 0]

dchord = [0, 1, 0, 0, 0, 0, 0, 0, 0]

and so on ...

This will set the first derivative of doutput_dx to doutput_dr, the second to doutput_dchord, etc.


In [7]:
from _bem import inductionfactors_dv

# get derivative of each input
I = np.eye(n)
dr = I[0, :]
dchord = I[1, :]
dRhub = I[2, :]
dRtip = I[3, :]
dphi = I[4, :]
dcl = I[5, :]
dcd = I[6, :]
dVx = I[7, :]
dVy = I[8, :]


fzero, a, ap, doutput_dx, da_dx, dap_dx = inductionfactors_dv(r, chord, Rhub, Rtip,
    phi, cl, cd, B, Vx, Vy, dr, dchord, dRhub, dRtip, dphi, dcl, dcd, dVx, dVy)

# rename the gradient
g_ad_sc = doutput_dx

This approach is a little more work, but it's much faster and not difficult once you've done it a few times.

How do we check that we did it correctly? Comparing against finite difference is ok, but the best way is to compare against complex step if possible because it is also exact.

To compute the gradients using complex step we need to import complex versions of our functions. We will also need to redefine the absolute value function because the complex version of absolute value$^1$ is the square root of the sum of squares of the real and imaginary part, which is not what we want. You can find more details on functions that need to be overloaded in complex step here.

$^1$ Recall that absolute value is not differentiable at 0, and is generally best to avoid if possible. In this particular code it is fine because I know that the argument will never be zero. It can be negative or positive depending on the operating conditions, but will never cross over to zero.


In [8]:
# import complex versions
from cmath import sin, cos, acos, exp, sqrt

# redine absolute value
def c_abs(x):
    if x.real < 0:
        return -x
    else:
        return x

abs = c_abs

# initialize
g_cs = np.zeros(n)

# iterate across entires in x
for i in range(n):
    step_complex = 1e-30  # take a really small step

    # new xvalue: x + ih
    xcomplex = np.copy(x).astype(complex)
    xcomplex[i] += complex(0.0, step_complex)

    # call function
    output_complex, a_complex, ap_complex = function(xcomplex, params)

    # compute gradient
    g_cs[i] = output_complex.imag / step_complex

Let's see how we did. We will compare our errors relative to complex step. First finite differencing:


In [9]:
from __future__ import print_function

print('error_fd =', (g_fd - g_cs)/g_cs)


error_fd = [ -1.03264668e-06  -1.14468953e-09  -1.00000000e+00  -7.63506801e-06
   2.14031553e-07   9.28095620e-10   1.10214382e-05  -6.52480239e-11
  -1.00017976e-06]

The errors are pretty small, except for the third entry is really bad. If we use a different step size for that one entry, we can do a little better. It turns out the function is very insensitive to Rhub at this point and so getting an accurate gradient with FD is difficult.

Let's now look at the two AD methods: one with operator overloading and one with source code transformation


In [10]:
print('error_ad_oo =', (g_ad_oo - g_cs)/g_cs)
print('error_ad_sc =', (g_ad_sc - g_cs)/g_cs)


error_ad_oo = [ -0.00000000e+00  -6.03768182e-16   1.49255469e-16   2.91419976e-16
   0.00000000e+00  -3.39625499e-16  -4.00715101e-14   1.42608657e-16
   0.00000000e+00]
error_ad_sc = [ -0.00000000e+00  -4.52826136e-16   3.43287579e-15   7.28549941e-16
  -1.32329359e-16  -2.26416999e-16  -2.92046599e-14  -0.00000000e+00
  -1.78260821e-16]

The error is basically machine zero for all entries.


In [ ]: