Symbolic Differentiation vs Automatic Differentiation

Consider the function below that, at least computationally, is very simple.


In [1]:
from math import sin, cos

def func(x):
    y = x
    for i in range(30):
        y = sin(x + y)

    return y

We can compute a derivative symbolically, but it is of course horrendous (see below). Think of how much worse it would be if we chose a function with products, more dimensions, or iterated more than 20 times.


In [2]:
from sympy import diff, Symbol, sin
from __future__ import print_function

x = Symbol('x')
dexp = diff(func(x), x)
print(dexp)


(((((((((((((((((((((((((((((2*cos(2*x) + 1)*cos(x + sin(2*x)) + 1)*cos(x + sin(x + sin(2*x))) + 1)*cos(x + sin(x + sin(x + sin(2*x)))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(2*x))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x)))))))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))))))))))))) + 1)*cos(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(x + sin(2*x))))))))))))))))))))))))))))))

We can now evaluate the expression.


In [7]:
xpt = 0.1

dfdx = dexp.subs(x, xpt)

print('dfdx =', dfdx)


dfdx = 1.91770676038667

Let's compare with automatic differentiation using operator overloading:


In [8]:
from algopy import UTPM, sin

x_algopy = UTPM.init_jacobian(xpt)
y_algopy = func(x_algopy)
dfdx = UTPM.extract_jacobian(y_algopy)
    
print('dfdx =', dfdx)


dfdx = [ 1.91770676]

Let's also compare to AD using a source code transformation method (I used Tapenade in Fortran)


In [9]:
def funcad(x):
    xd = 1.0
    yd = xd
    y = x
    for i in range(30):
        yd = (xd + yd)*cos(x + y)
        y = sin(x + y)
    return yd

dfdx = funcad(xpt)

print('dfdx =', dfdx)


dfdx = 1.91770676039

For a simple expression like this, symbolic differentiation is long but actually works reasonbly well, and both will give a numerically exact answer. But if we change the loop to 100 (go ahead and try this) or add other complications, the symbolic solver will fail. However, automatic differentiation will continue to work without issue (see the simple source code transformation version). Furthermore, if we add other dimensions to the problem, symbolic differentiation quickly becomes costly as lots of computations get repeated, whereas automatic differentiation is able to reuse a lot of calculations.