tangent

Source-to-Source Debuggable Derivatives in Pure Python

As a result, you can finally read your automatic derivative code just like the rest of your program. Tangent is useful to researchers and students who not only want to write their models in Python, but also read and debug automatically-generated derivative code without sacrificing speed and flexibility.

How

Under the hood, tangent.grad grabs the source code of the Python function you pass it (using inspect.getsource, which is available in the Python standard library), converts the source code into an abstract syntax tree (AST) using ast.parse (also built into the Python standard library), and walks the syntax tree in reverse order.


In [1]:
import tangent
import tensorflow as tf

In [12]:
def f(x):
    a = x * x
    b = x * a
    c = a + b
    return c

In [14]:
df = tangent.grad(f)

In [15]:
df


Out[15]:
<function tangent_e89f.dfdx>

In [16]:
df(33)


Out[16]:
3333.0

Forward mode

Reverse-mode autodiff, or backpropagation, generates efficient derivatives for the types of functions we use in machine learning, where there are usually many (perhaps millions) of input variables and only a single output (our loss). When the inverse is true, where there are many more outputs than inputs, reverse mode is not an efficient algorithm, as it has to be run as many times as there are output variables. However, a less famous algorithm, forward-mode autodiff, only has to be run as many times as there are input variables.). Tangent supports forward-mode autodiff.


In [17]:
forward_df = tangent.grad(f, mode='forward')

In [20]:
forward_df(33, dx=1)


Out[20]:
3333

In [21]:
forward_df(33, dx=2)


Out[21]:
6666

In [22]:
forward_df(33, dx=-1)


Out[22]:
-3333

Hessian-vector products

To take higher-order derivatives, you can use any combination of forward- and reverse-mode autodiff in Tangent. This works because the code Tangent produces can also be fed back in as input. The autodiff literature recommends calculating HVPs in a “Forward-over-Reverse” style. This means first apply reverse mode autodiff to the function, and then apply forward mode to that.


In [2]:
def f(x):
    a = x * x * x
    b = a * x ** 2.0
    return tf.reduce_sum(b)

In [8]:
hvp = tangent.grad(tangent.grad(f, mode='reverse'), mode='forward')

In [31]:
def f(W,x):
    h1 = tf.matmul(x,W)
    h2 = tf.tanh(h1)
    out = tf.reduce_sum(h2)
    return out

In [32]:
dfdW = tangent.grad(f)

In [33]:
dfdW


Out[33]:
<function tangent_9d80.dfdW>

In [40]:
W = tf.Variable(tf.zeros([100, 10]))
x = tf.Variable(tf.zeros([10, 100]))

In [41]:
dfdW(W, x)


Out[41]:
<tf.Tensor 'MatMul_6:0' shape=(100, 10) dtype=float32>

References