PyAutoDiff: automatic differentiation for NumPy

We are excited to have a guest post discussing a new tool that is freely available for the Python community. Welcome, Jeremiah Lowin, the Chief Scientist of the Lowin Data Company, to the growing pool of Data Community DC bloggers. We are very excited to announce an early release of PyAutoDiff, a library that allows automatic differentiation in NumPy, among other useful features. A quickstart guide is available here.

Autodiff can compute gradients (or derivatives) with a simple decorator:

from autodiff import gradient

def f(x):
return x ** 2

def g(x):
return x ** 2

print f(10.0) # 100.0
print g(10.0) # 20.0

More broadly, autodiff leverages Theano's powerful symbolic engine to compile NumPy functions, allowing features like mathematical optimization, GPU acceleration, and of course automatic differentiation. Autodiff is compatible with any NumPy operation that has a Theano equivalent and fully supports multidimensional arrays. It also gracefully handles many Python constructs (though users should be very careful with control flow tools like if/else and loops!).

In addition to the  @gradient decorator, users can apply  @function to compile functions without altering their return values. Compiled functions can automatically take advantage of Theano's optimizations and available GPUs, though users should note that GPU computations are only supported for float32 dtypes. Other decorators, classes, and high-level functions are available; see the docs for more information.

It is also possible for autodiff to trace NumPy objects through multiple functions. It can then compile symbolic representations of all of the traced operations (or their gradients) -- even with respect to objects that were purely local to the function(s) scope.

import numpy as np
from autodiff import Symbolic, tag

# -- a vanilla function
def f1(x):
return x + 2

# -- a function referencing a global variable
y = np.random.random(10)
def f2(x):
return x * y

# -- a function with a local variable
def f3(x):
z = tag(np.ones(10), 'local_var')
return (x + z) ** 2

# -- create a general symbolic tracer
x = np.random.random(10)
tracer = Symbolic()

# -- trace the three functions
out1 = tracer.trace(f1, x)
out2 = tracer.trace(f2, out1)
out3 = tracer.trace(f3, out2)

# -- compile a function representing f(x, y, z) = out3
new_fn = tracer.compile_function(inputs=[x, y, 'local_var'],
outputs=out3)

assert np.allclose(new_fn(x, y, np.ones(10)), f3(f2(f1(x))))

One of the original motivations for autodiff was working with SVMs that were defined purely in NumPy. The following example (also available at autodiff/examples/svm.py) fits an SVM to random data, using autodiff to compute parameter gradients for SciPy's L-BFGS-B solver:

import numpy as np
from autodiff.optimize import fmin_l_bfgs_b

rng = np.random.RandomState(1)

# -- create some fake data
x = rng.rand(10, 5)
y = 2 * (rng.rand(10) > 0.5) - 1
l2_regularization = 1e-4

# -- define the loss function
def loss_fn(weights, bias):
margin = y * (np.dot(x, weights) + bias)
loss = np.maximum(0, 1 - margin) ** 2
l2_cost = 0.5 * l2_regularization * np.dot(weights, weights)
loss = np.mean(loss) + l2_cost
return loss

# -- call optimizer
w_0, b_0 = np.zeros(5), np.zeros(())
w, b = fmin_l_bfgs_b(loss_fn, init_args=(w_0, b_0))

final_loss = loss_fn(w, b)

assert np.allclose(final_loss, 0.7229)

Some members of the scientific community will recall that James Bergstra began the PyAutoDiff project a year ago in an attempt to unify NumPy's imperative style with Theano's functional syntax. James successfully demonstrated the project's utility, and this version builds out and on top of that foundation. Standing on the shoulders of giants, indeed!

Please note that autodiff remains under active development and features may change. The library has been performing well in internal testing, but we're sure that users will find new and interesting ways to break it. Please file any bugs you may find!