Flow accumulation: comparison between different implementations.

Note: this notebook only compares Numpy vs. Numba implementations of the Braun & Willet's (2013) method for ordering the mesh nodes (single flow). See the Braun & Willet's paper for a detailled description of the algorithm.


In [1]:
import sys

import numpy
import numba

import landlab
from landlab import RasterModelGrid
from landlab.components.flow_routing.route_flow_dn import FlowRouter
from landlab.components.flow_accum.flow_accum_bw import make_ordered_node_array
from landlab.components.flow_accum.flow_accum_bw import _DrainageStack

%matplotlib inline

In [2]:
print numpy.__version__
print numba.__version__
print landlab.__version__


1.9.0
0.14.0
0.1.5

Implementation

The Numpy implementation is the one currently implemented in the landlab package v0.1.5 (https://github.com/landlab/). Note that the stack building is still implemented in pure python.


In [3]:
def order_nodes_numpy(receivers, baselevels):
    """Numpy implementation of node ordering."""
    
    # number of donors
    np = receivers.size
    nd = numpy.zeros(np, dtype=numpy.int)
    max_index = numpy.max(receivers)
    nd[:(max_index + 1)] = numpy.bincount(receivers)

    # donors temp list
    delta = numpy.zeros(np+1, dtype=int)
    delta.fill(np)
    delta[-2::-1] -= numpy.cumsum(nd[::-1])
    
    # donors array (non vectorized)
    w = numpy.zeros(np, dtype=int)
    D = numpy.zeros(np, dtype=int)
    for i in range(np):
        ri = receivers[i]
        D[delta[ri]+w[ri]] = i
        w[ri] += 1
    
    # build stacks
    dstack = _DrainageStack(delta, D)
    add_it = dstack.add_to_stack
    for k in baselevels:
        add_it(k)
    return dstack.s

Numba implementation below.

Recursive functions are not yet supported by Numba (in non-python mode), but hopefully it will be allowed in a next release. Consequently, the implementation of recursive stack building is still pure-python.

Note that, when using very large model grids with specific configurations, the maximum allowed recursion depth may be reached while building the stack, although this limit is configurable:

import sys
sys.setrecursionlimit(10000)  # or any other number

To address this issue, a non-recursive, jitted alternative is also tested here. The algorithm is less efficient than the recursive method, but the jitted function can be executed much faster compared to pure-python code.


In [4]:
#@numba.njit  # recursion not yet supported by numba
def _build_stack(n, D, delta, s, inc):
    """Stack building (recursive)."""
    s[inc] = n
    for i in range(delta[n], delta[n + 1]):
        m = D[i]
        if m != n:
            inc = _build_stack(m, D, delta, s, inc + 1)
    return inc


@numba.njit
def _build_stack_nr(b, r, D, delta, s, stack_id, inc):
    """
    A non-recursive, less efficient stack building method.
    """
    # init
    n_prev = -1
    
    # add base level
    n = b
    s[inc] = n
    stack_id[n] = b
    
    # main loop (walk the network upstream and downstream)
    while n != n_prev:
        n_prev = n
        mv_upstream = False
        for i in range(delta[n], delta[n + 1]):
            m = D[i]
            if stack_id[m] != b:
                inc += 1
                s[inc] = m
                stack_id[m] = b
                mv_upstream = True
                break
        if mv_upstream:
            n = m     # move upstream
        else:
            n = r[n]  # otherwise move downstream
    
    return inc
            

@numba.njit
def _calculate_temp_arrays(r, nd, delta, D, w):
    np = r.size
    
    # number of donors (nd)
    for i in range(np):
        nd[r[i]] += 1
    
    # donors positions in D (delta)
    delta[np] = np
    for i in range(np - 1, -1, -1):
        delta[i] = delta[i + 1] - nd[i]
    
    # donors ids flattened (D)
    for i in range(np):
        ri = r[i]
        D[delta[ri] + w[ri]] = i
        w[ri] += 1

   
def order_nodes_numba(receivers, baselevels, mode='recursive',
                      out_stack_ids=False):
    """Numba implementation of node ordering."""
    
    # temp arrays
    np = receivers.size
    nd = numpy.zeros(np, dtype=int)
    delta = numpy.zeros(np+1, dtype=int)
    D = numpy.zeros(np, dtype=int)
    w = numpy.zeros(np, dtype=int)
    _calculate_temp_arrays(receivers, nd, delta, D, w)
    del nd, w
    
    # build stacks
    s = numpy.zeros(np, dtype=int)
    stack_id = numpy.zeros(np, dtype=int) - 1
    if mode == 'recursive':
        bstack_func = lambda b, inc: _build_stack(
            b, D, delta, s, inc
        )
    elif mode == 'non-recursive':
        bstack_func = lambda b, inc: _build_stack_nr(
            b, receivers, D, delta, s, stack_id, 0
        )
    inc = 0
    for b in baselevels:
        inc = bstack_func(b, inc) + 1
    
    if out_stack_ids and mode != 'recursive':
        return (s, stack_id)
    else:
        return s

Testing

Testing is based on the example case presented in Braun & Willet (2013).


In [5]:
receivers = numpy.array([2, 5, 2, 7, 5, 5, 6, 5, 7, 8]) - 1
expected_order = numpy.array([5, 2, 1, 3, 6, 7, 4, 9, 8, 10]) - 1

print (order_nodes_numpy(receivers, [4]) == expected_order).all()
print (order_nodes_numba(receivers, [4]) == expected_order).all()
print (order_nodes_numba(receivers, [4], mode='non-recursive') == expected_order).all()


True
True
True

Benchmarking


In [6]:
sys.setrecursionlimit(10000)

Create the raster model grid


In [7]:
rgrid = RasterModelGrid(1000, 1000)

# using a random perturbed planar topography (diagonal oriented)
max_elevation = 50.
elevation = rgrid.node_y + rgrid.node_x 
elevation /= elevation.max() / max_elevation
#elevation += numpy.random.rand(elevation.size) / 1e5
rgrid.add_field('node', 'planet_surface__elevation', elevation)

# first node is the unique outlet
rgrid.set_closed_boundaries_at_grid_edges(True, True, True, True)
rgrid.set_fixed_value_boundaries(0)

Get D8 single flow receivers


In [8]:
frouter = FlowRouter(rgrid)
rgrid = frouter.route_flow(grid=rgrid)
receivers = rgrid['node']['flow_receiver']

receivers


Out[8]:
array([     0,      1,      2, ..., 999997, 999998, 999999])

Compare execution times: numpy vs. numba (pure-python recursive and jitted non-recursive) vs. elevation sorting (using numpy)


In [9]:
%timeit order_nodes_numpy(receivers, [0])


1 loops, best of 3: 4.3 s per loop

In [10]:
%timeit order_nodes_numba(receivers, [0])


1 loops, best of 3: 2.41 s per loop

In [11]:
%timeit order_nodes_numba(receivers, [0], mode='non-recursive')


10 loops, best of 3: 145 ms per loop

In [12]:
%timeit numpy.argsort(rgrid['node']['planet_surface__elevation'])


10 loops, best of 3: 73.7 ms per loop

In [13]: