Benchmarking Thinc layers with a custom benchmark layer

This notebook shows how to write a benchmark layer that can wrap any layer(s) in your network and that logs the execution times of the initialization, forward pass and backward pass. The benchmark layer can also be mapped to an operator like @ to make it easy to add debugging to your network.


In [ ]:
!pip install "thinc>=8.0.0a0" ml_datasets

To log the results, we first set up a custom logger using Python's logging module. You could also just print the stats instead, but using logging is cleaner, since it lets other users modify the logger's behavior more easily, and separates the logs from other output and write it to a file (e.g. if you're benchmarking several layers during training). The following logging config will output the date and time, the name of the logger and the logged results.


In [1]:
import logging

logger = logging.getLogger("thinc:benchmark")
if not logger.hasHandlers():  # prevent Jupyter from adding multiple loggers
    formatter = logging.Formatter('%(asctime)s %(name)s  %(message)s', datefmt="%Y-%m-%d %H:%M:%S")
    handler = logging.StreamHandler()
    handler.setFormatter(formatter)
    logger.addHandler(handler)
    logger.setLevel(logging.DEBUG)

Here's a minimalistic time logger that can be initialized with the name of a given layer, and can track several events (e.g. "forward" and "backward"). When the TimeLogger.end method is called, the output is formatted nicely and the elapsed time is logged with the logger name and colored label.


In [2]:
from timeit import default_timer
from wasabi import color


class TimeLogger:
    def __init__(self, name):
        self.colors = {"forward": "green", "backward": "blue"}
        self.name = name
        self.timers = {}
        
    def start(self, name):
        self.timers[name] = default_timer()
        
    def end(self, name):
        result = default_timer() - self.timers[name]
        label = f"{name.upper():<8}"
        label = color(label, self.colors.get(name), bold=True)
        logger.debug(f"{self.name:<12} | {label} | {result:.6f}")

The benchmark layer now has to wrap the forward pass, backward pass and initialization of the layer it wraps and log the execution times. It then returns a Thinc model instance with the custom forward function and a custom init function. We'll also allow setting a custom name to make it easier to tell multiple wrapped benchmark layers apart.


In [3]:
from thinc.api import Model
        
def benchmark(layer, name=None):
    name = name if name is not None else layer.name
    t = TimeLogger(name)
    
    def init(model, X, Y):
        t.start("init")
        result = layer.initialize(X, Y)
        t.end("init")
        return result
        
    def forward(model, X, is_train):
        t.start("forward")
        layer_Y, layer_callback = layer(X, is_train=is_train)
        t.end("forward")
        
        def backprop(dY):
            t.start("backward")
            result = layer_callback(dY)
            t.end("backward")
            return result
        
        return layer_Y, backprop
        
    return Model(f"benchmark:{layer.name}", forward, init=init)

Usage examples

Using the benchmark layer as a function

We can now wrap one or more layers (including nested layers) with the benchmark function. This is the original model:

model = chain(Linear(1), Linear(1))

In [4]:
import numpy
from thinc.api import chain, Linear

X = numpy.zeros((1, 2), dtype="f")

model = benchmark(chain(benchmark(Linear(1)), Linear(1)), name="outer")
model.initialize(X=X)
Y, backprop = model(X, is_train=False)
dX = backprop(Y)


2020-01-22 00:09:50 thinc:benchmark  linear       | INIT     | 0.000236
2020-01-22 00:09:50 thinc:benchmark  linear       | FORWARD  | 0.000424
2020-01-22 00:09:50 thinc:benchmark  outer        | INIT     | 0.004570
2020-01-22 00:09:50 thinc:benchmark  linear       | FORWARD  | 0.000116
2020-01-22 00:09:50 thinc:benchmark  outer        | FORWARD  | 0.016403
2020-01-22 00:09:50 thinc:benchmark  linear       | BACKWARD | 0.000239
2020-01-22 00:09:50 thinc:benchmark  outer        | BACKWARD | 0.003203

Using the benchmark layer as an operator

Alternatively, we can also use Model.define_operators to map benchmark to an operator like @. The left argument of the operator is the first argument passed into the function (the layer) and the right argument is the second argument (the name). The following example wraps the whole network (two chained Linear layers) in a benchmark layer named "outer", and the first Linear layer in a benchmark layer named "first".


In [5]:
from thinc.api import Model

with Model.define_operators({">>": chain, "@": benchmark}):
    model = (Linear(1) @ "first" >> Linear(1)) @ "outer"
    
model.initialize(X=X)
Y, backprop = model(X, is_train=True)
dX = backprop(Y)


2020-01-22 00:09:55 thinc:benchmark  first        | INIT     | 0.000315
2020-01-22 00:09:55 thinc:benchmark  first        | FORWARD  | 0.000106
2020-01-22 00:09:55 thinc:benchmark  outer        | INIT     | 0.009043
2020-01-22 00:09:55 thinc:benchmark  first        | FORWARD  | 0.000104
2020-01-22 00:09:55 thinc:benchmark  outer        | FORWARD  | 0.003850
2020-01-22 00:09:55 thinc:benchmark  first        | BACKWARD | 0.000133
2020-01-22 00:09:55 thinc:benchmark  outer        | BACKWARD | 0.010088

Using the benchmark layer during training


In [7]:
from thinc.api import Model, chain, Relu, Softmax, Adam
import ml_datasets

n_hidden = 32
dropout = 0.2

with Model.define_operators({">>": chain, "@": benchmark}):
    model = (
        Relu(nO=n_hidden, dropout=dropout) @ "relu1"
        >> Relu(nO=n_hidden, dropout=dropout) @ "relu2"
        >> Softmax()
    )

train_X = numpy.zeros((5, 784), dtype="f")
train_Y = numpy.zeros((540, 10), dtype="f")

model.initialize(X=train_X[:5], Y=train_Y[:5])
optimizer = Adam(0.001)
for i in range(10):
    for X, Y in model.ops.multibatch(8, train_X, train_Y, shuffle=True):
        Yh, backprop = model.begin_update(X)
        backprop(Yh - Y)
        model.finish_update(optimizer)


2020-01-22 00:10:04 thinc:benchmark  relu1        | INIT     | 0.001160
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000258
2020-01-22 00:10:04 thinc:benchmark  relu2        | INIT     | 0.000322
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000193
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000360
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000551
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000665
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.001340
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000253
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.000772
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000252
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000265
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000196
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.000274
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000170
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000223
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000364
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.001056
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000542
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000320
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000119
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.001592
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000245
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000169
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000166
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.014438
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000822
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000175
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000062
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.000636
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000184
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000149
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000090
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.000233
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000171
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000162
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000087
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.000285
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000176
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000160
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000088
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.001057
2020-01-22 00:10:04 thinc:benchmark  relu1        | FORWARD  | 0.000239
2020-01-22 00:10:04 thinc:benchmark  relu2        | FORWARD  | 0.000166
2020-01-22 00:10:04 thinc:benchmark  relu2        | BACKWARD | 0.000090
2020-01-22 00:10:04 thinc:benchmark  relu1        | BACKWARD | 0.000238