What's the fuzz all about?

Randomized data generation for robust testing

  • Moritz Gronbach, Blue Yonder
  • EuroPython 2015, Bilbao, Spain

About me and why I want to talk about this

Predictive Analytics

(picture CC by Nancy Nance)

Predictive Analytics


In [ ]:
import secret_algorithms

def create_pipeline():
    pipeline = []
    pipeline.append(TimeSeriesProcessor())
    pipeline.append(WeatherData())
    pipeline.append(secret_algorithms.SuperModel())
    return Pipeline(pipeline)

Big Data

Big Complex Data

If things go wrong in our code...

Actually the reality is closer to this

For general happiness...

  • need to check as many edge cases as possible
    • before going into production
    • even the cases we don't think about! ## Dynamic unit testing can help!

What is dynamic unit testing?

Property-based testing + Fuzzing

  • Test cases are generated automatically
    • Fuzzing
    • Parameter Templates
  • Function behaviour is checked for
    • crashes
    • timeouts
    • universal properties

Static Testing

  • test cases provided by the user
  • function behaviour precisely defined by the user

Dynamic and Static

Attributes of tests

  • Precision
    • How closely is the expected behaviour defined?
  • Case Coverage
    • What proportion of the input space is covered?

Does case coverage matter?

Does it really matter if we check 5 out of 2^64 cases, or 5000 out of 2^64 cases?

Often, yes!

  • Let's say there is a numerical instability in your algorithm
  • Only one percent of all inputs are affected
  • Probability to detect this instability using five cases
    • 1 - 0.99^5, about 0.05
  • Probability to detect this instability using 5000 cases
    • 1 - 0.99^5000, nearly 1

Dynamic tests help you find case classes you didn't think about

Static and Dynamic

  • Static
    • high precision, low case coverage*
  • Dynamic
    • low precision, higher case coverage*

* usually approximately true

  • Static and Dynamic testing complement each other
  • Uncertainty principle of unit testing: can't have both high precision and high case coverage

Dynamic Testing in Python

  • We use hypothesis
    • QuickCheck-style testing for Python
    • stable, but in ongoing development
    • a lot of innovative features

Let's do an example!


In [2]:
from math import sqrt


def fib(n):
    """Computes the n-th Fibonacci number.
    fib(0) == fib(1) == 1
    fib(n) == fib(n - 1) + fib(n - 2)
    1, 1, 2, 3, 5, 8, ..."""
    sqrt_5 = sqrt(5)
    p = (1 + sqrt_5) / 2
    q = 1/p
    return int((p**n + q**n) / sqrt_5 + 0.5)

print('Defined Fibonacci function!')


Defined Fibonacci function!

In [3]:
def test_fib():
    assert(fib(1) == 1)
    assert(fib(2) == 1)
    assert(fib(3) == 2)
    assert(fib(6) == 8)
    assert(fib(50) == 12586269025)
    print("Tests passed!")

test_fib()


Tests passed!

In [4]:
from hypothesis import given
from hypothesis.strategies import integers
from hypothesis import Settings, Verbosity


# settings to increase chances of a smooth presentation
Settings.default.derandomize = True
Settings.default.max_iterations = 50
Settings.default.timeout = 20
Settings.database = None

In [5]:
@given(integers(min_value=3))
def test_fib_recurrence(n):
    assert(fib(n) == fib(n - 1) + fib(n - 2))
    
test_fib_recurrence()


Falsifying example: test_fib_recurrence(n=71)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-b0b088d19f2f> in <module>()
      3     assert(fib(n) == fib(n - 1) + fib(n - 2))
      4 
----> 5 test_fib_recurrence()

/home/DataScience/ep15/medbot/.hypothesis/eval_source/hypothesis_temporary_module_ea6bbcced2e373e81f10b114ad0149bfd6a75eed.py in test_fib_recurrence(n)
      3 def accept(f):
      4     def test_fib_recurrence(n=not_set):
----> 5         return f(n=n)
      6     return test_fib_recurrence

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.pyc in wrapped_test(*arguments, **kwargs)
    521                 test_runner(reify_and_execute(
    522                     search_strategy, falsifying_template, test,
--> 523                     print_example=True
    524                 ))
    525 

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.pyc in default_executor(function)
     24 
     25 def default_executor(function):
---> 26     return function()
     27 
     28 

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.pyc in run()
    338                 )
    339             )
--> 340         return test(*args, **kwargs)
    341     return run
    342 

<ipython-input-5-b0b088d19f2f> in test_fib_recurrence(n)
      1 @given(integers(min_value=3))
      2 def test_fib_recurrence(n):
----> 3     assert(fib(n) == fib(n - 1) + fib(n - 2))
      4 
      5 test_fib_recurrence()

AssertionError: 

In [6]:
@given(integers(min_value=3),
      settings=Settings(verbosity=Verbosity.verbose))
def test_fib_recurrence(n):
    assert(fib(n) == fib(n - 1) + fib(n - 2))
    
test_fib_recurrence()


Trying example: test_fib_recurrence(n=805)
Traceback (most recent call last):
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 494, in is_template_example
    always_print=settings.max_shrinks <= 0
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.py", line 26, in default_executor
    return function()
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 340, in run
    return test(*args, **kwargs)
  File "<ipython-input-6-eeb8c5a7a39b>", line 4, in test_fib_recurrence
    assert(fib(n) == fib(n - 1) + fib(n - 2))
AssertionError

Trying example: test_fib_recurrence(n=3)
Trying example: test_fib_recurrence(n=4)
Trying example: test_fib_recurrence(n=5)
Trying example: test_fib_recurrence(n=7)
Trying example: test_fib_recurrence(n=11)
Trying example: test_fib_recurrence(n=19)
Trying example: test_fib_recurrence(n=35)
Trying example: test_fib_recurrence(n=67)
Trying example: test_fib_recurrence(n=131)
Traceback (most recent call last):
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 494, in is_template_example
    always_print=settings.max_shrinks <= 0
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.py", line 26, in default_executor
    return function()
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 340, in run
    return test(*args, **kwargs)
  File "<ipython-input-6-eeb8c5a7a39b>", line 4, in test_fib_recurrence
    assert(fib(n) == fib(n - 1) + fib(n - 2))
AssertionError

Trying example: test_fib_recurrence(n=68)
Trying example: test_fib_recurrence(n=69)
Trying example: test_fib_recurrence(n=99)
Traceback (most recent call last):
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 494, in is_template_example
    always_print=settings.max_shrinks <= 0
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.py", line 26, in default_executor
    return function()
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 340, in run
    return test(*args, **kwargs)
  File "<ipython-input-6-eeb8c5a7a39b>", line 4, in test_fib_recurrence
    assert(fib(n) == fib(n - 1) + fib(n - 2))
AssertionError

Trying example: test_fib_recurrence(n=70)
Trying example: test_fib_recurrence(n=83)
Traceback (most recent call last):
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 494, in is_template_example
    always_print=settings.max_shrinks <= 0
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.py", line 26, in default_executor
    return function()
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 340, in run
    return test(*args, **kwargs)
  File "<ipython-input-6-eeb8c5a7a39b>", line 4, in test_fib_recurrence
    assert(fib(n) == fib(n - 1) + fib(n - 2))
AssertionError

Trying example: test_fib_recurrence(n=75)
Trying example: test_fib_recurrence(n=42)
Trying example: test_fib_recurrence(n=59)
Trying example: test_fib_recurrence(n=71)
Traceback (most recent call last):
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 494, in is_template_example
    always_print=settings.max_shrinks <= 0
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.py", line 26, in default_executor
    return function()
  File "/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.py", line 340, in run
    return test(*args, **kwargs)
  File "<ipython-input-6-eeb8c5a7a39b>", line 4, in test_fib_recurrence
    assert(fib(n) == fib(n - 1) + fib(n - 2))
AssertionError

Trying example: test_fib_recurrence(n=50)
Trying example: test_fib_recurrence(n=53)
Trying example: test_fib_recurrence(n=55)
Trying example: test_fib_recurrence(n=62)
Trying example: test_fib_recurrence(n=65)
Trying example: test_fib_recurrence(n=66)
Successfully shrunk example 4 times
Falsifying example: test_fib_recurrence(n=71)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-6-eeb8c5a7a39b> in <module>()
      4     assert(fib(n) == fib(n - 1) + fib(n - 2))
      5 
----> 6 test_fib_recurrence()

/home/DataScience/ep15/medbot/.hypothesis/eval_source/hypothesis_temporary_module_ea6bbcced2e373e81f10b114ad0149bfd6a75eed.py in test_fib_recurrence(n)
      3 def accept(f):
      4     def test_fib_recurrence(n=not_set):
----> 5         return f(n=n)
      6     return test_fib_recurrence

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.pyc in wrapped_test(*arguments, **kwargs)
    521                 test_runner(reify_and_execute(
    522                     search_strategy, falsifying_template, test,
--> 523                     print_example=True
    524                 ))
    525 

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.pyc in default_executor(function)
     24 
     25 def default_executor(function):
---> 26     return function()
     27 
     28 

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.pyc in run()
    338                 )
    339             )
--> 340         return test(*args, **kwargs)
    341     return run
    342 

<ipython-input-6-eeb8c5a7a39b> in test_fib_recurrence(n)
      2       settings=Settings(verbosity=Verbosity.verbose))
      3 def test_fib_recurrence(n):
----> 4     assert(fib(n) == fib(n - 1) + fib(n - 2))
      5 
      6 test_fib_recurrence()

AssertionError: 

What happened?

  • Sampling
    • hypothesis samples integers until it finds a falsifying example
  • Shrinking
    • hypothesis tries to simplify the falsifying example
    • here: simplest means smallest integer

Example II: Departure from Math-Wonderland


In [10]:
from urllib import quote

def test_quote():
    assert(quote('') == '')
    s = 'abc def'
    expected = 'abc%20def'
    assert(quote(s) == expected)
    print("Tests passed!")
    
test_quote()


Tests passed!

In [11]:
from urllib import unquote
from hypothesis.strategies import text

@given(text())
def test_quote_unquote(s):
    assert unquote(quote(s)) == s
    
test_quote_unquote()


Falsifying example: test_quote_unquote(s=u'\x80')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-11-f1f9c7ff0856> in <module>()
      6     assert unquote(quote(s)) == s
      7 
----> 8 test_quote_unquote()

/home/DataScience/ep15/medbot/.hypothesis/eval_source/hypothesis_temporary_module_2607385f242d85b52317e4703feaed49e1c72972.py in test_quote_unquote(s)
      3 def accept(f):
      4     def test_quote_unquote(s=not_set):
----> 5         return f(s=s)
      6     return test_quote_unquote

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.pyc in wrapped_test(*arguments, **kwargs)
    521                 test_runner(reify_and_execute(
    522                     search_strategy, falsifying_template, test,
--> 523                     print_example=True
    524                 ))
    525 

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/executors/executors.pyc in default_executor(function)
     24 
     25 def default_executor(function):
---> 26     return function()
     27 
     28 

/home/DataScience/virtualenv/local/lib/python2.7/site-packages/hypothesis/core.pyc in run()
    338                 )
    339             )
--> 340         return test(*args, **kwargs)
    341     return run
    342 

<ipython-input-11-f1f9c7ff0856> in test_quote_unquote(s)
      4 @given(text())
      5 def test_quote_unquote(s):
----> 6     assert unquote(quote(s)) == s
      7 
      8 test_quote_unquote()

/usr/lib/python2.7/urllib.pyc in quote(s, safe)
   1266     if not s.rstrip(safe):
   1267         return s
-> 1268     return ''.join(map(quoter, s))
   1269 
   1270 def quote_plus(s, safe=''):

KeyError: u'\x80'

In [ ]:
from urllib import unquote
from hypothesis.strategies import text
import string

@given(text(alphabet=string.printable))
def test_quote_unquote(s):
    assert unquote(quote(s)) == s
    
test_quote_unquote()

Departure from Math-Wonderland

  • Handling impure* objects can be challenging in assertions and parameter templates, for example:
    • String encodings
    • Datetimes

* lacking mathematical elegance

Departure from Math-Wonderland

  • Hypothesis has built-in assistance for many common cases
    • Alphabets for string generation
    • Hypothesis-datetime package
    • Hypothesis-fakefactory package

Summary

  • Dynamic tests increase case coverage at the cost of precision
  • Dynamic and static complement each other - none is a replacement for the other
  • For math-heavy code, dynamic tests are really awesome
  • For impure objects, sometimes difficult (but tools exist & steadily improve)

Out of Scope

  • Custom strategies
  • Fuzzing for unstructured and mildly structured data:
    • text
    • images

Thanks for Listening!