Week 7 : Lab Part B

Test, log, debug and optimize

CS1P - University of Glasgow - John H. Williamson - 2019/2020 -- Python 3.x

In [14]:
## RUN THIS CELL FIRST!
%load_ext autoreload
%autoreload 2

from utils.tick import tick
from utils.test_utils import should_pass, should_fail, test_dups, report_execution_time
from utils.test_utils import defuse, begin_defusal, detonate, short_hash
from utils.complexity import time_complexity            
from IPython.core.debugger import set_trace
print("All loaded OK")


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
All loaded OK

Part B

Image source: Alexandre Dulaunoy CC-SA-2.0 via flickr.com

Defusing bombs is a risky business. In this section, you will have to defuse a "logic bomb" using the debugging tools you now know.

B.1 Careful testing: using assert

Part of the defusal exercise creating number sequences to type on the numeric input pad that can be used to "fuzz" the decoder mechanism. The bomb has some built-in protection against random attacks, which will trip an emergency detonation if it ever sees consecutive sequences of digits.

The code below is supposed to remove consecutive "runs" (repeated elements) from a list of digits in the range 1-9, so that there are never more than one occurrence of an element next to each other. For example,

[1, 1, 2, 3, 5, 6, 6, 6, 9, 1, 2]  

should become

[1, 2, 3, 5, 6, 9, 1, 2]

The order of elements should be preserved. Note that 1 can appear twice -- the elements are not supposed to be unique but just to have all consecutive repeats reduced to a single element.

Any list elements that are not integers in the range 1-9 should be ignored in repetition checks and removed from the output.

This code is not a correct implementation; the "tests" below are not sufficient.

Task: Write a set of assert statements to thoroughly test the code. Find edge cases where the code does not work, and fix all of them.


In [65]:
def is_acceptable(x):
    if x is None:
        return False
    if not isinstance(x, int):
        return False
    if x < 1 or x > 9:
        return False
    return True


def remove_duplicates(l):
    l = [x for x in l if is_acceptable(x)]
    
    if len(l) < 2:
        return l
    
    result = []
    
    # Process most the elements
    for ix in range(1, len(l)):
        if l[ix-1] != l[ix]:
            result.append(l[ix-1])
            
    # Process the last element
    if len(result) == 0 or l[-1] != result[-1]:
        result.append(l[-1])
        
    return result

#hope these tests are enough!        
assert remove_duplicates([1,2,3]) == [1,2,3]
assert remove_duplicates([1,2,2]) == [1,2]
assert remove_duplicates([1, 1, 2, 2, 5, 6, 6, 6, 6, 9, 1, 2]) == [1, 2, 5, 6, 9, 1, 2]
assert remove_duplicates([]) == []
assert remove_duplicates([1]) == [1]
assert remove_duplicates([3, 3, 3]) == [3]
assert remove_duplicates([2, 2, 3, 3]) == [2, 3]
assert remove_duplicates([5,"b",5,5])==[5]
assert remove_duplicates([None])==[]
assert remove_duplicates([None, None])==[]

In [66]:
## Tests
## you won't see what caused the error
## you have to write *your* own tests!
with tick():
    test_dups(remove_duplicates)


✓ Correct


In [680]:
# Solution goes here

B.2 Getting the key

The bomb has been planted. To start the defusal, you require a defusal key, a particular code that will give you the result you need. However, the bombers are devious and have locked the code so it can't be edited directly. The code below loads and runs the code, calling the function get_key() to get the key. It doesn't work without the right variable definitions defined in advance.

Task Using the ipdb debugger (which will start automatically when you run the cell below), work out what the code is doing, and set appropriate global variables such that the test below passes.

REMEMBER: you must quit the debugger with q each time you are done. Don't just hit stop or re-run the cell!


In [6]:
## Define variables here
# they are all positive integers
master_key = 4
roman = 1
gaul = 7
egyptian = 13
greek = 5

In [ ]:
# Solution goes here

In [7]:
## Then run this
## *you cannot edit this cell*
codes = ["Rome", "Byzantium", "Carthage", "Italica", "Thebes", "Babylon"]


def stepr(x):
    j = 9
    while x > 0:
        x = x - roman
        j = j - 1
    return roman + x, codes[j]


def lemur(x):
    return [x, x + greek, x - egyptian]


def get_key():
    z, banjo = stepr(master_key)
    z += gaul
    z = lemur(z)
    z += [banjo]
    print(z)
    assert z == [8, 13, -5, "Babylon"]
    # if you get here, you've solved the puzzle
    return short_hash(z)


set_trace()  # enter the debugger
key = get_key()


--Return--
None
> <ipython-input-7-4aafc69d07c3>(29)<module>()
     26     return short_hash(z)
     27 
     28 
---> 29 set_trace()  # enter the debugger
     30 key = get_key()

ipdb> c
[8, 13, -5, 'Babylon']

In [8]:
## Tests
with tick():    
    print(key, short_hash(key))
    assert short_hash(key) == "dbf33d76"


5176c5b3 dbf33d76

✓ Correct

B.2 Checking the inputs

The next stage involves feeding carefully chosen test data into the bomb's decoder. You must make sure this data is valid before it is applied to the decoder or you risk unexpected explosions. Each defusal "tool" is wrapped up as a function. You have to implement functions that will test these functions work right before the bomb squad apply them to the device.

Write functions that define tests that test and raise an AssertionError if the function they are passed does not satisfy the requirements given. For example, we might write a function to test if another function correctly removes spaces from a string:


In [9]:
# this is the testing function
def test_remove_spaces(fn):
    assert fn("test me") == "testme"
    assert fn("") == ""
    assert fn("   ")==""
    assert fn("  - - ")=="--"

In [10]:
## Test that the test works and spaces never get to the bomb
def remove_spaces_bad(s):
    if " " in s:
        return s[:s.index(" ")] + s[s.index(" ")+1:]
        
def remove_spaces_good(s):
    return s.replace(" ", "")

with tick():    
    should_fail(test_remove_spaces, remove_spaces_bad)
    should_pass(test_remove_spaces, remove_spaces_good)


✓ Correct

Write similar testing functions like test_remove_spaces() for each of the following problems, and verify that your testing function behaves correctly.

Reversing a string


In [15]:
def test_reverse(fn):
    assert(fn("abcde") == "edcba")
    assert(fn("ABCDE") == "EDCBA")

In [16]:
# Solution goes here

In [17]:
# test_reverse() should test if the function passed 
# reverses the argument given

def reverse_good(s):
    return s[::-1]

def reverse_bad(s):
    return s

def reverse_bad_3(s):
    return "".join(reversed(s.lower()))

def reverse_bad_2(s):
    ls = list(s)
    for i in range(len(s)//2):
        ls[i], ls[len(s)-i] = ls[len(s)-i], ls[i] 
    return "".join(ls)

with tick():
    should_fail(test_reverse, reverse_bad)
    should_fail(test_reverse, reverse_bad_2)
    should_fail(test_reverse, reverse_bad_3)
    should_pass(test_reverse, reverse_good)


✓ Correct

Uppercasing a string


In [71]:
def test_upper(fn):
    assert(fn("abcde") == "ABCDE")
    assert(fn("ABCDE") == "ABCDE")
    assert(fn("ABCDx") == "ABCDX")
    assert(fn("AbCd1") == "ABCD1")
    assert(fn("aaaaBBBB") == "AAAABBBB")
    assert(fn("") == "")
    assert(fn("a") == "A")

In [72]:
# Solution goes here

In [73]:
# test_upper() should test if the function passed uppercases the string given

def upper_good(s):
    return s.upper()

def upper_bad(s):
    upper = ''
    for ch in s:
        upper += chr(ord(ch)-32)
    return upper

def upper_bad_2(s):
    return s[0].upper() + s[1:].upper() 

with tick():
    should_pass(test_upper, upper_good)
    should_fail(test_upper, upper_bad)
    should_fail(test_upper, upper_bad_2)


✓ Correct

Interleaving two lists a,b, but only including as many elements as the shorter of the two lists

interleave([1,2,3], [4,5,6,7]) = [1,4,2,5,3,6]


In [65]:
def test_interleave(fn):
    assert(fn([1, 2, 3], [4, 5, 6, 7]) == [1, 4, 2, 5, 3, 6])
    assert(fn([4, 5, 6, 7], [1, 2, 3]) == [4, 1, 5, 2, 6, 3])
    assert(fn([], []) == [])
    assert(fn([1, 2], [3, 4]) == [1, 3, 2, 4])
    assert(fn([1], []) == [])
    assert(fn([], []) == [])

In [66]:
# Solution goes here

In [67]:
# test_interleave() should test if the function passed interleaves according to the rule above

import itertools
def interleave_good(a, b):    
    return list(itertools.chain(*zip(a,b)))

def interleave_bad(a, b):
    interleaved = []
    while len(a)>0:
        interleaved.append(a.pop(0))
        interleaved.append(b.pop(0))
    return interleaved

def interleave_bad_2(a, b):
    if len(b)>0:
        return [a[0], b[0]] + interleave_bad_2(a[1:], b[1:])
    else:
        return []

def interleave_bad_3(a, b):
    w = len(a)
    if a!=b and len(a)/len(b)!=1:
        w = min(len(a), len(b))
    interleaved = []
    for i in range(w):
        interleaved.append(a[i])
        interleaved.append(b[i])
    return interleaved

with tick():
    should_pass(test_interleave, interleave_good)
    should_fail(test_interleave, interleave_bad)
    should_fail(test_interleave, interleave_bad_2)
    should_fail(test_interleave, interleave_bad_3)


✓ Correct

B.4 A race against time

At the last stage of defusal, there is a timelock that gives you only a few seconds to enter the final unlock code.

You have to find a way to search for the final defusal code quickly enough that you find the key before the bomb relocks itself. The code below will defuse the final stage of the bomb, but it is too slow to complete before the timelock re-engages.

Use %%timeit to time the execution of this code. Find a way to make this code (much) faster, that still passes the test.


In [116]:
def factorise(number):
    candidates =[]    
    for i in range(2, number+1):
        # is it divisible? if so append it
        if number%i==0:
            candidates.append(i)
            
    return candidates

def crack_the_code_slow():
    # This code will generate test sequences for the defuser.
    #
    # We know that the pattern involves factorising one of 100
    # secret numbers that we have in the file secret_numbers.txt
    #
    # The code to be generated is the digits of each unique prime
    # factor of one of the numbers, all joined into one string
    #
    # We just don't know which one produces the right code, so
    # we can test them all.    
    digit_sequence = ""
    # test all 15 numbers
    for i in range(15):
        print(i, end=" ")
        # read in th numbers
        with open("secret_numbers.txt") as f:
            for j, line in enumerate(f):
                # find the right line for this number
                if i==j:
                    n = int(line) # convert to integer
                    
        primes = []
        
        # compute the prime numbers
        for i in range(10_000):
            candidates = factorise(i)
            # is it prime (has no factors other than 1 and itself?)
            if len(candidates)==1:
                primes = primes + [i]
        
        # now factor the number
        factors = factorise(n)
        
        # filter to keep only the primes
        prime_factors = []
        for factor in factors:
            # only keep prime factors, otherwise reject them
            if factor in primes:
                prime_factors.append(factor)
                
        # now build the string up        
        for factor in prime_factors:
            # add the string version of each digit
            digit_sequence = digit_sequence + str(factor)
            print("Digits: ", digit_sequence)
        
    return digit_sequence

print(crack_the_code_slow())


0 Digits:  2
Digits:  237
Digits:  23789
1 Digits:  237897
Digits:  23789711
Digits:  2378971143
2 Digits:  23789711432
Digits:  237897114323
Digits:  2378971143237
3 Digits:  23789711432373
Digits:  2378971143237397
4 Digits:  23789711432373977211
5 
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-116-b94b11c9de0c> in <module>
     57     return digit_sequence
     58 
---> 59 print(crack_the_code_slow())

<ipython-input-116-b94b11c9de0c> in crack_the_code_slow()
     34         # compute the prime numbers
     35         for i in range(10_000):
---> 36             candidates = factorise(i)
     37             # is it prime (has no factors other than 1 and itself?)
     38             if len(candidates)==1:

<ipython-input-116-b94b11c9de0c> in factorise(number)
      3     for i in range(2, number+1):
      4         # is it divisible? if so append it
----> 5         if number%i==0:
      6             candidates.append(i)
      7 

KeyboardInterrupt: 

In [130]:
def factorise(n):
    i = 2
    factors = []
    while i * i <= n:
        if n % i:
            i += 1
        else:
            n //= i
            factors.append(i)
    if n > 1:
        factors.append(n)
    return factors


def crack_the_code_fast():
    digit_sequence = ""
    
    numbers = []
    with open("secret_numbers.txt") as f:
        numbers = [int(line.strip()) for line in f]
            
    digits = []
    for i in range(15):
        print(i, end=" ")      
        prime_factors = factorise(numbers[i])
        for x in prime_factors:
            digits.append(str(x))
    
    return "".join(digits)

crack_the_code_fast()


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Out[130]:
'23130323350922231313352112223712287924603217311111192746123162756832027233729333337'

In [131]:
# note: it takes a long time to run the 
# validation test, so experiment here before moving on
crack_the_code_fast()


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Out[131]:
'23130323350922231313352112223712287924603217311111192746123162756832027233729333337'

In [132]:
## Tests
import random
with open("secret_numbers.txt", "w") as f:
    for i in range(15):
        print("{:d}".format(random.randint(100, 10000)), file=f)
                                   
assert crack_the_code_fast() == crack_the_code_slow()            
# check it is fast enough
report_execution_time(crack_the_code_fast)    

print("All right!")


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Digits:  3
Digits:  35
Digits:  3523
Digits:  352323
1 Digits:  35232311
Digits:  35232311829
2 Digits:  352323118293
Digits:  3523231182937
Digits:  352323118293737
3 Digits:  3523231182937372
Digits:  35232311829373723623
4 Digits:  352323118293737236232
Digits:  352323118293737236232367
5 Digits:  3523231182937372362323672143
6 Digits:  35232311829373723623236721432
Digits:  352323118293737236232367214325
Digits:  352323118293737236232367214325151
7 Digits:  35232311829373723623236721432515119
Digits:  3523231182937372362323672143251511931
8 Digits:  352323118293737236232367214325151193119
Digits:  352323118293737236232367214325151193119271
9 Digits:  3523231182937372362323672143251511931192719643
10 Digits:  35232311829373723623236721432515119311927196432
Digits:  352323118293737236232367214325151193119271964323
Digits:  3523231182937372362323672143251511931192719643235
Digits:  3523231182937372362323672143251511931192719643235283
11 Digits:  35232311829373723623236721432515119311927196432352832
Digits:  3523231182937372362323672143251511931192719643235283223
Digits:  3523231182937372362323672143251511931192719643235283223167
12 Digits:  35232311829373723623236721432515119311927196432352832231671747
13 Digits:  3523231182937372362323672143251511931192719643235283223167174731
Digits:  3523231182937372362323672143251511931192719643235283223167174731307
14 Digits:  35232311829373723623236721432515119311927196432352832231671747313071487
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Speed ratio: 160127.8x faster
More than 5000x faster, good job!
All right!

In [522]:
# Solution goes here

In [532]:
# Solution goes here

C: Extended problems (OPTIONAL)

These extended problems are optional for students who are keen to learn more. If you've finished the whole lab and want to explore these ideas in more depth, these problems and resources are intended to help you do that.

## You do not need to make any attempt at any of this section to receive a tick!

C.1 The tracer (optional extension)

The key only decativates part of the bomb. You have to make sure you keep the key parts of the mechanism working and disable everything else.

In this task, you have to "trace" some terribly written code, by using logging (print) statements to track what is going on.

It is nearly impossible to see what is going on at first glance, due to complicated structure and terrible naming. Carefully add your own print statements log the behaviour and work out what is happening in this code, step by step. Don't try and use the debugger in this code or you will set off the bomb.

To get the program to defuse, you must:

  • call the trigger() function with particular arguments such that it returns True, and any function marked @detonate(True) is never called.
  • comment out (i.e. remove the definition of) any function that can never be called under any circumstance ("dead functions");

You will only avoid detonation if you do both of these things correctly. @detonate(False) indicates that the function does not directly detonate the bomb. @detonate(True) will detonate the bomb if called.


In [683]:
begin_defusal()  # leave this line as it is

@detonate(True)
def c0():
    a0 = 0
    for a1 in range(100):
        a0 = a0 + a1
    print(a0*a1)


@detonate(True)
def c1():
    a1 = 8
    a0 = ["fluffy", "boom"] * a1
    return a0


@detonate(False)
def c3(p0, p0_):  
    if len(p0)%2==1:
        p0 = p0_ + [1]
    return p0

@detonate(False)
def c5(p0, p1):
    if p0 == p1:
        return p0
    else:
        return p0 + p1

@detonate(False)
def c6(p0, p1=""):
    if len(p0) == 0:
        p2 = len(set(p1))
        print(p1, p2)
        if len(p1) == 8 and p2 == len(p1) // 2:
            return True
        else:
            return False
    if p0[0] in "aeiou":
        return c6(p0[1:], p1)
    else:
        return c6(p0[1:], p1 + p0[0])

@detonate(True)
def c10(p0, p1):
    return p1 * p0

@detonate(True)
def c11(p0):
    if len(p0) < 2:
        return p0
    else:
        return c11(p0[len(p0) // 2 :]) + c11(p0[0 : len(y) // 2])

@detonate(False)
def c12(p0):
    return p0

@detonate(False)
def trigger(x, y):

    if type(y) == type([]):
        c11(y)
    if type(y) == type(""):
        y = c12(y)
    else:
        y = c5(x, y)

    a0 = ""
    for a1 in range(x[0], x[1]):
        a0 += chr(a1 + ord("A"))

    if "h" in a0:
        c1()
    elif "G" in a0:
        a2 = c5(y, y[::-1])
    assert "Q" in a0

    for a1 in y:
        if a1.lower() in a0.lower():
            c1()

    a2 = c6(a2)
    return a2

In [684]:
# you must specify values for x and y
x = None
y = None

In [685]:
# Solution goes here

In [686]:
with tick():
    assert trigger(x, y)
    defuse("92a98a01")

C.2 Timing varieties

The code below generates "sorted triangles" of strings. For example, given the string

"banjo murderer"

It will produce the list of strings

['',
 ' ',
 ' a',
 ' ab',
 ' abd',
 ' abde',
 ' abdee',
 ' abdeej',
 ' abdeejm',
 ' abdeejmn',
 ' abdeejmno',
 ' abdeejmnor',
 ' abdeejmnorr',
 ' abdeejmnorrr',
 ' abdeejmnorrru']

Use plot_complexity() along with random_string (defined below) to verify that this code take $O(N^3)$ time to run. Hint: you will have to define another, very simple function, that takes a single parameter n to work with plot_complexity.

Think about how to modify it to run in each of these complexities, and write a function that does each one, and plot the complexity graph to verify the runtime:

  • O(N^2) time sort_tri_nsqr()
  • O(N^2 log N) time sort_tri_nlogn_sqr()
  • O(N log N) time sort_tri_nlogn() (harder)

Do not try and modify my_sort(s). Instead, consider replacing it entirely the built-in Python sort (which runs in O(N log N) time by default).


In [ ]:
def my_sort(s):
    # DO NOT MODIFY THIS FUNCTION
    result = ""
    for steps in range(len(s)):
        # find smallest character
        min_ch = chr(255)
        for test_ch in s:
            # is this a new smaller character?
            if test_ch < min_ch:
                min_ch = test_ch
        result += min_ch # store smallest character
        s = s.replace(min_ch, "", 1) # remove character from s
    return result

my_sort("hello")

In [ ]:
# This is a very bad way to solve this
# problem. But it works!
def sort_tri_cube(s):    
    tri = []
    for i in range(len(s)+1):
        s_sorted = my_sort(s)
        result = ""
        for j in range(i):
            result += s_sorted[j]
        tri += [result]        
    return tri

sort_tri_cube("banjo murderer")

In [ ]:
import string, random

def random_string(n):
    return "".join([random.choice(string.printable) for i in range(n)])   

sort_tri_cube(random_string(10))

In [ ]:
# Solution goes here