Resources

binary search algorithm

I have been reading the book Grokking Algorithms by Aditya Bhargava.

While the examples in the book are given in Python2, the concepts the book is presenting are language agnostic.

These blog posts are notes taken from exploration of the algorithms and ideas from the book.

Functional programming

Goal: Express a binary search in a functional style in Python.

(I'm not sure I succeeded!)

Noting this here thinking that I could use reduce to write a functional binary search function.

  • Free download: http://www.oreilly.com/programming/free/files/functional-programming-python.pdf

    These reduce() expressions are awkward, but they also illustrate how powerful the function is in its generality: anything that can be computed from a sequence of successive elements can (if awkwardly) be expressed as a reduction.

    The function functools.reduce() is very general, very powerful, and very subtle to use to its full power. It takes successive items of an iterable, and combines them in some way.

  • From Python documentation: https://docs.python.org/3.5/library/functools.html

    Apply function of two arguments cumulatively to the items of sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the sequence. If the optional initializer is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty. If initializer is not given and sequence contains only one item, the first item is returned.

NB interesting example in functional style.

Interesting example of recursion.

def identity_print(x):
    print(x)
    return x

# returns if identity_print returns 'quit'
# or calls identity_print
echo_FP = lambda: identity_print(input("FP -- "))=='quit' or echo_FP()
echo_FP()

In [1]:
(  # colors for logging display
    BLUE,
    GREEN,
    YELLOW,
    RED,
) = (
    'blue',
    'green',
    'yellow',
    'red',
)

In [2]:
# demonstration of reduce
# Thinking I could use with binary search
from functools import reduce
reduce(lambda x, y: (x, y), range(1, 10), 0)


Out[2]:
(((((((((0, 1), 2), 3), 4), 5), 6), 7), 8), 9)

In [3]:
reduce(lambda x, y: (x, y), reversed(range(1, 10)), 0)


Out[3]:
(((((((((0, 9), 8), 7), 6), 5), 4), 3), 2), 1)

In [4]:
import os
# dynamically get an objects attr without typing all the dots
# return join of path of os
reduce(getattr, ('path', 'join',), os)


Out[4]:
<function posixpath.join>

Binary search functional style.

sorted_random_numbers creates test data


In [5]:
from random import choice

def sorted_random_numbers(limit=50):
    """Create a sorted collection of pseudorandom numbers of pseudorandom length with a minimum length of 10.
    
    :limit: max length of list returned
    """
    chosen = []
    
    def random_number():
        number = None
        while number not in chosen:
            number = choice(range(1000))
            if number:
                chosen.append(number)
        return number
        
    
    return sorted(random_number() for _ in range(10, choice(range(11, limit))))

In [6]:
number_lists = [sorted_random_numbers() for _ in range(10)]
# choose a random item to find in the list
lists_with_item = [(item, choice(item)) for item in number_lists]

echo outputs colorized text to logger


In [7]:
import click

def echo(message, color):
    """Display a colored message."""
    logger.info(click.style(''.join((' ', message)), fg=color))

set up logging


In [8]:
import logging
import importlib

importlib.reload(logging)  # IPython starts logging so have to reload

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)
echo('Logger is working.', YELLOW)


INFO:__main__: Logger is working.

display test data


In [9]:
# view of the test data
SPACE = ' '
for list_, needle in lists_with_item:
    echo(SPACE.join(("List to search:", repr(list_))), GREEN)
    echo(SPACE.join(("Item to find:", str(needle))), RED)


INFO:__main__: List to search: [36, 113, 138, 142, 157, 248, 302, 384, 408, 540, 559, 585, 672, 715, 795, 849, 875, 955, 987]
INFO:__main__: Item to find: 36
INFO:__main__: List to search: [5, 44, 59, 61, 69, 70, 72, 145, 201, 231, 255, 292, 310, 317, 349, 355, 373, 382, 417, 448, 492, 501, 507, 529, 540, 618, 654, 678, 679, 704, 724, 837, 852, 870, 890, 894, 901]
INFO:__main__: Item to find: 890
INFO:__main__: List to search: [5, 27, 66, 118, 157, 166, 194, 205, 215, 225, 379, 389, 466, 487, 523, 596, 653, 686, 714, 732, 771, 796, 800, 828, 831, 867, 873, 925]
INFO:__main__: Item to find: 487
INFO:__main__: List to search: [25, 74, 85, 94, 230, 231, 268, 313, 373, 402, 503, 544, 566, 648, 664, 672, 693, 693, 707, 847, 857, 899, 899, 914, 928, 948, 964, 970]
INFO:__main__: Item to find: 544
INFO:__main__: List to search: [48, 110, 154, 234, 234, 439, 451, 455, 463, 620, 683, 717, 803, 841, 847, 896, 972, 983]
INFO:__main__: Item to find: 154
INFO:__main__: List to search: [24, 30, 112, 120, 146, 163, 197, 213, 215, 217, 223, 223, 232, 253, 254, 342, 358, 369, 375, 375, 379, 498, 498, 555, 561, 625, 715, 738, 782, 809, 845, 864, 874, 876, 896, 953, 970, 978, 985]
INFO:__main__: Item to find: 342
INFO:__main__: List to search: [9, 39, 81, 95, 142, 165, 196, 232, 272, 300, 361, 519, 585, 649, 749, 873, 892, 931, 986, 995]
INFO:__main__: Item to find: 272
INFO:__main__: List to search: [144, 256, 266, 463, 470, 505, 661, 751, 761, 775, 843, 936, 946, 949, 979]
INFO:__main__: Item to find: 505
INFO:__main__: List to search: [44, 87, 93, 93, 113, 133, 205, 278, 286, 304, 315, 332, 376, 378, 390, 405, 406, 460, 466, 483, 516, 547, 583, 607, 639, 640, 648, 668, 727, 788, 835, 873, 902, 936, 949, 972]
INFO:__main__: Item to find: 949
INFO:__main__: List to search: [58, 572, 882, 908]
INFO:__main__: Item to find: 882

In [10]:
def imperative_binary_search(list_, item):
    """Binary search algorithm in an imperative style.
    
    Inspired by Grokking Algorithms: http://bit.ly/grokking-algorithms.
    :list_: collection through which to search
    :item: item for which to search in the collection
    """
    if not list_:
        raise ValueError('List is empty.')
    low = 0
    high = len(list_) - 1
    
    while low <= high:
        mid = (low + high) // 2  # python3 integer division
        guess = list_[mid]
        if guess == item:
            return mid
        if guess > item:
            high = mid - 1
        else:
            low = mid + 1
    raise ValueError("{} not found.".format(item))

In [11]:
lists, items = zip(*lists_with_item)

assert [list_[imperative_binary_search(list_, item)] 
        for list_, item 
        in lists_with_item] \
       == list(items)
echo('{} successfully found items {}'.format(imperative_binary_search, items), GREEN)


INFO:__main__: <function imperative_binary_search at 0x7f659cab6f28> successfully found items (36, 890, 487, 544, 154, 342, 272, 505, 949, 882)

test that error is thrown for empty list


In [12]:
lists_with_item_empty_list = lists_with_item[:]
lists_with_item_empty_list.append(([], None))
try:
    assert [list_[imperative_binary_search(list_, item)] 
            for list_, item 
            in lists_with_item_empty_list]\
           == list(items)
except ValueError as e:
    error = repr(e)
    assert error == "ValueError('List is empty.',)"
    echo(repr(e), YELLOW)
    echo(repr(lists_with_item_empty_list[-1]), YELLOW)


INFO:__main__: ValueError('List is empty.',)
INFO:__main__: ([], None)

test that error is thrown for item not found


In [13]:
lists_with_item_not_found = lists_with_item[:]
lists_with_item_not_found.append(([1, 3], 2))
try:
    assert [list_[imperative_binary_search(list_, item)] 
            for list_, item 
            in lists_with_item_not_found]\
           == list(items)
except ValueError as e:
    error = repr(e)
    assert error == "ValueError('2 not found.',)"
    echo(repr(e), YELLOW)
    echo(repr(lists_with_item_not_found[-1]), YELLOW)


INFO:__main__: ValueError('2 not found.',)
INFO:__main__: ([1, 3], 2)

refactor the imperative function into lambdas


In [14]:
# calculate the midpoint index of list
mid = lambda list_: len(list_) // 2

In [15]:
# split list into left and right halves using [:] slicing
splits = lambda list_: (list_[:mid(list_)], list_[mid(list_) + 1:])

In [16]:
guess = lambda list_: list_[mid(list_)]

In [17]:
# Is the guess the item?
is_item = lambda list_, item: guess(list_) == item

In [18]:
# return the left half if guess is > item else return the right half
sublist = lambda item, list_: splits(list_)[0] if guess(list_) > item else splits(list_)[1]

In [19]:
# check list
def error_check(list_, message): 
    if not list_:
        raise ValueError(message)

Not sure how to make this into a pure function.


In [20]:
def search(item, list_):
    error_check(list_, "List is empty.")
    while not is_item(list_, item):
        list_ = sublist(item, list_)
        error_check(list_, "Item not found.")
    return True

In [21]:
# test substrate
list_ = sorted_random_numbers()
item = choice(list_)
echo(str(list_), BLUE)
echo(str(item), BLUE)


INFO:__main__: [91, 132, 147, 176, 233, 261, 270, 281, 288, 290, 313, 324, 334, 351, 372, 416, 460, 495, 583, 610, 634, 646, 648, 657, 689, 702, 767, 809, 870, 871, 872, 883, 925, 949, 978]
INFO:__main__: 872

In [22]:
found = search(item, list_)
for item in (item, list_, found):
    echo(repr(item), GREEN)


INFO:__main__: 872
INFO:__main__: [91, 132, 147, 176, 233, 261, 270, 281, 288, 290, 313, 324, 334, 351, 372, 416, 460, 495, 583, 610, 634, 646, 648, 657, 689, 702, 767, 809, 870, 871, 872, 883, 925, 949, 978]
INFO:__main__: True

In [30]:
tests = tuple(search(item, list_) for list_, item in lists_with_item)
echo(repr(tests), GREEN)
assert all(tests)


INFO:__main__: (True, True, True, True, True, True, True, True, True, True)

In [25]:
try:
    search(0, list_)
except ValueError as e:
    echo("ValueError expected.", GREEN)
    echo(repr(ValueError(e)), GREEN)


INFO:__main__: ValueError expected.
INFO:__main__: ValueError(ValueError('Item not found.',),)

In [26]:
try:
    search(0, list())
except ValueError as e:
    echo("ValueError expected.", GREEN)
    echo(repr(ValueError(e)), GREEN)


INFO:__main__: ValueError expected.
INFO:__main__: ValueError(ValueError('List is empty.',),)