In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -u -d -v


Sebastian Raschka 
last updated: 2017-07-24 

CPython 3.6.1
IPython 6.0.0

Bubble Sort

Quick note about Bubble sort

I don't want to get into the details about sorting algorithms here, but there is a great report
"Sorting in the Presence of Branch Prediction and Caches - Fast Sorting on Modern Computers" written by Paul Biggar and David Gregg, where they describe and analyze elementary sorting algorithms in very nice detail (see chapter 4).

And for a quick reference, this website has a nice animation of this algorithm.

A long story short: The "worst-case" complexity of the Bubble sort algorithm (i.e., "Big-O")
$\Rightarrow \pmb O(n^2)$



Bubble sort implemented in (C)Python


In [4]:
def python_bubblesort(a_list):
    """ Bubblesort in Python for list objects (sorts in place)."""
    length = len(a_list)
    for i in range(length):
        for j in range(1, length):
            if a_list[j] < a_list[j-1]:
                a_list[j-1], a_list[j] = a_list[j], a_list[j-1]
    return a_list


Below is a improved version that quits early if no further swap is needed.


In [5]:
def python_bubblesort_improved(a_list):
    """ Bubblesort in Python for list objects (sorts in place)."""
    length = len(a_list)
    swapped = 1
    for i in range(length):
        if swapped: 
            swapped = 0
            for ele in range(length-i-1):
                if a_list[ele] > a_list[ele + 1]:
                    temp = a_list[ele + 1]
                    a_list[ele + 1] = a_list[ele]
                    a_list[ele] = temp
                    swapped = 1
    return a_list

Verifying that all implementations work correctly


In [6]:
import random
import copy
random.seed(4354353)

l = [random.randint(1,1000) for num in range(1, 1000)]
l_sorted = sorted(l)
for f in [python_bubblesort, python_bubblesort_improved]:
    assert(l_sorted  == f(copy.copy(l)))
print('Bubblesort works correctly')


Bubblesort works correctly

Performance comparison


In [7]:
# small list

l_small = [random.randint(1,100) for num in range(1, 100)]
l_small_cp = copy.copy(l_small)

%timeit python_bubblesort(l_small)
%timeit python_bubblesort_improved(l_small_cp)


100 loops, best of 3: 1.42 ms per loop
The slowest run took 95.97 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 24.8 µs per loop

In [8]:
# larger list

l_small = [random.randint(1,10000) for num in range(1, 10000)]
l_small_cp = copy.copy(l_small)

%timeit python_bubblesort(l_small)
%timeit python_bubblesort_improved(l_small_cp)


1 loop, best of 3: 19.5 s per loop
The slowest run took 7804.31 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 2.45 ms per loop