In [ ]:
%%HTML
<style>
.container { width:100% } 
</style>

An Iterative Implementation of Merge Sort

The function $\texttt{sort}(L)$ sorts the list $L$ in place using merge sort. It takes advantage of the fact that, in Python, lists are stored internally as arrays. The function sort is a wrapper for the function merge_sort. Its sole purpose is to allocate the auxiliary array A, which has the same size as the array holding L.


In [ ]:
def sort(L):
    A = L[:]  # A is a copy of L
    mergeSort(L, A)

The function mergeSort is called with 2 arguments.

  • The first parameter $\texttt{L}$ is the list that is to be sorted.
  • The second parameter $\texttt{A}$ is used as an auxiliary array. This array is needed as temporary storage and is required to have the same size as the list $\texttt{L}$.

In [ ]:
def mergeSort(L, A):
    n = 1
    while n < len(L):
        k = 0
        while n * k + n < len(L):
            top = min(n * k + 2 * n, len(L))
            merge(L, n * k, n * k + n, top, A)
            k += 2    
        n *= 2

The function merge takes five arguments.

  • L is a list,
  • start is an integer such that $\texttt{start} \in \{0, \cdots, \texttt{len}(L)-1 \}$,
  • middle is an integer such that $\texttt{middle} \in \{0, \cdots, \texttt{len}(L)-1 \}$,
  • end is an integer such that $\texttt{end} \in \{0, \cdots, \texttt{len}(L)-1 \}$,
  • A is a list of the same length as L.

Furthermore, the indices start, middle and end have to satisfy the following: $$ 0 \leq \texttt{start} < \texttt{middle} < \texttt{end} \leq \texttt{len}(L) $$ The function assumes that the sublists L[start:middle] and L[middle:end] are already sorted. The function merges these sublists so that when the call returns the sublist L[start:end] is sorted. The last argument A is used as auxiliary memory.


In [ ]:
def merge(L, start, middle, end, A):
    A[start:end] = L[start:end]
    idx1 = start
    idx2 = middle
    i    = start
    while idx1 < middle and idx2 < end:
        if A[idx1] <= A[idx2]:
            L[i]  = A[idx1]
            idx1 += 1
        else:
            L[i]  = A[idx2]
            idx2 += 1
        i += 1
    if idx1 < middle:
        L[i:end] = A[idx1:middle]
    if idx2 < end:
        L[i:end] = A[idx2:end]

Testing


In [ ]:
import random as rnd
from collections import Counter

In [ ]:
def demo():
    L = [ rnd.randrange(1, 100) for n in range(1, 20) ]
    print("L = ", L)
    S = L[:]
    sort(S)
    print("S = ", S)
    print(Counter(L))
    print(Counter(S))
    print(Counter(L) == Counter(S))

In [ ]:
demo()

The function isOrdered(L) checks that the list L is sorted in ascending order.


In [ ]:
def isOrdered(L):
    for i in range(len(L) - 1):
        assert L[i] <= L[i+1]

The function sameElements(L, S) returns Trueif the lists L and S contain the same elements and, furthermore, each element $x$ occurring in L occurs in S the same number of times it occurs in L.


In [ ]:
def sameElements(L, S):
    assert Counter(L) == Counter(S)

The function $\texttt{testSort}(n, k)$ generates $n$ random lists of length $k$, sorts them, and checks whether the output is sorted and contains the same elements as the input.


In [ ]:
def testSort(n, k):
    for i in range(n):
        L = [ rnd.randrange(2*k) for x in range(k) ]
        oldL = L[:]
        sort(L)
        isOrdered(L)
        sameElements(oldL, L)
        print('.', end='')
    print()
    print("All tests successful!")

In [ ]:
%%time
testSort(100, 20000)

In [ ]:
k = 1000000
L = [ rnd.randrange(2*k) for x in range(k) ]

In [ ]:
%%time
sort(L)

In [ ]:
L = [ rnd.randrange(1000) for x in range(k) ]

In [ ]:
%%time
L = sort(L)

In [ ]:
help(sorted)

In [ ]:
6.5 / 0.309

In [ ]: