Sort algorithms

Divided and Conquer Sort

divide and conquer (D&C) is an algorithm design paradigm based on multi-branched recursion. A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.

Quick sort

QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the given array around the picked pivot. There are many different versions of quickSort that pick pivot in different ways.

Reference: http://www.geeksforgeeks.org/quick-sort/

Merge sort

This is not the in place sort as quick sort. need extra space to sort.

Reference: http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/mergeSort.htm

Pro and cons of quick sort and merge sort

For Merge sort worst case is $O(n*log(n))$, for Quick sort: $O(n^2)$. For other cases (avg, best) both have $O(n*log(n))$. However Quick sort is space constant where Merge sort depends on the structure you're sorting.


In [15]:
import timeit

def time_decorator(func):
    """ This for test the run time """
    start = timeit.default_timer()

    def warpper(*args, **kwargs):
        return func(*args, **kwargs)

    stop = timeit.default_timer()

    print('*** Run time: %.4f milli seconds' % ((stop - start)*1000.))

    return warpper

In [16]:
## Quick sort (pick the last as pivot, value)

def partition(arr: list, low: int, high: int):
    """"""
    open_index = low         # the open card! 
    pivot_value = arr[high]  # pivot value, not index here
    
    for i in range(low, high):  # the last one the the pivot, no need to change

        if arr[i] <= pivot_value:
            arr[open_index], arr[i] = arr[i], arr[open_index]
            open_index += 1

    arr[open_index], arr[high] = arr[high], arr[open_index]

    return open_index

@time_decorator
def quick_sort(arr: list, low: int, high: int):
    """"""
    if low < high:  # termination
        # Divide
        pivot = partition(arr, low, high)
        # Conquer
        quick_sort(arr, low, pivot-1)
        quick_sort(arr, pivot, high)

## test
arr = [2,3,5,6,1,3,7,99, 10, 200, 10000]
quick_sort(arr, 0, len(arr)-1)
print(arr)


*** Run time: 0.0008 milli seconds
[1, 2, 3, 3, 5, 6, 7, 10, 99, 200, 10000]

In [42]:
## Quick sort (random pick pivot value)
from random import randint

def random_partition(arr: list, low: int, high: int):
    """"""
    open_index = low
    random_pivot_value = arr[randint(low, high-1)]
    # put pivot value at the end of the arr
    arr[random_pivot_value], arr[high] = arr[high], arr[random_pivot_value]
    # the following is as same as last pivot case....
    
    for i in range(low, high):
        
        if arr[i] <= random_pivot_value:
            arr[open_index], arr[i] = arr[i], arr[open_index]
            open_index += 1

    arr[open_index], arr[high] = arr[high], arr[open_index]
    
    return open_index  # this the next divide point, less, less, partition, large, large

@time_decorator
def random_quick_sort(arr: list, low: int, high: int):
    """"""
    if low < high:
        pivot = random_partition(arr, low, high)
        
        random_quick_sort(arr, low, pivot-1)
        random_quick_sort(arr, pivot+1, high)

## test
arr = [2,3,5,6,1,3,7,99, 10, 200,10000]
quick_sort(arr, 0, len(arr)-1)
print(arr)  # this is a in place sort


*** Run time: 0.0005 milli seconds
[1, 2, 3, 3, 5, 6, 7, 10, 99, 200, 10000]

In [47]:
## Merge sort

def merge(left: list, right: list):
    """ Conquer part """
    if not len(left) or not len(right):  # just a number??
        return left or right

    result = []
    i, j = 0, 0

    while (len(result) < len(left) + len(right)):
        
        # put the smaller one at the left hand of the list
        if left[i] < right[j]:
            result.append(left[i])
            i += 1

        else:
            result.append(right[j])
            j += 1
        
        # until i or j reach the boundary of one part, and put the other part at
        # the end of the result.
        if i == len(left) or j == len(right):
            result.extend(left[i:] or right[j:])
            break

    return result


@time_decorator
def merge_sort(arr: list):
    """"""
    if len(arr) < 2:
        return arr

    middle = len(arr)//2
    
    # divide
    left = merge_sort(arr[:middle])
    right = merge_sort(arr[middle:])

    return merge(left, right)  # merge /conquer

# test
arr = [2,3,5,6,1,3,7,99, 10, 200,10000]
print(merge_sort(arr))  # not a in place sort!!


*** Run time: 0.0005 milli seconds
[1, 2, 3, 3, 5, 6, 7, 10, 99, 200, 10000]

In [57]:
## Application of a merge sort
def merge(arr: list, left: list, right: list):
    """"""
    if not len(left) or not len(right):  # just a number??
        return left or right

    count = 0
    i, j = 0, 0  # i is left init pointer, j is the right init pointer

    result = []
    
    print('Sub-array passed in: %s' % arr)
    print('Left: %s, right: %s' % (left, right))
    
    while(len(result) < len(left)+len(right)):
         
        print(i, j)

        if left[i] < right[j]:  # right order
            result.append(left[i])
            i += 1

        else:
            result.append(right[j])
            j += 1
            count += 1

        if i == len(left) or j == len(right):
            result.extend(left[i:] or right[j:])  # ????
            break

    print('Sub-array after sort: %s' % result)

    return count


def count_iversion(arr: list):
    """
    In an array, , the elements at indices  and  (where ) form an inversion if . In other       words, inverted elements  and  are considered to be "out of order". To correct an       

    inversion, we can swap adjacent elements.

    For example, consider . It has two inversions:  and . To sort the array, we must   

    perform the following two swaps to correct the inversions:

    Given  datasets, print the number of inversions that must be swapped to sort each 

    dataset on a new line.
    """
    length = len(arr)
    if length <= 1:  # terimate case
        return 0

    mid = length // 2
    left = arr[: mid]
    right = arr[mid: ]

    print('Divide to %s and %s' %(left, right))
    
    return count_iversion(left) + count_iversion(right) + merge(arr, left, right)


# test
arr = [1, 1, 1, 2, 2]
print(count_iversion(arr))  # TODO: still problem here!!! Not sorted at all......


Divide to [1, 1] and [1, 2, 2]
Divide to [1] and [1]
Sub-array passed in: [1, 1]
Left: [1], right: [1]
0 0
Sub-array after sort: [1, 1]
Divide to [1] and [2, 2]
Divide to [2] and [2]
Sub-array passed in: [2, 2]
Left: [2], right: [2]
0 0
Sub-array after sort: [2, 2]
Sub-array passed in: [1, 2, 2]
Left: [1], right: [2, 2]
0 0
Sub-array after sort: [1, 2, 2]
Sub-array passed in: [1, 1, 1, 2, 2]
Left: [1, 1], right: [1, 2, 2]
0 0
0 1
1 1
Sub-array after sort: [1, 1, 1, 2, 2]
3

In [ ]: