Quicksort

Question 1

The file quicksort.txt contains all of the integers between 1 and 10,000 (inclusive, with no repeats) in unsorted order. The integer in the ith row of the file gives you the ith entry of an input array.

Your task is to compute the total number of comparisons used to sort the given input file by QuickSort. As you know, the number of comparisons depends on which elements are chosen as pivots, so we'll ask you to explore three different pivoting rules.

You should not count comparisons one-by-one. Rather, when there is a recursive call on a subarray of length m, you should simply add m−1 to your running total of comparisons. (This is because the pivot element is compared to each of the other m−1 elements in the subarray in this recursive call.)

WARNING: The Partition subroutine can be implemented in several different ways, and different implementations can give you differing numbers of comparisons. For this problem, you should implement the Partition subroutine exactly as it is described in the video lectures (otherwise you might get the wrong answer).

DIRECTIONS FOR THIS PROBLEM:

For the first part of the programming assignment, you should always use the first element of the array as the pivot element.


In [353]:
COUNT = 0

array = [5, 4, 10, 2, 6, 9]
# array = [5, 4, 10, 2, 6, 9, 60, 59, 58, 56]
# array = [2148, 9058, 7742, 3153, 6324, 609, 7628, 5469, 7017, 504]

fp = open("quickSort.txt", 'r')
data = fp.readlines()
array = [int(x.strip()) for x in data]
#array = array[:100]

#print ("START", array)

def quicksort(i, j):
    #print ("quicksort",i, j)
    if j-i < 1:
        return 0
    
    # Count comparisons
    k = partition(i, j)
    
    countleft = quicksort( i, k-1)
    countright = quicksort( k+1, j)

    count = countleft +  countright + (j-i)

    #print(array)
    #print ("countleft ,  countright, i, j", countleft ,  countright, i, j)
    return count

    
def swap(i, j):
    #print("swap", i, j, array)

    temp = array[j]
    array[j] = array[i]
    array[i] = temp

def partition(i, j):
    global COUNT
    
    #print ("Partition", i, j, array[i:j+1])
    
    START_IDX = i+1
    END_IDX = j    
    PIVOT_IDX = i
        
    pivot = array[PIVOT_IDX]
    s = i + 1
        
    #print (START_IDX, END_IDX+1)
    for k in range(START_IDX, END_IDX+1): 
        COUNT += 1 
        #print (k)

        if array[k] < pivot:    
            swap(s, k)            
            s+=1  
            
    swap(PIVOT_IDX, s-1)  
    
    #print ("DONE", array)
    return s-1


myCount = quicksort(0, len(array)-1)            
#print (array)
print ("myCount", myCount)
print ("COUNT", COUNT)


myCount 162085
COUNT 162085

Question 2

GENERAL DIRECTIONS AND HOW TO GIVE US YOUR ANSWER:

See the first question.

DIRECTIONS FOR THIS PROBLEM:

Compute the number of comparisons (as in Problem 1), always using the final element of the given array as the pivot element. Again, be sure to implement the Partition subroutine exactly as it is described in the video lectures.

Recall from the lectures that, just before the main Partition subroutine, you should exchange the pivot element (i.e., the last element) with the first element.


In [350]:
COUNT = 0

array = [5, 4, 10, 2, 6, 9]
# array = [5, 4, 10, 2, 6, 9, 60, 59, 58, 56]
# array = [2148, 9058, 7742, 3153, 6324, 609, 7628, 5469, 7017, 504]

fp = open("quickSort.txt", 'r')
data = fp.readlines()
array = [int(x.strip()) for x in data]
#array = array[:100]

#print ("START", array)

def quicksort(i, j):
    #print ("quicksort",i, j)
    if j-i < 1:
        return 0
    
    # Count comparisons
    k = partition(i, j)
    
    countleft = quicksort( i, k-1)
    countright = quicksort( k+1, j)

    count = countleft +  countright + (j-i)

    #print(array)
    #print ("countleft ,  countright, i, j", countleft ,  countright, i, j)
    return count

    
def swap(i, j):
    #print("swap", i, j, array)

    temp = array[j]
    array[j] = array[i]
    array[i] = temp

def partition(i, j):
    global COUNT
    
    #print ("Partition", i, j, array[i:j+1])
    swap(i, j)
    
    START_IDX = i+1
    END_IDX = j    
    PIVOT_IDX = i
        
    pivot = array[PIVOT_IDX]
    s = i + 1
        
    #print (START_IDX, END_IDX+1)
    for k in range(START_IDX, END_IDX+1): 
        COUNT += 1 
        #print (k)

        if array[k] < pivot:    
            swap(s, k)            
            s+=1  
            
    swap(PIVOT_IDX, s-1)  
    
    #print ("DONE", array)
    return s-1


myCount = quicksort(0, len(array)-1)            
#print (array)
print ("myCount", myCount)
print ("COUNT", COUNT)


myCount 164123
COUNT 164123

Question 3 -

GENERAL DIRECTIONS AND HOW TO GIVE US YOUR ANSWER:

See the first question.

DIRECTIONS FOR THIS PROBLEM:

Compute the number of comparisons (as in Problem 1), using the "median-of-three" pivot rule. [The primary motivation behind this rule is to do a little bit of extra work to get much better performance on input arrays that are nearly sorted or reverse sorted.]

In more detail, you should choose the pivot as follows. Consider the first, middle, and final elements of the given array. (If the array has odd length it should be clear what the "middle" element is; for an array with even length 2k, use the kth element as the "middle" element. So for the array 4 5 6 7, the "middle" element is the second one ---- 5 and not 6!) Identify which of these three elements is the median (i.e., the one whose value is in between the other two), and use this as your pivot. As discussed in the first and second parts of this programming assignment, be sure to implement Partition exactly as described in the video lectures (including exchanging the pivot element with the first element just before the main Partition subroutine).

EXAMPLE: For the input array 8 2 4 5 7 1 you would consider the first (8), middle (4), and last (1) elements; since 4 is the median of the set {1,4,8}, you would use 4 as your pivot element.

SUBTLE POINT: A careful analysis would keep track of the comparisons made in identifying the median of the three candidate elements. You should NOT do this. That is, as in the previous two problems, you should simply add m−1 to your running total of comparisons every time you recurse on a subarray with length m.


In [348]:
COUNT = 0

array = [5, 4, 10, 2, 6, 9]
# array = [5, 4, 10, 2, 6, 9, 60, 59, 58, 56]
array = [2148, 9058, 7742, 3153, 6324, 609, 7628, 5469, 7017, 504]

fp = open("quickSort.txt", 'r')
data = fp.readlines()
array = [int(x.strip()) for x in data]

#array = array

print ("START")

def quicksort(i, j):
    #print ("quicksort",i, j)
    if j-i < 1:
        return 0
    
    # Count comparisons
    k = partition(i, j)
    
    countleft = quicksort( i, k-1)
    countright = quicksort( k+1, j)

    count = countleft +  countright + (j-i)

    #print(array)
    #print ("countleft ,  countright, i, j", countleft ,  countright, i, j)
    return count

    
def swap(i, j):
    #print("swap", i, j, array)

    temp = array[j]
    array[j] = array[i]
    array[i] = temp

def find_median(i,j):
    n = j - i + 1 # num of points
    if n%2 == 0:
        m= i + int(n/2) - 1
    else:
        m= i + int(n/2)

    #print (array[i:j+1], array[m], i, m, j)
    
    if (array[m] > array[i] and array[m] < array[j]) or (array[m] > array[j] and array[m] < array[i]): 
        return m
    elif (array[i] > array[m] and array[i] < array[j]) or (array[i] > array[j] and array[i] < array[m]): 
        return i
    else:
        return j


def partition(i, j):
    global COUNT
    
    #print ("Partition", i, j, array[i:j+1])
    m = find_median(i,j)
    #print("swap", i, m, array)
    swap(m, i)
    
    
    START_IDX = i+1
    END_IDX = j    
    PIVOT_IDX = i
        
    pivot = array[PIVOT_IDX]
    s = i + 1
        
    #print (START_IDX, END_IDX+1)
    for k in range(START_IDX, END_IDX+1): 
        COUNT += 1 
        #print (k)

        if array[k] < pivot:    
            swap(s, k)            
            s+=1  
            
    swap(PIVOT_IDX, s-1)  
    
    #print ("DONE", array)
    return s-1


myCount = quicksort(0, len(array)-1)            
#print (array)
print ("myCount", myCount)
print ("COUNT", COUNT)


START
myCount 138382
COUNT 138382

In [ ]: