Consider a finite collection of orderable elements. Re-arranging that collection, so that the collection is completely ordered is called sorting. There are many techiniques to sort a collection. Following are some of the comparision based Sorting Algorithms.
Before looking at the analysis part, we shall examine the Language in built methods to sorting
sorted(collection,reverse = False[,key])
This function takes an iterable as argument, and returns it in sorted form based on key
. If key
is not given, sorting is done according to default comparision rules. Let's see the examples and understand the working of sorted()
. If reverse
is True
, reversed collection is returned after sorting.
In [1]:
x = list(range(10))
import random
random.shuffle(x)
In [2]:
x
Out[2]:
In [3]:
sorted(x)
Out[3]:
In [4]:
import math
y = sorted(x,key = lambda x: math.sin(x)) # Sort elements of x in increasing order of their sines
y
Out[4]:
In [5]:
[math.sin(i) for i in y]
Out[5]:
Note how the elements of sin(y)
are in increasing order.
In [2]:
from openanalysis.sorting import SortingAlgorithm,SortAnalyzer
import numpy as np # for doing vstack()
SortingAlgorithm
is the base class providing the standards to implement sorting algorithms, SortAnalyzer
visualizes and analyses the algorithm
SortingAlgorithm
classAny sorting algorithm, which has to be implemented, has to be derived from this class. Now we shall see data members and member functions of this class.
name
- Name of the Sorting Algorithmcount
- Holds the number of basic operations performedhist_arr
- A 2D numpy
array, holding the instances of array, as exchange is performed__init__(self, name):
- Initializes algorithm with a name
sort(self, array, visualization):
- The base sorting function. Sets count
to 0. array
is 1D numpy
array, visualization
is a bool
indicating whether array
has to be vstack
ed into hist_arr
Now we shall implement the class BubbleSort
In [7]:
class BubbleSort(SortingAlgorithm): # Derived from SortingAlgorithm
def __init__(self):
SortingAlgorithm.__init__(self, "Bubble Sort") # Initializing with name
def sort(self, array, visualization=False): # MUST have this signature
SortingAlgorithm.sort(self, array, visualization) # sets self.count to 0
for i in range(0, array.size): # Not len(array)
exch = False
for j in range(0, array.size - i - 1):
self.count += 1 # Increment self.count after each basic operation
if array[j] > array[j + 1]:
array[j], array[j + 1] = array[j + 1], array[j]
exch = True
if visualization:
self.hist_array = np.vstack([self.hist_array, array]) # Save the current state to hist_array
if not exch:
break
if visualization:
self.hist_array = np.vstack([self.hist_array, array]) # Save the final state to hist_array
SortAnalyzer
classThis class provides the visualization and analysis methods. Let's see its methods in detail
__init__(self, sorter):
Initializes visualizer with a Sorting Algorithm. sorter
is a class, which is derived from SortingAlgorithm
visualize(self, num=100, save=False):
Visualizes the given algorithm with a randomly shuffled array.num
size of randomly shuffled arraysave
is True
means animation is saved in output/
analyze(self, maxpts=1000):
maxpts
in the steps of 100, and counting the number of
basic operationsmaxpts
- Upper bound on size of elements chosen for analysing efficiency
In [8]:
bubble_visualizer = SortVisualizer(BubbleSort)
In [9]:
bubble_visualizer.efficiency()
As you can see in the above plot, BubbleSort
takes $\mathcal{O}(n)$ time on best case and $\mathcal{O}(n^2)$ time on both avarage and worst cases
You can call the visualize
function as shown below and see the 'mp4' file saved at output/
folder
bubble_visualizer.visualize(save=True)
compare(algs)
algs
is a list of classes derived from SortingAlgorithm
. It performs tests and plots the bar graph comparing the number of basic operations performed by each algorithm.
class
if sorting could be done using a functionWe have just seen how BubbleSort
is implemented. Every sorting algorithm is not as simple as BubbleSort
. QuickSort
and MergeSort
needs several auxiliary methods to work with. If they are scattered throughout the code, they decrease the readability. So it is better to pack everything in a class.
You can see more examples at Github