Sorting is the process of placing elements from a collection in some kind of order. This suggests that sorting is an important area of study in computer science and sorting also provides instructive examples of how algorithms and data structures work together, and of how the correct choice of data structure can make dramatic improvements in execution speed and memory usage.

An understanding of **recursion** allows us to consider more sophisticated procedures for sorting, notably quick sort, one of the fastest sorting algorithms. An understanding of recursive algorithms also leads also to the recognition that data structures themselves can be recursive. This insight is illustrated through an analysis of the heap sort algorithm, in which a recursive data structure known as a **heap** is used for the purposes of sorting.

Suppose we have a collection, $C$, of items of data, (and we limit our discussion to dealing mostly with linear structures). Sorting $C$ means rearranging (reordering) its data items, based on an **ordering property** of each item, into ascending or descending order (and a reordering of a sequence has exactly the same items as the original sequence, but in a different order).

If it is sorted into ascending order then if item $A$ comes before item $B$ it must also be true that $A \leq B$, according to the particular ordering property we are using. If $C$ is sorted in descending order then we would have $A \geq B$.

A formal, mathematical statement of sorting in, say, *ascending order* would run as follows:

Given a sequence of $n$ items $(x1, x2, …, xn)$, find a reordering $(x'_1, x'_2, ..., x'_n)$ such that $x'_1 \leq x'_2 \leq ... \leq x'_n$.

This leads us into writing our specification for *ascending* sort:

Name: | Sort |
---|---|

Inputs: | A sequence of elements $C = \{c_1, c_2, c_3, ..., c_n\}$ |

Outputs: | A re-ordering of $C$: $C' = \{c'_1, c'_2, c'_3, ..., c'_n\}$ |

Preconditions: | All $c_i$ in $C$ must have the same ordering property |

Postcondition: | $c'_1\leq c'_2\leq c'_3\leq ... \leq c'_n$ |

and we continue, in general terms, but conduct our discussion in terms of ascending sort for simplicity.

To sort a small number of items, a complex sorting method may be more trouble than it is worth and the overhead may be too high – a simple, rough-and-ready, unsystematic method may suffice. We often refer to such simple strategies as **naive sorting** or **straight sorting** methods.

Sorting a large number of items can take a substantial amount of computing resources and the efficiency of a sorting algorithm becomes material, and every significant improvement can matter. Two different units of computation are commonly considered when measuring the complexity and evaluating the overall efficiency of a sorting algorithm:

*Comparison*, the most commonly used unit, where we determine if one item is larger or smaller than another; and*Swap*, if, during sorting, two values are found to be out of order, exchanging the positions of the two items is a swap. This exchange is costly.

This algorithm iterates several times over the list. On the first pass it ‘bubbles’ the largest item up to its correct place, through a series of swaps. On the next pass it does the same with the next largest item, and so on, but the length of the section of the sequence over which further comparisons and swaps are made are reduced by 1 containing comparison to the unsorted portion. In this way, the items at the end, which are in their correct positions, no long need to be compared (as they have been sorted).

```
In [23]:
```def bubbleSort(alist):
for passnum in range(len(alist)-1,0,-1):
for i in range(passnum):
if alist[i]>alist[i+1]:
temp = alist[i]
alist[i] = alist[i+1]
alist[i+1] = temp
# print(alist) TO SEE EACH STEP
# Note the use of three-step assignment rather than simulatenous assignment, i.e. a, b = b, a.
alist = [54,26,93,17,77,31,44,55,20]
bubbleSort(alist)
print(alist)

```
```

The next piece of code shows this modification, which is often referred to as the **short bubble**.

```
In [24]:
```def shortBubbleSort(alist):
exchanges = True
passnum = len(alist)-1
while passnum > 0 and exchanges:
exchanges = False
for i in range(passnum):
if alist[i]>alist[i+1]:
exchanges = True
temp = alist[i]
alist[i] = alist[i+1]
alist[i+1] = temp
# print(alist)
passnum = passnum-1
alist=[20,30,40,90,50,60,70,80,100,110]
shortBubbleSort(alist)
print(alist)

```
```

**the hares and the tortoises**:

The item 99 (a hare), located at the beginning of the list, makes its way speedily up to its place at the end. By contrast, 1 (a tortoise) is situated at the end of the list and crawls down to its correct place at the beginning with tortuous slowness.

In bubble sort, once the largest item in the sequence is encountered, it moves to its place at the end in a single pass. Larger items near the beginning of the sequence are the hares and are quickly swapped up to the end in this way. Smaller items near the end of the sequence are the tortoises, which trudge down to their places at the beginning only after all n − 1 iterations have occurred.

As with any algorithm, the actual input data that bubble sort is working on may seriously affect its performance. A worst-case scenario for bubble sort is that in which the input list it is given is already sorted, but in reverse order. In such a reverse-sorted list, tortoises will predominate.

Can anything be done about this? Many have tried, and the bubble sort algorithm has been subjected to many tweaks over the years; but analysis shows that the bubble sort and its minor improvements are inferior to both the **insertion** and the **selection sorts**; and in fact the bubblesort has hardly anything to recommend it except its catchy name.

```
In [ ]:
``````
#### Bubble Sort
```