September 30, 2016: sync spec representation
September {23, 26}, 2016: refactoring toward class-based code
September 22, 2016: Quicksort theory, average cases
In [1]:
%run "../src/start_session.py"
%run "../src/recurrences.py"
The Quicksort algorithm depends on the distribution of the keys in the input vector. For what follow we assume to have a probability space $\Omega = D_n$, where $D_n$ is the set of permutation of length $n$ without repetition over $\{1,\ldots,n\}$. We focus on the simpler variant where the pivot is chosen as the right-most key. In the following table, we study the behavior of an application to the vector $\left ( 20, 25, 7, 3, 30, 8, 41, 18\right)$. $$ \begin{array}{cccccccccc} \hline 20 & 25 & 7 & 3 & 30 & 8 & 41 & 18 & & \\ \uparrow i & & & & & \uparrow j & & \uparrow pivot & \rightarrow & \{20, 41, 8\} \\ \hline 8 & 25 & 7 & 3 & 30 & 20 & 41 & 18 & & \\ & \uparrow i & & \uparrow j & & & & \uparrow pivot & \rightarrow & \{25, 30, 3\} \\ \hline 8 & 3 & 7 & 25 & 30 & 20 & 41 & 18 & & \\ & & \uparrow j & \uparrow i & & & & \uparrow pivot & \rightarrow & \{7, 25, 7\} \\ \hline 8 & 3 & 7 & 18 & 30 & 20 & 41 & 25 & & \\ & & & \uparrow pivot & & & & & \rightarrow & recursion \\ \hline \end{array} $$
Observe that in order to move the $pivot$ element in its final position, it is necessary for two keys ($7, 25$) to be checked twice against the $pivot$, moreover when the second of those checks happens, indexes $i$ and $j$ overlapped at some time such that $j < i$ eventually holds.
Hence, given a vector of length $n$, the number of checks performed before recurring on left and right partitions is $(n-1) + 2$, where $n-1$ appears because the $pivot$ element isn't indexed neither with $i$ nor with $j$. We analyze the number of performed checks by cases:
We explain the average case in the following dedicated section.
To study this case we have to consider all elements of $\Omega$ (recall that $w \in \Omega \rightarrow (w[i]\in \{1,\ldots,n\}) \wedge (\forall i\not =j: w[i]\not=w[j]) \wedge length(w) = n$). Let $j$ be the $pivot$ element, hence a generic $w$ will have this structure: $ w = (C_{j-1} \quad C_{n-j} \quad j)$, where $C_k$ is a vector of length $k$. Assume that the random variable $X =$ "$ j \in \lbrace 1,\ldots,n \rbrace \text{ is the $pivot$ element in } w$" is uniformly distributed, formally: $$ \mathbb{P}\left(w\in\Omega: w[n]=j \right) = \frac{(n-1)!}{n!} = \frac{1}{n} $$
Our goal here is to build a function $C(n)$ which counts the average number of checks during an execution of the algorithm given an input vector of length $n$. In order to do that observe that every key $j \in \{1,\ldots,n\}$ can be the $pivot$, so the following holds: $$ C(n) = (n+1) + \frac{1}{n}\sum_{j=1}^{n}{C(j-1) + C(n-j)} $$
Observing the sum when $j$ runs: $$ \begin{split} j=1 &\rightarrow C(0) + C(n-1) \\ j=2 &\rightarrow C(1) + C(n-2) \\ \ldots& \\ j=n-1 &\rightarrow C(n-2) + C(1) \\ j=n &\rightarrow C(n-1) + C(0) \\ \end{split} $$
Hence we can rewrite and manipulate: $$ \begin{split} C(n) &= (n+1) + \frac{2}{n}\sum_{j=0}^{n-1}{C(j)}\\ nC(n) &= n(n+1) + 2\sum_{j=0}^{n-1}{C(j)} \end{split} $$
Subtract the previous $(n-1)$ term to both members: $$ \begin{split} nC(n) -(n-1)C(n-1) &= n(n+1) + 2\sum_{j=0}^{n-1}{C(j)} -\left((n-1)((n-1)+1) + 2\sum_{j=0}^{(n-1)-1}{C(j)}\right) \\ % nC(n) -(n-1)C(n-1) &= n(n+1) + 2\sum_{j=0}^{n-1}{C(j)} -n(n-1) - 2\sum_{j=0}^{n-2}{C(j)} \\ &= n(n+1-(n-1)) + 2C(n-1) \\ &= 2(n + C(n-1)) \\ \end{split} $$
Getting $nC(n) = 2n + (n+1)C(n-1)$, divide both member by $n(n+1)$: $$ \begin{split} \frac{C(n)}{n+1} = \frac{2}{n+1} + \frac{C(n-1)}{n} \end{split} $$
Matching $A(n) = b(n) + A(n-1)$, where $A(n) = \frac{C(n)}{n+1} $ and $b(n) = \frac{2}{n+1} $. Unfolding $A(n-1)$ and fixing $C(0) = 0$: $$ \begin{split} \frac{C(n)}{n+1} &= \frac{2}{n+1} + \frac{C(n-1)}{n} = \frac{2}{n+1} + \frac{2}{n} + \frac{C(n-2)}{n-1}\\ &= \frac{2}{n+1} + \frac{2}{n} + \ldots + \frac{2}{3} + \frac{2}{2} + \frac{C(0)}{1}\\ &= \frac{2}{n+1} + \frac{2}{n} + \ldots + \frac{2}{3} + 1\\ &= 2\left(\frac{1}{n+1} + \frac{1}{n} + \ldots + \frac{1}{3}\right) + 1\\ &= 2\left(\frac{1}{n+1} + \frac{1}{n} + \ldots + \frac{1}{3}\right) +2\frac{1}{2} + 2 + 1 -2\frac{1}{2} - 2\\ &= 2\left(\frac{1}{n+1} + \frac{1}{n} + \ldots + \frac{1}{3} + \frac{1}{2} + 1\right) -2 \\ &= 2(H_{n+1}-1) \end{split} $$
where we recognized the $(n+1)$-th harmonic number $H_{n}=\left(\frac{1}{n+1} + \frac{1}{n} + \ldots + \frac{1}{3}+ \frac{1}{2}+ 1\right)$ to rewrite: $C(n) = 2(n+1)(H_{n+1}-1)$. Now recall the asymptotic approximation $H_n \sim ln(n) + \gamma$, so $C(n) \in O(n\log n)$.
From a practical point of view when a sorting problem is approached with the Quicksort algorithm, to avoid the worst case $O(n^2)$ number of checks, it is sufficient to do one of the following actions before starting the sorting process:
Each one of these two tricks requires linear time in the dimension of the input vector and allows to work with $O(n\log n)$ number of checks, on average.
In [2]:
c = IndexedBase('c')
checks_recurrence_spec=recurrence_spec(recurrence_eq=Eq(c[n]/(n+1), 2/(n+1) + c[n-1]/n),
recurrence_symbol=c,
variables=[n])
checks_recurrence_spec
Out[2]:
Function do_unfolding_steps allow us to perform unfolding or unrolling on occurrences of the inductively defined symbol. Doing $n$ steps of unfolding is the same to say to use the main recurrence defined above as a rewriting rule for occurrences, on the rhs, that pattern match with the one on the lhs.
In [3]:
unfolded = checks_recurrence_spec.unfold(depth=4)
In [4]:
unfolded
Out[4]:
In [5]:
unfolded.factor(c[n-5])
Out[5]:
In [7]:
num = (10*n**5 - 80*n**4 + 190*n**3 - 100*n**2 -92*n +48)
den = expand(n*(n-4)*(n-3)*(n-2)*(n-1))
one = simplify(num/den)
one
Out[7]:
In [58]:
num = n**5 -5*n**4 + 5*n**3 + 5*n**2 -6*n
another = simplify(num/den)
another
Out[58]:
In [60]:
Eq(unfolded.recurrence_eq.lhs, one + another*c[n-5])
Out[60]:
If desired, we can base the previous equation such that a new equation is produced containing the very first term of the sequence, in our case $c_{0}$.
In [6]:
unfolded.instantiate(strategy=raw(substitutions={n:8}))
Out[6]:
Using the previous idea, finding a value for $n$ that causes the very first item to appear, then we can substitute in the entire specification.
In [7]:
instantiated_spec = unfolded.instantiate(strategy=based(arity=unary_indexed()))
instantiated_spec
Out[7]:
If we attach a value to $c_{0}$ and we put it in the term cache, then it is possible to fully instantiate the spec.
In [8]:
instantiated_spec.subsume(additional_terms={c[0]:Integer(0)})
Out[8]:
Moreover, if we want to skip all little steps and perform a batch study for a variable number of steps, then we provide an higher-order operator ipython_latex, which produces a nice representation for a set of unfoldings
In [9]:
ipython_latex_description(checks_recurrence_spec, depths=range(10), arity=unary_indexed())
Out[9]:
Our goal here is to build a function $S(n)$ which counts the average number of swaps during an execution of the algorithm given an input vector of length $n$.
The recurrence for the average number of swaps satisfies the following equation: $$ S(n) = \frac{n-2}{6} + \frac{1}{n} \sum_{j=1}^{n}{S(j-1) + S(n-j)} $$ We develop the proof in two stages, first studying equation (1): $$ \mathbb{E} \left[K_j \right] = \frac{(j-1)(n-j)}{n-1} $$ where $K_j$ is a random variable that depends on $j$, then equation (2): $$ \frac{n-2}{6} = \frac{1}{n}\sum_{j=1}^{n}{ \mathbb{E} \left[K_j \right] } $$ In what follow, assume to work over a probability space $\Omega$ as defined in the previous section, distributed uniformly.
Proof of equation (1)
Let $j \in \left \lbrace 1,\ldots,n \right\rbrace $ be the $pivot$ element,
and let $K_j: \Omega \rightarrow \mathbb{R}$ be a random variable
such that $K_j = s \mathbb{1}_{\lbrace w[n]=j \rbrace}$, where $s$ is
the number of swaps performed by the partitioning phase given
an input vector $w$ of length $n$. To better understand, $K_j$ satisfies
the following implications:
$$
\begin{split}
w[n] = j &\rightarrow K_j(w) = s \\
w[n] \not= j &\rightarrow K_j(w) = 0 \\
\end{split}
$$
The probability to have $k$ swaps when $j$ is the $pivot$ element is:
$$
\mathbb{P}\left(K_j = k \right) = \frac{{{n-j}\choose{k}}
{{j-1}\choose{k}} (n-j)! (j-1)! }{(n-1)!}
$$
We can justify the above formula in the following steps:
Now we study the mean of the variable $K_j$: $$ \mathbb{E} \left[ K_j \right] = \sum_{k \geq 0}{k \mathbb{P}\left( K_j = k \right) } $$ Using ${{n}\choose{m}} = \frac{n!}{m!(n-m)!} $, we can rewrite: $$ \mathbb{P}\left(K_j = k \right) = {{n-j}\choose{k}} {{j-1}\choose{k}} \frac{(n-j)! (j-1)! }{(n-1)!} = {{n-j}\choose{k}} {{j-1}\choose{k}} {{n-1}\choose{j-1}}^{-1} $$ We put the previous rewrite of $\mathbb{P}\left(K_j = k \right)$ into the definition of $\mathbb{E} \left[ K_j \right]$: $$ \mathbb{E} \left[ K_j \right] = \sum_{k \geq 0}{k \mathbb{P}\left( K_j = k \right) } = \sum_{k \geq 0}{k \frac{{{n-j}\choose{k}} {{j-1}\choose{k}}}{{{n-1}\choose{j-1}}}} $$
Using the following rewrite for ${{j-1}\choose{k}}$: $$ {{j-1}\choose{k}} = \frac{(j-1)!}{k!(j-1-k)!} = \frac{(j-1)}{k} \frac{(j-2)!}{(k-1)!(j-1-k)!} = \frac{(j-1)}{k}{{j-2}\choose{k-1}} $$
and ${{n}\choose{m}} = {{n}\choose{n-m}}$ implies ${{j-2}\choose{k-1}} = {{j-2}\choose{j-2 -(k-1)}} = {{j-2}\choose{j -k-1}}$, then: $$ \begin{split} \mathbb{E} \left[ K_j \right] &= \sum_{k \geq 0}{k \mathbb{P}\left( K_j = k \right) } = \frac{1}{{{n-1}\choose{j-1}}} \sum_{k \geq 0}{k {{n-j}\choose{k}} {{j-1}\choose{k}}} \\ &= \frac{1}{{{n-1}\choose{j-1}}} \sum_{k \geq 0}{k {{n-j}\choose{k}} \frac{j-1}{k}{{j-2}\choose{j-k-1}}} = \frac{j-1}{{{n-1}\choose{j-1}}} \sum_{k \geq 0}{ {{n-j}\choose{k}} {{j-2}\choose{j-k-1}}} \end{split} $$
Now we recognize the Vandermonde result: $$ \sum_{k \geq 0}{{{r}\choose{k}} {{s}\choose{n-k}} } = {{r + s}\choose{n}} $$ which can be proved directly because the ipergeometric distribution has exactly the same structure, and being a distribution, it sum up to 1. So we use this result applying it to $\sum_{k \geq 0}{ {{n-j}\choose{k}} {{j-2}\choose{j-k-1}}} = {{n-2}\choose{j-1}} $, obtaining: $$ \begin{split} \mathbb{E} \left[ K_j \right] &= \frac{j-1}{{{n-1}\choose{j-1}}} {{n-2}\choose{j-1}} = \frac{(j-1)(n-2)!(j-1)!(n-j)!}{(n-1)!(j-1)!(n-j-1)!}=\\ &=\frac{(j-1)(n-2)!(j-1)!(n-j)(n-j-1)!} {(n-1)(n-2)!(j-1)!(n-j-1)!}= \frac{(j-1)(n-j)}{n-1} \end{split} $$ $\blacksquare$
Proof of equation (2)
Let us start with : $$ \begin{split} \frac{1}{n}\sum_{j=1}^{n}{\mathbb{E} \left[K_j \right] } &= \frac{1}{n} \sum_{j=1}^{n}{\frac{(j-1)(n-j)}{n-1}} = \frac{1}{n(n-1)} \sum_{j=1}^{n}{(j-1)(n-j)}=\\ &=\frac{1}{n(n-1)} \sum_{j=1}^{n}{(j(1+n)-j^2-n)} \\ &=\frac{1}{n(n-1)} \left( (n+1)\sum_{j=1}^{n}{j} - \sum_{j=1}^{n}{j^2} -n \sum_{j=1}^{n}{1} \right)\\ &=\frac{1}{n(n-1)}\left( \frac{n(n+1)^2}{2} - \frac{n(n+1)(2n+1)}{6} - n^2 \right) = \frac{n-2}{6} \end{split} $$ as the following check confirm last simplification, by factoring the lhs:
In [8]:
n = symbols('n')
denom, first, second, third = n*(n-1), (n*(n+1)**2)/2, (n*(n+1)*(2*n+1))/6, n**2
expr_to_prove = (first-second-third)/denom
Eq(expr_to_prove, factor(expr_to_prove))
Out[8]:
$\blacksquare$
We are now ready to solve the main recurrence using the same strategy for the average number of checks, fixing $S(0) = S(1) = S(2) = 0$: $$ \begin{split} nS(n) - (n-1)S(n-1) &= \frac{2n-3}{6} + 2S(n-1)\\ \frac{S(n)}{n+1} &= \frac{S(n-1)}{n} + \frac{2n -3}{6n(n+1)} = % \frac{S(2)}{3} + \sum_{k=3}^{n}{ \frac{2k-3}{6k(k+1)} } \end{split} $$ Decomposing the general term of the summation $ \frac{2k-3}{6k(k+1)}$ in partial fractions yield: $$- \frac{1}{6 x - 6} + \frac{1}{2 x}$$ and integrating over $x$ yield: $$\frac{1}{2} \log{\left (x \right )} - \frac{1}{6} \log{\left (x - 1 \right )}$$ hence $S(n) \in O(n \log n)$.
In [10]:
s = IndexedBase('s')
swaps_recurrence_spec=recurrence_spec(recurrence_eq=Eq(s[n]/(n+1),s[n-1]/n + (2*n-3)/(6*n*(n+1))),
recurrence_symbol=s,
variables=[n])
swaps_recurrence_spec
Out[10]:
In [11]:
unfolded = swaps_recurrence_spec.unfold(depth=4)
In [12]:
unfolded
Out[12]:
In [13]:
instantiated = unfolded.instantiate(strategy=based(arity=unary_indexed()))
instantiated
Out[13]:
In [14]:
instantiated.subsume(additional_terms={})
Out[14]:
In [15]:
ipython_latex_description(swaps_recurrence_spec, depths=range(10), arity=unary_indexed())
Out[15]:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.