In [3]:
%pylab inline
import pandas as pd
import numpy as np
from __future__ import division
import itertools
import matplotlib.pyplot as plt
import seaborn as sns
import logging
logger = logging.getLogger()
for optimization problems:
divide-and-conquer
partition into disjoint subproblems, solve recursively, and then combine.
dynamic programming
subploems overlap
developing by four steps:
Given:
Problem:
determine the maximum revenue $r_n$ obtainable by cutting up the rod and selling the pieces.
Solution:
$$r_n = max_{1 \leq i \leq n} (p_i + r_{n-i})$$
Implementation:
In [4]:
price = np.array([1, 5, 8, 9, 10, 17, 17, 20, 24, 30])
print('length i: {}'.format(range(len(price))))
print('price p_i: {}'.format(price))
def bottom_up_cut_rod(p_, n):
"""Solve cut_rod problem by bottom-up dynamic programming.
"""
assert (len(p_) >= n), 'n should be less than the max length {}'.format(len(p))
p = np.insert(p_, 0, 0) #set price_0 = 0
r = np.zeros(n+1)
for k in range(1, n+1):
q = -np.inf
for m in range(1, k+1):
q = max(q, p[m]+r[k-m])
r[k] = q
return r[-1]
print('max price: {}'.format([bottom_up_cut_rod(price, n) for n in range(1, len(price)+1)]))
given a chain $<A_1, A_2, \cdots, A_n>$ of $n$ matrices, where for $i = 1, 2, \cdots, n$, matrix $A_i$ has dimension $p_{i-1} \times p_i$,
problem: fully parenthesize the product $A_1 A_2 \cdots A_n$ in a way that minimizes the number of scalar multiplicatons.
solutions:
exhaustively checking all possible parenthesizations. $\Omega(2^n)$
\begin{equation}
P(n) = \begin{cases}
1 & \quad \text{if } n = 1 \\
\sum_{k=1}^{n-1} P(k) P(n-k) & \quad \text{if } n \geq 2 \\
\end{cases}
\end{equation}
Applying dynamic programming. $\Omega (n^3)$
define $A_{i \dotso j} = A_i A_{i+1} \dotsm A_j$
The structure of an optimal parenthesization.
suppose $A_{i \dotso k} A_{k \dotso j}$ is the optimal solution of $<A_i, \dotsm, A_j>$, where $i \leq k < j$.
then $A_{i \dotso k}$ must be the optimal solution of $<A_i, \dotsm, A_k>$, as well as $A_{k \dots j}$.
Proof:
if existing $Cost(\hat{A}_{i \dotso k}) < Cost(A_{i \dotso k})$,
then $\hat{A}_{i \dotso k} A_{k \dotso j}$ should be the optimal solution ==> contradiction.
A recursive solution.
Let $m[i,j]$ be the minimum number of scalar multiplications needed to compute the matrix $A_{i \dotso j}$.
\begin{equation}
m[i,j] = \begin{cases}
0 & \quad \text{if } i = j \\
min_{i \leq k < j} {m[i,k] + m[k+1,j] + p_{i-1} p_k p_j} & \quad \text{if } i < j
\end{cases}
\end{equation}
Computing the optimal costs.
$m[i,j]$ depends on all $m[i,k], m[k+1,j]$.
hence, the algorithm should fill in the talbe $m$ in a manner that corresponds to solving the parenthesizaiton problem on matrix chains of incresing lenghth.
Constructing an optimal solution.
construct table $s$ whose rows is $1, \dotsm, n-1$ and columns is $2, \dotsm, n$.
Each entry $s[i,j]$ records the optimal solution $k$ for $A_{i \dotso j}$.
In [5]:
#todo
Two keys: optimal substructure and overlapping subproblems.
A solution to the problem consists of making a choice.
Suppose: for a given problem, you are given the choice
Given the choice, you determine which subproblems ensue and how to best characterize the resulting space of subproblems.
By using "cut-and-paste" technique, you show that the solutions to the subproblems used within an optimal solution to the problem must themselves be optimal.
To characterize the space of subproblems, a good rule of thumb says to try to keep the space as simple as possible and then expand it as necessary.
Optimal substructure varies across problem domains in two ways:
how many subproblems an optimal solution to the original problem use.
how many choices we have in detering which subproblem(s) to use in an optimal solution.
the running time of a dynamic-programming algorithm depends on the product of two factors:
the number of subproblems overall.
how many choices we look at for each subproblem.
Dynamic programming often uses optimal substructure in a bottom-up fashion.
The cost of the problem solution is usually the subproblem costs plus a cost that is directly attributable to the choice itself.
Be careful NOT to assume that the optimal substructure applies when it does not.
eg: Unweighted shortest path VS Unweighted longest simple path.
subproblems are independent:
The solution to one subproblem does not affect the solution to another subproblem of the same problem.
in another word, the subproblems do not share resources.
store which choice we made in each subproblem in a table.
maintains an entry in a table for the solution to each subproblem.
In general practice, we prefer to use bottom-up rather than top-down memoized algorithm. While if some subproblems need not be solved at all, the memoized solution has the advantage.
Steps:
Characterizing the optimal substructure:
Let $X = <x_1, x_2, \dotsc, x_m>$ and $Y = <y_1, y_2, \dotsc, y_n>$ be sequences, and let $Z = <z_1, z_2, \dotsc, z_k>$ be any LCS of $X$ and $Y$.
If $x_m = y_n$, then $z_k = x_m = y_n$, and $Z_{k-1}$ is an LCS of $X_{m-1}$ and $Y_{n-1}$.
If $X_m \neq y_n$, then $z_k \neq x_m$ implies that $Z$ is an LCS of $X_{m-1}$ and $Y$.
If $x_m \neq y_n$, then $z_k \neq y_n$ implies that $Z$ is an LCS of $X$ and $Y_{n-1}$.
A recursive solution.
\begin{equation}
c[i,j] = \begin{cases}
0 \quad & \text{if } i = 0 \text{ or } j = 0 \\
c[i-1,j-1] + 1 \quad & \text{if } i, j > 0 \text{ and } x_i = y_j \\
max(c[i,j-1], c[i-1,j]) \quad & \text{if } i, j > 0 \text{ and } x_i \neq y_j
\end{cases}
\end{equation}
Computing the length of an LCS.
code.
Constructing.
$b[i,j]$ points to the table entry corresponding to the optimal subproblem solution chosen when computing $c[i,j]$.
eliminate the $b$ table.
leave only two rows of table $c$ at a time.
In [6]:
#todo
Given:
a sequence $K = <k_1, k_2, \dotsc, k_n>$ of $n$ distinct keys in sorted order. For each $k_i$, we have a probability $p_i$ that a serch will be for $k_i$. And we also have $n+1$ "dummy keys" $<d_0, d_1, \dotsc, d_n>$ representing values not in $K$, and the probability $q_i$ for each $d_i$.
We want to build a search trees $T$ for $K$, and define the expected cost of a search in $T$ is:
\begin{align}
E[cost] &= \sum_{i=1}^n (depth_T(k_i) + 1) p_i + \sum_{i=0}^n (depth_T(d_i) + 1) q_i \\
&= 1 + \sum_{i=1}^n depth_T(k_i) p_i + \sum_{i=0}^n depth_T(d_i) q_i
\end{align}
Problem:
We wish the expected cost of $T$ is the smallest.
Optimal substructure
If an optimal binary search tree $T$ has a subtree $T'$ containing keys $k_i, \dotsc, k_j$, then this subtree $T'$ must be optimal as well for the subproblem with keys $k_i, \dotsc, k_j$ and dummy keys $d_{i-1}, \dotsc, d_j$.
A recursive solution
define: $w[i,j] = \sum_{l=i}^{j} p_l + \sum_{l=i-1}^{j} q_l$
\begin{equation}
e[i,j] = \begin{cases}
q_{i-1} \quad & \text{if } j = i - 1 \\
min_{i \leq r \leq j} \{e[i,r-1] + e[r+1,j] + w[i,j]\} \quad & \text{if } i \leq j
\end{cases}
\end{equation}
Computing
$$w[i,j] = w[i,j-1] + p_j + q_j$$
In [7]:
#todo