Problem 0

You will get 0 for the whole pset unless you finish this problem :)

  • Read homework rules carefully. If you do not follow it you will likely be penalized.

You were invited to Piazza, where you can find announcement on the course project. In case you didn't get an invitation to your @skoltech.ru email from Piazza, ask TA to set you up there.

  • Register in Piazza with your @skoltech.ru email.
  • Write a private post to TA in Piazza describing your background in mathematics (e.g. courses you took and enjoyed, etc).

In [ ]:

Problem 1 (Python demo)

40 pts

Data preparation

  • First of all download $\verb|.wav|$ file with starcraft sound from here. Load it in python and play using functions from demo.

Our next goal is to process this signal by multiplying it by a special type of matrix (convolution operation) that will smooth the signal.

  • Before processing this file let us estimate what size of matrix we can afford. Let $N$ be the size of the signal. Estimate analytically memory in megabytes required to store dense square matrix of size $N\times N$ to fit in your operation memory and print this number. Cut the signal so that you will not have swap (overflow of the operation memory). Note: Cut the signal by taking every p-th number in array: signal[::p].
  • Write function gen_toeplitz($N,\alpha$) that outputs matrix $T$: $$T_{ij} = \sqrt{\frac{\alpha}{\pi}}e^{-\alpha (i-j)^2}, \quad i,j=1,\dots,N$$ as numpy array. Avoid using loops or lists! The function scipy.linalg.toeplitz might be helpful for this task. Note: matrices that depend only on difference of indices: $T_{ij} \equiv T_{i-j}$ are called Toeplitz. Toeplitz matrix-by-vector multiplication is convolution since it can be written as $y_i = \sum_{j=1}^N T_{i-j} x_j$. Convolutions can be computed faster than $\mathcal{O}(N^2)$ complexity using Fast Fourier transform (will be covered later in our course, no need to implement it here).

Convolution

  • Multiply matrix $T$ by your signal (for matvec operations use $\verb|numpy|$ package). Plot the first $100$ points of the result and the first $100$ points of your signal on the same figure. Do the same plots for $\alpha = \frac{1}{5}$, $\alpha = \frac{1}{100}$ using plt.subplots in matplotlib. Each subplot should contain first $100$ points of initial and convolved signals for some $\alpha$. Make sure that you got results that look like smoothed initial signal.
  • Play the resulting signal. In order to do so format your array into $\verb|int16|$ before playing by using
    your_array = your_array.astype(np.int16)
    
    To play it correctly you should also scale the frequency, which is one of the inputs in wavPlayer. Note that too small signal can not be played.

Deconvolution

Given convolved signal $y$ and initial signal $x$ our goal now is to recover $x$ by solving the system $$ y = Tx. $$ To do so we will run iterative process $$ x_{k+1} = x_{k} - \tau_k (Tx_k - y), \quad k=1,2,\dots $$ starting from zero vector $x_0$. There are different ways how to define parameters $\tau_k$. Different choices lead to different methods (e.g. Richardson iteration, Chebyshev iteration, etc.). This topic will be covered in details later in our course.

To get some intuition why this process converges to the solution of $Tx=y$ we can have the following considerations. Let us note that if $x_k$ converges to some limit $x$, then so does $x_{k+1}$ and taking $k\to \infty$ we arrive at $x = x - \tau (Tx - y)$ and hence $x$ is the solution of $Tx = y$.

Another important point is that iterative process require only matrix-vector porducts $Tx_k$ on each iteration instead of the whole matrix. In this problem we, however, work with the full matrix, but keep in mind, that convolution can be done efficiently without storing the whole matrix.

  • For each $k$ choose paremeter $\tau_k$ such that the residual $r_{k+1}=Tx_{k+1} - y$ is minimal possible (line search with search direction $r_k$): $$ \|Tx_{k+1} - y\|_2 \to \min_{\tau_k} $$ found analytically. The answer to this bullet is derivation of $\tau_k$. $\tau_k$ should be expressed in terms of residuals $r_k = T x_k - y$.

  • Write a function iterative(N, num_iter, y, $\alpha$) that outputs accuracy numpy array of relative errors $\big\{\frac{\|x_{k+1} - x\|_2}{\|x\|_2}\big\}$ after num_iter iterations using $\tau_k$ from the previous task. Set num_iter=1000, x=s[::20] and do a convergece plot for $\alpha = \frac{1}{2}$ and $\alpha = \frac{1}{5}$. Note: The only loop you are allowed to use here is a loop for $k$.

  • Set x=s[::20], num_iter=1000 and $\alpha=\frac{1}{5}$. Explain what happens with the convergence if you add small random noise of amplitude $10^{-3}\max(x)$ to $y$. The answer to this question should be an explanation supported by plots and/or tables.


In [ ]:

Problem 2 (Theoretical tasks)

Unitary invariance

20 pts

  • Prove that the second vector norm $\|\cdot\|_2$ is unitary invariant: $\|Ux\|_2 = \|x\|_2$ holds for any unitary matrix $U$ and any vector $x$.
  • Prove that the spectral matrix norm $\|\cdot\|_2$ is unitary invariant: $\|UAV\|_2 \equiv \|A\|_2$ holds for any unitary matrices $U$ and $V$ and for any matrix $A$.
  • Prove that $\|A\|^2_F = \text{trace}(AA^*)$ and that $\text{trace}(BC) \equiv \text{trace}(CB)$.
  • Prove that the Frobenius norm is unitary invariant.

SVD

15 pts

  • Prove that $\|A\|_2 = \sigma_1(A)$ and $\|A\|_F = \sqrt{\sigma_1^2(A) + \dots + \sigma_r^2(A)}$.
  • Matrix norm $\|A\|_* = \sigma_1(A) + \dots + \sigma_r(A)$ is called nuclear norm and plays important role in approximation by low-rank matrices. Check if it is unitary invariant.
    Bonus: Show that $\|\cdot\|_*$ is a norm.
  • Find the distance in second and Frobenius norms between nonsingular matrix and the closest singular.

In [ ]:

Problem 3 (Strassen algorithm)

15 pts

  • What is the complexity of naive matrix-matrix multiplication? What is the complexity of Strassen algorithm? Can complexity of matrix-matrix multiplication be asymptotically smaller than $\mathcal{O}(n^2)$? Why?

It might be a good idea not to do recursion in the Strassen algorithm to the bottom level. Sometimes only several levels of the recursion are used. This helps to reduce a constant outside $n^3$.

  • Find analytically constant outside $n^3$ after $2$ levels of recursion in Strassen algorithm. Compare it with the constant in the naive multiplication. Note: additions and multiplications of numbers are assumed to have the same computational cost.

In [ ]:

Problem 4 (SVD compression)

10 pts

  • Find SVD decomposition of the Lena image and plot its singular values.
  • Compress the image using truncation of singular values. Plot compressed images for $r=5,20,50,512$ ($r$ is a number of remaining singular values). Specify compression rate in the titles of images . Note: use plt.subplots and locate images as a $2\times 2$ array.

In [ ]:

Problem 5 (Bonus)

  • The norm is called absolute if $\|x\|=\| \lvert x \lvert \|$ holds for any vector $x$, where $x=(x_1,\dots,x_n)^T$ and $\lvert x \lvert = (\lvert x_1 \lvert,\dots, \lvert x_n \lvert)^T$. Give an example of a norm which is not absolute.
  • Write a function HOSVD(n, epsilon) that calculates High-Order SVD (HOSVD) algorithm in 3D and returns core and 3 HOSVD factors (epsilon is the relative accuracy in the Frobenius norm between approximated and initial tensors). Give ranks of the 3D Hilbert tensor $$ a_{ijk} = \frac{1}{i+j+k + 1}, \quad i,j,k = 0,\dots, n-1 $$ with $10^{-5}$ accuracy. Details can be found here on Figure 4.3.

In [ ]: