Suppose we are given a data signal which consists of several nearly mono-components.
Can we recover the mono-components?
The answer is YES, with an efficient algorithm using EVDs of Hankel matrices.
Mono-component recovery can be successfully applied to audio signals.
The reader should be familiar to elementary concepts about signals, and with linear algebra concepts, particularly EVD and its properties and algorithms.
The reader should be able to decompose given signal into mono-components using EVD methods.
For more details see P. Jain and R. B. Pachori, An iterative approach for decomposition of multi-component non-stationary signals based on eigenvalue decomposition of the Hankel matrix.
Credits: The first Julia implementation was derived in A. M. Bačak, Master's Thesis.
Let $x\in\mathbb{R}^{m}$, denote a signal with $N$ samples.
Assume $x$ is composed of $L$ stationary mono-components:
$$ x=\sum\limits_{k=1}^L x^{(k)}, $$where
$$ x^{(k)}_i=A_k \cos(2\pi f_k i +\theta_k)\quad i=1,2,\ldots,m. $$Here:
$f_k=\displaystyle\frac{F_k}{F}$ is the normalized frequency of $x^{(k)}$,
$F$ is the sampling frequency of $x$ in Hz,
$F_k$ is the sampling frequency of $x^{(k)}$,
$A_k$ is the amplitude of $x^{(k)}$, and
$\theta_k$ is the phase of $x^{(k)}$.
We assume that $F_k< F_{k+1}$ for $k=1,2,\ldots,n-1$, and $F>2F_n$.
A Hankel matrix is a (real) square matrix with constant values along the skew-diagonals. More precisely, let $a\in\mathbb{R}^{2n-1}$. An $n\times n$ matrix $H\equiv H(a)$ for which $H_{ij}=H_{i+1,j-1}=a_{i+j-1}$ is a Hankel matrix.
Let $x$ be a signal with $2n-1$ samples composed of $L$ stationary mono-components.
Let $H$ be an $n\times n$ Hankel matrix corresponding to $x$ and let $H=U\Lambda U^T$ be its EVD (Hankel matrix is symmetric) with $\lambda_1\leq \lambda_2 \leq \cdots \leq \lambda_n$.
Smilarly, let $H_k$ be the $n\times n$ Hankel matrix corresponding to the $k$-th component $x^{(k)}$ and let $H_k=U_k\Lambda_k U_k^T$ be its EVD.
$H=\sum\limits_{k=1}^{L} H_k$.
$H_k=\lambda_k U_{:,k}U_{:,k}^T + \lambda_{n-k+1} U_{:,n-k+1}U_{:,n-k+1}^T$.
In [1]:
using Plots
using SpecialMatrices
In [2]:
# Small Hankel matrix
a=collect(1:11)
Hankel(a)
Out[2]:
In [2]:
# Create the signal
n=160
N=2*n-1
F = 6400
L = 3
A = [3, 2, 1]
Fk= [200, 320, 160]
θ = [pi/2, pi/4, 0]
x = zeros(N)
for k=1:3
for i=1:N
x[i]+=A[k]*cos(2*pi*Fk[k]*i/F+θ[k])
end
end
plot(x,xlabel="Number of samples N", ylabel="Amplitude")
Out[2]:
In [3]:
F/Fk[1]
Out[3]:
In [4]:
# FFT indicates that there are three components
# The plot of abs.(y[1:end]) is symmetric arround midpoint
using FFTW
y=fft(x)
plot(log10.(abs.(y))[1:100])
Out[4]:
In [5]:
x
Out[5]:
In [6]:
# Let us decompose the signal
H=Hankel(x)
Out[6]:
In [7]:
using LinearAlgebra
λ,U=eigen(Matrix(H))
λ
Out[7]:
We see that the three smallest and the three largest eigenvalues come in pairs and define the three mono-components.
The ratios of the moduli of the eigenvalues correspond to the ratios of the amplitudes of the mono-components.
In [10]:
# Form the three matrices
Hcomp=Array{Any}(undef,3)
for k=1:L
Hcomp[k]=λ[k]*U[:,k]*U[:,k]' + λ[end-k+1]*U[:,end-k+1]*U[:,end-k+1]'
end
In [11]:
# Compare the first matrix with the Hankel matrix of the first mono-component
x₁ = zeros(N)
l=1
for i=1:N
x₁[i]+=A[l]*cos(2*pi*Fk[l]*i/F+θ[l])
end
In [12]:
H₁=Hankel(x₁)
eigvals(Matrix(H₁)), norm(Hcomp[1]-H₁)
Out[12]:
In [13]:
# Now we reconstruct the mono-components from the skew-diagonal elements of Hcomp
xcomp=Array{Array{Float64}}(undef,L)
z=Array{Float64}(undef,N)
for k=1:L
z[1:2:N]=diag(Hcomp[k])
z[2:2:N]=diag(Hcomp[k],1)
xcomp[k]=copy(z)
end
In [14]:
xaxis=collect(1:N)
plot([xcomp[1],xcomp[2],xcomp[3]])
Out[14]:
Several outer eigenvalues pairs of Hankel matrices can be computed using Lanczos method. If the multiplication $Hx$ is performed using Fast Fourier Transform, this EVD computation is very fast.
A Toeplitz matrix is a (real) square matrix with constant values along the diagonals. More precisely, let
$$ a=(a_{-(n-1)},a_{-(n-2)},\ldots,a_{-1},a_0,a_1,\ldots,a_{n-1})\in\mathbb{R}^{2n-1}. $$An $n\times n$ matrix $T\equiv T(a)$ for which $T_{ij}=T_{i+1,j+1}=a_{i-j}$ is a Toeplitz matrix.
A circulant matrix is a Toeplitz matrix where each column is rotated one element downwards relative to preceeding column.
More precisely, let $a\in\mathbb{R}^{n}$. An $n\times n$ matrix $C\equiv C(a)=T(a,a_{1:n-1})$ is a Circulant matrix.
A rotation matrix is an identity matrix rotated 90 degrees to the right (or left).
A Fourier matrix is Vandermonde matrix
$$ F_n=V(1,\omega_n,\omega_n^2,\ldots, \omega_n^{n-1}), $$where $\omega_n=exp(2\pi i/n)$ is the $n$-th root of unity (see the notebook Eigenvalue Decomposition - Definitions and Facts).
In [15]:
C=Circulant([1,2,3,4,5])
Out[15]:
In [16]:
TC=Toeplitz([2,3,4,5,1,2,3,4,5])
Out[16]:
In [17]:
T=Toeplitz([1,2,3,4,5,6,7,8,9])
Out[17]:
In [18]:
H₁=Hankel([1,2,3,4,5,6,7,8,9])
Out[18]:
In [19]:
Vandermonde([6,2,3,4,5])
Out[19]:
For more details see G. H. Golub and C. F. Van Loan, Matrix Computations, p. 202, and the references therein
Hankel matrix is the product of a Toeplitz matrix and the rotation matrix.
Circulant matrix is normal and, thus, unitarily diagonalizable, with the eigenvalue decomposition $$ C(a)=U\mathop{\mathrm{diag}}(F_n^* a)U^*, $$ where $U=\displaystyle\frac{1}{\sqrt{n}} F_n$. The product $F_n^* a$ can be computed by the Fast Fourier Transform(FFT).
Given $a,x\in\mathbb{R}^n$, the product $y=C(a)x$ can be computed using FFT as follows: \begin{align*} \tilde x&=F_n^*x\\ \tilde a&=F_n^*a\\ \tilde y&=\tilde x.* \tilde a\\ y&= F_n^{-*} \tilde y. \end{align*}
Toeplitz matrix of order $n$ can be embedded in a circulant matrix of order $2n-1$: if $a\in\mathbb{R}^{2n-1}$, then $$ T(a)=[C([a_{n+1:2n-1};a_{1:n}])]_{1:n,1:n}. $$
Further, let $x\in\mathbb{R}^{n}$ and let $\bar x\in\mathbb{R}^{2n-1}$ be equal to $x$ padded with $n-1$ zeros.Then $$ T(a)x=[C([a_{n+1:2n-1};a_{1:n}])\bar x]_{1:n}. $$
Fact 1 implies $H(a)x=(T(a)J)x=T(a)(Jx)$.
In [20]:
# Fact 1
eye(n)=Matrix{Int64}(I,n,n)
J=rotl90(eye(5))
T*J
Out[20]:
In [21]:
J
Out[21]:
In [22]:
rotl90(T)
Out[22]:
In [23]:
# Fact 1
Matrix(T)*J
Out[23]:
In [24]:
# Fact 2
import Random
Random.seed!(467)
a=rand(-8:8,6)
n=length(a)
C=Circulant(a)
ω=exp(2*pi*im/n)
v=[ω^k for k=0:n-1]
F=Vandermonde(v)
U=F/sqrt(n)
λ=Matrix(F)'*a
Out[24]:
In [25]:
a
Out[25]:
In [26]:
fft(a)
Out[26]:
In [27]:
eigvals(Matrix(C))
Out[27]:
In [28]:
v
Out[28]:
In [29]:
F
Out[29]:
In [30]:
C
Out[30]:
In [31]:
# Residual
norm(Matrix(C)*U-U*Diagonal(λ))
Out[31]:
In [32]:
?fft;
In [33]:
# Check fft
norm(λ-fft(a))
Out[33]:
Fact 3 - Circulant() x vector, as implemented in the package SpecialMatrices.jl
function *(C::Circulant{T},x::Vector{T}) where T
xt=fft(x)
vt=fft(C.c)
yt=vt.*xt
real(ifft(yt))
end
Similarly, mul!()
function mul!(y::StridedVector{T},C::Circulant{T},x::StridedVector{T}) where T
xt=fft(x)
vt=fft(C.c)
yt=ifft(vt.*xt)
if T<: Int
map!(round,y,yt)
elseif T<: Real
map!(real,y,yt)
else
copy!(y,yt)
end
return y
end
In [34]:
x=rand(-9:9,n)
Out[34]:
In [35]:
@which C*x
Out[35]:
In [36]:
[Matrix(C)*x C*x mul!(similar(x),C,x)]
Out[36]:
In [37]:
# Fact 4 - Embedding Toeplitz() into Circulant()
n=5
a=rand(-6:6,2*n-1)
T=Toeplitz(a)
Out[37]:
In [38]:
C=Circulant([a[n:2*n-1];a[1:n-1]])
Out[38]:
In [39]:
# Fact 5 - Toeplitz() x vector
x=rand(-6:6,n)
Out[39]:
In [40]:
[Matrix(T)*x T*x mul!(similar(x),T,x)]
Out[40]:
In [43]:
# Fact 6 - Hankel() x vector
H₁=Hankel(a)
Out[43]:
In [44]:
[Matrix(H₁)*x H₁*x mul!(similar(x),H₁,x)]
Out[44]:
Given a Hankel matrix $H$, the Lanczos method can be applied by defining a function (linear map) which returns the product $Hx$ for any vector $x$. This approach uses the package LinearMaps.jl and is described in the notebook Symmetric Eigenvalue Decomposition - Lanczos Method notebook.
The computation is very fast and allocates little extra space.
In [45]:
# import Pkg; Pkg.add("LinearMaps")
In [46]:
using LinearMaps
In [47]:
n=size(H,1)
f(x)=mul!(similar(x),H,x)
Out[47]:
In [48]:
H
Out[48]:
In [49]:
A=LinearMap(f,n,issymmetric=true)
Out[49]:
In [50]:
size(A)
Out[50]:
In [51]:
@time eigvals(Matrix(H));
In [52]:
# import Pkg; Pkg.add("Arpack")
In [53]:
using Arpack
In [55]:
# Run twice
@time λA,UA=eigs(A, nev=6, which=:LM)
Out[55]:
Let $x\in\mathbb{R}^{m}$, denote a signal with $N$ samples.
Assume $x$ is composed of $L$ non-stationary mono-components:
$$ x=\sum\limits_{k=1}^L x^{(k)}, $$where
$$ x^{(k)}_i=A_k \cos(2\pi f_k i +\theta_k),\quad i=1,2,\ldots,m. $$Assume that the normalized frequencies $f_k=\displaystyle\frac{F_k}{F}$, the sampling frequencies $F_k$, the amplitudes $A_k$, and the phases $\theta_k$, all sightly change in time.
Let $H\equiv H(x)$ be the Hankel matrix of $x$. The eigenpair of $(\lambda,u)$ of $H$ is significant if $|\lambda|> \tau \cdot \sigma(H)$. Here $\sigma(H)$ is the spectral radius of $H$, and $\tau$ is the significant threshold percentage chosen by the user depending on the type of the problem.
The following algorithm decomposes the signal $x$:
Each tone has its fundamental frequency (mono-component). However, musical instruments produce different overtones (harmonics) which are near integer multiples of the fundamental frequency. Due to construction of resonant boxes, these frequencies slightly vary in time, and their amplitudes are contained in a time varying envelope.
Tones produces by musical instruments are nice examples of non-stationary signals. We shall decompose the note A4 played on piano.
For manipulation of recordings, we are using package WAV.jl. Another package with similar functionality is the package AudioIO.jl.
In [56]:
# import Pkg; Pkg.add("WAV")
In [57]:
using WAV
In [58]:
varinfo(WAV)
Out[58]:
In [59]:
# Load a signal
Signal = wavread("files/piano_A41.wav")
Out[59]:
In [60]:
typeof(Signal)
Out[60]:
In [61]:
s=Signal[1]
Fs=Signal[2]
Out[61]:
In [62]:
# Play the signal
wavplay(s,Fs)
In [63]:
# Plot the signal
plot(s)
Out[63]:
In [64]:
# Plot in time scale
t=range(0,stop=length(s)/Fs,length=length(s))
plot(t,s,xlabel="Time")
Out[64]:
In [65]:
# Total time and number of samples
t[end], length(s)
Out[65]:
In [66]:
# Detail
plot(s[1:800])
Out[66]:
Let us visualize the signal in detail.
In [67]:
k=500
plot(collect(k:k+1000),s[k:k+1000])
Out[67]:
In [68]:
# Last part of the signal is just noise, so we read a
# shorter signal. N must be odd.
Signal = wavread("files/piano_A41.wav",100001)
Out[68]:
In [69]:
s=Signal[1];
In [70]:
# wavplay(s,Fs)
In [71]:
# File to play on Windows
wavwrite(Signal[1],"files/piano_A41_short.wav",Fs=Signal[2])
In [72]:
# Plot the short signal in time
t=range(0,stop=length(s)/Fs,length=length(s))
plot(t,s)
Out[72]:
In [73]:
# Check the signal with FFT
# Notice 3 stronger harmonics and six more weaker ones
fs=abs.(fft(s))
plot(t,fs)
Out[73]:
In [74]:
# Details
l=10000
plot(t[1:l],fs[1:l])
Out[74]:
In [75]:
# Form the Hankel matrix
# IMPORTANT - Do not try to display H - it is a 50001 x 50001 matrix.
H=Hankel(vec(s));
In [76]:
size(H), H[100,200]
Out[76]:
In [77]:
@time fft(s);
In [78]:
# We are looking for 20 eigenvalue pairs
n=size(H,1)
f(x)=mul!(similar(x),H,x)
A=LinearMap(f,n,issymmetric=true)
size(A)
Out[78]:
In [79]:
@time λ,U=eigs(A, nev=40, which=:LM)
Out[79]:
In [80]:
# Count the eigenvalue pairs (+-) larger than the 10% of the maximum
τ=0.1
L=round(Int,(sum(abs.(λ).>(τ*maximum(abs,λ)))/2))
Out[80]:
At this point, the implementation using full matrices is rather obvious. However, we cannot do that, due to large dimension. Recall, the task is to define Hankel matrices $H_k$ for $k=1,\ldots,L$, from the signal obtained by averaging the skew-diagonals of the matrices
$$ H_k=\lambda_k U_{:,k}U_{:,k}^T + \lambda_{n-k+1} U_{:,n-k+1}U_{:,n-k+1}^T, $$without actually forming the matrices.
This is a nice programming excercise which can be solved using $\cdot$ products.
In [81]:
function myaverages(λ::T, u::Vector{T}) where T
n=length(u)
x=Array{Float64}(undef,2*n-1)
# Average lower diagonals
for i=1:n
x[i]=dot(u[1:i],reverse(u[1:i]))/i
end
for i=2:n
x[n+i-1]=dot(u[i:n],reverse(u[i:n]))/(n-i+1)
end
λ*x
end
Out[81]:
In [82]:
# A small test
u=[1,2,3,4,5]
u*u'
Out[82]:
In [83]:
myaverages(1,u)
Out[83]:
We now execute the first step of the algorithm from the above Fact.
Notice that eigs()
returns the eigenvalues arranged by the absoulte value, so the consecutive
pairs define the $i$-th signal. The computation of averages is long - it requires $O(n^2)$
operations and takes several minutes.
In [84]:
# This step takes 7 minutes, so we skip it
# xcomp=Array(Array{Float64},L)
# for k=1:L
# xcomp[k]=myaverages(λ[2*k-1],U[:,2*k-1])+myaverages(λ[2*k],U[:,2*k])
# end
Can we do without averaging?
The function myaverages()
is very slow - 7 minutes, compared to 5 seconds for the eigenvalue computation.
The simplest option is to disregard the averages, and use the first column and the last row of the underlying matrix, as in definition of Hankel matrices, which we do next. Smarter approach might be to use small random samples to compute the averages.
Let us try the simple approach for the fundamental frequency. (See also the notebook Examples in Signal Decomposition.ipynb.)
In [85]:
xcomp=Array{Array{Float64}}(undef,L)
for k=1:L
k1=2*k-1
k2=2*k
xsimple=[(λ[k1]*U[1,k1])*U[:,k1]; (λ[k1]*U[n,k1])*U[2:n,k1]]
xsimple+=[(λ[k2]*U[1,k2])*U[:,k2]; (λ[k2]*U[n,k2])*U[2:n,k2]]
xcomp[k]=xsimple
end
Let us look and listen to what we got:
In [86]:
typeof(xcomp[1])
Out[86]:
In [87]:
k=7
plot(t,xcomp[k])
Out[87]:
In [88]:
k=4
plot(t[1:1000],xcomp[k][1:1000])
Out[88]:
In [89]:
# Plot short first parts of FFTs of and display frequency
l=10000
k=11
fs=abs.(fft(xcomp[k]))
m,ind=findmax(fs[1:l])
println("Frequency = ", ind*Fs/length(fs) ," Hz, Amplitude = ", m)
plot(t[1:l],fs[1:l])
Out[89]:
We see that all xcomp[k]
are clean mono-components - see
Physics of Music - Notes:
1 = 440 Hz (A4)
2 = 880 Hz (2*440,+octave,A5)
3 = 1320 Hz (3*440,+quint,E6)
4 = 440 Hz
5 = 880 Hz
6 = 2200 Hz (5*440,++major terza, C#7)
7 = 2640 Hz (6*440,++quint,E7)
8 = 440 Hz
9 = 2200 Hz
10 = 1760 Hz (4*440,++octave,A6)
11 = 2640 Hz
N.B. Some mono-components are repeated, and they should be grouped by adding components with absolute weighted correlation larger than some prescribed threshold.
In [90]:
# wavplay(xcomp[2],Fs)
In [91]:
wavplay(sum([xcomp[i] for i=1:11]),Fs)
In [92]:
# On Windows, store the mono-components
for i=1:11
wavwrite(xcomp[i],"files/comp$i.wav",Fs=Signal[2])
end
In [93]:
wavwrite(sum([xcomp[i] for i=1:11]),"files/compsum.wav",Fs=Signal[2])