With this exercise you will get a better understanding of some of the Fourier theorems and of some useful properties of the DFT. You will write code to implement and verify several properties of the DFT that are discussed in the lectures. You will also learn to use the dftModel.py
module of sms-tools, which contains the basic python functions implementing the DFT. There are five parts in the exercise: 1) Minimize energy spread in DFT of sinusoids, 2) Optimal zero-padding, 3) Symmetry properties of the DFT, 4) Suppressing frequency components using DFT model, and 5) FFT size and zero-padding.
DFT of sinusoids: When a real sinusoid has an integer number of cycles in $N$ samples, the frequency of the sinusoid exactly matches one of the bin frequencies in an $N$ point DFT. Hence the DFT spectrum of the sinusoid has a value of zero at every DFT bin except at the two bins that match the frequency of the sinusoid. Otherwise, the energy of the sinusoid is spread over all the bins. When there are multiple sinusoids, the equations extend to each sinusoid.
\begin{eqnarray} x[n]&=&A_{0}\cos\left(2\pi k_{0}n/N\right)=\frac{A_{0}}{2}{\textstyle e}^{j2\pi k_{0}n/N}+\frac{A_{0}}{2}{\textstyle e}^{-j2\pi k_{0}n/N}\\ X[k] &=& \frac{A_0}{2} \,\,\, \mathrm{for} \,\,\, k = k_0, -k_0; \,\,\,\, 0 \,\,\,\mathrm{otherwise} \end{eqnarray}Zero-padding: Zero-padding a signal is done by adding zeros at the end of the signal. If we perform zero-padding to a signal before computing its DFT, the resulting spectrum will be an interpolated version of the spectrum of the original signal. In most implementations of the DFT (including the FFT algorithms) when the DFT size is larger than the length of the signal, zero-padding is implicitly done.
Zero phase windowing: Zero phase windowing of a frame of a signal puts the centre of the signal at the zero time index for DFT computation. By moving the centre of the frame to zero index by a circular shift, the computed DFT will not have the phase offset which would have otherwise been introduced (recall that a shift of the signal causes the DFT to be multiplied by a complex exponential, which keeps the magnitude spectrum intact but changes the phase spectrum). When used in conjunction with zero-padding, zero phase windowing is also useful for the creation of a frame of length of power of 2 for FFT computation (fftbuffer
).
If the length of the signal $x$ is $M$ and the required DFT size is $N$, the zero phase windowed version of the signal, dftbuffer
, for DFT computation can be obtained by (works for both even and odd $M$):
hM1 = floor((M+1)/2)
hM2 = floor(M/2)
dftbuffer = zeros(N)
dftbuffer[:hM1] = x[hM2:]
dftbuffer[-hM2:] = x[:hM2]
Real, even and odd signals: A signal is real when it does not have any imaginary component, and all sounds are real signals. A signal $x$ is even if $x[n] = x[-n]$, and odd if $x[n] = -x[-n]$. For a signal of length $M$ (and $M$ is odd), in the context of a zero phase windowed signal and its DFT, the signal is even if $x[n] = x[M-n]$ and odd if $x[n] = -x[M-n]$, $1 \leq n \leq M-1$. The DFT properties show that for real input signals, the magnitude spectrum is even and the phase spectrum is odd. Furthermore, when the input signal is both real and even, the DFT is real valued, with an even magnitude spectrum and imaginary component equal to zero. In summary, if $x$ is an input signal of length $M$ ($M$ is odd) and $X = \mathrm{DFT}(x,M)$, then for $1 \leq k \leq M-1$
If $x$ is real, $\left|X[k]\right| = \left|X[M-k]\right|$ and $\boldsymbol{<}\!X[k] = -\boldsymbol{<}\!X[M-k]$
If $x$ is real and even, $\left|X[k]\right| = \left|X[M-k]\right|$ and $\mathrm{imag}(X[k]) = 0$
Positive half of the DFT spectrum: Audio signals are real signals. Due to the symmetry properties of the DFT of a real signal, it is sufficient to store only one half of the magnitude and phase spectra. To save on both storage and computation, we will just store just the half spectrum when possible.
From an $N$ point DFT ($N$ even), we can obtain the positive half of the spectrum by considering only the first $(N/2)+1$ samples of the DFT. We can compute the magnitude spectrum of the positive half (in dB) as $m_X = 20\log_{10}\left|X[0:(N/2)+1]\right|$, where $X$ is the DFT of the input.
Filtering: Filtering involves selectively suppressing certain frequencies present in the signal. Filtering is often performed in the time domain by the convolution of the input signal with the impulse response of a filter. The same operation can also be done in the DFT domain using the properties of DFT, by multiplying the DFT of the input signal by the DFT of the impulse response of the filter. In this assignment, we will consider a very simple illustrative filter that suppresses some frequency components by setting some DFT coefficients to zero. It is to be noted that the convolution operation here is circular convolution with a period $N$, the size of the DFT.
If $x_1[n] \Leftrightarrow X_1[k]$ and $x_2[n] \Leftrightarrow X_2[k]$, $x_1[n] * x_2[n] \Longleftrightarrow X_1[k]\,X_2[k]$
Given an input signal consisting of two sinusoids, the function minimize_energy_spread_dft()
should select the first M
samples from the signal and return the positive half of the DFT magnitude spectrum (in dB), such that it has only two non-zero values.
M
is to be calculated as the smallest positive integer for which the positive half of the DFT magnitude spectrum has only two non-zero values. To get the positive half of the spectrum, first compute the M
point DFT of the input signal (for this you can use the fft()
function of scipy.fftpack
). Consider only the first (M/2)+1
samples of the DFT, computing the magnitude spectrum of the positive half (in dB) as mX = 20*log10(abs(X[:M/2+1]))
, where X
is the DFT of the input signal.
The input arguments to this function are the input signal x
(of length W >= M) consisting of two sinusoids of frequency f1
and f2
, the sampling frequency fs
and the value of frequencies f1
and f2
. The function should return the positive half of the magnitude spectrum mX
. For this question, you can assume the input frequencies f1
and f2
to be positive integers and factors of fs
, and that M
is even.
Due to the precision of the FFT computation, the zero values of the DFT are not zero but very small values < 1e-12 (or -240 dB) in magnitude. For practical purposes, all values with absolute value less than 1e-6 (or -120 dB) can be considered to be zero.
HINT: The DFT magnitude spectrum of a sinusoid has only one non-zero value (in the positive half of the DFT spectrum) when its frequency coincides with one of the DFT bin frequencies. This happens when the DFT size (M
in this question) contains exactly an integer number of periods of the sinusoid. Since the signal in this question consists of two sinusoids, this condition should hold true for each of the sinusoids, so that the DFT magnitude spectrum has only two non-zero values, one per sinusoid.
M
can be computed as the Least Common Multiple (LCM) of the sinusoid periods (in samples). The LCM of two numbers x
, y
can be computed as: x*y/gcd(x,y)
, where gcd denotes the greatest common divisor.
In [ ]:
from scipy.fftpack import fft, fftshift
import numpy as np
from math import gcd, ceil, floor
import sys
sys.path.append('../software/models/')
from dftModel import dftAnal, dftSynth
from scipy.signal import get_window
import matplotlib.pyplot as plt
In [ ]:
# E3 - 1.1: Complete the function minimize_energy_spread_dft()
def minimize_energy_spread_dft(x, fs, f1, f2):
""" From a signal with two sinusoids compute its magnitude spectrum having only two non-zero value.
Args:
x (np.array): input signal
fs (float): sampling frequency in Hz
f1 (float): frequency of first sinusoid component in Hz
f2 (float): frequency of second sinusoid component in Hz
Returns:
np.array: positive half of magnitude spectrum (in dB)
"""
### Your code here
Test cases for minimize_energy_spread_dft()
:
Test case 1: For an input signal x
sampled at fs = 10000
Hz that consists of sinusoids of frequencies f1 = 80
Hz and f2 = 200
Hz, you need to select M = 250
samples of the signal to meet the required condition. In this case, output mX
is 126 samples in length and has non-zero values at bin indices 2 and 5 (corresponding to the frequency values of 80 and 200 Hz, respectively). You can create a test signal x
by generating and adding two sinusoids of the given frequencies.
Test case 2: For an input signal x
sampled at fs = 48000
Hz that consists of sinusoids of frequencies f1 = 300
Hz and f2 = 800
Hz, you need to select M = 480
samples of the signal to meet the required condition. In this case, output mX
is 241 samples in length and has non-zero values at bin indices 3 and 8 (corresponding to the frequency values of 300 and 800 Hz, respectively). You can create a test signal x
by generating and adding two sinusoids of the given frequencies.
In [ ]:
# E3 - 1.2: Compute and plot the two input signals proposed above, call the function minimize_energy_spread_dft(),
# and plot the output magnitude spectra
### Your code here
The function test_real_even()
should check if the input signal is real and even using the symmetry properties of its DFT. The function will return the result of this test, the zerophase windowed version of the input signal (dftbuffer
), and its DFT.
Given an input signal x
of length M
, do a zero phase windowing of x
without any zero-padding. Then compute the M
point DFT of the zero phase windowed signal and use the symmetry of the computed DFT to test if the input signal x
is real and even. Return the result of the test, the dftbuffer
computed, and the DFT of the dftbuffer
.
The input argument is a signal x
of length M
. The output is a tuple with three elements
(isRealEven, dftbuffer, X)
, where isRealEven
is a boolean variable which is True
if x
is real and even, else False
. dftbuffer
is the M
length zero phase windowed version of x
. X
is the M
point DFT of the dftbuffer
.
To make the problem easier, we will use odd length input sequence in this question (M
is odd).
Due to the precision of the FFT computation, the zero values of the DFT are not zero but very small values < 1e-12 in magnitude. For practical purposes, all values with absolute value less than 1e-6 can be considered to be zero. Use an error tolerance of 1e-6 to compare if two floating point arrays are equal.
Caveat: Use the imaginary part of the spectrum instead of the phase to check if the input signal is real and even.
In [ ]:
# E3 - 2.1: Complete the function test_real_even()
def test_real_even(x):
"""check if x is real and even using the symmetry properties of its DFT.
Args:
x (np.array): input signal of length M (M is odd)
Returns:
tuple including:
isRealEven (boolean): True if input x is real and even, and False otherwise
dftbuffer (np.array): M point zero phase windowed version of x
X (np.array): M point DFT of dftbuffer
"""
### Your code here
Test cases for test_real_even()
:
Test case 1: If x = np.array([2, 3, 4, 3, 2])
, which is a real and even signal (after zero phase windowing), the function returns
(True, array([ 4., 3., 2., 2., 3.]), array([14.0000+0.j, 2.6180+0.j,
0.3820+0.j, 0.3820+0.j, 2.6180+0.j])) (values are approximate)
Test case 2: If x = np.array([1, 2, 3, 4, 1, 2, 3])
, which is not an even signal (after zero phase windowing), the function returns
(False, array([ 4., 1., 2., 3., 1., 2., 3.]), array([ 16.+0.j,
2.+0.69j, 2.+3.51j, 2.-1.08j, 2.+1.08j, 2.-3.51j, 2.-0.69j])) (values are approximate)
To get a more realistic example use a longer input signal and plot the real and imaginary parts of the output spectrum X
. For example, use x = get_window('hanning', 51, fftbins=False)
, which is real an even, and plot x
and the real and imaginary part of the spectrum X
.
In [ ]:
# E3 - 2.2: Plot the input signal proposed above (window signal), call the function test_real_even(),
# and plot its output spectrum (real and imaginary)
### Your code here
Given a signal as input, the function supress_freq_dft_model()
should suppress the frequency components <= 70Hz using the DFT. It should teturn the filtered signal in the time domain.
Use the DFT to implement a very basic form of frequency domain filtering. Use the functions dftAnal()
and dftSynth()
provided in the dftModel.py
module.
Use dftAnal()
to obtain the magnitude spectrum (in dB) and phase spectrum of the audio signal. Set the values of the magnitude spectrum that correspond to frequencies <= 70 Hz to -120dB (there may not be a bin corresponding exactly to 70Hz, choose the nearest bin of equal or higher frequency, e.g., using np.ceil()
).
Use dftSynth()
to synthesize the filtered output signal. Then return the filtered signal.
Use a hamming window to smooth the signal. Hence, do not forget to scale the output signals by the sum of the window values (as done in software/models_interface/dftModel_function.py
).
Please note that this question is just for illustrative purposes and filtering is not usually done this way - such sharp cutoffs introduce artifacts in the output.
The input is a M
length signal x
, sampling frequency is fs
and the FFT size N
. The output is the filtered signal.
In [ ]:
# E3 - 3.1: Complete the function suppress_freq_dft_model()
def suppress_freq_dft_model(x, fs, N):
"""
Args:
x (np.array): input signal of length N (odd)
fs (float): sampling frequency (Hz)
N (int): FFT size
Returns:
np.array: output signal with filtering (N samples long)
"""
N = len(x)
w = get_window('hamming', N)
outputScaleFactor = sum(w)
### Your code here
Test case for the function suppress_freq_dft_model()
:
Test case 1: For an input signal with 40Hz, 100Hz, 200Hz, 1000Hz components, the output should only contain 100Hz, 200Hz and 1000Hz components.
Test case 2: For an input signal with 23Hz, 36Hz, 230Hz, 900Hz, 2300Hz components, the output should only contain 230Hz, 900Hz and 2300Hz components.
To understand the effect of filtering, you can plot the magnitude spectra of the input and output signals superposed.
In [ ]:
# E3 - 3.2: Compute the input signals proposed above and plot their magnitude spectra (x-axis in Hz),
# call the function suppress_freq_dft_model(), and plot the magnitude spectra of the output signals
### Your code here
The function zp_fft_size_expt()
should take an input signal, compute three different magnitude spectra (with different parameters) and return them.
This function should provide some insights into the effects window-size, FFT-size, and zero-padding on the spectrum of a signal.
The input signal should be of size 512 samples, the sampling rate should be 1000Hz, and the analysis window used should be hamming. The three set of analysis parameters should be:
Use dftAnal()
to obtain the positive half of the magnitude spectrum (in dB). Return the 3 magnitude spectra in dB.
In [ ]:
# E3 - 4.1: Complete the function zp_fft_size_expt()
def zp_fft_size_expt(x, window_size=[256, 512, 256], FFT_size=[256, 512, 512]):
"""compute magnitude spectra of x with different window sizes and FFT sizes.
Args:
x (np.array): input signal (512 samples long)
Returns:
list with magnitude spectra (np.array)
"""
### Your code here
Test cases for the function zp_fft_size_expt()
:
Test case 1: Use as input x = .2*np.cos(2*np.pi*300*n)+.2*np.cos(2*np.pi*600*n)
where n=np.arange(512)/fs
and the sampling rate fs=1000
. Use the default arguments for window_size
and FFT_size
. Call the function with mag_spectra = zp_fft_size_expt(x)
To understand better, plot the output of dftAnal()
for each case on a common frequency axis with different colors. You will see that mag_spectra[2]
is the interpolated version of mag_spectra[0]
(zero-padding leads to interpolation of the DFT). You will also observe that the 'mainlobe' of the magnitude spectrum in mag_spectra[1]
will be narrower than that in mag_spectra[0]
and mag_spectra[2]
. This shows that having a longer window leads to a narrower mainlobe with better frequency resolution and less spreading of the energy of the sinusoid.
In [ ]:
# E3 - 4.2: Compute, plot, and play the input signal proposed above, call the function zp_fft_size_expt(), and plot
# the outputs
### Your code here
In [ ]:
# E3 - 4.3: Explain the results of Part 4. If we were to estimate the frequency of the sinusoid using its DFT,
# a first principles approach is to choose the frequency value of the bin corresponding to the maximum in the
# DFT magnitude spectrum. If you were to take this approach, which of the magnitude spectra will give you a
# better estimate of the frequency of the sinusoid? Comment and discuss.
"""
"""