In [1]:
from bregman.suite import *
import sparseapprox as S
Analyze 10s of a natural scene (field recording) using a log frequency scale (constant-Q transform) and plot the resulting time series and its time-averaged constant-Q spectrum.
In [2]:
x,sr,fmt = wavread('chernobyl.wav', last=10*44100) # load the sound fle
x = x.mean(1) # take the average of the channels
F = LogFrequencySpectrum(x, nhop=4096, nfft=16384, wfft=8192, npo=24) # constant-Q transform
F.feature_plot(dbscale=1) # plot the transform
figure(); title('Average Constant-Q Spectrum')
plot((F.X**2).sum(1)) # and the time-averaged constant-Q spectrum
Out[2]:
Invert the constant-Q transform to an audio signal, using inverse constant-Q transform and inverse short-time Fourier transform.
In [3]:
xh = F.inverse()
play(balance_signal(xh))
Learn sparse codes from data using dictionary learnng on 16x16 patches of the constant-Q transform.
In [4]:
SS = S.SparseApproxSpectrum(n_components=9, patch_size=(16,16)) # learn 16 components of 8x8 patches
SS.extract_codes(F.X, standardize=1) # Use standardized patches
SS.plot_codes(cbar=1,cmap=cm.hot) # show the learned codes
Reconstruct the constant-Q spectrogram using each learned patch basis. Do this for each patch separately.
In [5]:
# Reconstruct the spectra from sparse dictionary
SS.reconstruct_individual_spectra(plotting=1)
Out[5]:
In [10]:
# Reconstruct signal from approximated dictionary atom spectrum
x_hat = F.inverse(SS.X_hat_l[0]) # <- change the reconstruct patch index here
feature_plot(SS.X_hat,dbscale=0,normalize=1)
feature_plot(F.X,dbscale=0,normalize=1)
play(balance_signal(F.x_hat))
In [7]:
# Make random phase map for signal reconstruction
Phi_hat=rand(*F.STFT.shape)*2*pi-pi
print Phi_hat.shape
In [9]:
# Reconstruct signal from random phases and approximated dictionary atom spectrum
x_hat = F.inverse(SS.X_hat_l[0], Phi_hat=Phi_hat, pvoc=1) # <- change the reconstruct patch index here
play(balance_signal(F.x_hat))
In [ ]: