Computational Spectromorphology

Michael A. Casey, Dartmouth College

License:
Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
http://creativecommons.org/licenses/by-nc/4.0/


In [2]:
from bregman.suite import *
import glob

In [4]:
# Load a 60s segment of electroacoustic music (Bernard Parmeggiani)
x,sr,fmt=wavread('IncidencesRésonances.wav')
print x.shape[0]/(44100.), 'secs'
play(x)


60.0 secs
Period size is 64 , Buffer size is 22050

Constant-Q Spectrum


In [5]:
F = LogFrequencySpectrum(x,nfft=4096,wfft=4096,nhop=2048)
F.feature_plot(dbscale=1)


Non-Negative Matrix Factorization

2D Convolutive Probabilistic Latent Component Analysis (2DPLCA)


In [30]:
help(F.separate)


Help on method separate in module bregman.features_base:

separate(self, cls=<class 'bregman.plca.PLCA'>, n=5, play_flag=False, **kwargs) method of bregman.features.LogFrequencySpectrum instance
    Features may be separated into auditory streams using probabilistic latent component analysis (PLCA).
    Use the following functions to process features into separate auditory information streams, corresponding to underlying auditory objects in the mixed input signal.
    
    separate is a meta function to call class methods on the passed class.
    ::
        cls  # The PLCA class to use (PLCA, SIPLCA, SIPLCA2, etc.) [plca.PLCA]
        n    # The numbre of PLCA components to initialize
        optional kwargs:
        x    # original analyzed audio signal (for play_components)
        kwargs # win=(shifts,len) for frequency and time shift invariance, 
        alphaW # Dirichlet pior on W basis
        alphaZ # Dircihlet prior on Z
        alphaH # Dirichlet prior on H
        betaW # Entropic pior on W basis
        betaZ # Entropic prior on Z
        betaH # Entropic prior on H

2D (Frequency) Shift-Invariant Matrix Factorization

$$ X = (WZ)\bigotimes H^T $$


In [35]:
# Use 2D Shift-invariant PLCA, 4 components, window with 7 shifts and 10 frames
F.separate(plca.SIPLCA2, n=1, win=(12,20))
F.normalize_components()
W, Z, H, Xhat = F.w, F.z, F.h, F.X_hat

In [36]:
print "shapes of components W,Z,H: ", W.shape, Z.shape, H.shape
print " reconstruction / original: ", Xhat[0].shape, F.X.shape


shapes of components W,Z,H:  (95, 1, 20) (1,) (1, 12, 134)
 reconstruction / original:  (95, 134) (95, 134)

In [37]:
for k in range(len(Z)):
    fig,axes = subplots(1, 2, figsize=(12.75,4))
    im=axes[0].imshow(20*log10(clip(W[:,k,:],0.0001,1)/W[:,k,:].max()),aspect='auto',origin='bottom',cmap=cm.jet)
    axes[0].set_title('Spectro-Temporal Basis Function')
    colorbar(im,ax=axes[0])

    im=axes[1].imshow(20*log10(clip(H[k],0.000001,1)/H[k].max()), aspect='auto',origin='bottom',cmap=cm.jet)
    #feature_plot(sin(U),cmap=cm.hot,nofig=1)
    axes[1].set_title('Time-Frequency Shift Matrix')
    colorbar(im,ax=axes[1])

    fig.subplots_adjust(wspace=0.1)



In [38]:
for k in range(len(Xhat)):
    #subplot(ceil(len(Xhat)/3.),3,k+1)
    feature_plot(Xhat[k], dbscale=1, normalize=1)
    x_hat = F.inverse(Xhat[k])
    play(balance_signal(x_hat))


Period size is 64 , Buffer size is 22050

In [13]:
F.play_components()


Period size is 32 , Buffer size is 22050
Period size is 32 , Buffer size is 22050
Period size is 32 , Buffer size is 22050
Period size is 32 , Buffer size is 22050

In [23]:



Out[23]:
(7,)

In [ ]: