Feature Extraction

Load, plot, and play simpleLoop.wav:


In [1]:
from essentia.standard import MonoLoader
x = MonoLoader(filename='simpleLoop.wav')()
print x.shape


(132300,)

In [2]:
t = arange(len(x))/44100.0
plot(t, x)
xlabel('Time (seconds)')


Out[2]:
<matplotlib.text.Text at 0x3f49450>

In [3]:
from IPython.display import Audio
Audio(x, rate=44100)


Out[3]:

Windowing

Before extracting features from a frame of audio, we first multiply the frame by a window, such as the Hamming window pictured below, to reduce artifacts caused by the edges of the frame.


In [4]:
plot(hamming(101))


Out[4]:
[<matplotlib.lines.Line2D at 0x569e510>]

MFCCs

Mel-frequency cepstral coefficients (MFCCs) are a set of features that describe the coarse overall shape of a spectrum but not the fine harmonic structure. MFCCs are often used to describe the timbre of a musical signal.


In [5]:
from essentia.standard import MFCC, Spectrum, Windowing, FrameGenerator
hamming_window = Windowing(type='hamming')
spectrum = Spectrum()  # we just want the magnitude spectrum
mfcc = MFCC()

mfccs = array([mfcc(spectrum(hamming_window(frame)))[1]
               for frame in FrameGenerator(x, frameSize=1024, hopSize=512)])
print mfccs.shape


(260, 13)

Display MFCCs over time:


In [6]:
imshow(mfccs[:,1:].T, origin='lower', aspect='auto', interpolation='nearest') # Ignore the 0th MFCC
yticks(range(12), range(1,13)) # Ignore the 0th MFCC
ylabel('MFCC Coefficient Index')
xlabel('Frame Index')


Out[6]:
<matplotlib.text.Text at 0x569ecd0>

Centroid

In Essentia, centroid can be used to compute either the temporal or spectral centroid of a frame.


In [7]:
from essentia.standard import Centroid, Spectrum, Windowing, FrameGenerator
hamming_window = Windowing(type='hamming')
spectrum = Spectrum()  # we just want the magnitude spectrum
centroid = Centroid(range=22050)

energy = array([centroid(spectrum(hamming_window(frame)))
                for frame in FrameGenerator(x, frameSize=2048, hopSize=1024)])

plot(energy)
ylabel('Spectral Centroid')
xlabel('Frame Index')


Out[7]:
<matplotlib.text.Text at 0x56b2050>