In [1]:
%matplotlib inline
from matplotlib import pyplot as plt
import numpy as np
import re
Author: Tiago F. Tavares - 2016
This document contains a demonstration of a low-level feature extraction process. The main hypothesis behind this method is that perceptual audio characteristics are related to low-level features spectral features. Therefore, we first calculate a spectrogram for the audio file. Then, for each spectrogram frame, we calculate a descriptive feature. In the following example, we will calculate the spectral flux, which is the sum of the positive differences between two consecutive spectrogram frames:
In [49]:
import mir3.modules.tool.wav2spectrogram as spec
import mir3.modules.features.flux as flux
# Calculate spectrogram
converter = spec.Wav2Spectrogram()
s = converter.convert(open("examples/157447__nengisuls__solo-loops-2.wav"), window_length=2048, dft_length=2048,
window_step=1024, spectrum_type='magnitude', save_metadata=True)
f = flux.Flux()
track = f.calc_track(s) # Feature track
The feature calculation generates a feature track. Like the spectrogram, it contains both data and metadata regarding the calculation. We can use it such information to plot the feature track, as:
In [34]:
def plot_feature_track(s, scale=None, dim=0, size=(3.45,2.0)):
if s.data.ndim > 1:
d = s.data[:,dim]
else:
d = s.data
min_y = np.min(d)
max_y = np.max(d)
min_time = 0
max_time = float(len(d)) / s.metadata.sampling_configuration.ofs
ylabel = s.metadata.feature.split()[dim]
if scale is not None:
ylabel += ' ('
ylabel += str(scale)
ylabel += ')'
x_axis = np.array(range(len(d))) / \
float(s.metadata.sampling_configuration.ofs)
im = plt.plot(x_axis, d)
plt.xlabel('Time (s)')
plt.ylabel(ylabel)
fig = plt.gcf()
width_inches = size[0]#/80.0
height_inches = size[1]#/80.0
fig.set_size_inches( (width_inches, height_inches) )
plt.show()
In [35]:
plot_feature_track(track, size=(15.45,5.0))
An important metadata member is .feature. It contains the label of the feature contained in the feature track. In this case, the _0_1024 indicate that the feature was calculated using bins 0 to 1024 in the spectrogram.
PyMIR3 also provides tools for differentiating feature tracks. The features.diff module also generates a feature track, whose feature name is equal to the input feature name preceded by a diff_ indicator:
In [51]:
import mir3.modules.features.diff as diff
d = diff.Diff()
dtrack = d.calc_track(track)
plot_feature_track(dtrack, size=(15.45,5.0))
Texture windows are sliding windows of approximately 1s in which feature statistics (mean and variance) are calculated. The low-pass characteristic of a texture window allows it to be used to detect low-level features at the auditory texture level. The texture window estimator also generates a feature track. However, the generated feature contains two dimensions, related to the mean and variance of the feature within the texture window:
In [59]:
import mir3.modules.tool.to_texture_window as tex
reload(tex)
# Texture window
t = tex.ToTextureWindow().to_texture(track, 40)
plot_feature_track(t, dim=0, size=(15.45,5.0))
plot_feature_track(t, dim=1, size=(15.45,5.0))
It is possible that you want to join all feature tracks from a particular file to a single object. The reasons for this are many: you may want to save it all at once, or perform further calculations all at once, or something else. PyMIR3 provides the join class to perform such operations:
In [69]:
import mir3.modules.features.join as join
j = join.Join()
all_tracks = j.join([track, dtrack])
plot_feature_track(all_tracks, dim=0, size=(15.45,5.0))
plot_feature_track(all_tracks, dim=1, size=(15.45,5.0))
In [74]:
print "Feature name:", t.metadata.feature
print "Input feature name:", t.metadata.input_metadata.feature
print "Input filename:", t.metadata.input_metadata.input_metadata.input.name
If a feature track was built from many inputs, then .input_metadata will be a list:
In [73]:
print "Feature name:", all_tracks.metadata.feature
print "Input feature name:", all_tracks.metadata.input_metadata[0].feature
print "Input filename:", all_tracks.metadata.input_metadata[0].input_metadata.input.name
In [ ]: