In [4]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

import base_features
import pitch_features
import feature_transforms
import utils

Catchy feature extraction

Outline

This notebook shows how to compute features for a set of presegmented audiofiles.

Extracting catchy features from a folder of such files involves three steps:

1. Base feature extraction

Here, basic, familiar feature time series are extracted. The toolbox currently implements (wrappers for) MFCC, chroma, melody and perceptual feature extraction.

This part of the toolbox relies on a lot of external code, but it's also easy to work around: if you want to use other features, just save them to a set of csv files (1 per song section--see below) in some folder (1 per feature).

2. Pitch descriptor extraction

This part computes mid-level pitch descriptors from chroma and/or melody information computed in step one. Essentially an implementation of several kinds of audio bigram descriptors.

See also [1] and [2].

3. Feature transforms

Compute 'first' and 'second order' aggregates of any of the features computed in step 1 and step 2.

See [2].

[1] Van Balen, J., Wiering, F., & Veltkamp, R. (2015). Audio Bigrams as a Unifying Model of Pitch-based Song Description. In Proc. 11th International Symposium on Computer Music Multidisciplinary Research (CMMR). Plymouth, United Kingdom.

[2] Van Balen, J., Burgoyne, J. A., Bountouridis, D., Müllensiefen, D., & Veltkamp, R. (2015). Corpus Analysis Tools for Computational Hook Discovery. In Proc. 16th International Society for Music Information Retrieval Conference (pp. 227–233). Malaga, Spain.

Dataset

Let's import some audio data and see how all of this works.

The CATCHY toolbox was designed for the analysis of a corpus of song sections.

CATCHY therefore requires data to be represented as a python dictionary of song section paths, grouped by song id.

utils.dataset_from_dir() makes such a dictionary given a folder of audio files, labeled songid-sectionid.ext where ext can be wav or mp3


In [1]:
audio_dir = '../Cogitch/Audio/Eurovision/'


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-0741743ba8a7> in <module>()
      1 audio_dir = '../Cogitch/Audio/Eurovision/'
      2 
----> 3 utils.dataset_from_dir(audio_dir)

NameError: name 'utils' is not defined

In [6]:
euro_dict = utils.dataset_from_dir(audio_dir)

Base features

Basic feature time series can be extracted using the base_features module. The function compute_and_write() provides a convenient wrapper around most of the functionality in this module, reading audio and computing a set of basic, useful features.

The results will be written to a set of csv files in data_dir.

Currently requires a dir to made for each of the features.


In [ ]:
data_dir = '../Cogitch/Data/Eurovision/'

In [7]:
# base_features.compute_and_write(audio_dir, data_dir)

Pitch Features

The pitch_features module provides code to compute, from the variable-length base features computed above, fixed-sized melody and harmony descriptors for each of the song sections.

pitch_features.compute_and_write() again provides a high-level wrapper function. The features that it should compute must be provided in a dictionary of (feature_function, parameters) tuples, with some feature name of your choice for each as keys.

The result is again stored in a set of csv files. Directories are the feature names provided.


In [ ]:
pitch_features.melody_dir = data_dir + 'melody/'
pitch_features.chroma_dir = data_dir + 'hpcp/'

In [8]:
features = {'pitchhist3': (pitch_features.get_pitchhist3, {}),
            'pitchhist3_int': (pitch_features.get_pitchhist3, {'intervals': True}),
            'chromahist3': (pitch_features.get_chromahist3, {}),
            'chromahist3_int': (pitch_features.get_chromahist3, {'intervals': True}),
            'harmonisation': (pitch_features.get_harmonisation, {}),
            'harmonisation_int': (pitch_features.get_harmonisation, {'intervals': True}) }

# pitch_features.compute_and_write(data_dir, features=features)

Feature Transforms

The feature_transforms module allows you to compute first- and second-order features based on any of the features above. The transforms to be applied must be passed to the compute() function using a special syntax. The syntax states a feature, a reference corpus, and an aggregation function.

From the doc string:

- feature name and aggregates are separated by dots, e.g. 'mfcc.entropy'
- feature name is first and contains no dots
- first order and second order aggregates are separated by one of 2 keywords:
    'corpus' or 'song'

Ex.:
>>> parse_features('loudness.mean.song.pdf.log')
('loudness', ['mean'], ['song', 'pdf', 'log'])

The above shows how the transform names are read. In the example:

`loudness.mean.song.pdf.log` 

computes the log of the probability density function of the distribution of the loudness features' mean within the song (i.e., across the sections of the song).

The result is returned in a Pandas dataframe.


In [ ]:
feature_transforms.data_dir = data_dir

The above tells the module where to look for base features.

Below, a set of tested first and second-order features is computed for the full dataset.


In [ ]:
features = [
'harmonisation_int.corpus.information',
'harmonisation_int.corpus.tau',
'harmonisation_int.song.information',
'harmonisation_int.song.tau',
'harmonisation.normentropy.minlog',
'harmonisation.normentropy.minlog.corpus.pdf.rank.logit',
'harmonisation.normentropy.minlog.song.pdf.rank.logit',
'chromahist3_int.corpus.information',
'chromahist3_int.corpus.tau',
'chromahist3_int.song.information',
'chromahist3_int.song.tau',
'chromahist3.normentropy.minlog',
'chromahist3.normentropy.minlog.corpus.pdf.rank.logit',
'chromahist3.normentropy.minlog.song.pdf.rank.logit',
'loudness.mean',
'loudness.mean.corpus.pdf.rank.logit',
'loudness.mean.song.pdf.rank.logit',
'loudness.std',
'loudness.std.corpus.pdf.rank.logit',
'loudness.std.song.pdf.rank.logit',
'pitchhist3_int.corpus.information',
'pitchhist3_int.corpus.tau',
'pitchhist3_int.song.information',
'pitchhist3_int.song.tau',
'pitchhist3.normentropy.minlog',
'pitchhist3.normentropy.minlog.corpus.pdf.rank.logit',
'pitchhist3.normentropy.minlog.song.pdf.rank.logit',
'mfcc.mean.corpus.indeppdf.rank.logit',
'mfcc.mean.song.indeppdf.rank.logit',
'mfcc.totvar.log',
'mfcc.totvar.log.corpus.pdf.rank.logit',
'mfcc.totvar.log.song.pdf.rank.logit',
'melody.mean',
'melody.mean.corpus.pdf.rank.logit',
'melody.mean.song.pdf.rank.logit',
'melody.std.log',
'melody.std.log.corpus.pdf.rank.logit',
'melody.std.log.song.pdf.rank.logit',
'roughness.mean.log',
'roughness.mean.log.corpus.pdf.rank.logit',
'roughness.mean.log.song.pdf.rank.logit',
'sharpness.mean',
'sharpness.mean.corpus.pdf.rank.logit',
'sharpness.mean.song.pdf.rank.logit']

data = feature_transforms.compute(euro_dict, features)

Output

Finally, output data to a single CSV file for use in another notebook or R.


In [22]:
# data.hist(figsize=(28,21));
data.to_csv('euro_features.csv', index=None)