Example: use BarBitURythme to encode drum rhythms

This example illustrates how BarBitURythme can be used to

  • encode rhythm sequences of a signal.

That encoding will later be used to train BarBitURythme statistical models and generate sequences of similar style.

We set up the python machinery that will be used. Since barbiturythme is not in our system's python package repositories (sys.path), we add it.


In [1]:
import os, sys
sys.path.append( '/'.join(os.getcwd().split('/')[:-1]) + '/barbiturythme' )

In [2]:
import numpy as np
import subprocess
import barbiturythme
from barbiturythme import bbur_gene

We choose the data folder/files location and name and initialize a bbur_gene.bbur_io class to read and write to data files.

  • io.data_folder_name is the name of a folder in ../data
  • io.data_file_name is the name of the files in ../data/data_folder_name
  • io.n_freq_int is the number of percussive instruments encoded in those files, e.g. Roland-808 kick drum, snare and hihat

In [3]:
io = bbur_gene.bbur_io()
io.data_folder_name = '4bars_4beats_4subs_3bands'
io.data_file_name = 'hh_00' 
io.n_freq_int = 3

Then the class barbiturythme.BarBit will be used to read .wav files and encode (some of) their content into machine adapted data.


In [4]:
bb = barbiturythme.BarBit()
bb.read( '/home/freinque/tracks/atlanta_trap.wav' )


WarningWarning: Here, depending on which program+option generated your .wav file, you may get the 'chunk not understood' error. It doesn't seem to alter the signal reading, though.
/usr/lib/python2.7/dist-packages/scipy/io/wavfile.py:121: WavFileWarning: chunk not understood
  warnings.warn("chunk not understood", WavFileWarning)

We set up the parameters of the standard spectral analysis procedure. Roughly speaking, we will

  • scan the signal using a 'magnifying glasses' of (time) width bb.window_size, making steps of (time) length bb.hop_size,
  • look into 'magnifying glasses' with filters that only let the frequencies in the intervals bb.freq_bounds pass, and
  • calculate the increase of intensity in each of them from one step to the next with bb.calc_odf().
Here, we use a frequency interval [20,120] to get the bass drum (this seems standard), [4000,6000] to catch the Roland-808 snare and [13000,14000] to see the hihat. If you don't know what are the main frequencies of a percussive instrument, I suggest a Fourier transform (over a tight window containing a hit of the instrument) using the Analyse > Plot Spectrum tool in Audacity.


In [5]:
bb.window_size = 1024*2
bb.hop_size = 512
bb.set_freq_bounds([[20,120],[4000,6000],[13000,14000]])
bb.smooth_windows = [1,10,1]
start_time = 10
duration = 15
bb.calc_odf( start_time, start_time + duration )


Warning: hop size (512) is not a factor of signal size (661500)

We plot the increase levels in each interval


In [6]:
bb.plot_odf()


We adjust the levels at which we consider that each instrument is played and use bb.calc_onsets() to determine at which steps each instrument is played.


In [7]:
bb.min_thresholds = [0.3,0.5,0.2]
bb.median_windows = [5,9,13]
bb.peak_sizes = [0.05,1,0.02]
bb.calc_onsets()

We plot the results. If they don't make us happy (i.e. what we hear and see coincide), we can reexecute the above cells with different paramenters.


In [8]:
bb.plot_onsets()


At this point, our BarBit doesn't know what is the tempo of our track so it can't really find its bars (i.e. measures) or multi-bar structures (e.g. choruses, etc.). If you know it for sure, just set bb.bpm to it.

Otherwise, bbur_gene.bpm_finder() provides a decent (although very simple) way of calculating it. Here, we use the onsets (i.e. moments when hit) of the snare and multiply the result by 2 since it is played every two beats.


In [9]:
bpmf = bbur_gene.bpm_finder()
bpmf.gcd = bb.hop_size
bpmf.sampling_rate = bb.sampling_rate
bpmf.set_onsets( bb.onsets[1] )

In [10]:
bpmf.calc_bpms()


Highest values at lags:  [[29.000000000000004, 1.3351473922902495], [29.000000000000004, 1.3235374149659864], [29.000000000000004, 1.3119274376417234], [29.000000000000004, 1.3003174603174603], [29.000000000000004, 1.2887074829931973], [29.000000000000004, 1.2770975056689342], [29.000000000000004, 1.2654875283446712], [28.600000000000001, 1.3467573696145125], [25.700000000000003, 2.5541950113378684], [25.400000000000002, 1.253877551020408], [24.700000000000003, 2.5658049886621317], [22.600000000000001, 2.6354648526077096], [22.600000000000001, 2.6238548752834467], [22.600000000000001, 2.6122448979591835], [22.600000000000001, 2.6006349206349206], [22.600000000000001, 2.589024943310658], [22.600000000000001, 2.5774149659863945], [21.500000000000004, 2.647074829931973], [18.5, 1.3583673469387756], [16.5, 2.5425850340136056]]
corresponding to bpm values of:  [44.93885869565217, 45.333059210526315, 45.73423672566371, 46.142578125, 46.558277027027025, 46.98153409090909, 47.41255733944954, 44.55145474137931, 23.490767045454547, 47.8515625, 23.384473981900452, 22.76638215859031, 22.867118362831857, 22.968750000000004, 23.0712890625, 23.17474775784753, 23.279138513513512, 22.666529605263158, 44.17067307692307, 23.598030821917806]

In [11]:
bb.bpm = 2.*bpmf.bpms[3]
bb.bpm


Out[11]:
92.28515625

We set the set the parameters of the bar encoding.

  • bb.beats_per_bar is the time signature of the track,
  • bb.n is the length of the sequences (in bars) that we want to encode,
  • bb.subs_per_beat is the time resolution that we want to use in the encoding (i.e. number of subdivisions per beat).

In [12]:
bb.subs_per_beat = 4
bb.beats_per_bar = 4
bb.n = 4

bb.encode_onsets() then does what its name says.


In [13]:
bb.encode_onsets()

The result is a bunch of 4bars (concatenated sequence made of 4 bars) in integer (bb.n_bars_int) and binary (bb.n_bars_bin) forms. For instance


In [14]:
bb.n_bars_bin[0][0]


Out[14]:
array([[0, 0, 0],
       [0, 0, 0],
       [1, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 1, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [1, 1, 1],
       [0, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 1, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [1, 1, 1],
       [0, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 1, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [1, 1, 1],
       [0, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 1, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [1, 1, 1],
       [0, 0, 1]])

where the third element being [1,0,1] means that at the third time subdivision, the fisrt and third (kick and hihat) intruments are played, the the fourth element [0,0,1] means that at the fourth time subdivision, the third (hihat) intrument is played, etc. The integer form looks like


In [15]:
bb.n_bars_int[0][0]


Out[15]:
array([0, 0, 5, 4, 4, 4, 6, 4, 4, 4, 4, 4, 4, 4, 7, 4, 5, 5, 5, 4, 4, 4, 6,
       4, 4, 4, 4, 4, 4, 4, 7, 4, 5, 5, 5, 4, 4, 4, 6, 4, 4, 4, 4, 4, 4, 4,
       7, 4, 5, 5, 5, 4, 4, 4, 6, 4, 4, 4, 4, 4, 4, 4, 7, 4])

where 5 stands for 12**0 + 021 + 1*22, etc.

bb doesn't know where exactly to set the 'time 0' moment. We help a bit, but this could be mostly automated. subdiv_shift is our preferred time shift.


In [16]:
subdiv_shift = int(( bb.sampling_rate*60.0/(bb.bpm*bb.hop_size) ) / bb.subs_per_beat) #time shift, in subdivs
observations = np.array( bb.n_bars_int[57*subdiv_shift][:] )

In [17]:
print observations[:]


[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 4 4 4 6]
 [4 4 4 4 4 4 4 7 4 5 5 5 4 4 4 6 4 4 4 4 4 4 4 7 4 5 5 5 4 4 4 6 4 4 4 4 4
  4 4 7 4 5 5 5 4 4 4 6 4 4 4 4 4 4 4 7 4 5 5 5 4 4 4 6]
 [4 4 4 4 4 4 4 7 4 5 5 5 4 4 4 6 4 4 4 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

And we add de desired 4bars to the data file.


In [18]:
io.append_data( [observations[1]] )

In [ ]: