Introduction

The goal of this notebook is to investigate and explore a method of time-stretching an existing audio track such that its rhythmic pulses become expanded or contracted along the Fibonacci sequence, using Euclidean rhythms as the basis for modification. For lack of a better term, let's call it Fibonacci stretch.

Inspiration for this came initially from Vijay Iyer's article on Fibonacci numbers and musical rhythm as well as his trio's renditions of "Mystic Brew" and "Human Nature"; we'll use original version of the latter as an example throughout. We'll also touch upon Godfried Toussaint's work on Euclidean rhythms and Bjorklund's algorithm, both of which are intimately related.

A sneak peek at the final result

This is where we'll end up:


In [ ]:
import IPython.display as ipd
ipd.Audio("../data/out_humannature_90s_stretched.mp3", rate=44100)

You can also jump to Part 6 for more audio examples.

Part 1 - Representing rhythm as symbolic data

1.1 Rhythms as arrays

The main musical element we're going to play with here is rhythm (in particular, rhythmic ratios and their similarities). The base rhythm that we're going to focus on is the tresillo ("three"-side) of the son clave pattern, which sounds like this:


In [ ]:
ipd.Audio("../data/tresillo_rhythm.mp3", rate=44100)

...and looks something like this in Western music notation:

We can convert that into a sequence of bits, with each 1 representing an onset, and 0 representing a rest (similar to the way a sequencer works). Doing so yields this:

[1 0 0 1 0 0 1 0]

...which we can conveniently store as a list in Python. Actually, this is a good time to start diving directly into code. First, let's import all the Python libraries we need:


In [ ]:
%matplotlib inline
import math # Standard library imports
import IPython.display as ipd, librosa, librosa.display, numpy as np, matplotlib.pyplot as plt # External libraries
import pardir; pardir.pardir() # Allow imports from parent directory
import bjorklund # Fork of Brian House's implementation of Bjorklund's algorithm https://github.com/brianhouse/bjorklund

import fibonaccistretch # Functions pertaining specifically to Fibonacci stretch; much of what we'll use here

Briefly: we're using IPython.display to do audio playback, librosa for the bulk of the audio processing and manipulation (namely time-stretching), numpy to represent data, and matplotlib to plot the data.

Here's our list of bits encoding the tresillo sequence in Python (we'll use numpy arrays for consistency with later when when we deal with both audio signals and plotting visualizations):


In [ ]:
tresillo_rhythm = np.array([1, 0, 0, 1, 0, 0, 1, 0])
print(tresillo_rhythm)

Note that both the music notation and the array are symbolic representations of the rhythm; the rhythm is abstracted so that there is no information about tempo, dynamics, timbre, or other musical information. All we have is the temporal relationship between each note in the sequence (as well as the base assumption that the notes are evenly spaced).

Let's hear (and visualize) an example of how this rhythm sounds in more concrete terms:


In [ ]:
# Generate tresillo clicks
sr = 44100
tresillo_click_interval = 0.25 # in seconds
tresillo_click_times = np.array([i * tresillo_click_interval for i in range(len(tresillo_rhythm))
                                 if tresillo_rhythm[i] != 0])
tresillo_clicks = librosa.clicks(times=tresillo_click_times, click_freq=2000.0, sr=sr) # Generate clicks according to the rhythm

# Plot clicks and click times
plt.figure(figsize=(8, 2))
librosa.display.waveplot(tresillo_clicks, sr=sr)
plt.vlines(tresillo_click_times + 0.005, -1, 1, color="r") # Add tiny offset so the first line shows up
plt.xticks(np.arange(0, 1.75, 0.25))

# Render clicks as audio
ipd.Audio(tresillo_clicks, rate=sr)

1.2 Rhythmic properties

To work with rhythm in an analytical fashion, we'll need to define some properties of a given rhythmic sequence.

Let's define pulses as the number of onsets in a sequence (i.e. the number of 1s as opposed to 0s), and steps as the total number of elements in the sequence:


In [ ]:
tresillo_num_pulses = np.count_nonzero(tresillo_rhythm)
tresillo_num_steps = len(tresillo_rhythm)
print("The tresillo rhythm has {} pulses and {} steps".format(tresillo_num_pulses, tresillo_num_steps))

We can listen to the pulses and steps together:


In [ ]:
# Generate the clicks
tresillo_pulse_clicks, tresillo_step_clicks = fibonaccistretch.generate_rhythm_clicks(tresillo_rhythm, tresillo_click_interval)
tresillo_pulse_times, tresillo_step_times = fibonaccistretch.generate_rhythm_times(tresillo_rhythm, tresillo_click_interval)

# Tresillo as an array
print(tresillo_rhythm)

# Tresillo audio, plotted
plt.figure(figsize=(8, 2))
librosa.display.waveplot(tresillo_pulse_clicks + tresillo_step_clicks, sr=sr)
plt.vlines(tresillo_pulse_times + 0.005, -1, 1, color="r")
plt.vlines(tresillo_step_times + 0.005, -0.5, 0.5, color="r")

# Tresillo as audio
ipd.Audio(tresillo_pulse_clicks + tresillo_step_clicks, rate=44100)

You can follow along with the printed array and hear that every 1 corresponds to a pulse, and every 0 to a step.

In addition, let's define pulse lengths as the number of steps that each pulse lasts:


In [ ]:
tresillo_pulse_lengths = fibonaccistretch.calculate_pulse_lengths(tresillo_rhythm)
print("Tresillo pulse lengths: {}".format(tresillo_pulse_lengths))

Note that the tresillo rhythm's pulse lengths all fall along the Fibonacci sequence. This allows us do some pretty fun things, as we'll see in a bit. But first let's take a step back.

Part 2 - Fibonacci rhythms

2.1 Fibonacci numbers

The Fibonacci sequence is a particular sequence in which each value is the sum of the two preceding values. We can define a function in Python that gives us the nth Fibonacci number:


In [ ]:
fibonaccistretch.fibonacci??

And the first 20 numbers in the sequence are:


In [ ]:
first_twenty_fibs = np.array([fibonaccistretch.fibonacci(n) for n in range(20)])
plt.figure(figsize=(16,1))
plt.scatter(first_twenty_fibs, np.zeros(20), c="r")
plt.axis("off")
print(first_twenty_fibs)

The Fibonacci sequence is closely linked to the golden ratio in many ways, including the fact that as we go up the sequence, the ratio between successive numbers gets closer and closer to the golden ratio. (If you're interested, Vijay Iyer's article Strength in numbers: How Fibonacci taught us how to swing goes into this in more depth.)

Below is a plot of Fibonacci number ratios in red, and the golden ratio as a constant in blue. You can see how the Fibonacci ratios converge to the golden ratio:


In [ ]:
# Calculate and plot Fibonacci number ratios
phi = (1 + math.sqrt(5)) / 2 # Golden ratio; 1.61803398875...
fibs_ratios = np.array([first_twenty_fibs[i] / float(max(1, first_twenty_fibs[i-1])) for i in range(2,20)])
plt.plot(np.arange(len(fibs_ratios)), fibs_ratios, "r")

# Plot golden ratio as a consant
phis = np.empty(len(fibs_ratios))
phis.fill(phi)
plt.xticks(np.arange(len(fibs_ratios)))
plt.xlabel("Fibonacci index (denotes i for ith Fibonacci number)")
plt.ylabel("Ratio between ith and (i-1)th Fibonacci number")
plt.plot(np.arange(len(phis)), phis, "b", alpha=0.5)

We can also use the golden ratio to find the index of a Fibonacci number:


In [ ]:
fibonaccistretch.find_fibonacci_index??

fib_n = 21
fib_i = fibonaccistretch.find_fibonacci_index(fib_n)
assert(fibonaccistretch.fibonacci(fib_i) == fib_n)
print("{} is the {}th Fibonacci number".format(fib_n, fib_i))

2.2 Using Fibonacci numbers to manipulate rhythms

Recall our tresillo rhythm:


In [ ]:
plt.figure(figsize=(8, 2))
plt.vlines(tresillo_pulse_times + 0.005, -1, 1, color="r")
plt.vlines(tresillo_step_times + 0.005, -0.5, 0.5, color="r", alpha=0.5)
plt.yticks([])

print("Tresillo rhythm sequence: {}".format(tresillo_rhythm))
print("Tresillo pulse lengths: {}".format(tresillo_pulse_lengths))

We might classify it as a Fibonacci rhythm, since every one of its pulse lengths is a Fibonacci number. If we wanted to expand that rhythm along the Fibonacci sequence, what would that look like?

An intuitive (and, as it turns out, musically satisfying) method would be to take every pulse length and simply replace it with the Fibonacci number that follows it. So in our example, the 3s become 5s, and the 2 becomes 3.


In [ ]:
expanded_pulse_lengths = fibonaccistretch.fibonacci_expand_pulse_lengths(tresillo_pulse_lengths)
print("Expanded tresillo pulse lengths: {}".format(expanded_pulse_lengths))

We'll also want to be able to contract rhythms along the Fibonacci sequence (i.e. choose numbers in decreasing order instead of increasing order), as well as specify how many Fibonacci numbers away we want to end up.

We can generalize this expansion and contraction into a single function that can scale pulse lengths:


In [ ]:
# Note that `scale_amount` determines the direction and magnitude of the scaling.
# If `scale_amount` > 0, it corresponds to a rhythmic expansion.
# If `scale_amount` < 0, it corresponds to a rhythmic contraction.
# If `scale_amount` == 0, the original scale is maintained and no changes are made.

print("Tresillo pulse lengths:                 {}".format(tresillo_pulse_lengths))
print("Tresillo pulse lengths expanded by 1:   {}".format(fibonaccistretch.fibonacci_scale_pulse_lengths(tresillo_pulse_lengths, scale_amount=1)))
print("Tresillo pulse lengths expanded by 2:   {}".format(fibonaccistretch.fibonacci_scale_pulse_lengths(tresillo_pulse_lengths, scale_amount=2)))
print("Tresillo pulse lengths contracted by 1: {}".format(fibonaccistretch.fibonacci_scale_pulse_lengths(tresillo_pulse_lengths, scale_amount=-1)))

Of course, once we have these scaled pulse lengths, we'll want to be able to convert them back into rhythms, in our original array format:


In [ ]:
# Scale tresillo rhythm by a variety of factors and plot the results
for scale_factor, color in [(0, "r"), (1, "g"), (2, "b"), (-1, "y")]:
    scaled_rhythm = fibonaccistretch.fibonacci_scale_rhythm(tresillo_rhythm, scale_factor)
    scaled_pulse_indices = np.array([p_i for p_i,x in enumerate(scaled_rhythm) if x > 0 ])
    scaled_step_indices = np.array([s_i for s_i in range(len(scaled_rhythm))])
    scaled_pulse_ys = np.empty(len(scaled_pulse_indices))
    scaled_pulse_ys.fill(0)
    scaled_step_ys = np.empty(len(scaled_step_indices))
    scaled_step_ys.fill(0)
    
    # plt.figure(figsize=(len([scaled_rhythm])*0.5, 1))
    plt.figure(figsize=(8, 1))
    if scale_factor > 0:
        plt.title("Tresillo rhythm expanded by {}: {}".format(abs(scale_factor), scaled_rhythm), loc="left")
    elif scale_factor < 0:
        plt.title("Tresillo rhythm contracted by {}: {}".format(abs(scale_factor), scaled_rhythm), loc="left")
    else: # scale_factor == 0, which means rhythm is unaltered
        plt.title("Tresillo rhythm: {}".format(scaled_rhythm), loc="left")
    # plt.scatter(scaled_pulse_indices, scaled_pulse_ys, c=color)
    # plt.scatter(scaled_step_indices, scaled_step_ys, c="k", alpha=0.5)
    # plt.grid(True)
    plt.vlines(scaled_pulse_indices, -1, 1, color=color)
    plt.vlines(scaled_step_indices, -0.5, 0.5, color=color, alpha=0.5)
    plt.xticks(np.arange(0, plt.xlim()[1], 1))
    plt.yticks([])
    # plt.xticks(np.linspace(0, 10, 41))

This is exactly the kind of rhythmic expansion and contraction that the Vijay Iyer Trio explore in their renditions of "Mystic Brew" and "Human Nature (Trio Extension)".

Next up, let's begin working with some actual audio!

Part 3 - Mapping rhythm to audio

Part of the beauty of working with rhythms in a symbolic fashion is that once we set things up, we can apply them to any existing audio track.

To properly map the relationship between a rhythmic sequence and an audio representation of a piece of music, we'll have to do some feature extraction, that is, teasing out specific attributes of the music by analyzing the audio signal.

Our goal is to create a musically meaningful relationship between our symbolic rhythmic data and the audio track we want to manipulate.

3.1 Estimating tempo

First we'll load up our source audio file. For this example we'll work with Michael Jackson's "Human Nature", off of his 1982 album Thriller:


In [ ]:
# Load input audio file
filename = "../data/humannature_30s.mp3"
y, sr = librosa.load(filename, sr=sr)

plt.figure(figsize=(16,4))
librosa.display.waveplot(y, sr=sr)

ipd.Audio(y, rate=sr)

An important feature we want to extract from the audio is tempo (i.e. the time interval between steps). Let's estimate that using the librosa.beat.tempo method (which requires us to first detect onsets, or []):


In [ ]:
tempo = fibonaccistretch.estimate_tempo(y, sr)
print("Tempo (calculated): {}".format(tempo))
tempo = 93.0 # Hard-coded from prior knowledge
print("Tempo (hard-coded): {}".format(tempo))
(We can see that the tempo we've estimated differs by approximately 1BPM from the tempo that we've hard-coded from prior knowledge. It's often the case that such automatic feature extraction tools and algorithms require a fair bit of fine-tuning, so we can improve our results by supplying some user-defined parameters, especially when using them out of the box like we are here. The variables `hop_length` and `tempo` are two such parameters in this case. However, the more parameters we define manually, the less flexible our overall system becomes, so it's a tradeoff between accuracy and robustness.)

3.2 From tempo to beats

From the tempo we can calculate the times of every beat in the song (assuming the tempo is consistent, which in this case it is):


In [ ]:
beat_times = fibonaccistretch.calculate_beat_times(y, sr, tempo)
print("First 10 beat times (in seconds): {}".format(beat_times[:10]))

And let's listen to our extracted beats with the original audio track:


In [ ]:
# Listen to beat clicks (i.e. a metronome)
beat_clicks = librosa.clicks(times=beat_times, sr=sr, length=len(y))

# Plot waveform and beats
plt.figure(figsize=(16,4))
librosa.display.waveplot(y, sr=sr)
plt.vlines(beat_times, -0.25, 0.25, color="r")

ipd.Audio(y + beat_clicks, rate=sr)

3.3 From beats to measures

In order to map our tresillo rhythm to the audio in a musically meaningful way, we'll need to group beats into measures. From listening to the above example we can hear that every beat corresponds to a quarter note; thus, we'll set beats_per_measure to 4:


In [ ]:
beats_per_measure = 4

Using beats_per_measure we can calculate the times for the start of each measure:


In [ ]:
# Work in samples from here on
beat_samples = librosa.time_to_samples(beat_times, sr=sr)
measure_samples = fibonaccistretch.calculate_measure_samples(y, beat_samples, beats_per_measure)

print("First 10 measure samples: {}".format(measure_samples[:10]))

Note that we're working in samples now, as this is the unit that the audio data is actually stored in; when we loaded up the audio track, we essentially read in a large array of samples. The sample rate, which we defined as sr, tells us how many samples there are per second.

Thus, it's a simple matter to convert samples to times whenever we need to:


In [ ]:
measure_times = librosa.samples_to_time(measure_samples, sr=sr)
print("First 10 measure times (in seconds): {}".format(measure_times[:10], sr=sr))

We can visualize, and listen to, the measure and beat markers along with the original waveform:


In [ ]:
# Add clicks, then plot and listen
plt.figure(figsize=(16, 4))
librosa.display.waveplot(y, sr=sr)

plt.vlines(measure_times, -1, 1, color="r")
plt.vlines(beat_times, -0.5, 0.5, color="r")

measure_clicks = librosa.clicks(times=measure_times, sr=sr, click_freq=3000.0, length=len(y))
ipd.Audio(y + measure_clicks + beat_clicks, rate=sr)

3.4 Putting it all together: mapping symbolic rhythms to audio signals

With our knowledge of the song's tempo, beats, and measures, we can start bringing our symbolic rhythms into audio-land. Again, let's work with our trusty tresillo rhythm:


In [ ]:
print("Tresillo rhythm: {}\n"
      "{} pulses, {} steps".format(tresillo_rhythm, tresillo_num_pulses, tresillo_num_steps))

For this example, we want the rhythm to last an entire measure as well, so we'll set steps_per_measure to be the number of steps in the rhythm (in this case, 8):


In [ ]:
steps_per_measure = tresillo_num_steps
steps_per_measure

With these markers in place, we can now overlay the tresillo rhythm onto each measure and listen to the result:


In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(tresillo_rhythm, y, measure_samples, sr=sr)

The clicks for measures, pulses, and steps, overlap with each other at certain points. While you can hear this based on the fact that each click is at a different frequency, it can be hard to tell visually in the above figure. We can make this more apparent by plotting each set of clicks with a different color.

In the below figure, each measure is denoted by a large red line, each pulse by a medium green line, and each step by a small blue line.


In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(tresillo_rhythm, y, measure_samples, sr=sr, click_colors={"measure": "r",
                                                                                    "pulse": "g",
                                                                                    "step": "b"})

You can hear that the tresillo rhythm's pulses line up with the harmonic rhythm of "Human Nature"; generally, we want to pick rhythms and audio tracks that have at least some kind of musical relationship.

(We could actually try to estimate rhythmic patterns based on onsets and tempo, but that's for another time.)

Part 4 - Time-stretching audio

Now that we've put the symbolic rhythm and source audio together, we're ready to begin manipulating the audio and doing some actual stretching!

4.1 Target rhythms

First, we'll define the target rhythm that we want the audio to be mapped to:


In [ ]:
original_rhythm = tresillo_rhythm
target_rhythm = fibonaccistretch.fibonacci_scale_rhythm(original_rhythm, 1) # "Fibonacci scale" original rhythm by a factor of 1
print("Original rhythm: {}\n"
      "Target rhythm:   {}".format(original_rhythm, target_rhythm))

4.2 Pulse ratios

Given an original rhythm and target rhythm, we can compute their pulse ratios, that is, the ratio between each of their pulses:


In [ ]:
pulse_ratios = fibonaccistretch.calculate_pulse_ratios(original_rhythm, target_rhythm)
print("Pulse ratios: {}".format(pulse_ratios))

4.3 Modifying measures by time-stretching

Since we're treating our symbolic rhythms as having the duration of one measure, it makes sense to start by modifying a single measure.

Basically what we want to do is: for each pulse, get the audio chunk that maps to that pulse, and time-stretch it based on our calculated pulse ratios.

Below is an implementation of just that. It's a bit long, but that's mostly due to having to define several properties to do with rhythm and audio. The core idea, of individually stretching the pulses, remains the same:


In [ ]:
fibonaccistretch.modify_measure??

You'll notice that in the part where we choose stretch methods, there's a function called euclidean_stretch that we haven't defined. We'll get to that in just a second! For now, let's just keep that in the back of our heads, and not worry about it too much, so that we can hear what our modification method sounds like when applied to the first measure of "Human Nature":


In [ ]:
first_measure_data = y[measure_samples[0]:measure_samples[1]]
first_measure_modified = fibonaccistretch.modify_measure(first_measure_data,
                                                         original_rhythm, target_rhythm,
                                                         stretch_method="timestretch")
ipd.Audio(first_measure_modified, rate=sr)

It doesn't sound like there's much difference between the stretched version and the original, does it?

4.4 Modifying an entire track by naively time-stretching each pulse

To get a better sense, let's apply the modification to the entire audio track:


In [ ]:
# Modify the track using naive time-stretch
y_modified, measure_samples_modified = fibonaccistretch.modify_track(y, measure_samples,
                                                    original_rhythm, target_rhythm,
                                                    stretch_method="timestretch")

plt.figure(figsize=(16,4))
librosa.display.waveplot(y_modified, sr=sr)
ipd.Audio(y_modified, rate=sr)

Listening to the whole track, only perceptible difference is that the last two beats of each measure are slightly faster. If we look at the pulse ratios again:


In [ ]:
pulse_ratios = fibonaccistretch.calculate_pulse_ratios(original_rhythm, target_rhythm)
print(pulse_ratios)

... we can see that this makes sense, as we're time-stretching the first two pulses by the same amount, and then time-stretching the last pulse by a different amount.

(Note that while we're expanding our original rhythm along the Fibonacci sequence, this actually corresponds to a contraction when time-stretching. This is because we want to maintain the original tempo, so we're trying to fit more steps into the same timespan.)

4.5 Overlaying target rhythm clicks

We can get some more insight if we sonify the target rhythm's clicks and overlay it onto our modified track:


In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(target_rhythm, y_modified, measure_samples, sr)

This gets to the heart of the problem: when we time-stretch an entire pulse this way, we retain the original pulse's internal rhythm, essentially creating a polyrhythm in the target pulse's step (i.e. metrical) structure. Even though we're time-stretching each pulse, we don't hear a difference because everything within the pulse gets time-stretched by the same amount.

Part 5 - Euclidean stretch

Listening to the rendered track in Part 4.5, you can hear that aside from the beginning of each measure and pulse, the musical onsets in the modified track don't really line up with the target rhythm's clicks at all. Thus, without the clicks, we have no way to identify the target rhythm, even though that's what we were using as the basis of our stretch method!

So how do we remedy this?

5.1 Subdividing pulses

We dig deeper. That is, we can treat each pulse as a rhythm of its own, and subdivide it accordingly, since each pulse is comprised of multiple steps after all.


In [ ]:
print("Original rhythm: {}\n"
      "Target rhythm:   {}".format(original_rhythm, target_rhythm))

Looking at the first pulses of the original rhythm and target rhythm, we want to turn

[1 0 0]

into

[1 0 0 0 0].

To accomplish this, we'll turn to the concept of Euclidean rhythms.

5.2 Generating Euclidean rhythms using Bjorklund's algorithm

A Euclidean rhythm is a type of rhythm that can be generated based upon the Euclidean algorithm for calculating the greatest common divisor of two numbers.


In [ ]:
fibonaccistretch.euclid??

gcd = fibonaccistretch.euclid(8, 12)
print("Greatest common divisor of 8 and 12 is {}".format(gcd))

The concept of Euclidean rhythms was first introduced by Godfried Toussaint in his 2004 paper The Euclidean Algorithm Generates Traditional Musical Rhythms.

The algorithm for generating these rhythms is actually Bjorklund's algorithm, first described by E. Bjorklund in his 2003 paper The Theory of Rep-Rate Pattern Generation in the SNS Timing System, which deals with neutron accelerators in nuclear physics. Here we use Brian House's Python implementation of Bjorklund's algorithm; you can find the source code on GitHub.

It turns out that our tresillo rhythm is an example of a Euclidean rhythm. We can generate it by plugging in the number of pulses and steps into Bjorklund's algorithm:


In [ ]:
print(np.array(bjorklund.bjorklund(pulses=3, steps=8)))

5.3 Using Euclidean rhythms to subdivide pulses

Say we want to stretch a pulse [1 0 0] so that it resembles another pulse [1 0 0 0 0]:


In [ ]:
original_pulse = np.array([1,0,0])
target_pulse = np.array([1,0,0,0,0])

We want to know how much to stretch each subdivision. To do this, we'll convert these single pulses into rhythms of their own. First, we'll treat each step in the original pulse as an onset:


In [ ]:
original_pulse_rhythm = np.ones(len(original_pulse), dtype="int")
print(original_pulse_rhythm)

And as mentioned before, we'll use Bjorklund's algorithm to generate the target pulse's rhythm. The trick here is to use the number of steps in the original pulse as the number of pulses for the target pulse rhythm (hence the conversion to onsets earlier):


In [ ]:
target_pulse_rhythm = np.array(bjorklund.bjorklund(pulses=len(original_pulse), steps=len(target_pulse)))
print(target_pulse_rhythm)

You might have noticed that this rhythm is exactly the same as the rhythm produced by contracting the tresillo rhythm along the Fibonacci sequence by a factor of 1:


In [ ]:
print(fibonaccistretch.fibonacci_scale_rhythm(tresillo_rhythm, -1))

And it's true that there is some significant overlap between Euclidean rhythms and Fibonacci rhythms. The advantage of working with Euclidean rhythms here is that they work with any number of pulses and steps, not just ones that are Fibonacci numbers.

To summarize:


In [ ]:
print("In order to stretch pulse-to-pulse {} --> {}\n"
      "we subdivide and stretch rhythms   {} --> {}".format(original_pulse, target_pulse,
                                                            original_pulse_rhythm, target_pulse_rhythm))

The resulting pulse ratios are:


In [ ]:
print(fibonaccistretch.calculate_pulse_ratios(original_pulse_rhythm, target_pulse_rhythm))

... which doesn't intuitively look like it would produce something any different from what we tried before. However, we might perceive a greater difference because:

a) we're working on a more granular temporal level (subdivisions of pulses as opposed to measures), and

b) we're adjusting an equally-spaced rhythm (e.g. [1 1 1]) to one that's not necessarily equally-spaced (e.g. [1 0 1 0 1])

5.4 The Euclidean stretch algorithm

With all this in mind, we can now implement Euclidean stretch:


In [ ]:
fibonaccistretch.euclidean_stretch??

Let's take a listen to how it sounds:


In [ ]:
# Modify the track
y_modified, measure_samples_modified = fibonaccistretch.modify_track(y, measure_samples,
                                                                     original_rhythm, target_rhythm,
                                                                     stretch_method="euclidean")

plt.figure(figsize=(16,4))
librosa.display.waveplot(y_modified, sr=sr)
ipd.Audio(y_modified, rate=sr)

Much better! With clicks:


In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(target_rhythm, y_modified, measure_samples, sr)

As you can hear, the modified track's rhythm is in line with the clicks, and sounds noticeably different from the original song. This is a pretty good place to end up!

Part 6 - Fibonacci stretch: implementation and examples

6.1 Implementation

Here's an end-to-end implementation of Fibonacci stretch. A lot of the default parameters have been set to the ones we've been using in this notebook, although of course you can pass in your own:


In [ ]:
fibonaccistretch.fibonacci_stretch_track??

Now we can simply feed the function a path to an audio file (as well as any parameters we want to customize).

This is the exact method that's applied to the sneak peek at the final result up top. The only difference is that we use a 90-second excerpt rather than our original 30-second one:


In [ ]:
# "Human Nature" stretched by a factor of 1 using default parameters
fibonaccistretch.fibonacci_stretch_track("../data/humannature_90s.mp3",
                        stretch_factor=1,
                        tempo=93.0)

And indeed we get the exact same result.

6.2 Examples: customizing stretch factors

Now that we have a function to easily stretch tracks, we can begin playing around with some of the parameters.

Here's the 30-second "Human Nature" excerpt again, only this time it's stretched by a factor of 2 instead of 1:


In [ ]:
# "Human Nature" stretched by a factor of 2
fibonaccistretch.fibonacci_stretch_track("../data/humannature_30s.mp3",
                        tempo=93.0,
                        stretch_factor=2,
                        overlay_clicks=True)

As mentioned in part 2.2, we can contract rhythms as well using negative numbers as our stretch_factor. Let's try that with "Chan Chan" by the Buena Vista Social Club:


In [ ]:
# "Chan Chan" stretched by a factor of -1
fibonaccistretch.fibonacci_stretch_track("../data/chanchan_30s.mp3",
                        stretch_factor=-1,
                        tempo=78.5)

(Note that although we do end up with a perceptible difference (the song now sounds like it's in 7/8), it should actually sound like it's in 5/8, since [1 0 0 1 0 0 1 0] is getting compressed to [1 0 1 0 1]. This is an implementation detail with the Euclidean stretch method that I need to fix.)

6.3 Examples: customizing original and target rhythms

In order to get musically meaningful results we generally want to supply parameters that make musical sense with our input audio (although it can certainly be interesting to try with parameters that don't!). One of the parameters that makes the most difference in results is the rhythm sequence used to represent each measure.

Here's Chance the Rapper's verse from DJ Khaled's "I'm the One", with a custom original_rhythm that matches the bassline of the song:


In [ ]:
# "I'm the One" stretched by a factor of 1
fibonaccistretch.fibonacci_stretch_track("../data/imtheone_cropped_chance_60s.mp3",
                        tempo=162,
                        original_rhythm=np.array([1,0,0,0,0,1,0,0]),
                        stretch_factor=1)

We can define both a custom target rhythm as well. In addition, neither original_rhythm nor target_rhythm have to be Fibonacci rhythms for the stretch algorithm to work (although with this implementation they do both have to have the same number of pulses).

Let's try that out with the same verse, going from an original rhythm with 8 steps (i.e. in 4/4 meter) to a target rhythm with 10 steps (i.e. in 5/4 meter):


In [ ]:
# "I'm the One" in 5/4
fibonaccistretch.fibonacci_stretch_track("../data/imtheone_cropped_chance_60s.mp3",
                        tempo=162,
                        original_rhythm=np.array([1,0,0,0,0,1,0,0]),
                        target_rhythm=np.array([1,0,0,0,0,1,0,0,0,0]),
                        overlay_clicks=True)

As another example, we can give a swing feel to the first movement of Mozart's "Eine kleine Nachtmusik" (K. 525), as performed by A Far Cry:


In [ ]:
# "Eine kleine Nachtmusik" with a swing feel
fibonaccistretch.fibonacci_stretch_track("../data/einekleinenachtmusik_30s.mp3",
                        tempo=130,
                        original_rhythm=np.array([1,0,1,1]),
                        target_rhythm=np.array([1,0,0,1,0,1]))

It works pretty decently until around 0:09, at which point the assumption of a metronomically consistent tempo breaks down. (This is one of the biggest weaknesses with the current implementation, and is something I definitely hope to work on in the future.)

Let's also hear what "Chan Chan" sounds like in 5/4:


In [ ]:
# "Chan Chan" in 5/4
fibonaccistretch.fibonacci_stretch_track("../data/chanchan_30s.mp3",
                        tempo=78.5,
                        original_rhythm=np.array([1,0,0,1,0,0,0,0]),
                        target_rhythm=np.array([1,0,0,0,0,1,0,0,0,0])) # Also interesting to try with [1,0,1]

6.4 Examples: customizing input beats per measure

We can also work with source audio in other meters. For example, Frank Ocean's "Pink + White" is in 6/8. Here I've stretched it into 4/4 using the rhythm of the bassline, but you can uncomment the other supplied parameters (or supply your own!) to hear how they sound as well:


In [ ]:
# "Pink + White" stretched by a factor of 1
fibonaccistretch.fibonacci_stretch_track("../data/pinkandwhite_30s.mp3",
                        beats_per_measure=6,
                        tempo=160,
                        
                        # 6/8 to 4/4 using bassline rhythm
                        original_rhythm=np.array([1,1,1,1,0,0]),
                        target_rhythm=np.array([1,1,1,0,1,0,0,0]),
                        
                        # 6/8 to 4/4 using half notes
                        # original_rhythm=np.array([1,0,0,1,0,0]),
                        # target_rhythm=np.array([1,0,0,0,1,0,0,0]),
                        
                        # 6/8 to 10/8 (5/4) using Fibonacci stretch factor of 1
                        # original_rhythm=np.array([1,0,0,1,0,0]),
                        # stretch_factor=1,
                        
                        overlay_clicks=True)

Part 7 - Final thoughts

This notebook started out as an answer to the question "What if we applied rhythmic expansion methods, based on the Fibonacci sequence, to actual audio tracks?"

It quickly grew into more and more, and we now have a working implementation of what I've dubbed Fibonacci stretch. Along the way I've come to a few conclusions that I'll go over:

7.1 Fibonacci stretch as a creative tool

I think there's certainly a case to be made for Fibonacci stretch as an interesting and useful means of musical transformation; it's rooted in mathematical processes that have been shown to produce interesting artistic output, and Part 6 shows that Fibonacci itself can produce musically interesting results.

However, it has its limits as well, the main one being that the Fibonacci sequence grows at an exponential rate (with the rate actually being equal to the golden ratio). This means that above a certain stretch_factor value, Fibonacci stretch starts to feel somewhat impractical. For example, stretching [1 0 0 1 0 0 1 0] by a factor of 6 gives us a target rhythm with 144 steps, which isn't something we can easily perceive when crammed into the space of our original 8 steps.

One solution to this imperceptibility is to allow for the length of the modified measure to change as well. For example, a 2-second measure with 8 steps (4 steps/second) could be stretched into a 24-second measure with 144 steps (6 steps/second). The longer time span might yield more interesting results, but it might also obscure the relationship between the modified track and the original track.

7.2 Rhythm perception

I gleaned a lot of insight from experimenting with different parameters. As shown in Part 2.2, the greater the stretch_factor, the closer we get to the golden ratio; with regards to rhythm perception, I found that this made the resulting rhythm sound more and more "natural", in an almost uncanny valley manner. This relates back to the limitations of Fibonacci stretch as a creative tool as well. Perhaps it would be worth examining the space of results that lie between stretch factors of 1 and 3, as those seem to be where the most musically interesting rhythmic shifts occur.

Similarly, plugging in different patterns for the original_rhythm and target_rhythm parameters yielded differing results, with some seeming more closely related to others. It's possible that there are some underlying rhythmic principles that more clearly explain how the relationship between the original and target rhythms affects how we perceive the stretched result.

In addition, the initially disappointing results of using naive time-stretch also indicate the importance of considering different perceptual levels of rhythm. We explored a solution using subdivision of pulses, but haven't dealt with perceptual levels of rhythm in a rigorous manner at all.

7.3 Implementation improvements

The current implementation of Fibonacci stretch works well enough, but it also leaves a lot to be desired.

One of the biggest issues is the handling of tempo. Firstly, the built-in tempo detection almost never gives us the correct value, which is why for the examples we've had to pass in tempo values manually. Looking into alternate tempo detection methods could mitigate this. More importantly, however, the current implementation doesn't allow for variable tempo, which is a real problem with tracks that weren't recorded to a metronome. Simply using a dynamic tempo estimate (which librosa.beat.tempo is capable of) would go a long way into improving the quality of the modified tracks.

This implementation of Fibonacci stretch also doesn't work too well when stretch_factor < 0. As with the "Chan Chan" example in Part 6.2, we don't get the results we expect. It might just be a quirk in the step length conversion in euclidean_stretch(), but it's also possible that an adidtional level of subdivision might be needed.

We might also want to explore using onset detection to improve the actual time-stretching process. Instead of choosing stretch regions purely based on where they fall in relation to the symbolic rhythm parameters, we could define regions in accordance with the sample indices of detected onsets as well, which could yield more natural sounding output.

7.4 Future directions

There are a ton of avenues to explore further, some of which we've touched upon. For example, some of the most interesting results came from stretching a track so that the performance was converted from one meter to another (e.g. 6/8 to 5/4). Sometimes this occurred as a happy coincidence of Fibonacci stretch, but most of the time we had to pass in custom original_rhythm and target_rhythm parameters. I think it would be worthwhile to explore a version of the stretch method that could convert meter A to meter B without explicitly defining the rhythm patterns.

On a related note, exploring the possibilities of Euclidean rhythms, outside of their relationship to Fibonacci rhythms, could be worthwhile as well. In addition, we could allow each measure to be stretched using different parameters by passing in a list of arguments, each corresponding to a measure or group of measures. This would allow for more flexibility and creative freedom when using Fibonacci stretch.

Finally, I could see this stretch method being used in the context of a digital audio workstation (DAW), with both the symbolic meter data for a project and the raw audio signal for a track being modified. If embedded directly into a DAW, this would open up possibilities for rapid experimentation of rhythm in a music production setting.

All in all, I hope this notebook presents an insightful exploration of how the Fibonacci sequence can be applied to sample-level audio manipulation of rhythmic relationships!