by David Su
This notebook and its associated code are also available on GitHub.
The goal of this notebook is to investigate and explore a method of time-stretching an existing audio track such that its rhythmic pulses become expanded or contracted along the Fibonacci sequence, using Euclidean rhythms as the basis for modification. For lack of a better term, let's call it Fibonacci stretch.
Inspiration for this came initially from Vijay Iyer's article on Fibonacci numbers and musical rhythm as well as his trio's renditions of "Mystic Brew" and "Human Nature"; we'll use original version of the latter as an example throughout. We'll also touch upon Godfried Toussaint's work on Euclidean rhythms and Bjorklund's algorithm, both of which are intimately related.
In [ ]:
import IPython.display as ipd
ipd.Audio("../data/out_humannature_90s_stretched.mp3", rate=44100)
You can also jump to Part 6 for more audio examples.
In [ ]:
ipd.Audio("../data/tresillo_rhythm.mp3", rate=44100)
...and looks something like this in Western music notation:
We can convert that into a sequence of bits, with each 1
representing an onset, and 0
representing a rest (similar to the way a sequencer works). Doing so yields this:
[1 0 0 1 0 0 1 0]
...which we can conveniently store as a list in Python. Actually, this is a good time to start diving directly into code. First, let's import all the Python libraries we need:
In [ ]:
%matplotlib inline
import math # Standard library imports
import IPython.display as ipd, librosa, librosa.display, numpy as np, matplotlib.pyplot as plt # External libraries
import pardir; pardir.pardir() # Allow imports from parent directory
import bjorklund # Fork of Brian House's implementation of Bjorklund's algorithm https://github.com/brianhouse/bjorklund
import fibonaccistretch # Functions pertaining specifically to Fibonacci stretch; much of what we'll use here
Briefly: we're using IPython.display
to do audio playback, librosa
for the bulk of the audio processing and manipulation (namely time-stretching), numpy
to represent data, and matplotlib
to plot the data.
Here's our list of bits encoding the tresillo sequence in Python (we'll use numpy arrays for consistency with later when when we deal with both audio signals and plotting visualizations):
In [ ]:
tresillo_rhythm = np.array([1, 0, 0, 1, 0, 0, 1, 0])
print(tresillo_rhythm)
Note that both the music notation and the array are symbolic representations of the rhythm; the rhythm is abstracted so that there is no information about tempo, dynamics, timbre, or other musical information. All we have is the temporal relationship between each note in the sequence (as well as the base assumption that the notes are evenly spaced).
Let's hear (and visualize) an example of how this rhythm sounds in more concrete terms:
In [ ]:
# Generate tresillo clicks
sr = 44100
tresillo_click_interval = 0.25 # in seconds
tresillo_click_times = np.array([i * tresillo_click_interval for i in range(len(tresillo_rhythm))
if tresillo_rhythm[i] != 0])
tresillo_clicks = librosa.clicks(times=tresillo_click_times, click_freq=2000.0, sr=sr) # Generate clicks according to the rhythm
# Plot clicks and click times
plt.figure(figsize=(8, 2))
librosa.display.waveplot(tresillo_clicks, sr=sr)
plt.vlines(tresillo_click_times + 0.005, -1, 1, color="r") # Add tiny offset so the first line shows up
plt.xticks(np.arange(0, 1.75, 0.25))
# Render clicks as audio
ipd.Audio(tresillo_clicks, rate=sr)
In [ ]:
tresillo_num_pulses = np.count_nonzero(tresillo_rhythm)
tresillo_num_steps = len(tresillo_rhythm)
print("The tresillo rhythm has {} pulses and {} steps".format(tresillo_num_pulses, tresillo_num_steps))
We can listen to the pulses and steps together:
In [ ]:
# Generate the clicks
tresillo_pulse_clicks, tresillo_step_clicks = fibonaccistretch.generate_rhythm_clicks(tresillo_rhythm, tresillo_click_interval)
tresillo_pulse_times, tresillo_step_times = fibonaccistretch.generate_rhythm_times(tresillo_rhythm, tresillo_click_interval)
# Tresillo as an array
print(tresillo_rhythm)
# Tresillo audio, plotted
plt.figure(figsize=(8, 2))
librosa.display.waveplot(tresillo_pulse_clicks + tresillo_step_clicks, sr=sr)
plt.vlines(tresillo_pulse_times + 0.005, -1, 1, color="r")
plt.vlines(tresillo_step_times + 0.005, -0.5, 0.5, color="r")
# Tresillo as audio
ipd.Audio(tresillo_pulse_clicks + tresillo_step_clicks, rate=44100)
You can follow along with the printed array and hear that every 1
corresponds to a pulse, and every 0
to a step.
In addition, let's define pulse lengths as the number of steps that each pulse lasts:
In [ ]:
tresillo_pulse_lengths = fibonaccistretch.calculate_pulse_lengths(tresillo_rhythm)
print("Tresillo pulse lengths: {}".format(tresillo_pulse_lengths))
Note that the tresillo rhythm's pulse lengths all fall along the Fibonacci sequence. This allows us do some pretty fun things, as we'll see in a bit. But first let's take a step back.
In [ ]:
fibonaccistretch.fibonacci??
And the first 20 numbers in the sequence are:
In [ ]:
first_twenty_fibs = np.array([fibonaccistretch.fibonacci(n) for n in range(20)])
plt.figure(figsize=(16,1))
plt.scatter(first_twenty_fibs, np.zeros(20), c="r")
plt.axis("off")
print(first_twenty_fibs)
The Fibonacci sequence is closely linked to the golden ratio in many ways, including the fact that as we go up the sequence, the ratio between successive numbers gets closer and closer to the golden ratio. (If you're interested, Vijay Iyer's article Strength in numbers: How Fibonacci taught us how to swing goes into this in more depth.)
Below is a plot of Fibonacci number ratios in red, and the golden ratio as a constant in blue. You can see how the Fibonacci ratios converge to the golden ratio:
In [ ]:
# Calculate and plot Fibonacci number ratios
phi = (1 + math.sqrt(5)) / 2 # Golden ratio; 1.61803398875...
fibs_ratios = np.array([first_twenty_fibs[i] / float(max(1, first_twenty_fibs[i-1])) for i in range(2,20)])
plt.plot(np.arange(len(fibs_ratios)), fibs_ratios, "r")
# Plot golden ratio as a consant
phis = np.empty(len(fibs_ratios))
phis.fill(phi)
plt.xticks(np.arange(len(fibs_ratios)))
plt.xlabel("Fibonacci index (denotes i for ith Fibonacci number)")
plt.ylabel("Ratio between ith and (i-1)th Fibonacci number")
plt.plot(np.arange(len(phis)), phis, "b", alpha=0.5)
We can also use the golden ratio to find the index of a Fibonacci number:
In [ ]:
fibonaccistretch.find_fibonacci_index??
fib_n = 21
fib_i = fibonaccistretch.find_fibonacci_index(fib_n)
assert(fibonaccistretch.fibonacci(fib_i) == fib_n)
print("{} is the {}th Fibonacci number".format(fib_n, fib_i))
Recall our tresillo rhythm:
In [ ]:
plt.figure(figsize=(8, 2))
plt.vlines(tresillo_pulse_times + 0.005, -1, 1, color="r")
plt.vlines(tresillo_step_times + 0.005, -0.5, 0.5, color="r", alpha=0.5)
plt.yticks([])
print("Tresillo rhythm sequence: {}".format(tresillo_rhythm))
print("Tresillo pulse lengths: {}".format(tresillo_pulse_lengths))
We might classify it as a Fibonacci rhythm, since every one of its pulse lengths is a Fibonacci number. If we wanted to expand that rhythm along the Fibonacci sequence, what would that look like?
An intuitive (and, as it turns out, musically satisfying) method would be to take every pulse length and simply replace it with the Fibonacci number that follows it. So in our example, the 3
s become 5
s, and the 2
becomes 3
.
In [ ]:
expanded_pulse_lengths = fibonaccistretch.fibonacci_expand_pulse_lengths(tresillo_pulse_lengths)
print("Expanded tresillo pulse lengths: {}".format(expanded_pulse_lengths))
We'll also want to be able to contract rhythms along the Fibonacci sequence (i.e. choose numbers in decreasing order instead of increasing order), as well as specify how many Fibonacci numbers away we want to end up.
We can generalize this expansion and contraction into a single function that can scale pulse lengths:
In [ ]:
# Note that `scale_amount` determines the direction and magnitude of the scaling.
# If `scale_amount` > 0, it corresponds to a rhythmic expansion.
# If `scale_amount` < 0, it corresponds to a rhythmic contraction.
# If `scale_amount` == 0, the original scale is maintained and no changes are made.
print("Tresillo pulse lengths: {}".format(tresillo_pulse_lengths))
print("Tresillo pulse lengths expanded by 1: {}".format(fibonaccistretch.fibonacci_scale_pulse_lengths(tresillo_pulse_lengths, scale_amount=1)))
print("Tresillo pulse lengths expanded by 2: {}".format(fibonaccistretch.fibonacci_scale_pulse_lengths(tresillo_pulse_lengths, scale_amount=2)))
print("Tresillo pulse lengths contracted by 1: {}".format(fibonaccistretch.fibonacci_scale_pulse_lengths(tresillo_pulse_lengths, scale_amount=-1)))
Of course, once we have these scaled pulse lengths, we'll want to be able to convert them back into rhythms, in our original array format:
In [ ]:
# Scale tresillo rhythm by a variety of factors and plot the results
for scale_factor, color in [(0, "r"), (1, "g"), (2, "b"), (-1, "y")]:
scaled_rhythm = fibonaccistretch.fibonacci_scale_rhythm(tresillo_rhythm, scale_factor)
scaled_pulse_indices = np.array([p_i for p_i,x in enumerate(scaled_rhythm) if x > 0 ])
scaled_step_indices = np.array([s_i for s_i in range(len(scaled_rhythm))])
scaled_pulse_ys = np.empty(len(scaled_pulse_indices))
scaled_pulse_ys.fill(0)
scaled_step_ys = np.empty(len(scaled_step_indices))
scaled_step_ys.fill(0)
# plt.figure(figsize=(len([scaled_rhythm])*0.5, 1))
plt.figure(figsize=(8, 1))
if scale_factor > 0:
plt.title("Tresillo rhythm expanded by {}: {}".format(abs(scale_factor), scaled_rhythm), loc="left")
elif scale_factor < 0:
plt.title("Tresillo rhythm contracted by {}: {}".format(abs(scale_factor), scaled_rhythm), loc="left")
else: # scale_factor == 0, which means rhythm is unaltered
plt.title("Tresillo rhythm: {}".format(scaled_rhythm), loc="left")
# plt.scatter(scaled_pulse_indices, scaled_pulse_ys, c=color)
# plt.scatter(scaled_step_indices, scaled_step_ys, c="k", alpha=0.5)
# plt.grid(True)
plt.vlines(scaled_pulse_indices, -1, 1, color=color)
plt.vlines(scaled_step_indices, -0.5, 0.5, color=color, alpha=0.5)
plt.xticks(np.arange(0, plt.xlim()[1], 1))
plt.yticks([])
# plt.xticks(np.linspace(0, 10, 41))
This is exactly the kind of rhythmic expansion and contraction that the Vijay Iyer Trio explore in their renditions of "Mystic Brew" and "Human Nature (Trio Extension)".
Next up, let's begin working with some actual audio!
Part of the beauty of working with rhythms in a symbolic fashion is that once we set things up, we can apply them to any existing audio track.
To properly map the relationship between a rhythmic sequence and an audio representation of a piece of music, we'll have to do some feature extraction, that is, teasing out specific attributes of the music by analyzing the audio signal.
Our goal is to create a musically meaningful relationship between our symbolic rhythmic data and the audio track we want to manipulate.
First we'll load up our source audio file. For this example we'll work with Michael Jackson's "Human Nature", off of his 1982 album Thriller:
In [ ]:
# Load input audio file
filename = "../data/humannature_30s.mp3"
y, sr = librosa.load(filename, sr=sr)
plt.figure(figsize=(16,4))
librosa.display.waveplot(y, sr=sr)
ipd.Audio(y, rate=sr)
An important feature we want to extract from the audio is tempo (i.e. the time interval between steps). Let's estimate that using the librosa.beat.tempo
method (which requires us to first detect onsets, or []):
In [ ]:
tempo = fibonaccistretch.estimate_tempo(y, sr)
print("Tempo (calculated): {}".format(tempo))
tempo = 93.0 # Hard-coded from prior knowledge
print("Tempo (hard-coded): {}".format(tempo))
In [ ]:
beat_times = fibonaccistretch.calculate_beat_times(y, sr, tempo)
print("First 10 beat times (in seconds): {}".format(beat_times[:10]))
And let's listen to our extracted beats with the original audio track:
In [ ]:
# Listen to beat clicks (i.e. a metronome)
beat_clicks = librosa.clicks(times=beat_times, sr=sr, length=len(y))
# Plot waveform and beats
plt.figure(figsize=(16,4))
librosa.display.waveplot(y, sr=sr)
plt.vlines(beat_times, -0.25, 0.25, color="r")
ipd.Audio(y + beat_clicks, rate=sr)
In [ ]:
beats_per_measure = 4
Using beats_per_measure
we can calculate the times for the start of each measure:
In [ ]:
# Work in samples from here on
beat_samples = librosa.time_to_samples(beat_times, sr=sr)
measure_samples = fibonaccistretch.calculate_measure_samples(y, beat_samples, beats_per_measure)
print("First 10 measure samples: {}".format(measure_samples[:10]))
Note that we're working in samples now, as this is the unit that the audio data is actually stored in; when we loaded up the audio track, we essentially read in a large array of samples. The sample rate, which we defined as sr
, tells us how many samples there are per second.
Thus, it's a simple matter to convert samples to times whenever we need to:
In [ ]:
measure_times = librosa.samples_to_time(measure_samples, sr=sr)
print("First 10 measure times (in seconds): {}".format(measure_times[:10], sr=sr))
We can visualize, and listen to, the measure and beat markers along with the original waveform:
In [ ]:
# Add clicks, then plot and listen
plt.figure(figsize=(16, 4))
librosa.display.waveplot(y, sr=sr)
plt.vlines(measure_times, -1, 1, color="r")
plt.vlines(beat_times, -0.5, 0.5, color="r")
measure_clicks = librosa.clicks(times=measure_times, sr=sr, click_freq=3000.0, length=len(y))
ipd.Audio(y + measure_clicks + beat_clicks, rate=sr)
In [ ]:
print("Tresillo rhythm: {}\n"
"{} pulses, {} steps".format(tresillo_rhythm, tresillo_num_pulses, tresillo_num_steps))
For this example, we want the rhythm to last an entire measure as well, so we'll set steps_per_measure
to be the number of steps in the rhythm (in this case, 8):
In [ ]:
steps_per_measure = tresillo_num_steps
steps_per_measure
With these markers in place, we can now overlay the tresillo rhythm onto each measure and listen to the result:
In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(tresillo_rhythm, y, measure_samples, sr=sr)
The clicks for measures, pulses, and steps, overlap with each other at certain points. While you can hear this based on the fact that each click is at a different frequency, it can be hard to tell visually in the above figure. We can make this more apparent by plotting each set of clicks with a different color.
In the below figure, each measure is denoted by a large red line, each pulse by a medium green line, and each step by a small blue line.
In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(tresillo_rhythm, y, measure_samples, sr=sr, click_colors={"measure": "r",
"pulse": "g",
"step": "b"})
You can hear that the tresillo rhythm's pulses line up with the harmonic rhythm of "Human Nature"; generally, we want to pick rhythms and audio tracks that have at least some kind of musical relationship.
(We could actually try to estimate rhythmic patterns based on onsets and tempo, but that's for another time.)
In [ ]:
original_rhythm = tresillo_rhythm
target_rhythm = fibonaccistretch.fibonacci_scale_rhythm(original_rhythm, 1) # "Fibonacci scale" original rhythm by a factor of 1
print("Original rhythm: {}\n"
"Target rhythm: {}".format(original_rhythm, target_rhythm))
In [ ]:
pulse_ratios = fibonaccistretch.calculate_pulse_ratios(original_rhythm, target_rhythm)
print("Pulse ratios: {}".format(pulse_ratios))
Since we're treating our symbolic rhythms as having the duration of one measure, it makes sense to start by modifying a single measure.
Basically what we want to do is: for each pulse, get the audio chunk that maps to that pulse, and time-stretch it based on our calculated pulse ratios.
Below is an implementation of just that. It's a bit long, but that's mostly due to having to define several properties to do with rhythm and audio. The core idea, of individually stretching the pulses, remains the same:
In [ ]:
fibonaccistretch.modify_measure??
You'll notice that in the part where we choose stretch methods, there's a function called euclidean_stretch
that we haven't defined. We'll get to that in just a second! For now, let's just keep that in the back of our heads, and not worry about it too much, so that we can hear what our modification method sounds like when applied to the first measure of "Human Nature":
In [ ]:
first_measure_data = y[measure_samples[0]:measure_samples[1]]
first_measure_modified = fibonaccistretch.modify_measure(first_measure_data,
original_rhythm, target_rhythm,
stretch_method="timestretch")
ipd.Audio(first_measure_modified, rate=sr)
In [ ]:
# Modify the track using naive time-stretch
y_modified, measure_samples_modified = fibonaccistretch.modify_track(y, measure_samples,
original_rhythm, target_rhythm,
stretch_method="timestretch")
plt.figure(figsize=(16,4))
librosa.display.waveplot(y_modified, sr=sr)
ipd.Audio(y_modified, rate=sr)
Listening to the whole track, only perceptible difference is that the last two beats of each measure are slightly faster. If we look at the pulse ratios again:
In [ ]:
pulse_ratios = fibonaccistretch.calculate_pulse_ratios(original_rhythm, target_rhythm)
print(pulse_ratios)
... we can see that this makes sense, as we're time-stretching the first two pulses by the same amount, and then time-stretching the last pulse by a different amount.
(Note that while we're expanding our original rhythm along the Fibonacci sequence, this actually corresponds to a contraction when time-stretching. This is because we want to maintain the original tempo, so we're trying to fit more steps into the same timespan.)
In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(target_rhythm, y_modified, measure_samples, sr)
This gets to the heart of the problem: when we time-stretch an entire pulse this way, we retain the original pulse's internal rhythm, essentially creating a polyrhythm in the target pulse's step (i.e. metrical) structure. Even though we're time-stretching each pulse, we don't hear a difference because everything within the pulse gets time-stretched by the same amount.
Listening to the rendered track in Part 4.5, you can hear that aside from the beginning of each measure and pulse, the musical onsets in the modified track don't really line up with the target rhythm's clicks at all. Thus, without the clicks, we have no way to identify the target rhythm, even though that's what we were using as the basis of our stretch method!
So how do we remedy this?
In [ ]:
print("Original rhythm: {}\n"
"Target rhythm: {}".format(original_rhythm, target_rhythm))
Looking at the first pulses of the original rhythm and target rhythm, we want to turn
[1 0 0]
into
[1 0 0 0 0]
.
To accomplish this, we'll turn to the concept of Euclidean rhythms.
In [ ]:
fibonaccistretch.euclid??
gcd = fibonaccistretch.euclid(8, 12)
print("Greatest common divisor of 8 and 12 is {}".format(gcd))
The concept of Euclidean rhythms was first introduced by Godfried Toussaint in his 2004 paper The Euclidean Algorithm Generates Traditional Musical Rhythms.
The algorithm for generating these rhythms is actually Bjorklund's algorithm, first described by E. Bjorklund in his 2003 paper The Theory of Rep-Rate Pattern Generation in the SNS Timing System, which deals with neutron accelerators in nuclear physics. Here we use Brian House's Python implementation of Bjorklund's algorithm; you can find the source code on GitHub.
It turns out that our tresillo rhythm is an example of a Euclidean rhythm. We can generate it by plugging in the number of pulses and steps into Bjorklund's algorithm:
In [ ]:
print(np.array(bjorklund.bjorklund(pulses=3, steps=8)))
In [ ]:
original_pulse = np.array([1,0,0])
target_pulse = np.array([1,0,0,0,0])
We want to know how much to stretch each subdivision. To do this, we'll convert these single pulses into rhythms of their own. First, we'll treat each step in the original pulse as an onset:
In [ ]:
original_pulse_rhythm = np.ones(len(original_pulse), dtype="int")
print(original_pulse_rhythm)
And as mentioned before, we'll use Bjorklund's algorithm to generate the target pulse's rhythm. The trick here is to use the number of steps in the original pulse as the number of pulses for the target pulse rhythm (hence the conversion to onsets earlier):
In [ ]:
target_pulse_rhythm = np.array(bjorklund.bjorklund(pulses=len(original_pulse), steps=len(target_pulse)))
print(target_pulse_rhythm)
You might have noticed that this rhythm is exactly the same as the rhythm produced by contracting the tresillo rhythm along the Fibonacci sequence by a factor of 1:
In [ ]:
print(fibonaccistretch.fibonacci_scale_rhythm(tresillo_rhythm, -1))
And it's true that there is some significant overlap between Euclidean rhythms and Fibonacci rhythms. The advantage of working with Euclidean rhythms here is that they work with any number of pulses and steps, not just ones that are Fibonacci numbers.
To summarize:
In [ ]:
print("In order to stretch pulse-to-pulse {} --> {}\n"
"we subdivide and stretch rhythms {} --> {}".format(original_pulse, target_pulse,
original_pulse_rhythm, target_pulse_rhythm))
The resulting pulse ratios are:
In [ ]:
print(fibonaccistretch.calculate_pulse_ratios(original_pulse_rhythm, target_pulse_rhythm))
... which doesn't intuitively look like it would produce something any different from what we tried before. However, we might perceive a greater difference because:
a) we're working on a more granular temporal level (subdivisions of pulses as opposed to measures), and
b) we're adjusting an equally-spaced rhythm (e.g. [1 1 1]
) to one that's not necessarily equally-spaced (e.g. [1 0 1 0 1]
)
In [ ]:
fibonaccistretch.euclidean_stretch??
Let's take a listen to how it sounds:
In [ ]:
# Modify the track
y_modified, measure_samples_modified = fibonaccistretch.modify_track(y, measure_samples,
original_rhythm, target_rhythm,
stretch_method="euclidean")
plt.figure(figsize=(16,4))
librosa.display.waveplot(y_modified, sr=sr)
ipd.Audio(y_modified, rate=sr)
Much better! With clicks:
In [ ]:
fibonaccistretch.overlay_rhythm_onto_audio(target_rhythm, y_modified, measure_samples, sr)
As you can hear, the modified track's rhythm is in line with the clicks, and sounds noticeably different from the original song. This is a pretty good place to end up!
In [ ]:
fibonaccistretch.fibonacci_stretch_track??
Now we can simply feed the function a path to an audio file (as well as any parameters we want to customize).
This is the exact method that's applied to the sneak peek at the final result up top. The only difference is that we use a 90-second excerpt rather than our original 30-second one:
In [ ]:
# "Human Nature" stretched by a factor of 1 using default parameters
fibonaccistretch.fibonacci_stretch_track("../data/humannature_90s.mp3",
stretch_factor=1,
tempo=93.0)
And indeed we get the exact same result.
In [ ]:
# "Human Nature" stretched by a factor of 2
fibonaccistretch.fibonacci_stretch_track("../data/humannature_30s.mp3",
tempo=93.0,
stretch_factor=2,
overlay_clicks=True)
In [ ]:
# "Chan Chan" stretched by a factor of -1
fibonaccistretch.fibonacci_stretch_track("../data/chanchan_30s.mp3",
stretch_factor=-1,
tempo=78.5)
(Note that although we do end up with a perceptible difference (the song now sounds like it's in 7/8), it should actually sound like it's in 5/8, since [1 0 0 1 0 0 1 0]
is getting compressed to [1 0 1 0 1]
. This is an implementation detail with the Euclidean stretch method that I need to fix.)
In order to get musically meaningful results we generally want to supply parameters that make musical sense with our input audio (although it can certainly be interesting to try with parameters that don't!). One of the parameters that makes the most difference in results is the rhythm sequence used to represent each measure.
Here's Chance the Rapper's verse from DJ Khaled's "I'm the One", with a custom original_rhythm
that matches the bassline of the song:
In [ ]:
# "I'm the One" stretched by a factor of 1
fibonaccistretch.fibonacci_stretch_track("../data/imtheone_cropped_chance_60s.mp3",
tempo=162,
original_rhythm=np.array([1,0,0,0,0,1,0,0]),
stretch_factor=1)
We can define both a custom target rhythm as well. In addition, neither original_rhythm
nor target_rhythm
have to be Fibonacci rhythms for the stretch algorithm to work (although with this implementation they do both have to have the same number of pulses).
Let's try that out with the same verse, going from an original rhythm with 8 steps (i.e. in 4/4 meter) to a target rhythm with 10 steps (i.e. in 5/4 meter):
In [ ]:
# "I'm the One" in 5/4
fibonaccistretch.fibonacci_stretch_track("../data/imtheone_cropped_chance_60s.mp3",
tempo=162,
original_rhythm=np.array([1,0,0,0,0,1,0,0]),
target_rhythm=np.array([1,0,0,0,0,1,0,0,0,0]),
overlay_clicks=True)
As another example, we can give a swing feel to the first movement of Mozart's "Eine kleine Nachtmusik" (K. 525), as performed by A Far Cry:
In [ ]:
# "Eine kleine Nachtmusik" with a swing feel
fibonaccistretch.fibonacci_stretch_track("../data/einekleinenachtmusik_30s.mp3",
tempo=130,
original_rhythm=np.array([1,0,1,1]),
target_rhythm=np.array([1,0,0,1,0,1]))
It works pretty decently until around 0:09
, at which point the assumption of a metronomically consistent tempo breaks down. (This is one of the biggest weaknesses with the current implementation, and is something I definitely hope to work on in the future.)
Let's also hear what "Chan Chan" sounds like in 5/4:
In [ ]:
# "Chan Chan" in 5/4
fibonaccistretch.fibonacci_stretch_track("../data/chanchan_30s.mp3",
tempo=78.5,
original_rhythm=np.array([1,0,0,1,0,0,0,0]),
target_rhythm=np.array([1,0,0,0,0,1,0,0,0,0])) # Also interesting to try with [1,0,1]
We can also work with source audio in other meters. For example, Frank Ocean's "Pink + White" is in 6/8. Here I've stretched it into 4/4 using the rhythm of the bassline, but you can uncomment the other supplied parameters (or supply your own!) to hear how they sound as well:
In [ ]:
# "Pink + White" stretched by a factor of 1
fibonaccistretch.fibonacci_stretch_track("../data/pinkandwhite_30s.mp3",
beats_per_measure=6,
tempo=160,
# 6/8 to 4/4 using bassline rhythm
original_rhythm=np.array([1,1,1,1,0,0]),
target_rhythm=np.array([1,1,1,0,1,0,0,0]),
# 6/8 to 4/4 using half notes
# original_rhythm=np.array([1,0,0,1,0,0]),
# target_rhythm=np.array([1,0,0,0,1,0,0,0]),
# 6/8 to 10/8 (5/4) using Fibonacci stretch factor of 1
# original_rhythm=np.array([1,0,0,1,0,0]),
# stretch_factor=1,
overlay_clicks=True)
This notebook started out as an answer to the question "What if we applied rhythmic expansion methods, based on the Fibonacci sequence, to actual audio tracks?"
It quickly grew into more and more, and we now have a working implementation of what I've dubbed Fibonacci stretch. Along the way I've come to a few conclusions that I'll go over:
I think there's certainly a case to be made for Fibonacci stretch as an interesting and useful means of musical transformation; it's rooted in mathematical processes that have been shown to produce interesting artistic output, and Part 6 shows that Fibonacci itself can produce musically interesting results.
However, it has its limits as well, the main one being that the Fibonacci sequence grows at an exponential rate (with the rate actually being equal to the golden ratio). This means that above a certain stretch_factor
value, Fibonacci stretch starts to feel somewhat impractical. For example, stretching [1 0 0 1 0 0 1 0]
by a factor of 6 gives us a target rhythm with 144 steps, which isn't something we can easily perceive when crammed into the space of our original 8 steps.
One solution to this imperceptibility is to allow for the length of the modified measure to change as well. For example, a 2-second measure with 8 steps (4 steps/second) could be stretched into a 24-second measure with 144 steps (6 steps/second). The longer time span might yield more interesting results, but it might also obscure the relationship between the modified track and the original track.
I gleaned a lot of insight from experimenting with different parameters. As shown in Part 2.2, the greater the stretch_factor
, the closer we get to the golden ratio; with regards to rhythm perception, I found that this made the resulting rhythm sound more and more "natural", in an almost uncanny valley manner. This relates back to the limitations of Fibonacci stretch as a creative tool as well. Perhaps it would be worth examining the space of results that lie between stretch factors of 1 and 3, as those seem to be where the most musically interesting rhythmic shifts occur.
Similarly, plugging in different patterns for the original_rhythm
and target_rhythm
parameters yielded differing results, with some seeming more closely related to others. It's possible that there are some underlying rhythmic principles that more clearly explain how the relationship between the original and target rhythms affects how we perceive the stretched result.
In addition, the initially disappointing results of using naive time-stretch also indicate the importance of considering different perceptual levels of rhythm. We explored a solution using subdivision of pulses, but haven't dealt with perceptual levels of rhythm in a rigorous manner at all.
The current implementation of Fibonacci stretch works well enough, but it also leaves a lot to be desired.
One of the biggest issues is the handling of tempo. Firstly, the built-in tempo detection almost never gives us the correct value, which is why for the examples we've had to pass in tempo values manually. Looking into alternate tempo detection methods could mitigate this. More importantly, however, the current implementation doesn't allow for variable tempo, which is a real problem with tracks that weren't recorded to a metronome. Simply using a dynamic tempo estimate (which librosa.beat.tempo
is capable of) would go a long way into improving the quality of the modified tracks.
This implementation of Fibonacci stretch also doesn't work too well when stretch_factor
< 0. As with the "Chan Chan" example in Part 6.2, we don't get the results we expect. It might just be a quirk in the step length conversion in euclidean_stretch()
, but it's also possible that an adidtional level of subdivision might be needed.
We might also want to explore using onset detection to improve the actual time-stretching process. Instead of choosing stretch regions purely based on where they fall in relation to the symbolic rhythm parameters, we could define regions in accordance with the sample indices of detected onsets as well, which could yield more natural sounding output.
There are a ton of avenues to explore further, some of which we've touched upon. For example, some of the most interesting results came from stretching a track so that the performance was converted from one meter to another (e.g. 6/8 to 5/4). Sometimes this occurred as a happy coincidence of Fibonacci stretch, but most of the time we had to pass in custom original_rhythm
and target_rhythm
parameters. I think it would be worthwhile to explore a version of the stretch method that could convert meter A to meter B without explicitly defining the rhythm patterns.
On a related note, exploring the possibilities of Euclidean rhythms, outside of their relationship to Fibonacci rhythms, could be worthwhile as well. In addition, we could allow each measure to be stretched using different parameters by passing in a list of arguments, each corresponding to a measure or group of measures. This would allow for more flexibility and creative freedom when using Fibonacci stretch.
Finally, I could see this stretch method being used in the context of a digital audio workstation (DAW), with both the symbolic meter data for a project and the raw audio signal for a track being modified. If embedded directly into a DAW, this would open up possibilities for rapid experimentation of rhythm in a music production setting.
All in all, I hope this notebook presents an insightful exploration of how the Fibonacci sequence can be applied to sample-level audio manipulation of rhythmic relationships!