Audio effects and playback with Librosa and IPython Notebook

This notebook will demonstrate how to do audio effects processing with librosa and IPython notebook. You will need IPython 2.0 or later.

By the end of this notebook, you'll know how to do the following:

Play audio in the browser
Effect transformations such as harmonic/percussive source separation, time stretching, and pitch shifting
Decompose and reconstruct audio signals with non-negative matrix factorization
Visualize spectrogram data



In [ ]:

    
from __future__ import print_function



In [ ]:

    
import librosa
import librosa.display
import IPython.display
import numpy as np



In [ ]:

    
import matplotlib.pyplot as plt
import matplotlib.style as ms
ms.use('seaborn-muted')
%matplotlib inline



In [ ]:

    
# Load the example track
y, sr = librosa.load(librosa.util.example_audio_file())



In [ ]:

    
# Play it back!
IPython.display.Audio(data=y, rate=sr)



In [ ]:

    
# How about separating harmonic and percussive components?
y_h, y_p = librosa.effects.hpss(y)



In [ ]:

    
# Play the harmonic component
IPython.display.Audio(data=y_h, rate=sr)



In [ ]:

    
# Play the percussive component
IPython.display.Audio(data=y_p, rate=sr)



In [ ]:

    
# Pitch shifting?  Let's gear-shift by a major third (4 semitones)
y_shift = librosa.effects.pitch_shift(y, sr, 7)

IPython.display.Audio(data=y_shift, rate=sr)



In [ ]:

    
# Or time-stretching?  Let's slow it down
y_slow = librosa.effects.time_stretch(y, 0.5)

IPython.display.Audio(data=y_slow, rate=sr)



In [ ]:

    
# How about something more advanced?  Let's decompose a spectrogram with NMF, and then resynthesize an individual component
D = librosa.stft(y)

# Separate the magnitude and phase
S, phase = librosa.magphase(D)

# Decompose by nmf
components, activations = librosa.decompose.decompose(S, n_components=8, sort=True)



In [ ]:

    
# Visualize the components and activations, just for fun

plt.figure(figsize=(12,4))

plt.subplot(1,2,1)
librosa.display.specshow(librosa.logamplitude(components**2.0, ref_power=np.max), y_axis='log')
plt.xlabel('Component')
plt.ylabel('Frequency')
plt.title('Components')

plt.subplot(1,2,2)
librosa.display.specshow(activations, x_axis='time')
plt.xlabel('Time')
plt.ylabel('Component')
plt.title('Activations')

plt.tight_layout()



In [ ]:

    
print(components.shape, activations.shape)



In [ ]:

    
# Play back the reconstruction
# Reconstruct a spectrogram by the outer product of component k and its activation
D_k = components.dot(activations)

# invert the stft after putting the phase back in
y_k = librosa.istft(D_k * phase)

# And playback
print('Full reconstruction')

IPython.display.Audio(data=y_k, rate=sr)



In [ ]:

    
# Resynthesize.  How about we isolate just first (lowest) component?
k = 0

# Reconstruct a spectrogram by the outer product of component k and its activation
D_k = np.multiply.outer(components[:, k], activations[k])

# invert the stft after putting the phase back in
y_k = librosa.istft(D_k * phase)

# And playback
print('Component #{}'.format(k))

IPython.display.Audio(data=y_k, rate=sr)



In [ ]:

    
# Resynthesize.  How about we isolate a middle-frequency component?
k = len(activations) / 2

# Reconstruct a spectrogram by the outer product of component k and its activation
D_k = np.multiply.outer(components[:, k], activations[k])

# invert the stft after putting the phase back in
y_k = librosa.istft(D_k * phase)

# And playback
print('Component #{}'.format(k))

IPython.display.Audio(data=y_k, rate=sr)



In [ ]:

    
# Resynthesize.  How about we isolate just last (highest) component?
k = -1

# Reconstruct a spectrogram by the outer product of component k and its activation
D_k = np.multiply.outer(components[:, k], activations[k])

# invert the stft after putting the phase back in
y_k = librosa.istft(D_k * phase)

# And playback
print('Component #{}'.format(k))

IPython.display.Audio(data=y_k, rate=sr)