In [1]:
import stanford_mir; stanford_mir.init()

Jupyter Audio Basics

Audio Libraries

We will mainly use two libraries for audio acquisition and playback:

1. librosa

librosa is a Python package for music and audio processing by Brian McFee. A large portion was ported from Dan Ellis's Matlab audio processing examples.

2. IPython.display.Audio

IPython.display.Audio lets you play audio directly in an IPython notebook.

Included Audio Data

This GitHub repository includes many short audio excerpts for your convenience.

Here are the files currently in the audio directory:


In [2]:
ls audio


125_bounce.wav                  jangle_pop.mp3
58bpm.wav                       latin_groove.mp3
README.md                       oboe_c6.wav
brahms_hungarian_dance_5.mp3    prelude_cmaj.wav
busta_rhymes_hits_for_days.mp3  simple_loop.wav
c_strum.wav                     simple_piano.wav
clarinet_c6.wav                 sir_duke_piano_fast.mp3
classic_rock_beat.wav           sir_duke_piano_slow.mp3
conga_groove.wav                sir_duke_trumpet_fast.mp3
drum_samples/                   sir_duke_trumpet_slow.mp3
funk_groove.mp3                 tone_440.wav

Reading Audio

Use librosa.load to load an audio file into an audio array. Return both the audio array as well as the sample rate:


In [3]:
import librosa
x, sr = librosa.load('audio/simple_loop.wav')

If you receive an error with librosa.load, you may need to install ffmpeg.

Display the length of the audio array and sample rate:


In [4]:
print(x.shape)
print(sr)


(49613,)
22050

Visualizing Audio

In order to display plots inside the Jupyter notebook, run the following commands, preferably at the top of your notebook:


In [5]:
%matplotlib inline
import matplotlib.pyplot as plt
import librosa.display

Plot the audio array using librosa.display.waveplot:


In [6]:
plt.figure(figsize=(14, 5))
librosa.display.waveplot(x, sr=sr)


Out[6]:
<matplotlib.collections.PolyCollection at 0x10ccd5320>

Display a spectrogram using librosa.display.specshow:


In [7]:
X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(14, 5))
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')


Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x10cd6dfd0>

Playing Audio

IPython.display.Audio

Using IPython.display.Audio, you can play an audio file:


In [8]:
import IPython.display as ipd
ipd.Audio('audio/conga_groove.wav') # load a local WAV file


Out[8]:

Audio can also accept a NumPy array. Let's synthesize a pure tone at 440 Hz:


In [9]:
import numpy
sr = 22050 # sample rate
T = 2.0    # seconds
t = numpy.linspace(0, T, int(T*sr), endpoint=False) # time variable
x = 0.5*numpy.sin(2*numpy.pi*440*t)                # pure sine wave at 440 Hz

Listen to the audio array:


In [10]:
ipd.Audio(x, rate=sr) # load a NumPy array


Out[10]:

Writing Audio

librosa.output.write_wav saves a NumPy array to a WAV file.


In [11]:
librosa.output.write_wav('audio/tone_440.wav', x, sr)