Using Audio in IPython

Audio Libraries

We will mainly use three libraries for audio acquisition and playback: IPython.display.Audio, essentia.standard, and librosa.

Introduced in IPython 2.0, IPython.display.Audio lets you play audio directly in an IPython notebook.

Essentia is an open-source library for audio analysis and music information retrieval from the Music Technology Group at Universitat Pompeu Fabra. Although Essentia is written in C++, we will use the Python bindings for Essentia.

librosa is a Python package for music and audio processing by Brian McFee. A large portion was ported from Dan Ellis's Matlab audio processing examples.

Retrieving Audio

To download a file onto your local machine (or Vagrant box) in Python, you can use urllib.urlretrieve:



In [1]:

    
import urllib
urllib.urlretrieve('https://ccrma.stanford.edu/workshops/mir2014/audio/simpleLoop.wav', filename='simpleLoop.wav')









    Out[1]:





('simpleLoop.wav', <httplib.HTTPMessage instance at 0x3629b48>)

To check that the file downloaded successfully, list the files in the working directory:



In [2]:

    
%ls *.wav









    



default.wav  out.wav  simpleLoop.wav

Visit https://ccrma.stanford.edu/workshops/mir2014/audio/ for more audio files.

If you only want to listen to, and not manipulate, a remote audio file, use IPython.display.Audio instead. (See Playing Audio.)

Reading Audio

`essentia.standard.Monoloader`

MonoLoader reads (and downmixes, if necessary) an audio file into a single channel (as will often be the case during this workshop). MonoLoader also resamples the audio to a sampling frequency of your choice (default = 44100 Hz):



In [3]:

    
from essentia.standard import MonoLoader
audio = MonoLoader(filename='simpleLoop.wav')()
audio.shape









    Out[3]:





(132300,)



In [4]:

    
N = len(audio)
t = arange(0, N)/44100.0
plot(t, audio)
xlabel('Time (seconds)')









    Out[4]:





<matplotlib.text.Text at 0x40f4bd0>

For more control over the audio acquisition process, you may want to use AudioLoader instead.

`librosa.load`



In [5]:

    
import librosa
x, fs = librosa.load('simpleLoop.wav')
print x.shape
print fs

Playing Audio

`IPython.display.Audio`

Using IPython.display.Audio, you can play a local audio file or a remote audio file:



In [6]:

    
from IPython.display import Audio
Audio('https://ccrma.stanford.edu/workshops/mir2014/audio/CongaGroove-mono.wav') # remote WAV file









    Out[6]:



In [7]:

    
Audio('simpleLoop.wav')  # local WAV file









    Out[7]:

Audio can also accept a NumPy array:



In [8]:

    
fs = 44100 # sampling frequency
T = 1.5    # seconds
t = numpy.linspace(0, T, int(T*fs), endpoint=False) # time variable
x = numpy.sin(2*numpy.pi*440*t)                # pure sine wave at 440 Hz
Audio(x, rate=fs)









    Out[8]:

SoX

To play audio from the command line, we recommend SoX (included in the stanford-mir Vagrant box).

$ play simpleLoop.wav

Visualizing Audio

plot is the simplest way to plot time-domain signals:



In [9]:

    
T = 0.001    # seconds
fs = 44100   # sampling frequency
t = numpy.linspace(0, T, int(T*fs), endpoint=False) # time variable
x = numpy.sin(2*numpy.pi*3000*t)     
plot(t, x)
xlabel('Time (seconds)')









    Out[9]:





<matplotlib.text.Text at 0x5810cd0>

specgram is a Matplotlib tool for computing and displaying spectrograms.



In [10]:

    
S, freqs, bins, im = specgram(x, NFFT=1024, Fs=fs, noverlap=512)
xlabel('Time')
ylabel('Frequency')









    Out[10]:





<matplotlib.text.Text at 0x5f4f4d0>

Writing Audio

`essentia.standard.MonoWriter`

MonoWriter can write a NumPy array to a WAV file. Note: the array must have type int, single, or complex64.



In [11]:

    
from essentia.standard import MonoWriter
noise = 0.1*randn(44100)
MonoWriter(filename='noise1.wav')(single(noise))
%ls *.wav









    



default.wav  noise1.wav  out.wav  simpleLoop.wav

`librosa.output.write_wav`

librosa.output.write_wav also saves a NumPy array to a WAV file. This is a bit easier to use.



In [12]:

    
import librosa
noise = 0.1*randn(44100)
librosa.output.write_wav('noise2.wav', noise, 44100)
%ls *.wav









    



default.wav  noise1.wav  noise2.wav  out.wav  simpleLoop.wav