The Simulated 13dB Miracle

The audio component's error levels are measured and compared to the masking threshold (in the psychoacoustic model), producing an Noise to Mask Ratio (NMR) in the critical bands. The NMR reveals how close to audibility the component's distortions are.

As an example, the NMR technique could be illustrated by the so-called "13dB Miracle." A short selection of linearly coded music was played as a reference, then played again with its signal/noise ratio degraded to only 13.6dB. The sound was grossly distorted. The music was played the third time, again with signal/noise ratio of 13.6dB, but this time the noise was almost inaudible.

The difference in sound quality between these signals was vast. In the good-sounding signal, the noise was spectrally distributed so as to be masked by the correctly coded music signal. The poor-sounding signal had exactly the same amount of noise, but the noise was flat in amplitude and static in level.

Simulated 13dB Miracle

To get a feeling of the 13dB miracle, we would like to do a simulation.

The plan is as follows.

Use a .wav mono file, encoding it with MP3, at 64 kb/s, and denote the original and coded .wav files as A and B, respectively
Now delay-align A and B
Compute the difference signal (A - B)
Generate white noise of the same power as (A - B)
Add that white noise to A => C
A simulated 13dB miracle: B vs. C!

Let's examine the code that would carry out this plan.



In [10]:

    
import numpy as np
from scipy.io import wavfile
from Audio import *

# Load the original signal and the coded signal 
rate1, dat1 = wavfile.read('./samples/saxophone-mono-align.wav')
rate2, dat2 = wavfile.read('./samples/saxophone-mono-64kbps-align.wav')

# Measure the difference between the original signal and the coded signal
noisepower = 0
diff_array = []
for i in range(len(dat2)):
    diff = dat1[i] - dat2[i]
    diff_array.append(diff)
    noisepower = noisepower + diff ** 2

# Generate noise with equal (or less) signal power
noisepower = noisepower / len(dat2)
noise = np.random.normal(0, np.sqrt(noisepower), len(dat2))

# Generate signal C
dat3 = []
for i in range(len(dat2)):
    dat3.append(dat1[i] + noise[i])
dat3 = np.asarray(dat3, dtype=np.int16)

# Write to .wav files
wavfile.write("./samples/signal-b.wav", rate2, dat2);
wavfile.write("./samples/signal-c.wav", rate2, dat3);

Let us hear the miracle!

By comparing Signal B and C, we can hear the difference of noise shaping. With the same noise power for Signal B, the noise is barely audible.



In [11]:

    
Audio(open("./samples/signal-b.wav").read())









    Out[11]:



In [12]:

    
Audio(open("./samples/signal-c.wav").read())









    Out[12]:

Exercise: What made the miracle happen?

1) In the simulated example, we encoded the sound at 64kb/s, what would happen if we encode the sound at 32kb/s or 128kb/s? Would the effect of noise shaping become more remarkable at higher bitrates?

2) In the simulated example, we used the simple psychoacoustic model provided in the MP3 encoder. Would a more sophisticated psychoacoustic model make the effect of noise shaping more remarkable?

References

[1] Stereophile, http://www.stereophile.com/content/i-have-heard-future-audioand-it-digital-robert-harley-noisemask-ratios
[2] Noise shaping, http://en.wikipedia.org/wiki/Noise_shaping#Noise_shaping_in_digital_audio