This algorithm implements the “true-peak” level meter as descripted in the second annex of the ITU-R BS.1770-2[1] or the ITU-R BS.1770-4[2] (default).
Note: the parameters 'blockDC' and 'emphatise' work only when 'version' is set to 2.
References:
[1] Series, B. S. (2011). Recommendation ITU-R BS.1770-2. Algorithms to
measure audio programme loudness and true-peak audio level,
https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-2-201103-S!!PDF-E.
pdfe
[2] Series, B. S. (2011). Recommendation ITU-R BS.1770-4. Algorithms
to measure audio programme loudness and true-peak audio level,
https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-4-201510-I!!PDF-E.
pdf
In [1]:
import essentia.standard as es
import numpy as np
import matplotlib
matplotlib.use('nbagg')
import matplotlib.pyplot as plt
import ipywidgets as wg
from IPython.display import Audio
from essentia import array as esarr
plt.rcParams["figure.figsize"] =(9, 5)
The following widget demonstrates two intersample detection techniques:
The accuracy of both methods can be assessed in real-time by shifting the sampling points in a Sinc function and evaluating the error produced by both systems.
In [5]:
# Parameters
duration = 10 # s
fs = 1 # hz
k = 1. # amplitude
oversamplingFactor = 4 # factor of oversampling for the real signal
nSamples = fs * duration
time = np.arange(-nSamples/2, nSamples/2,
2 ** -oversamplingFactor, dtype='float')
samplingPoints = time[::2 ** oversamplingFactor]
def shifted_sinc(x, k, offset):
xShifted = x - offset
y = np.zeros(len(xShifted))
for idx, i in enumerate(xShifted):
if not i:
y[idx] = k
else:
y[idx] = (k * np.sin(np.pi * i) / (np.pi * i))
return y
def resampleStrategy(y, fs, quality=0, oversampling=4):
yResample = es.Resample(inputSampleRate=fs,
outputSampleRate=fs*oversampling,
quality=quality)(y.astype(np.float32))
tResample = np.arange(np.min(samplingPoints), np.max(samplingPoints)
+ 1, 1. / (fs * oversampling))
tResample = tResample[:len(yResample)]
# getting the stimated peaks
yResMax = np.max(yResample)
tResMax = tResample[np.argmax(yResample)]
return yResample, tResample, yResMax, tResMax
def parabolicInterpolation(y, threshold=.6):
# todo plot the parabol maybe
positions, amplitudes = es.PeakDetection(threshold=threshold)\
(y.astype(np.float32))
pos = int(positions[0] * (len(y-1)))
a = y[pos - 1]
b = y[pos]
c = y[pos + 1]
tIntMax = samplingPoints[pos] + (a - c) / (2 * (a - 2 * b + c))
yIntMax = b - ((a - b) ** 2) / (8 * (a - 2 * b + c))
return tIntMax, yIntMax
def process():
## Processing
# "real" sinc
yReal = shifted_sinc(time, k, offset.value)
# sampled sinc
y = shifted_sinc(samplingPoints, k, offset.value)
# Resample strategy
yResample, tResample, yResMax, tResMax = \
resampleStrategy(y, fs, quality=0, oversampling=4)
# Parabolic Interpolation extrategy
tIntMax, yIntMax = parabolicInterpolation(y)
## Plotting
ax.clear()
plt.title('Interpeak detection estrategies')
ax.grid(True)
ax.grid(xdata=samplingPoints)
ax.plot(time, yReal, label='real signal')
yRealMax = np.max(yReal)
sampledLabel = 'sampled signal. Error:{:.3f}'\
.format(np.abs(np.max(y) - yRealMax))
ax.plot(samplingPoints, y, label=sampledLabel, ls='-.',
color='r', marker='x', markersize=6, alpha=.7)
ax.plot(tResample, yResample, ls='-.',
color='y', marker='x', alpha=.7)
resMaxLabel = 'Resample Peak. Error:{:.3f}'\
.format(np.abs(yResMax - yRealMax))
ax.plot(tResMax, yResMax, label= resMaxLabel,
color='y', marker = 'x', markersize=12)
intMaxLabel = 'Interpolation Peak. Error:{:.3f}'\
.format(np.abs(yIntMax - yRealMax))
ax.plot(tIntMax, yIntMax, label= intMaxLabel,
marker = 'x', markersize=12)
fig.legend()
fig.show()
In [6]:
# matplotlib.use('TkAgg')
offset = wg.FloatSlider()
offset.max = 1
offset.min = -1
offset.step = .1
display(offset)
fig, ax = plt.subplots()
process()
def on_value_change(change):
process()
offset.observe(on_value_change, names='value')
As it can be seen from the widget, the oversampling strategy generates a smaller error in most of the cases.
The ITU-R BS.1770 recommentation proposess the following signal chain based on the oversampling strategy:
-12.04dB --> x4 oversample --> LowPass --> abs() --> 20 * log10() --> +12.04dB
In our implementation, the gain control is suppressed from the chain as in not required when working with float point values, and the result is returned in natural units as it can be converted to dB as a postprocessing step. Here we can see an example.
In [4]:
fs = 44100.
eps = np.finfo(np.float32).eps
audio_dir = '../../audio/'
audio = es.MonoLoader(filename='{}/{}'.format(audio_dir,
'recorded/distorted.wav'),
sampleRate=fs)()
times = np.linspace(0, len(audio) / fs, len(audio))
peakLocations, output = es.TruePeakDetector(version=2)(audio)
oversampledtimes = np.linspace(0, len(output) / (fs*4), len(output))
random_indexes = [1, 300, 1000, 3000]
figu, axes = plt.subplots(len(random_indexes))
plt.subplots_adjust(hspace=.9)
for idx, ridx in enumerate(random_indexes):
l0 = axes[idx].axhline(0, color='r', alpha=.7, ls = '--')
l1 = axes[idx].plot(times, 20 * np.log10(np.abs(audio + eps)))
l2 = axes[idx].plot(oversampledtimes, 20 * np.log10(output + eps), alpha=.8)
axes[idx].set_xlim([peakLocations[ridx] / fs - .0002, peakLocations[ridx] / fs + .0002])
axes[idx].set_ylim([-.15, 0.15])
axes[idx].set_title('Clipping peak located at {:.2f}s'.format(peakLocations[ridx] / (fs*4)))
axes[idx].set_ylabel('dB')
figu.legend([l0, l1[-1], l2[-1]], ['Dynamic range limit', 'Original signal', 'Resampled signal'])
plt.show()
The resampled signal is closer to the analog signal once it goes through the DAC. The part of this signal peaking above 1 may produce additional distortion if the analog amplifier doesn't have enough headroom to handle this additional amplitude. As this type of distortion depends entirely on the design of the playback hardware, it is always recommended to produce music so the true peak positions never overcome the dynamic range [-1, 1].
this is an explanation of the most relevant parameters of the algorithm
oversamplingFactor. How many times to resample the signal
version. Version of the recommendation to use. Read the references to understand the differences
blockDC and emphatise. Optional post-processing on the 2nd version of the recommendation. blockDC is a very low-frequency high pass filter and emphatise is a high-shelf filter.
In [ ]: