MFCC feature extraction using Yaafe in pyannote.features
In [1]:
from pyannote.features.audio.yaafe import YaafeMFCC
wav = '/Volumes/data/tvseries/TheBigBangTheory/wav/TheBigBangTheory.Season01.Episode01.en.wav'
mfcc = YaafeMFCC().extract(wav)
Groundtruth annotation using pyannote.core data structure
In [2]:
from pyannote.core import Annotation, Segment
In [3]:
reference = Annotation()
reference[Segment(325.000,329.110)] = 'sheldon'
reference[Segment(330.430, 331.770)] = 'penny'
reference[Segment(332.540, 333.680)] = 'leonard'
reference[Segment(334.110, 336.270)] = 'penny'
reference[Segment(336.380, 336.580)] = 'leonard'
reference[Segment(337.050, 339.980)] = 'penny'
reference[Segment(340.550, 342.190)] = 'sheldon'
reference
Out[3]:
In this example, we suppose that the exact segmentation is available. We just do not know the label.
In [4]:
input_segmentation = reference.anonymize_tracks()
input_segmentation
Out[4]:
Let us initialize BIC clustering algorithms from pyannote.algorithms.
In [5]:
from pyannote.algorithms.clustering.bic import BICClustering
# covariance_type can be 'full' or 'diag'
bicClustering = BICClustering(covariance_type='full', penalty_coef=1.2)
We now apply BIC clustering.
In [6]:
hypothesis = bicClustering(input_segmentation, feature=mfcc)
hypothesis
Out[6]:
We can also analyse the behavior of BIC clustering, step by step.
In [7]:
bicClustering.initialize(input_segmentation, feature=mfcc)
In [8]:
# internal similarity matrix
bicClustering.matrix
Out[8]:
In [9]:
iterations = bicClustering.iterate(feature=mfcc)
In [10]:
current_hypothesis = next(iterations)
current_hypothesis
Out[10]:
In [11]:
bicClustering.matrix
Out[11]:
In [12]:
current_hypothesis = next(iterations)
current_hypothesis
Out[12]:
In [13]:
bicClustering.matrix
Out[13]:
In [14]:
try:
next(iterations)
except:
print "Reached stoping criterion."
In [15]:
hypothesis = bicClustering.finalize(feature=mfcc)
hypothesis
Out[15]:
In [16]:
reference
Out[16]:
Let's evaluate the result numerically using metrics available in pyannote.metrics
In [17]:
from pyannote.metrics.diarization import DiarizationErrorRate, DiarizationPurity, DiarizationCoverage
der = DiarizationErrorRate()
purity = DiarizationPurity()
coverage = DiarizationCoverage()
In [18]:
p = purity(reference, hypothesis)
c = coverage(reference, hypothesis)
d = der(reference, hypothesis)
print "Purity {p:.1f}% / Coverage {c:.1f}% / Diarization error rate {d:.1f}%".format(p=100*p,c=100*c,d=100*d)