Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations. If you find this content useful, please consider supporting the work by buying the book!
Although OpenCV does not provide an implementation of agglomerative hierarchical clustering, it is a popular algorithm that should, by all means, belong to our machine learning repertoire.
We start out by generating 10 random data points, just like in the previous figure:
In [1]:
from sklearn.datasets import make_blobs
X, y = make_blobs(n_samples=10, random_state=100)
Using the familiar statistical modeling API, we import the AgglomerativeClustering
algorithm and specify the desired number of clusters:
In [2]:
from sklearn import cluster
agg = cluster.AgglomerativeClustering(n_clusters=3)
Fitting the model to the data works, as usual, via the fit_predict
method:
In [3]:
labels = agg.fit_predict(X)
We can generate a scatter plot where every data point is colored according to the predicted label:
In [4]:
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
plt.figure(figsize=(10, 6))
plt.scatter(X[:, 0], X[:, 1], c=labels, s=100)
Out[4]:
That's it! This marks the end of another wonderful adventure.