K-means is a clustering algorithm which defines K cluster centroids in the feature space and, by making use of an appropriate distance function, iteratively assigns each example to the closest cluster centroid and each cluster centroid to the mean of points previously assigned to it.
In the following example we will make use of K-means clustering to reduce the number of colors contained in an image stored using 24-bit RGB encoding.
The RGB color model is an additive color model in which red, green and blue light are added together in various ways to reproduce a broad array of colors. In a 24-bit encoding, each pixel is represented as three 8-bit unsigned integers (ranging from 0 to 255) that specify the red, green and blue intensity values, resulting in a total of 256*256*256=16,777,216 possible colors.
To compress the image, we will reduce this number to 16, assign each color to an index and then each pixel to an index. This process will significantly decrease the amount of space occupied by the image, at the cost of introducing some computational effort.
For a 128x128 image:
Note that we won't implement directly the K-means algorithm, as we are primarily interested in showing its application in a common scenario, but we'll delegate it to the scikit-learn library.
In [165]:
from scipy import misc
pic = misc.imread('media/irobot.png')
In [166]:
import matplotlib.pyplot as plt
%matplotlib inline
plt.imshow(pic)
Out[166]:
The image is stored in a 3-dimensional matrix, where the first and second dimension represent the pixel location on the 2-dimensional plan and the third dimension the RGB intensities:
In [167]:
pic.shape
Out[167]:
In [168]:
w = pic.shape[0]
h = pic.shape[1]
X = pic.reshape((w*h,3))
X.shape
Out[168]:
In [169]:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=16)
kmeans.fit(X)
Out[169]:
We can verify that each pixel has been assigned to a cluster:
In [170]:
kmeans.labels_
Out[170]:
In [171]:
np.unique(kmeans.labels_)
Out[171]:
And we can visualize each cluster centroid:
In [172]:
kmeans.cluster_centers_
Out[172]:
Note that cluster centroids are computed as the mean of the features, so we easily end up on decimal values, which are not admitted in a 24 bit representation (three 8-bit unsigned integers ranging from 0 to 255) of the colors. We decide to round them with a floor operation. Furthermore we have to invert the sign of the clusters to visualize them:
In [173]:
import numpy as np
plt.imshow(np.floor(kmeans.cluster_centers_.reshape((1,16,3))) * (-1))
Out[173]:
In [176]:
labels = kmeans.labels_
clusters = np.floor(kmeans.cluster_centers_) * (-1)
The data contained in clusters
and labels
define the compressed image and should be stored in a proper format, in order to effectively realize the data compression:
clusters
: 16 clusters * 24 bits/clusterlabels
: (width x height) px * 4 bits/pxTo reconstruct the image we assign RGB values of the cluster centroids to the pixels and we reshape the matrix in the original form:
In [177]:
# Assigning RGB to clusters and reshaping
pic_recovered = clusters[labels,:].reshape((w,h,3))
In [181]:
plt.imshow(pic_recovered)
Out[181]:
At the cost of a deterioriation in the color quality, the space occupied by the image will be significantly lesser. We can compare the original and the compressed image in the following figure:
In [182]:
fig, axes = plt.subplots(nrows=1, ncols=2,figsize=(10,5))
axes[0].imshow(pic)
axes[1].imshow(pic_recovered)
Out[182]:
In [ ]: