Colour Image Quantization using K-means

A simple tutorial on how to reduce the number of distinct colours in an image using Python and OpenCV

Colour quantization is a process that reduces the number of colours in an image while it tries to preserve the quality and the important global information. Images are composed of pixels each of which can be associated with 16,777,216 different colours in the case of RGB colour space, which is probably the most commonly used colour space. Each colour can be represented as a 3d vector, and each vector element has an 8-bit dynamic range, which means 2⁸=256 different values (i.e. 256x256x256 =16,777,216). This kind of representation is often called RGB triplet. The key factor for successful colour quantization is the appropriate selection of the colour palette that sufficiently summarizes the information of the initial image.

Motivation: When should we use Colour Image Quantization?

Colour Image Quantization could be vital in cases we want to enable rendering of images in devices that support only a limited number of colours (limited colour palette) and this usually happens due to memory restraints.

Colour Quantization as a Clustering Problem

One of the most common ways to address the chromatic compression of an image is by using clustering where the features are the colours of the image. K-means is a very popular clustering method of vector quantization and thanks to the scikit-learn library it is really easy to use (as long as you know how it works). Of course, other clustering methods such as SVD, Gaussian Mixture Model, etc. do the work as well.

Colour Spaces

Colour spaces are mathematical models describing the way colours can be represented. The easiest way of visualising them is to think of a box containing all the possible colours that can be produced by mixing the three primary colours of light: red, green and blue [Source]. In figure 1 you can see a diagram that mathematicians have come up with in order to fit three axes (for red, green and blue) to a two-dimensional format:

Figure 1: CIE Chromaticity Diagram [Source]

Adobe RGB, sRGB and CMYK are all gamuts (ranges) and are subsets of L*a*b* gamut defined by the International Commission on Illumination (abbreviated CIE) in 1976 and is intended as a perceptually uniform space, where a given numerical change corresponds to a similar perceived change in colour. I am referring to this because for K-means we need a distance metric to measure the similarity among the features. The RGB colour space which is not perceptually uniform means that two pairs of colours that have the same Euclidean distance might be perceived as not equally different (due to the different wavelengths of each colour and the sensitivity of human eyes on how to perceive each one of them).

Key take away: Before applying K-means convert the image into L*a*b* colour space.

The Code

Below is the code that applies colour quantization using the scikit-learn library (that implements the K-means method) and the OpenCV library for image processing.

You can find the code on my GitHub (Image by Author)
  • Initially, we expect two command-line arguments from the user: (1) the path to the input image, and (2) the number of clusters.
  • We load the image and we convert it into the L*a*b* gamut.
  • We reshape the original image into (width*height, 3) since k-means takes as input a two-dimensional array not a three-dimensional.
  • We initialize our classifier with K clusters that the user decided.

Note: I chose to use MiniBatchKMeans because it is substantially faster than normal KMeans, although the centroids may not be so stable due to the small batches it operates on compared to KMeans that operates on the whole population. It is a good practice to start with MiniBatchKMeans and if the results are not so satisfying you can then try KMeans.

  • clf.fit_predict(image) returns the prediction for each pixel, namely in which cluster it belongs to.
  • quantized = clf.cluster_centers_.astype(“uint8”)[labels] creates the quantized image by reassigning each pixel value from the original image to its corresponding closest center.
  • We reshape both the initial and quantized feature vectors to images and convert them into the RGB gamut.
  • Finally, we save the concatenation (of both original and quantized images) to the disk.

Results

In this paragraph, I will present the results of the above program applied on a poster of my favourite anime Boku no Hero Academia borrowed from the My Hero Academia Wiki 🎆.

K = 2

Figure 2: Our colour palette consists of two colours, black and beige (Image by Author)

Note: The algorithm initially chooses K centroids randomly from the given data and on each step assigns each data point to the closest centroid. When all data points (features) have been assigned to a centroid, each mean of the K clusters is recalculated and the two means are the new two centroids. The procedure is repeated until no update (or at least less than a threshold) of the centroids happens. Consequently, these two colours are NOT the most frequently appeared colours in the initial image (a misconception that could potentially arise) but you can imagine them as the two colours which are the means (i.e. centres) of the two clusters, and imply the minimization of the sum of the distances of the other colours from these two centres respectively.

K = 8

Figure 3: Our colour palette consists of 8 colours — different shades of grey, reddish colours and yellow can be observed (Image by Author)

K = 32

Figure 4: Our colour palette consists of 32 colours and still green misses (Image by Author)

K = 40

Figure 5: Our colour palette consists of 40 colours and we can see that is pretty close to the original image (Image by Author)

To summarize, in this article we saw in a few lines of code how to colour-quantize an image. Figure 5 shows us that for a device that supports only a limited number of colours (let’s say 40 colours) due to memory restraints, this approach could convert the initial image in a way that the quality does not deteriorate significantly and the important information is preserved.