I recently tested the following algorithm for determining the "colorfulness" of an image by David Haslera and Sabine Susstrunk, which is implemented in Python here. However, I noticed something interesting about the performance of this algorithm on some test images.
The colorfulness metric for a 3x1, 8-bit-per-channel RGB image of a single red, green, and blue pixel seems to be as high as it gets (feel free to prove me wrong, that's simply what I've observed).
Here's what I mean. This image has a colorfulness of 206.52:
while this 4x4 image has a colorfulness of 185.13:
while a pure black/white image has a colorfulness of 0, for comparison. (These values are calculated when the levels for each channel are stored as integers between 0 and 255, inclusive).
Intuitively, I would consider the second image to be far more "colorful" than the first, but this is not the result given by that algorithm.
Essentially, I'm looking for some other measurement that corresponds to the variation in colors that appear in an image. However, I'm not really sure what that would look like mathematically, especially since images are represented as distinct red, blue, and green channels.
One idea might be to simply keep a tally of the number of distinct pixels in an image (add one to the tally every time a pixel of a new color is seen), but this does not do a good job when there are many dark (yet different) pixels that are virtually indistinguishable. It also wouldn't do a good job if there are a lot of slightly different pixels of all approximately the same color. Yet the algorithm in that paper seems to break my intuition when testing an extreme case like this one.
Does anybody know of other metrics that might give a more accurate representation of the variety of colors that appear in an image? Or can you perhaps propose one yourself? I'm open to any and all ideas.