3

In many examples across the web of facial recognition with OpenCV, i see images being converted to grayscale as part of the "pre-processing" for the facial recognition functionality. What would happen if a color image was used for facial recognition? Why do all examples turn images to grayscale first?

user1431072
  • 912
  • 1
  • 9
  • 27

2 Answers2

3

Many image processing and CV algorithms use grayscale images for input rather than color images. One important reason is because by converting to grayscale, it separates the luminance plane from the chrominance planes. Luminance is also more important for distinguishing visual features in a image. For instance, if you want to find edges based on both luminance and chrominance, it requires additional work. Color also doesn't really help us identity important features or characteristics of the image although there may be exceptions.

Grayscale images only have one color channel as opposed to three in a color image (RGB, HSV). The inherent complexity of grayscale images is lower than that of color images as you can obtain features relating to brightness, contrast, edges, shape, contours, textures, and perspective without color.

Processing in grayscale is also much faster. If we make the assumption that processing a three-channel color image takes three times as long as processing a grayscale image then we can save processing time by eliminating color channels we don't need. Essentially, color increases the complexity of the model and in general slows down processing.

nathancy
  • 26,679
  • 11
  • 67
  • 86
1

Most facial recognition algorithms rely on the general intensity distribution in the image rather than the color intensity information of each channel.

Grayscale images provide exactly this information about the general distribution of intensities in an image (high-intensity areas appearing as white / low-intensity areas as black). Calculating the grayscale image is simple and needs little computing time, you can calculate this intensity by averaging the values of all 3 channels.

In a RGB image, this information is divided in all 3 channels. Take for example a bright yellow with:

RGB (255,217,0)

While this is obviously a color of high intensity, we obtain this information by combining all channels, which is exactly what a grayscale image does. You could of course instead use each channel for your feature calculation and concatenate the results to use all intensity information for this image, but it would result in essentially the same result as using the grayscale version while taking 3 times the computation time.

T A
  • 1,488
  • 4
  • 15
  • 25