9

Several questions have been asked about the SIFT algorithm, but they all seem focussed on a simple comparison between two images. Instead of determining how similar two images are, would it be practical to use SIFT to find the closest matching image out of a collection of thousands of images? In other words, is SIFT scalable?

For example, would it be practical to use SIFT to generate keypoints for a batch of images, store the keypoints in a database, and then find the ones that have the shortest Euclidean distance to the keypoints generated for a "query" image?

When calculating the Euclidean distance, would you ignore the x, y, scale, and orientation parts of the keypoints, and only look at the descriptor?

Community
  • 1
  • 1
Cerin
  • 50,711
  • 81
  • 269
  • 459
  • This would definitely work, I'm sure there are papers written about this topic, I was not able to find any though. – fairidox Mar 02 '11 at 19:48

1 Answers1

8

There are several approaches.

One popular approach is the so called bag of words representation which does matching based solely upon how many descriptors match, thus ignoring the location part consisting of (x, y, scale, and orientation) and just look at the descriptor.

Efficient querying of a large database may use approximate methods like locality sensitive hashing

Other methods may involve vocabulary trees or other data structures.

For an efficient method that also takes into account location information, check out pyramid match kernels

peakxu
  • 6,319
  • 1
  • 25
  • 27
  • Each image has different number of keypoint / descriptior, and even different resolutions, how do i name them with words? – TomSawyer Jul 22 '18 at 09:16