57

I have used the SIFT implementation of Andrea Vedaldi, to calculate the sift descriptors of two similar images (the second image is actually a zoomed in picture of the same object from a different angle).

Now I am not able to figure out how to compare the descriptors to tell how similar the images are?

I know that this question is not answerable unless you have actually played with these sort of things before, but I thought that somebody who has done this before might know this, so I posted the question.

the little I did to generate the descriptors:

>> i=imread('p1.jpg');
>> j=imread('p2.jpg');
>> i=rgb2gray(i);
>> j=rgb2gray(j);
>> [a, b]=sift(i);  % a has the frames and b has the descriptors
>> [c, d]=sift(j);
Cœur
  • 32,421
  • 21
  • 173
  • 232
Lazer
  • 79,569
  • 109
  • 264
  • 349
  • 1
    Check on [SURF](http://www.mathworks.in/help/vision/ref/detectsurffeatures.html) too, Matlab has inbuilt support for this. – Sridutt Feb 15 '13 at 06:38

5 Answers5

37

First, aren't you supposed to be using vl_sift instead of sift?

Second, you can use SIFT feature matching to find correspondences in the two images. Here's some sample code:

    I = imread('p1.jpg');
    J = imread('p2.jpg');

    I = single(rgb2gray(I)); % Conversion to single is recommended
    J = single(rgb2gray(J)); % in the documentation

    [F1 D1] = vl_sift(I);
    [F2 D2] = vl_sift(J);

    % Where 1.5 = ratio between euclidean distance of NN2/NN1
    [matches score] = vl_ubcmatch(D1,D2,1.5); 

    subplot(1,2,1);
    imshow(uint8(I));
    hold on;
    plot(F1(1,matches(1,:)),F1(2,matches(1,:)),'b*');

    subplot(1,2,2);
    imshow(uint8(J));
    hold on;
    plot(F2(1,matches(2,:)),F2(2,matches(2,:)),'r*');

vl_ubcmatch() essentially does the following:

Suppose you have a point P in F1 and you want to find the "best" match in F2. One way to do that is to compare the descriptor of P in F1 to all the descriptors in D2. By compare, I mean find the Euclidean distance (or the L2-norm of the difference of the two descriptors).

Then, I find two points in F2, say U & V which have the lowest and second-lowest distance (say, Du and Dv) from P respectively.

Here's what Lowe recommended: if Dv/Du >= threshold (I used 1.5 in the sample code), then this match is acceptable; otherwise, it's ambiguously matched and is rejected as a correspondence and we don't match any point in F2 to P. Essentially, if there's a big difference between the best and second-best matches, you can expect this to be a quality match.

This is important since there's a lot of scope for ambiguous matches in an image: imagine matching points in a lake or a building with several windows, the descriptors can look very similar but the correspondence is obviously wrong.

You can do the matching in any number of ways .. you can do it yourself very easily with MATLAB or you can speed it up by using a KD-tree or an approximate nearest number search like FLANN which has been implemented in OpenCV.

EDIT: Also, there are several kd-tree implementations in MATLAB.

Jacob
  • 33,032
  • 14
  • 105
  • 160
  • this is commonly the method of object recognition via sift, but is it the most effective method in determining object similarity? Similarity is not exactly the same as recognizing after ll. After looking at the pyramid match kernel paper recommended, I'm thinking the answer is no. – mugetsu Dec 13 '11 at 00:51
  • @mugetsu: Perhaps, but that's not the point ; the question was about getting descriptors from the VLFeat toolbox. – Jacob Dec 13 '11 at 03:03
  • 1
    well from my interpretation of the problem, I was under the impression that Lazer asked how to determine if two images are similar, not getting the descriptors(which he has already done). Doing a comparison of the descriptor does not mean he has to be limited to vlfeat functions, all he needs is some form of an algorithm to apply to those descriptors. And in his case, as is in mine, vlfeat doesn't have what we need to do a similarity comparison. – mugetsu Dec 13 '11 at 18:15
  • is there a python (opencv) implementation of VL_UBCMATCH function ? – user93 May 12 '16 at 06:10
9

You should read David Lowe's paper, which talks about how to do exactly that. It should be sufficient, if you want to compare images of the exact same object. If you want to match images of different objects of the same category (e.g. cars or airplanes) you may want to look at the Pyramid Match Kernel by Grauman and Darrell.

Dima
  • 37,098
  • 13
  • 69
  • 112
  • 1
    have you by chance used the pyramid match kernel? What is your opinion on it's performance? – mugetsu Dec 13 '11 at 00:51
  • @mugetsu I played with the published code for it a little (libpmk), but I have not used it much. The results in the paper look impressive, though. – Dima Dec 13 '11 at 03:35
3

Try to compare each descriptor from the first image with descriptors from the second one situated in a close vicinity (using the Euclidean distance). Thus, you assign a score to each descriptor from the first image based on the degree of similarity between it and the most similar neighbor descriptor from the second image. A statistical measure (sum, mean, dispersion, mean error, etc) of all these scores gives you an estimate of how similar the images are. Experiment with different combinations of vicinity size and statistical measure to give you the best answer.

luvieere
  • 35,580
  • 18
  • 120
  • 178
2

If you want just compare zoomed and rotated image with known center of rotation you can use phase correlation in log-polar coordinates. By sharpness of peak and histogram of phase correlation you can judge how close images are. You can also use euclidean distance on absolute value of Fourier coefficients.

If you want compare SIFT descriptor, beside euclidean distance you can also use "diffuse distance" - getting descriptor on progressively more rough scale and concatenating them with original descriptor. That way "large scale" feature similarity would have more weight.

mirror2image
  • 280
  • 2
  • 7
0

If you want to do matching between the images, you should use vl_ubcmatch (in case you have not used it). You can interpret the output 'scores' to see how close the features are. This represents the square of euclidean distance between the two matching feature descriptor. You can also vary the threshold between Best match and 2nd best match as input.

Swagatika
  • 757
  • 1
  • 7
  • 28