I would like to do a comparison from a query with pictures in a database (about 2000).
Before posting on this website i read a lot of papers concerning methods for matching a picture in a big database and read a lot of posts on stackOverflow.
Concerning papers, there are some stuff interesting but quite technical and difficult to understand well the algorithms. (I just began to specialize myself in this field)
Posts (the most interesting) :
Simple and fast method to compare images for similarity ;
Nearest neighbors in high-dimensional data? ;
How to understand Locality Sensitive Hashing? ;
Image fingerprint to compare similarity of many images ;
Papers :
Object retrieval with large vocabularies and fast spatial matching,
Image Similarity Search with Compact Data Structures,
LSH,
Near Duplicate Image Detection min-Hash and tf-idf Weighting
Vocabulary tree
Aggregating locals descriptors
But i'm still confusing.
The first thing i did is to implement BoW. I trained the Bag of Words (with ORB as detector and descriptor ,and used VLAD features) with 5 class in order to test its efficiency. After a long training, i launched it. It functioned well with an accuracy of 94 %. That's pretty good.
But there is a problem for me:
- I don't want to do a classification. In my database, i'll have about 2000 differents pictures. I just want to find the best matches between my query and the database. So if i have 2000 differents pictures,if i'm logical i have to consider these 2000 pictures as 2000 differents class and obviously that's impossible...
For this first thing, are you agree with me ? It's not obviously the best method to do what i would like ? Maybe there is another way to use BoW in order to find similarities in the database ?
The second thing i did is « more simpler ». I compute the descriptors of my query. Then i did a loop over all my database and i computed the descriptors of each picture and then added each descriptors in a vector.
std::vector<cv::Mat> all_descriptors_database;
for (i → 2000) :
cv::Mat request=cv::imread(img);
computeKeypoints(request) ;
computeDescriptors(request) ;
all_descriptors_database.pushback(descriptors_of_request)
At the end i have a big vector which contains all the descriptors of the all database. (The same with all the keypoints)
Then, this is here where i get confused. At the beginning, i wanted to compute the matching inside the loop that is to say, for each image in the database, compute its descriptors and do a match with the query. But it tooks a lot of time.
So after reading a lot of paper about how find similarities in big databases, i found the LSH algorithm which seems to be appropriate for that kind of search.
Therefore i wanted to use this method. So inside my loop i did something like that :
//Create Flann LSH index
cv::flann::Index flannIndex(all_descriptors_database.at(i), cv::flann::LshIndexParams(12, 20, 2), cvflann::FLANN_DIST_HAMMING);
cv::Mat results, dists;
int k=2; // find the 2 nearest neighbors
// search (nearest neighbor)
flannIndex.knnSearch(query_descriptors, results, dists, k, cv::flann::SearchParams() );
However i have some questions :
It tooks more than 5 seconds to loop all my database (2000) whereas i thought it will take less 1s (on the papers, they have huge databases not like me and LSH is more efficient). Did i do something wrong ?
I found on the internet some libraries which implement LSH like http://lshkit.sourceforge.net/ or http://www.mit.edu/~andoni/LSH/ . So what is the difference between these libraries and the four line of code i wrote using OpenCV ? Because i checked the libraries and for a kind of beginner like me, it was so difficult to try to use it.I got a bit confused.
The third thing :
I wanted to do a kind of fingerprint of each descriptors for each picture (in order to compute the Hamming distance with the database) but it seems to be impossible to do that. OpenCV / SURF How to generate a image hash / fingerprint / signature out of the descriptors?
So since 3 days, i'm blocked on that task. I don't know if i'm on the wrong way or not. Maybe i missed something.
I hope it will be enough clear for you. Thank for reading