3

I am working on a solution for detecting similar and somewhat different media Images.

I have come across many articles suggesting techniques for handling Images for instance - Image comparison - fast algorithm

  1. file-hash based (md5,sha1,etc) - Tried, working well for exactly similar content.

  2. perceptual hashing (phash) for rescaled images - Looking for a java implementation that is as accurate as the one provided by phash.org. One java solution provided @ http://pastebin.com/Pj9d8jt5 has been reported to have false positives, i haven't tried it though.

  3. feature-based (SIFT) for modified images - Looking for some sample code for a kickstart.

If there are any other suggestions please do share.

Community
  • 1
  • 1
Sumeet
  • 160
  • 1
  • 9

1 Answers1

4

Answer to a point 2 of the question. I did not check any of these pHash implementations yet, but probably there is an accurate one among them:

Java pHash https://github.com/krishnact/jphash

Another java pHash https://github.com/thomasheckmann/image-indexer

Java (Android) pHash https://github.com/gavinliu/SimilarPhoto

Groovy pHash https://github.com/mdbishop/ImagePHash

Scala pHash https://gist.github.com/Howon/7db1239355841a71ffa9

Another scala pHash https://github.com/warricksothr/ImageTools/blob/master/engine/src/main/scala/com/sothr/imagetools/engine/hash/PHash.scala

Anton Ivinskyi
  • 352
  • 1
  • 15
  • Thanks for the reply i managed to find some code for kickstart. I especially found Elliot Shepherd's implementation of pHash (based on http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html blog) https://gist.github.com/6e20342198d4040e0bb5 to be pretty useful. However, the algorithm surely needs to be refined to avoid false positives. – Sumeet Jul 28 '17 at 05:32