10

There are several questions about text-diff libraries for Java on SO, but none about binary diff. So here I go:

I'm looking for a binary diff library, implemented in Java. I found javaxdelta and GNU Diff port but I wonder if there are some other hidden gems out there? And, maybe someone has experience with above mentioned libraries? A comparison would be super helpful.

Finally, maybe it is helpful to know that I want to diff objects serialized with Avro.

Josh Lee
  • 149,877
  • 34
  • 253
  • 263
Neeme Praks
  • 8,150
  • 5
  • 40
  • 46
  • if you want to get really low-level you could google "longest common substring", usually abbreviated LCSS to disambiguate it from "longest common subsequence". of course with Java you'd have to use byte arrays and not strings unless you want to use Jython. – jcomeau_ictx Mar 02 '11 at 05:54

3 Answers3

2

I've found a new (as far as I can tell) Java implementation of a binary diff: JBDiff

Sebastien Diot
  • 6,961
  • 5
  • 41
  • 78
0

If you want to compare Avro Files, have a look at Avro Editor, it contains a compare utility for Avro serialised Files.

Bruce Martin
  • 9,845
  • 1
  • 24
  • 36
  • Thank you for that hint, I was not aware of such a visual editor/comparison tool for Avro. However, this does not answer my question: I'm interested in doing a diff between two versions of Avro objects and later patching the original object in order to obtain the latter. Oh well, I guess I will try out those libraries that I already mentioned in my question. – Neeme Praks Mar 07 '11 at 07:15
-1

Your question isn't specific enough to get a good answer. What do you mean by "binary diff"?

If you want to see a list of all the differences between one byte array and another, you can implement or use the xdelta algorithm. You could also convert the file to hex using the od command and use the standard patch/diff command if you would rather write a shell script than C.

However, it sounds like you want to do something specific to Avro. Is that true?

If you just want to compare the field values of two different Avro files and generate a diff, and you're already familiar with Avro, all you would need to do is write a program that reads both files and prints out the differences in a way that you can read in later to transform the original.

Community
  • 1
  • 1
pawstrong
  • 894
  • 7
  • 17
  • 1
    As I mentioned in a comment to previous answer: "I'm interested in doing a diff between two versions of Avro objects and later patching the original object in order to obtain the latter." So it would not be just for comparing, it would be for recording the differences in a separate binary stream/file and using this later to patch the original file to obtain the other file. I would use it for simple versioning. – Neeme Praks Sep 06 '11 at 06:38