3

I am comparing two jar files that has the same content but they have different Host OS ( UNIX / FLAT ), as a result they get different CRC. How do I compare them without extracting them ?

I do not want to extract the jar file because actually the application compare between two ear files : each ear is a version of a project ( old version / new version) that has more than 300 jar files. I used to compare two jars using FileUtils.contentEquals, it works great with normal cases ( sample jar files with same Host OS).

boolean isTwoEqual = FileUtils.contentEquals(File1, File2);

I expect the output of isTwoEqual to be true when the content of two files is the same, but the actual output is false because they have different Host OS as a result they get different CRC code.

Unkown
  • 31
  • 1
  • 1
    As far as I know it's not possible. CRC doesn't pick and choose. It is based on all the bits in the file. Any change will give you a different code. But maybe someone has a solution that approaches the problem in a different way that would actually work. – Oloff Biermann Aug 06 '19 at 15:08
  • @Jeff Grigg's answer is good. The difference is almost certainly all in end of line characters. You'll need to unpack the jars one way or another and compare them omitting those. Jars are zip files. As Jeff says, you can unzip the contents and compare streams character-wise without writing files. It will still be resource intensive (cpu and ram), but there isn't really a choice. Also you need a way of inferring whether a file is text or non-text data. Trying to find lines in data file and strip the line endings would also get the wrong answer. – Gene Aug 16 '19 at 01:34

1 Answers1

0

I've implemented a InputStream using the java.util.zip.ZipInputStream class, and NextEntry and int read(byte b[]) method calls, to read the contents of ZIP files without extracting the contents to separate temporary files.

Then use the BufferedReader's readLine method to read lines, discarding line endings, and compare the lines from the two sources.

Jeff Grigg
  • 934
  • 7
  • 7