13

What InputStream type should be used to handle URLConnection streams that have HTTP Content-Encoding set to deflate?

For a Content-Encoding of gzip or zip I use a GZIPInputStream, no problem.

For a Content-Encoding of "deflate" I have tried using InflaterInputStream and DeflaterInputStream but I get

java.util.zip.ZipException: unknown compression method at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:147)

My understanding is that "deflate" encoding refers to Zlib compression, and according to the docs this should be handled by InflaterInputStream.

Joel
  • 27,478
  • 33
  • 104
  • 136
  • God I encounter the same situation. Raw deflate doesn't have one header so there is No way getting to know what it is. I thought my code is incorrect and check and check and check... damn. After this I have suggested the archive manager I'm using to support decoding raw deflate data so I can just try to open it with the archive manager and I'll know "Oh! My code is good! And this is deflate!". – asuka Aug 07 '20 at 16:34

2 Answers2

22

In HTTP/1.1, Content-encoding: deflate actually refers to the DEFLATE compression algorithm, as defined by RFC 1951, wrapped in the zlib data format, as defined by RFC 1950.

However some vendors just implement the DEFLATE algorithm as defined RFC 1951, completely ignoring RFC 1950 (no zlib headers).

Others have been hit by the same issue:

In order to work around this, try to instantiate the InflaterInputStream passing an Inflater that was created with the nowrap parameter set to true:

in = new InflaterInputStream(conn.getInputStream()), new Inflater(true));
Community
  • 1
  • 1
Grodriguez
  • 20,528
  • 10
  • 53
  • 97
  • Both RFCs seem to reference Zlib, but I guess different versions? – Joel Oct 14 '10 at 10:59
  • 1
    "6.2.2.2 Deflate Coding The "deflate" format is defined as the "deflate" compression mechanism (described in [RFC1951]) used inside the "zlib" data format ([RFC1950]). Note: Some incorrect implementations send the "deflate" compressed data without the zlib wrapper." -- http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-11.html#rfc.section.6.2.2.2 – Julian Reschke Oct 14 '10 at 11:04
  • 3
    RFCs make excellent example of how to write confusing, misleading technical documentation. – Muxecoid Jan 01 '14 at 16:27
2

Unfortunately, using the InflaterInputStream with an Inflater object did not always produce the correct decompression. I had to detect the headers and tell the Inflater where the offset to the payload was.

http://thushw.blogspot.com/2014/05/decoding-html-pages-with-content.html

user1706991
  • 227
  • 2
  • 4