0

This:

H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/Ih7/Hu8WZXqZ101RLT/7aHe881GaL6fVrFhefPbRuj3fPvjo9zh63FTZ6tHp8jIvq1We0ivL5hE+++yjeduuHt2920zn+SJrxvQVPh9X9cVd/HI315fufqSvvWsK+9bV1dX46h433tvZ2b37e3/x/DUD2i6WTZstp7l7a3bzWx8pok+q2fXR45N5trzIn+fLi3b+1bJoX+XNqlo2in4ILZ80eX1ZTPPfe/zi9M1dAhR5e122R7ufjnce3tvff3x3oEH0C+6XvvKwk98NSY/+H/eQC6mIAQAA

...is the gzipped body of a HTTP response from a web service. I want to decompress this inside a Python script, so I have used a similar piece of code to that shown in previous posts on here, such as Decompressing a gzipped payload of a packet with Python

This is my script:

#!/usr/bin/env python
import logging
import gzip
import StringIO

logging.basicConfig(filename='out.log', level=logging.DEBUG)

compressed_data = 'H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/Ih7/Hu8WZXqZ101RLT/7aHe881GaL6fVrFhefPbRuj3fPvjo9zh63FTZ6tHp8jIvq1We0ivL5hE+++yjeduuHt2920zn+SJrxvQVPh9X9cVd/HI315fufqSvvWsK+9bV1dX46h433tvZ2b37e3/x/DUD2i6WTZstp7l7a3bzWx8pok+q2fXR45N5trzIn+fLi3b+1bJoX+XNqlo2in4ILZ80eX1ZTPPfe/zi9M1dAhR5e122R7ufjnce3tvff3x3oEH0C+6XvvKwk98NSY/+H/eQC6mIAQAA'

logging.debug(compressed_data)

buf = StringIO.StringIO(compressed_data)
f = gzip.GzipFile(fileobj=buf)
decompressed_data = f.read()

logging.debug(decompressed_data)

...but when I run it, Python reports it is not a gzipped file.

I am pretty sure it is, because when I use this online gzip/gunzip utility, the string is correctly decompressed. The HTTP response header also says it is gzip encoded. And, I can also see the decoded contents when I call the service using a testing tool.

I would be interested to know what I have omitted here.

For reference, the decompressed string should be:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><ChangeLengthUnitResponse xmlns="http://www.webserviceX.NET/"><ChangeLengthUnitResult>16.09344</ChangeLengthUnitResult></ChangeLengthUnitResponse></soap:Body></soap:Envelope>

I am using Python 2.7.11.

Community
  • 1
  • 1
  • 6
    That's base64-encoded data. Maybe it's a base64-encoded gzip'ed file, but you first need to base64-decode it before you can gunzip it. –  Jun 22 '16 at 00:53
  • 2
    Yeah, I was going to say there's no way it's gzip data if it's all ASCII. – John Gordon Jun 22 '16 at 00:54

1 Answers1

1

Using @Rhymoid's suggestion, your code should look something like this (untested):

#!/usr/bin/env python
import logging
import gzip 
import StringIO
from base64 import b64decode

logging.basicConfig(filename='out.log', level=logging.DEBUG)

compressed_data = 'H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/Ih7/Hu8WZXqZ101RLT/7aHe881GaL6fVrFhefPbRuj3fPvjo9zh63FTZ6tHp8jIvq1We0ivL5hE+++yjeduuHt2920zn+SJrxvQVPh9X9cVd/HI315fufqSvvWsK+9bV1dX46h433tvZ2b37e3/x/DUD2i6WTZstp7l7a3bzWx8pok+q2fXR45N5trzIn+fLi3b+1bJoX+XNqlo2in4ILZ80eX1ZTPPfe/zi9M1dAhR5e122R7ufjnce3tvff3x3oEH0C+6XvvKwk98NSY/+H/eQC6mIAQAA'

logging.debug(compressed_data)

buf = StringIO.StringIO(b64decode(compressed_data))
f = gzip.GzipFile(fileobj=buf)
decompressed_data = f.read()

logging.debug(decompressed_data)

The base64.b64decode method will return the decoded string.

Alex Taylor
  • 7,080
  • 4
  • 23
  • 37