0

I have a text file which I constantly append data to. When processing is done I need to gzip the file. I tried several options like shutil.make_archive, tarfile, gzip but could not eventually do it. Is there no simple way to compress a file without actually writing to it?

Let's say I have mydata.txt file and I want it to be gzipped and saved as mydata.txt.gz.

minerals
  • 5,093
  • 13
  • 48
  • 92
  • 1
    What do you mean "without writing to it"? If you don't write to the compressed file, `mydata.txt.gz` is going to be empty. – abarnert May 05 '15 at 09:49
  • What does "could not eventually do it" even mean? What did you try *exactly*, and what error(s) did you get? Show some code! – unwind May 05 '15 at 10:04
  • @abarnert I have a processed file called `mydata.txt` it already contains data, isn't it? Next step, I need to compress it. If I do it the way it is shown in examples with 'w' parameter the file gonna be overwritten and empty so this is what "without actually writing to it" means. I tried 'a:gz' parameter of `tarfile` but it does not accept it. – minerals May 05 '15 at 10:19
  • 1
    See [`fileinput`](https://docs.python.org/2/library/fileinput.html#module-fileinput) with [`hook_compressed`](https://docs.python.org/2/library/fileinput.html#fileinput.hook_compressed) – Peter Wood May 05 '15 at 10:22
  • http://stackoverflow.com/questions/1855095/how-to-create-a-zip-archive-of-a-directory – Ajay May 05 '15 at 10:32

2 Answers2

3

I don't see the problem. You should be able to use e.g. the gzip module just fine, something like this:

inf = open("mydata.txt", "rb")
outf = gzip.open("file.txt.gz", "wb")
outf.write(inf.read())
outf.close()
inf.close()

There's no problem with the file being overwritten, the name given to gzip.open() is completely independent of the name given to plain open().

unwind
  • 364,555
  • 61
  • 449
  • 578
  • Indeed, I thought it will be slow, but it appears not critically. – minerals May 05 '15 at 17:45
  • @minerals: Why would it be slow? If you're worried about reading the whole file into memory, you can just loop over 8K at a time instead of the whole file at once; it doesn't change anything essential. – abarnert May 05 '15 at 19:00
0

If you want to compress a file without writing to it, you could run a shell command such as gzip using the Python libraries subprocess or popen or os.system.

Monkeybrain
  • 656
  • 6
  • 22