1

Which is the most easy way to lower the used memory space of a string and finally save it on the disc ?? (using java)

I heard about Hufmann Coding, but you need to create a tree in order to realize it. Another way should be bit shifting, because a character in ASCII always uses 1 byte of space...

That's the theory, but how to realize something like bit shifting in java, that really shrinks the strings memory usage on disc ??

Thanks in advance

appcodix
  • 322
  • 2
  • 15

2 Answers2

1

I might suggest some names on which you can read further using Google.

1) cannonic Hoffman coding - no need to store a tree But only the length of the symbols.

2) arithmetic encoding - A more interesting scenario is adaptive arithmetic coding that adapts the encoding to the probabilities of the words at encoding time.

3) front prefix coding - if words in your string have common prefixes like a dictionary for example, you might cut the prefix of a word and just store its length And on decoding get the prefix from the previous word Or a word in a certain index of the string that came before.

These are really easy to implement(also in java) and their pseudo code as well as elreal java implementations may be found in the first links of a Google search.

Of course there are many more techniques and the right ones depend on the actual use. If you expand your question and give a use case , others and I may be able to fine tune the technique to the scenario.

Solo
  • 391
  • 1
  • 10
  • Thanks for your response! In general, I have no specific use case. I simply want to reach a compression of a string "for fun" ^^ The string contains a longer text and was originally stored in a file. Now I want to compress this string and put it again on the disc with a lower memory usage... – appcodix May 31 '14 at 20:03
  • @appcodix Then in my opinion Canonical Huffman Coding is the best choice, or you could use gzip library which I suspect using Huffman coding anyway(correct me if I'm wrong) – Solo Jun 01 '14 at 08:37
1

With the Deflater (sic) class.

Mark Adler
  • 79,438
  • 12
  • 96
  • 137
  • See also: DeflaterOutputStream (example: https://stackoverflow.com/a/13060441/1599699) and DeflaterInputStream. – Andrew Dec 06 '18 at 20:37
  • Also useful: https://stackoverflow.com/a/35446009/1599699 https://stackoverflow.com/a/15950481/1599699 – Andrew Dec 06 '18 at 20:50