Questions tagged [data-compression]

156 questions
102
votes
6 answers

How to read data from a zip file without having to unzip the entire file

Is there anyway in .Net (C#) to extract data from a zip file without decompressing the complete file? Simply I possibly want to extract data (file) from the start of a zip file, obviously this depends if the compression algorithm compress the file…
AwkwardCoder
  • 22,434
  • 24
  • 80
  • 143
65
votes
15 answers

What is the computer science definition of entropy?

I've recently started a course on data compression at my university. However, I find the use of the term "entropy" as it applies to computer science rather ambiguous. As far as I can tell, it roughly translates to the "randomness" of a system or…
fluffels
  • 3,946
  • 7
  • 33
  • 51
49
votes
7 answers

How do I compute the approximate entropy of a bit string?

Is there a standard way to do this? Googling -- "approximate entropy" bits -- uncovers multiple academic papers but I'd like to just find a chunk of pseudocode defining the approximate entropy for a given bit string of arbitrary length. (In case…
dreeves
  • 25,132
  • 42
  • 147
  • 226
35
votes
5 answers

Write a program that takes text as input and produces a program that reproduces that text

Recently I came across one nice problem, which turned up as simple to understand as hard to find any way to solve. The problem is: Write a program, that reads a text from input and prints some other program on output. If we compile and run the…
19
votes
3 answers

Any theoretical limit to compression?

Imagine that you had all the supercomputers in the world at your disposal for the next 10 years. Your task was to compress 10 full-length movies losslessly as much as possible. Another criteria was that a normal computer should be able to decompress…
David
  • 4,052
  • 7
  • 46
  • 77
14
votes
4 answers

How to check if TOAST is working on a particular table in postgres

I have a table that contains two text fields which hold a lot of text. For some reason our table have started growing exponentially. I suspect that TOAST (compression for text fields in postgres) is not working automatically. In our table definition…
jindal
  • 208
  • 1
  • 2
  • 7
13
votes
7 answers

Is there a practical way to compress NSData?

I haven't seen any documentation on the topic, but that doesn't mean it doesn't exist.
eric.mitchell
  • 8,501
  • 12
  • 50
  • 91
12
votes
2 answers

Big file compression with python

I want to compress big text files with python (I am talking about >20Gb files). I am not any how an expert so I tried to gather the info I found and the following seems to work : import bz2 with open('bigInputfile.txt', 'rb') as input: with…
user1242959
  • 125
  • 1
  • 4
11
votes
5 answers

compressed vector/array class with random data access

I would like to make "compressed array"/"compressed vector" class (details below), that allows random data access with more or less constant time. "more or less constant time" means that although element access time isn't constant, it shouldn't keep…
SigTerm
  • 24,947
  • 5
  • 61
  • 109
8
votes
5 answers

Compression algorithms for numbers only

I am to compress location data (latitude,longitude, date,time). All the numbers are in fixed format. 2 of them (latitude,longitude) are with decimal format. Other 2 are integers. Now these numbers are in fixed format string. What are the algorithms…
fireball003
  • 1,829
  • 4
  • 19
  • 24
7
votes
3 answers

Data compression in python/numpy

I'm looking at using the amazon cloud for all my simulation needs. The resulting sim files are quite large, and I would like to move them over to my local drive for ease of analysis, ect. You have to pay per data you move over, so I want to compress…
tylerthemiler
  • 4,216
  • 6
  • 29
  • 40
6
votes
8 answers

Matrix compression methods

In an application I've been working on, I have to send a 256 x 256 matrix over a socket. I'm developing a visualization client for a offshore system simulator that runs on a cluster, and this matrix is a heightmap representing the current state of…
cake
  • 1,226
  • 1
  • 13
  • 19
5
votes
8 answers

"uncompressable" data sequence

I would like to generate an "uncompressable" data sequence of X MBytes through an algorithm. I want it that way in order to create a program that measures the network speed through VPN connection (avoiding vpn built-in compression). Can anybody help…
Tate
  • 1,276
  • 11
  • 13
5
votes
2 answers

Column level compression in SQL Server

I have a column that I would like to store a lot of text data in (XML data). Approx 8,000 chars per row, and about 100-500 rows per minute. That much data means I will have to purge the column out fairly aggressively. (Since I have to host my…
Vaccano
  • 70,257
  • 127
  • 405
  • 747
5
votes
2 answers

Most efficient lossless compression for random numerical data?

My data are not actually totally random. I am looking to compress telemetry measurements which will tend to be in the same range (e.g. temperatures won't vary much). However, I seek a solution for multiple applications, so I might be sending…
Mawg says reinstate Monica
  • 34,839
  • 92
  • 281
  • 509
1
2 3
10 11