98

I know that Knapsack is NP-complete while it can be solved by DP. They say that the DP solution is pseudo-polynomial, since it is exponential in the "length of input" (i.e. the numbers of bits required to encode the input). Unfortunately I did not get it. Can anybody explain that pseudo-polynomial thing to me slowly ?

Bill the Lizard
  • 369,957
  • 201
  • 546
  • 842
Michael
  • 37,415
  • 63
  • 167
  • 303
  • Possible duplicate of [How to understand the knapsack problem is NP-complete?](https://stackoverflow.com/questions/3907545/how-to-understand-the-knapsack-problem-is-np-complete) – nbro Apr 03 '18 at 23:42

4 Answers4

98

The running time is O(NW) for an unbounded knapsack problem with N items and knapsack of size W. W is not polynomial in the length of the input though, which is what makes it pseudo-polynomial.

Consider W = 1,000,000,000,000. It only takes 40 bits to represent this number, so input size = 40, but the computational runtime uses the factor 1,000,000,000,000 which is O(240).

So the runtime is more accurately said to be O(N.2bits in W), which is exponential.

Also see:

Community
  • 1
  • 1
moinudin
  • 117,949
  • 42
  • 185
  • 213
  • 1
    Link #3 referring to "Complexity of dynamic programming for the 0-1 knapsack problem" is dead. – dev_nut Jun 16 '19 at 21:48
  • 1
    Sorry, I didn't get it. Let's say if we have an algorithm with time complexity O(N), then we have O(2^(bits in N)), which is exponential? Thanks~ – Lusha Li Aug 07 '19 at 10:54
  • 3
    @LushaLi This helped me: https://www.youtube.com/watch?v=9oI7fg-MIpE. If N is an array where each element has a fixed max size input (say each element in the array is no more than 32 bits), and you run a for loop on this array once, then it is a polynomial time algorithm in the input size N of the array. However, if N is an integer and you run a loop over N, then N is unbounded and grows exponentially in the number of bits it takes to represent it. So a simple for loop on N is, actually, exponential. Note that in the case of the array, the size of each element in the array was upper-bounded. – max_max_mir Nov 30 '19 at 17:30
  • I wasn’t convinced. There are a lot of algorithms with the same properties that are not “pseudo-polynomial”. Say, what is the complexity of Sieve of Eratosthenes (or any other prime numbers finder)? – Ofir A. Jan 28 '20 at 11:36
  • It is a really weird way to describe the running time of an algorithm indeed. If you have an outer-loop with N iterations and an inner-loop with W iterations, then the running time of your algorithm is O(NW)... no? The fact that the input to your program will consist of N integers and an integer W seems to be a separate issue - your algorithm will still make N * W iterations. – Snackoverflow Feb 28 '21 at 17:48
31

In most of our problems, we're dealing with large lists of numbers which fit comfortably inside standard int/float data types. Because of the way most processors are built to handle 4-8 byte numbers at a time at no additional cost (relative to numbers than fit in, say, 1 byte), we rarely encounter a change in running time from scaling our numbers up or down within ranges we encounter in real problems - so the dominant factor remains just the sheer quantity of data points, the n or m factors that we're used to.

(You can imagine that the Big-O notation is hiding a constant factor that divides-out 32 or 64 bits-per-datum, leaving only the number-of-data-points whenever each of our numbers fit in that many bits or less)

But try reworking with other algorithms to act on data sets involving big ints - numbers that require more than 8 bytes to represent - and see what that does to the runtime. The magnitude of the numbers involved always makes a difference, even in the other algorithms like binary sort, once you expand beyond the buffer of safety conventional processors give us "for-free" by handling 4-8 byte batches.

The trick with the Knapsack algorithm that we discussed is that it's unusually sensitive (relative to other algorithms ) to the magnitude of a particular parameter, W. Add one bit to W and you double the running time of the algorithm. We haven't seen that kind of dramatic response to changes in value in other algorithms before this one, which is why it might seem like we're treating Knapsack differently - but that's a genuine analysis of how it responds in a non-polynomial fashion to changes in input size.

shaunak1111
  • 781
  • 1
  • 9
  • 15
11

The way I understand this is that the capacity would've been O(W) if the capacity input were an array of [1,2,...,W], which has a size of W. But the capacity input is not an array of numbers, it's instead a single integer. The time complexity is about the relationship to the size of input. The size of an integer is NOT the value of the integer, but the number of bits representing it. We do later convert this integer W into an array [1,2,...,W] in the algorithm, leading people into mistakenly thinking W is the size, but this array is not the input, the integer itself is.

Think of input as "an array of stuff", and the size as "how many stuff in the array". The item input is actually an array of n items in the array so size=n. The capacity input is NOT an array of W numbers in it, but a single integer, represented by an array of log(W) bits. Increase the size of it by 1 (adding 1 meaningful bit), W doubles so run time doubles, hence the exponential time complexity.

lineil
  • 800
  • 1
  • 7
  • 10
9

The Knapsack algorithm's run-time is bound not only on the size of the input (n - the number of items) but also on the magnitude of the input (W - the knapsack capacity) O(nW) which is exponential in how it is represented in computer in binary (2^n) .The computational complexity (i.e how processing is done inside a computer through bits) is only concerned with the size of the inputs, not their magnitudes/values.

Disregard the value/weight list for a moment. Let's say we have an instance with knapsack capacity 2. W would take two bits in the input data. Now we shall increase the knapsack capacity to 4, keeping the rest of the input. Our input has only grown by one bit, but the computational complexity has increased twofold. If we increase the capacity to 1024, we would have just 10 bits of the input for W instead of 2, but the complexity has increased by a factor of 512. Time complexity grows exponentially in the size of W in binary (or decimal) representation.

Another simple example that helped me understand the pseudo-polynomial concept is the naive primality testing algorithm. For a given number n we are checking if it's divided evenly by each integer number in range 2..√n, so the algorithm takes √(n−1) steps. But here, n is the magnitude of the input, not it's size.

                     Now The regular O(n) case

By contrast, searching an array for a given element runs in polynomial time: O(n). It takes at most n steps and here n is the size of the input (the length of the array).

[ see here ]

Calculating bits required to store decimal number

Community
  • 1
  • 1
shaunak1111
  • 781
  • 1
  • 9
  • 15
  • 4
    So for your last searching example, why don't consider n as binary too? if n=1024, it also only takes 10bits, so shouldn't it be pseudo-polynomial? – user1024 Jan 19 '16 at 12:38