3

I'm currently in a Data Structures course at my university and did do some algorithm analysis in a prior class, but it was the section I had the most difficult time with in the previous course. We are now going over algorithm analysis in my data structures course and so I'm going back through my textbook from the previous course to see what it says on the matter.

In the textbook, it says "For every algorithm we want to analyze, we need to define the size of the prob- lem." Doing some Google searching, it's not entirely clear what "problem size" actually means. I'm trying to get a more concrete definition of what a problem size is so I can identify it in an algorithm.

I know that, if I have an algorithm that is sorting a list of numbers, the problem size is n, the size of the list. With that said, saying that doesn't clarify what "problem size" actually is, except for in that context. An algorithm is not just a process to sort numbers, so I can't always say that the problem size is the number of elements in a list.

Hoping someone out there can clarify things for me, and that you all are doing well.

Thank you

kvnr
  • 300
  • 1
  • 8
  • If you look up a word in a dictionary, it is the size of the dictionary (the number of words in that dictionary) – wildplasser Aug 30 '20 at 21:22
  • Did you ask your teacher? – Cid Aug 30 '20 at 21:23
  • 2
    An algorithm would generally work for input, encoded in a specific way. Typically the input would be variable in size. You can consider number of elements of the input or size of the encoding to be the "problem size". – zch Aug 30 '20 at 21:24
  • If the input is a sort of collection (including a string as a collection of characters), it's usually the size of the collection, but if it's e.g. a number, it's the number itself. If the algorithm has multiple inputs, it gets a bit more complicated. – tobias_k Aug 30 '20 at 22:00
  • 1
    @tobias_k, I thought that too, but looking at the answers here (and googling) I realise I was wrong. Problem size really does just mean input size. – Elliott Aug 31 '20 at 00:08
  • This is a duplicate of https://cs.stackexchange.com/questions/26268/size-of-the-instance-of-a-problem – Elliott Aug 31 '20 at 00:11

3 Answers3

2

The answer is right there in the part you quoted (emphasis mine):

For every algorithm we want to analyze, we need to define the size of the problem

The "problem size" is only defined numerically relative to the algorithm. For an algorithm where the input is an array or a list, the problem size is typically measured by its length; for a graph algorithm, the problem size is typically measured by the number of vertices and the number of edges (with two variables); for an algorithm where the input is a single number, the problem size may be measured by the number itself, or the amount of bits required to represent the number in binary, depending on context.

So the meaning of "problem size" is specific to the problem that the algorithm solves. If you want a more universal definition which could apply to all problems, then the problem size can be defined as the number of bits required to represent the input; but this definition is not practical, and is only used in theory to talk about classes of problems (such as those which are solvable in polynomial time).

kaya3
  • 31,244
  • 3
  • 32
  • 61
  • 1
    If the input is a single number, the problem size is *not* the magnitude of that number; it is still the length of the input in bits, which is the logarithm of the magnitude. An algorithm whose time complexity is polynomial in the magnitude of the input is often said to have [pseudopolynomial time complexity](https://stackoverflow.com/questions/19647658/what-is-pseudopolynomial-time-how-does-it-differ-from-polynomial-time) – rici Aug 30 '20 at 23:28
  • @rici As I said, that depends on context. If you are talking about whether a problem is solvable in polynomial time, then that is understood to mean polynomial in the number of bits; but there is nothing stopping you from reporting the time complexity of an algorithm as a function of something else, and people often do this. If you have not seen examples, search Stack Overflow for "Fibonacci time complexity" and you will not find many people measuring it as a function of the number of bits. – kaya3 Aug 31 '20 at 01:25
  • Nothing stops people from using terms incorrectly. But I think you will have a hard time finding reference material on algorithmic complexity which does not carefully make this distinction. This question is not in the domain of informal discourse; it is clearly marked as being in an academic context where formal definitions are important. In that context, I think your answer is misleading. – rici Aug 31 '20 at 02:11
  • I get that we need to define a problem size, and for some algorithms I suppose that's easy. I guess I was hoping to find a more concrete definition that could be applied across all algorithms so that way I could easily look at a problem as say "AH HA! That's the size of the problem." I hope that makes sense. – kvnr Aug 31 '20 at 06:06
  • @rici Actually, it is only one particular academic context where "number of bits" is always the definition; for example, you would not analyse the time complexity of a sorting algorithm with n as the number of bits. Even in an academic paper n would always be the length of the list (and if you want to be pedantic, the number of bits is O(nw) where w is the word size which is not always presumed constant). As for "nothing stops people from using terms incorrectly", the people who use the term "incorrectly" in your opinion include university lecturers. – kaya3 Aug 31 '20 at 11:08
  • So your objection seems to be that I have described what people use the term to mean, rather than prescribed what they should, in your opinion, use it to mean. – kaya3 Aug 31 '20 at 11:08
  • @kvnr The only universal definition is "number of bits to encode the input". In practice, you should just think of "define how to measure the problem size" as one of the (first) steps you take when analysing an algorithm. – kaya3 Aug 31 '20 at 11:13
  • @rici Here's an example that isn't Stack Overflow; the paper introducing the AKS primality test algorithm reports its running time as a function of the input number n, not the number of bits needed to represent it. This is the paper which resolved the question of whether primality testing can be done in polynomial time. https://www.cse.iitk.ac.in/users/manindra/algebra/primality_v6.pdf (p. 6). – kaya3 Aug 31 '20 at 11:24
  • @kaya: from the second paragraph of that article::"An efficient test should need only a polynomial (**in the size of the input = log n**) number of steps...". The goal of that paper is to demonstrate that their algorithm is in P, the set of polynomial-time algorithms; the definition of P (Cobham, 1965) is based precisely on the input size *measured in bits*. – rici Aug 31 '20 at 12:02
  • So the statement that the AKS primality algorithm "runs in polynomial time" is only meaningful if the polynomial is applied to the size of the input, not its magnitude. This is the definition of problem used used consistently in complexity theory. – rici Aug 31 '20 at 12:09
  • Yes, I know. That doesn't contradict anything I said. – kaya3 Aug 31 '20 at 12:17
  • I think it makes clear what "problem size" means (and needs to mean in statements about complexity theory). Computational complexity is not always expressed as a formula on problem size, but that fact does not alter the definition of "problem size", which remains "the bit length of the input". – rici Aug 31 '20 at 12:23
  • "in statements about complexity theory", yes. In other contexts, it doesn't "need" to mean exactly that. I wrote this in my answer which you are commenting underneath. – kaya3 Aug 31 '20 at 12:28
  • I guess the disagreement here is purely semantic. You seem to be saying that if I express the computation time as a formula on some metric, that metric automatically becomes the "problem size". I think that the metric remains whatever it is, and "problem size" remains whatever it is. When I report complexity on the basis of some other metric, I need to state what that metric is. Of course, I'm always free to do that. – rici Aug 31 '20 at 12:34
  • Rereading your answer, I agree that you mention all these points and I don't think our difference is all that profound. All I'm saying is that you can analyse an algorithm based on any metric you like ("the number of comparisons", for example), but doing so does not alter the meaning of "problem size" as a mathematical concept. – rici Aug 31 '20 at 12:37
1

The problem size is the number of bits needed to store an instance of the problem, when it is specified in a reasonable encoding.

gnasher729
  • 47,695
  • 5
  • 65
  • 91
1

To clarify the concept, let me define this in the layman's terms:

Given:

  • You have a big phone book.

Problem:

  • You are told to find the number of person John Mcallister.

Approach:

  • You can either search for this entry through each page (in the linear manner);
  • or, if the phone-book is sorted, you can utilize Binary Search;

Answer to your question:

  • Algorithm problem here is Finding the entry in the Phone Book;
  • Algorithm problem's size is the size of data, your algorithm should apply to (in your case, it's the size of your phone-book. If it has 10 entries per each page, and the book has 50 pages, the size is 50x10=500, to wit, 500 entries.)
  • As your algorithm should solve your task of examining entire phone book, the size of your task/problem, which you implement the algorithm for, is 500.

Problem Size is generally denoted with n and it literally means the size of input data.

Giorgi Tsiklauri
  • 6,699
  • 7
  • 29
  • 54