2

I've learned in class that merge sort is O(n · log(n)), but I'm not clear on whether it's O(n · log2(n)) or O(n · log10(n)). I've read on Stack Overflow (at Big O notation Log Base 2 or Log Base 10) that "it does not matter" because they give approximately the same runtime; but I want the exact runtime. For example, I know that bubble sort with 512 elements takes 512²/2 − 512/2 = 130816 time units; does merge sort with 512 elements take 512 · 9 = 4608 time units, or 512 · 5.404 = 2767 time units?

ruakh
  • 156,364
  • 23
  • 244
  • 282

3 Answers3

2

Since merge sort divides the arrays to halves, then yup, base 2. As the comments say, base is a constant factor so it's not important in big O notation.

Cs_Is_Better
  • 101
  • 9
2

Your question is based on a faulty premise.

Suppose that I have two equivalent algorithms, A and B. Algorithm A performs ten additions and no multiplications; algorithm B performs one multiplication and no additions.

Algorithmic analysis says that A and B are constant-time algorithms (O(1)). You would apparently like to distinguish them, saying that O(1) is merely "approximate", but that the "exact" runtime for A is 10 and the "exact" runtime for B is 1, so that A is ten times faster than B. That seems reasonable. But maybe on this computer multiplication is twice as expensive than addition, so the difference is actually more like 10:2 (i.e. 5:1). Or conversely, maybe the cost of progressing from one instruction to the next (incrementing the program counter) is as expensive as the cost of an arithmetic operation, so the difference is actually more like 19:1. Oh, except that we haven't even considered memory operations, which are usually much more expensive than arithmetic operations; where do the values in these algorithms come from?

So if you want exact runtimes, algorithmic analysis isn't the right approach; the reason that it discards constant factors (making log2 equivalent to log10) is that to do otherwise would require much, much more information about the hardware and the program context and so on, which would make the results much less generally applicable.

Instead, you would use benchmarking — and for a small example like yours, that means microbenchmarking.

ruakh
  • 156,364
  • 23
  • 244
  • 282
  • I think the question suffers from another common misconception. OP doesn't seem to realize that big-O notation is about how the requirements asymptotically ***scale*** with the problem size, rather than predicting the specific run time. Knowing that an algorithm is O(n^2) lets you know that doubling the size will take about four times as long, increasing by a factor of 10 will take about 100 times as long, and so on. As your answer points out, it gives no clue what the actual run time will be. – pjs Sep 10 '17 at 17:30
0

yes, it's log on base 2, but you can modify the algorithm and split the array into 3 parts, and then it's log_3(n), 4 parts is base-4, and so on. it's really (really) doesn't matter.

sicarii443
  • 49
  • 5