Why is there a huge difference in performance though time complexity for the two functions below seems to be similar?

Question

I am trying to evaluate the performance of two Python methods that sort a list of numbers. The time complexity seems to be n^2 for both of them but empirical data shows one performs better than the other. Any reasons for this?

I wrote two methods, one using nested for loops and another that finds a max and adds the max to a new list (and removes from the old list) iteratively.

Method 1:

def mysort1(l):
    i = 0
    j = 1
    for i in range(0,len(l)-1):
        for j in range(i,len(l)):
            if l[i] > l[j]:
                tmp = l[j]
                l[j] = l[i]
                l[i] = tmp
    return l

Method 2:

def mysort2(l):
    nl = []
    for i in range(0,len(l)):
        m = max(l)
        nl.insert(0, m)
        l.remove(m)
    return nl

Both were tested with a list of 10000 numbers in reverse order. When using profile, Method 1 takes approximately 8 seconds (10000+ calls) and method 2 takes only 0.6 seconds (30000+ calls). Any reason why Method 2 performs so much better than Method 1 even though time complexity for both seems to be the same?

side-node: swapping in python: `l[i], l[j] = l[j], l[i]`. no need for `tmp`. — hiro protagonist, Sep 02 '19 at 14:42
`max` iterates over your list with "C-speed" and not "python-speed" as your first inner loop. — hiro protagonist, Sep 02 '19 at 14:44
Time complexity will tell you how the time needed evolves with the size of the data. Having the same time complexity does absolutely not mean having the same speed. — Thierry Lathuille, Sep 02 '19 at 14:45
As your experiment demonstrates, practical use of complexity analysis is actually quite limited. Yes, both are quadratic algorithms, but obviously the multiplicative constants are vastly different. Turns out the speedup you get from iterating at C level in the second example makes all the difference. When in doubt, always rely on benchmarks. In fact, there are sometimes "worse" algorithms that will be faster when the data is not too big, and, as someone put it, actual data is frequently not too big... So always take measures with representative examples of your data. — jdehesa, Sep 02 '19 at 14:47
Try to implement yourself `max` as a python function and check again. — Giacomo Alzetta, Sep 02 '19 at 14:49
100000000000 * n^2 + 100000000000 * n and 0.000000000001 * n^2 are both O(n^2). Time complexity and execution time are very different things. — molbdnilo, Sep 02 '19 at 14:59
@hiroprotagonist thanks for pointing out the C-speed. I did not consider that. — sj7070, Sep 02 '19 at 15:14

score 3 · Answer 1 · edited Jun 20 '20 at 09:12

Essentially, as the comments have suggested, it boils down to the major implementation of Python in C. This answer points out that the real reason is that the C counterpart, the native implementation of the list operations max etc, is much faster than your Python implementation due to many people optimizing the code and generally, C runs faster than Python for operations like this.

Here is another answer to the question "Why are Python Programs often slower than the Equivalent Program Written in C or C++?".

From the answer:

Internally the reason that Python code executes more slowly is because code is interpreted at runtime instead of being compiled to native code at compile time.

Other interpreted languages such as Java bytecode and .NET bytecode run faster than Python because the standard distributions include a JIT compiler that compiles bytecode to native code at runtime. The reason why CPython doesn't have a JIT compiler already is because the dynamic nature of Python makes it difficult to write one. There is work in progress to write a faster Python runtime so you should expect the performance gap to be reduced in the future, but it will probably be a while before the standard Python distribution includes a powerful JIT compiler.

score 2 · Answer 2 · answered Sep 02 '19 at 14:53

Python doesn't give a #@&^ about time complexity (of course it does, but...)

Python being interpreted dynamically-typed language has a lot of time overhead on type checks and runtime "compilation". For instance, in your first method, it has to check the type of i on each iteration at least six times (when indexing a list).

So I'd assume that processing time difference is because max is optimized and (as you probably using CPython interpreter) is basically a C function (as well as .insert and .remove methods).

score 1 · Answer 3 · answered Sep 02 '19 at 15:00

Both functions have quadratic run time in N = len(l), but this only means that asymptotically the time taken by the first function is bounded by some c_1 * N^2 and the second function by some c_2 * N^2.

The value of c_1 and c_2 can be vastly different. In your case the second function can perform the inner loops in max, insert and remove in Python's implementation which is compiled to native machine code and optimized for the particular purpose, while the first function must perform the inner loop in interpreted Python byte code. The latter takes a lot more time usually.

When talking about time complexity this prefactor is usually dropped from discussion (e.g. in big-O notation) because it depends on the particular time individual operations take, while the N^2 behavior is universal for any implementation of that algorithm.

Furthermore, at finite N complexity theory makes no arguments at all. It could be that the algorithm with strictly lower time complexity performs worse than the other for the first 10^100 or whatever values of N (although that is not the problem in your particular case here).

Why is there a huge difference in performance though time complexity for the two functions below seems to be similar?

3 Answers3