6

Sometimes I have to check for some condition that doesn't change inside a loop, this means that the test is evaluated in every iteration, but I think this isn't the right way.

I thought since the condition doesn't change inside the loop I should only test it only once outside the loop, but then I will have to "repeat myself" and possibly write the same loop more than once. Here's a code showing what I mean:

#!/usr/bin/python

x = True      #this won't be modified  inside the loop
n = 10000000

def inside():
    for a in xrange(n):
        if x:    #test is evaluated n times
            pass
        else:
            pass
    
def outside():
    if x:        #test is evaluated only once
        for a in xrange(n):  
            pass
    else:
        for a in xrange(n):
            pass

if __name__ == '__main__':
    outside()
    inside()

Running cProfile on the previous code gave the following output:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.542    0.542    0.542    0.542 testloop.py:5(inside)
        1    0.261    0.261    0.261    0.261 testloop.py:12(outside)
        1    0.000    0.000    0.803    0.803 testloop.py:3(<module>)

This shows that obviously, testing once outside the loop gives better performance, but I had to write the same loop twice (maybe more if there were some elifs).

I know that this performance won't matter in most cases, but I need to know what's the best way to write this kind of code. For example is there a way to tell python to only evaluate the test once ?

Any help is appreciated, thanks.

EDIT:

Actually after making some tests, I'm now convinced that the difference in performance is mainly affected by other code performed within the loops, not by the evaluation of tests. So for now I'm sticking with the first form, which is more readable, and better for debugging later.

Community
  • 1
  • 1
Amr
  • 675
  • 3
  • 11
  • Theres always `break` if theres some condition inside the loop that you know the next iterations isn't going to change then just `break` out of the loop. – Samy Vilar Jun 20 '12 at 08:42
  • @samy.vilar Ofcourse, but I just gave a minimal example, but in most cases there'd be more code than those `pass`es :) – Amr Jun 20 '12 at 08:44
  • ok, if you need to check for a condition outside the loop then do so, update/set variable which will then be used inside the loop, correspondingly, duplicate/redundant code is bad, even more so in loops, creating another variable(s) may or may not solve this problem depending on what you are trying to achieve .. – Samy Vilar Jun 20 '12 at 08:49

7 Answers7

5

First, a major component of the performance difference between your examples is the time it takes to lookup a global. If we capture it into a local variable:

def inside_local():
    local_x = x
    for a in xrange(n):
        if local_x:
            pass
        else:
            pass

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.258    0.258    0.258    0.258 testloop.py:13(outside)
    1    0.314    0.314    0.314    0.314 testloop.py:21(inside_local)
    1    0.421    0.421    0.421    0.421 testloop.py:6(inside)

most of the performance difference disappears.

In general whenever you have common code you should try to encapsulate it. If the branches of the if have nothing in common apart from the loop then try to encapsulate the loop iterator e.g. into a generator.

ecatmur
  • 137,771
  • 23
  • 263
  • 343
5

This is what I usally do in this situation.

def inside():
    def x_true(a):
        pass

    def x_false(a):
        pass

    if x:
        fn = x_true
    else:
        fn = x_false

    for a in xrange(n):
        fn(a)
wrongite
  • 848
  • 1
  • 12
  • 21
  • Thanks, setting variables as functions always escapes my mind xD – Amr Jun 20 '12 at 10:30
  • Caution: this can be slow, because (a) `x_true` and `x_false` are closures, and (b) the function call introduces overhead. It's definitely the cleanest option, though. – ecatmur Jun 20 '12 at 10:38
  • @ecatmur: I thought about the function call overhead, but are closures slower than normal functions ? – Amr Jun 20 '12 at 10:40
  • @Amr marginally, yes; profiling I get 2.803 for closures and 2.766 when the closures are moved to module level. This compares to 0.292 and 0.390 for your original `outside` and `inside` respectively, so it's dominated by the function call overhead. – ecatmur Jun 20 '12 at 10:58
  • As ecatmur said, I've tested this and it turned out to be way slower. – Amr Jun 20 '12 at 12:04
3

python has things like closures, lambda functions, gives first class status to functions and many many built-in functions, that really help us remove duplicate code, for example imagine you needed to apply a function to a sequence of values, you could do it this way

def outside():              
    if x:        # x is a flag or it could the function itself, or ...
        fun = sum # calc the sum, using pythons, sum function
    else:
        fun = lambda values: sum(values)/float(len(values)) # calc avg using our own function

    result = fun(xrange(101))

If you give us an exact scenario we can help you optimize it.

Samy Vilar
  • 9,450
  • 1
  • 32
  • 33
2

I know of no interpreted language providing support in that direction, compiled languages are likely to make the comparison only once (loop invariant optimization) but this would not help much, if the evaluation of x is simple. Obviously the code to be in place of the pass statements can't be completely identical, since "if" would have no use then. Typically one would write a procedure called in both places.

guidot
  • 4,534
  • 2
  • 20
  • 35
1
def outside():
    def true_fn(a):
        pass
    def false_fn(a):
        pass

    fn = true_fn if x else false_fn
    for a in xrange(n):
        fn(a)
dbykov
  • 31
  • 3
  • I believe you shouldn't have a colon ':' at the end of the if. Also what are true_fn's and false_fn's values? This blows up upon running. Traceback (most recent call last): File "", line 1, in File "test.py", line 7, in outside fn = true_fn if x else false_fn NameError: global name 'true_fn' is not defined – octopusgrabbus Jun 20 '12 at 18:36
0

In your case It depends on what you want: readability or performance.

If the task you are doing is some kind of filter you also can use a list_comprehension to run the loop:

[e for e in xrange(n) if x]

If you show a little more of your code I could suggest something.

Diego Navarro
  • 7,778
  • 3
  • 23
  • 31
0

Based on your original question, that you would like to test x's value without spending a lot of system resources, you already accepted an answer involving copying the value of global x to a local variable.

Now, if returning the value of x involved a multi-step function, but you were guaranteed the result would always be the same for x, then I would considering memoizing the function. Here is a very nice stackoverflow link on the subject

Community
  • 1
  • 1
octopusgrabbus
  • 9,974
  • 12
  • 59
  • 115