5

I wanted to check if a given x lies in the interval [0,a-1]. As a lazy coder I wrote

x in range(a)

and (because that piece of code was in 4.5 nested loops) quickly run into performance issues. I tested it and indeed, it turns out runtime of n in range(n) lies in O(n), give or take. I actually thought my code would be optimized to x >= 0 and x < a but it seems this is not the case. Even if I fix the range(a) in advance the time doesn't become constant (though it improves a lot) - see side notes.

So, my question is:

Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?


Side notes:

  1. I tried searching SO for range, python-2.7, performance tags put together, found nothing (same with python-2.x).
  2. If I try following:

    i = range(a)
    ...
    x in i
    

    so that the range is fixed and I only measure runtime of x in i, I still get runtime in O(x) (assuming a is large enough).

  3. runtime of n in xrange(n) lies in O(n) as well.
  4. I found this post, which asks similar question for python 3. I decided to test same stuff on python 3 and it rolled through tests like it's nothing. I got sad for python 2.
mike239x
  • 195
  • 2
  • 9
  • P.S. pls tell me if you think this question need more tags (like python or python-2.x), I'll add those. – mike239x Feb 27 '18 at 04:43
  • yes, i'd recommend tagging it with `python`. Does `xrange` run quicker for you than `range`? Here's a [link](https://stackoverflow.com/questions/94935/what-is-the-difference-between-range-and-xrange-functions-in-python-2-x) – Kind Stranger Feb 27 '18 at 04:46
  • it would be much more efficient to do this comparison `x >= 0 and x <= (a-1)` – Haleemur Ali Feb 27 '18 at 05:03
  • 1
    @KindStranger yeah, version with `xrange` runs faster than the one with `range` but gets outperformed by the fixed-range version. So yeah, also linear time there. Well, with xrange it is no big surprise, since it is just a generator, AFAIK. – mike239x Feb 27 '18 at 05:05
  • `xrange` will be constant time. You should pretty much always use `xrange` in Python 2, indeed, in Python 3, `range` is the same as `xrange`, if you really want a list, you can always do `list(range())` in Python 3. Also, you should really be using Python 3. – juanpa.arrivillaga Feb 27 '18 at 05:07
  • 1
    Also, not sure how you expect Python to optimize such a thing. Note, `range` is just like any identifier, you can assign anything to it. But anyway, `range` in python 2 materializes a list, and then you do a linear search to find the `x`. So, yeah. – juanpa.arrivillaga Feb 27 '18 at 05:07
  • @mike239x I guess you have your answer - if you really value the efficiency/speed of the program and it makes enough of a difference, go for that over the readability of `xrange`. `xrange` isn't a generator as such, but from what i understand, it behaves in a similar way to one. – Kind Stranger Feb 27 '18 at 05:10
  • 1
    @juanpa.arrivillaga version with `xrange` also takes linear time. – mike239x Feb 27 '18 at 05:13
  • 1
    @mike239x holy hell, it is. Yeah, *another* great reason not to use Python 2. Anyway, `xrange` is technically not a generator, heck, it's not even an iterator. It is a sequence-type. – juanpa.arrivillaga Feb 27 '18 at 05:15
  • 6
    Note that `x >= 0 and x < a` can be written as `if 0 <= x < a`... – Jon Clements Feb 27 '18 at 12:53
  • You seem to consider O(n) as good enough; I definitely won't agree when an alternative is available which is easier to read, simpler to understand and O(1). – guidot Feb 27 '18 at 13:07
  • 1
    @ZeroPiraeus at the moment when I was making those edits my post was marked as a duplicate of another post, so I was not able to add an answer. – mike239x Feb 27 '18 at 15:55
  • @guidot I consider O(n) bad and O(1) good. – mike239x Feb 27 '18 at 15:57

4 Answers4

6

The problem with range in Python 2 is that it creates a list of values, so x in range(a) will create a list and linearly scan that list. xrange should be a generator, but it is not much faster; probably still just linearly scans the values, just without creating the entire list first.

In [2]: %timeit 5*10**5 in range(10**6 + 1)  # Python 2
10 loops, best of 3: 18.1 ms per loop

In [3]: %timeit 5*10**5 in xrange(10**6 + 1) # Python 2
100 loops, best of 3: 6.21 ms per loop

In Python 3, range is much smarter, not only not creating the entire list, but also providing a fast implementation of the contains check.

In [1]: %timeit 5*10**5 in range(10**6 + 1)  # Python 3
1000000 loops, best of 3: 324 ns per loop

Even faster and IMHO more readable: Using comparison chaining:

In [2]: %timeit 0 <= 5*10**5 < 10**6 + 1     # Python 2 or 3
10000000 loops, best of 3: 46.6 ns per loop

Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?

"No", "it depends", and "yes". You should not use x >= 0 and x < a because 0 <= x < a is shorter and easier to parse (for puny humans), and is interpreted as (0 <= x) and (x < a). And you should not use in range in Python 2, but in Python 3, you can use it if you like.

Still, I'd prefer comparison chaining, since a <= x < b is much more explicit about bounds than x in range(a, b) (what if x == b?), which could prevent many off-by-one errors or +1 padding the range.

Also, note that 0 <= x < a is not strictly the same as x in range(0, a), as a range will only ever contain integer values, i.e. 1.5 in range(0, 5) is False, whereas 0 <= 1.5 < 5 is True, which may of may not what you want. Also, using range you can use steps other than 1, e.g. 5 in range(4, 10, 2) is False, but the same can also be implemented using pure math, e.g. as (4 <= x < 10) and (x - 4 % 2 == 0).

tobias_k
  • 74,298
  • 11
  • 102
  • 155
  • 2
    Addendum: As noted in another answer, `a <= x < b` is not strictly the same as `x in range(a, b)`, as a range will only ever contain integer values. Also, using `range` you can use steps other than 1, e.g. `5 in range(2, 10, 2)` is `False`. – tobias_k Feb 27 '18 at 15:46
  • Can you add to your answer that python3 range can be imported to python2 using `from future.builtins import range` in the header, please? – mike239x Feb 27 '18 at 16:16
  • @mike239x: That's not mentioned in the python docs. https://docs.python.org/2/library/future_builtins.html – Håken Lid Feb 27 '18 at 16:32
  • @HåkenLid 1) mine are future.builtins, yours are future_builtins 2) I found this site: http://python-future.org/imports.html and range is stated there under builtins 3) it worked for me – mike239x Feb 27 '18 at 16:41
  • So it's a third party library, not the standard lib. I see. – Håken Lid Feb 27 '18 at 16:42
  • @HåkenLid shall I make it into my own answer instead? – mike239x Feb 27 '18 at 16:51
  • 1
    Anyone has an idea why the downvote, so I can fix it, whatever it is? – tobias_k Feb 27 '18 at 19:52
  • @tobias_k No, your [ Answer ] is well structured, on-spot, clear and sound. Hate-full people do not care your insights and efforts and simply express the hate. It has nothing to do with your knowledge sponsoring efforts. Ó tempóra, ó mores ... ( given there are zero transaction costs associated with negative votes, the hate-full pathologies are so easy to spring out and expand ). Good Luck & Stay positive minded, tobias_k – user3666197 Feb 27 '18 at 21:19
  • @mike239x: Sure. Submit an answer explaining the future library. I did not know about it before I read your comment, and people who find this question in the future might be looking for something like that. I also edited my answer to include a mention of `future`. – Håken Lid Feb 27 '18 at 21:45
  • @tobias_k it was my downvote, because at the time, you were recommending `a <= x < b` without the caveat that this is not quite the same thing as a containment check for a `range` object. Reversed now :-) – Zero Piraeus Mar 05 '18 at 15:28
  • @ZeroPiraeus Well, I never said that it's the same, and OP was already using comparisons (just without chaining). Also, at that time I already explained it in the "Addendum" comment. But anyway, thanks for reverting, and thanks for stating the reason. :-) – tobias_k Mar 05 '18 at 15:43
2

You can get the same performance as in python3 by using a custom range class and override the in operator. For trivial cases it does not perform as well as simple comparison, but you will avoid the O(n) memory and time usage that you get with the builtin range() or xrange().

Note that testing value in range(low, high) is different than low < value <= high, since range will only contain integers. So 7.2 in range(10) == False.

But more importantly range() can take an optional third step argument, so if you need to test value in range(low, high, step), you might consider using a custom class.

EDIT: @mike239x found the future package which contains a range object similar to the one in my answer (in addition to other functions that help you write code that's python2/3 cross compatible). It should be safe to use that since it's presumably well tested and stable.

An object of this class wraps an xrange object, and only overrides the very expensive in operation. For regular iteration it works just as xrange.

class range(object):
  """Python 2 range class that emulates the constant time `in` operation from python 3"""

  def __init__(self, *args):
    self.start, self.end = (0, args[0]) if len(args) == 1 else args[:2]
    self.step = 1 if len(args) < 3 else args[2]
    self.xrange = xrange(*args)

  def __contains__(self, other):
    # implements the `in` operator as O(1) instead of xrange's O(n)
    try:
      assert int(other) == other
    except Exception:
      return False  # other is not an integer
    if self.step > 0:
      if not self.start <= other < self.end:
        return False  # other is out of range
    else:
      if not self.start >= other > self.end:
        return False  # other is out of range
    # other is in range. Check if it's a valid step
    return (self.start - other) % self.step == 0

  def __iter__(self):
    # returns an iterator used in for loops
    return iter(self.xrange)

  def __getattr__(self, attr):
    # delegate failed attribute lookups to the encapsulated xrange
    return getattr(self.xrange, attr)

The builtin xrange object is implemented in C, so we can't use class inheritance. Instead we can use composition and delegate everything except __contains__ to an encapsulated xrange object.

The implementation of contains can be compared to range_contains_long in the cpython rangeobject implementation. Here's the python 3.6 source code for that function.

Edit: For a more comprehensive python implementation, check out future.builtins.range from the future library.

user3666197
  • 1
  • 6
  • 43
  • 77
Håken Lid
  • 18,252
  • 8
  • 40
  • 58
  • You might be interested, that **once indeed benchmarked**, the class-defined substitute of the native `range()` bears **`O( 1 )` scaled constant costs of `~ 5 [us] - 29 [us]`** ( i.e. way more in time ), than the O/P asked to compare against - i.e. the `if ( 0 <= x < a )` syntax, which is processed at about a costs of ~ 0 [us]. – user3666197 Feb 28 '18 at 00:03
  • Thanks for the benchmark. Your result is in line with what I expected. Even if an improved `range` offers constant time lookup, there's still overhead due to object creation, method calls etc. – Håken Lid Feb 28 '18 at 12:22
  • Actually funny to hear "thanks" so many times while having got several angry and hateful **DownVotes** for the very same, the quantitatively fair & repeatable benchmarks. ( You need not name the overheads associated with your proposed approach, have spent decades on performance tuning and latency-shaving ( for more details ref. the re-formulated Amdahl's Law context, where naive / poorly handled overhead costs may turn N-parallel code-execution schedule easily work *way* slower, than a pure-[SERIAL], original code-execution strategy ). **Benchmarks deliver facts+show cost (in-)efficiency** G/L – user3666197 Feb 28 '18 at 13:19
1

Call x in range( a ) slow? ( Notice py2 hidden RISK if using range() ... )

       23[us] spent in [py2] to process ( x in range( 10E+0000 ) )
        4[us] spent in [py2] to process ( x in range( 10E+0001 ) )
        3[us] spent in [py2] to process ( x in range( 10E+0002 ) )
       37[us] spent in [py2] to process ( x in range( 10E+0003 ) )
      404[us] spent in [py2] to process ( x in range( 10E+0004 ) )
     4433[us] spent in [py2] to process ( x in range( 10E+0005 ) )
    45972[us] spent in [py2] to process ( x in range( 10E+0006 ) )
   490026[us] spent in [py2] to process ( x in range( 10E+0007 ) )
  2735056[us] spent in [py2] to process ( x in range( 10E+0008 ) )

MemoryError

A syntax of in range( a ) constructor is not only slow in [TIME]-domain, having --at best-- O(log N), if done smarter, than a pure sequential search through the enumerated domain of list-ed values, but
in
py2, the native range() always has also a composite add-on O( N ) costs of both the [TIME]-domain costs ( a time to build ) and also the [SPACE]-domain costs ( allocating the space to store + spending more time to put all those data through ... ) of such a range-based memory-representation construction.


Let's benchmark a safe, O( 1 ) scaled approach ( +always do benchmark )

>>> from zmq import Stopwatch
>>> aClk = Stopwatch()
>>> a = 123456789; x = 123456; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
4L
>>> a = 123456789; x = 123456; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
3L

It takes 3 ~ 4 [us] to evaluate the condition-based formulation, having O( 1 ) scaling, invariant to x magnitude.


Next, test the very same using an x in range( a ) formulation:

>>> a = 123456789; x = 123456; aClk.start(); _ = ( x in range( a ) );aClk.stop()

and your machine will almost freeze in memory-throughput bound CPU-starvations ( not mentioning the nasty swap spillovers from costs ranges of some ~ 100 [ns] several orders of magnitude higher into some ~ 15.000.000 [ns] costs of swap-disk IO data-flows ).


No, no, no. Never a way to test x being inside a bounded range.

Ideas to create some other, class based evaluator, that still approaches the problem via an enumeration ( set ) will never be able to meet the benchmarked 3 ~ 4 [us] ( if not using some extraterrestrial wizardry beyond my understanding of cause-effect laws in classical and quantum physics )


Python 3 has changed the way, how the range()-constructor works, but this was not the core merit of the original post:

    3 [us] spent in [py3] to process ( x in range( 10E+0000 ) )
    2 [us] spent in [py3] to process ( x in range( 10E+0001 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0002 ) )
    2 [us] spent in [py3] to process ( x in range( 10E+0003 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0004 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0005 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0006 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0007 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0008 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0009 ) )
    2 [us] spent in [py3] to process ( x in range( 10E+0010 ) )
    1 [us] spent in [py3] to process ( x in range( 10E+0011 ) ) 

In Python 2, neither range() not xrange() escape from the trap of O( N ) scaling, where xrange()-generator seems to operate about just 2x less slow

>>> from zmq import Stopwatch
>>> aClk = Stopwatch()

>>> for expo in xrange( 8 ):
...     a = int( 10**expo); x = a-2; aClk.start(); _ = ( x in range( a ) );aClk.stop()
...
3L
8L
5L
40L
337L
3787L
40466L
401572L
>>> for expo in xrange( 8 ):
...     a = int( 10**expo); x = a-2; aClk.start(); _ = ( x in xrange( a ) );aClk.stop()
...
3L
10L
7L
77L
271L
2772L
28338L
280464L

The range-bounds syntax enjoys O( 1 ) constant time of ~ < 1 [us], as demonstrated already above, so the yardstick to compare agains was set:

>>> for expo in xrange( 8 ):
...     a = int( 10**expo); x = a-2; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
...
2L
0L
1L
0L
0L
1L
0L
1L
user3666197
  • 1
  • 6
  • 43
  • 77
  • Not my downvote, but you basically just repeat what OP already said: That `in range` is slow (at least in Python 2). You don't mention why that is (although it can be read between the lines that you know why; maybe it seemed just too obvious?) – tobias_k Feb 27 '18 at 19:55
  • @user3666197 and if you think it is important to learn benchmark approach and such --- comments under the question are the place. Not the answers. – mike239x Feb 27 '18 at 20:31
  • @tobias_k **Right**, normally I lead my students to better get hands dirty first on collecting quantitative evidence from testing and then trying to formulate why that happens. Telling the result never helps more than getting one's own way from hypothesis to conclusion. Finally, the **anti-pattern test in the second half is exactly such a way**, might be better if being run on a growing scale of a ( 1E3, 1E4, 1E5, 1E6, 1E7, 1E8, 1E9 ) so as to smell the smoke, after in-cache and in-RAM computing starts to swap and almost freeze the O/S performance. – user3666197 Feb 27 '18 at 20:50
  • @mike239x You concentrate on the word awful -- did you notice the subject ? **The RISK hidden in your code INDEED _IS_ AWFUL**. Serious and resposible design does not leave room to devastate the system. So why to risk a frozen Operating System ( swapping memory2disk till dead ) if the **result does not cost more than `~ 3 [us]`** in due shape and fashion, **independently of the size of `a`** till anywhere above `1E+123...` in unrestricted precision python can deliver -- did you realise what is the root cause of the problem ? Seems the word "AWFUL" just made you miss the core message. My fault. – user3666197 Feb 27 '18 at 21:00
  • @mike239x StackOverflow is not a PR-site. If reading ***(cit.):** "...might be read as "you call this slow? you are wrong" **implying** "it is fast"... after which **a person would not read any further and simply downvote**."* Well, such person is very self-evicted. Having met more than ~ 510.000 readers attention here on StackOverflow, only a tiny fraction decided to express behaviour that you sketch in your explanation. Right, people who react without any further re-thinking about the text are students lost in hate. The very 1st row in the answer is clear & sound "not only **slow**..." – user3666197 Feb 27 '18 at 21:12
  • @user3666197 ok, in this case you better get a reply on "why downvote" from someone else, not me. Btw I am happy you actually changed first line. – mike239x Feb 27 '18 at 22:02
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/165934/discussion-between-mike239x-and-user3666197). – mike239x Feb 27 '18 at 22:05
0

So, yeah, basically, using range in Python 2 (as described) is a bad idea - python actually creates a list with all the values of the range + afterwards it searches through the whole list in the most straight forward way.

One of the options is the following: use range from Python 3, which handles the situation much better for various reasons. "Well", you ask, "how do I use range from Python 3 in Python 2"? The answer is: using future library. Install that, write down

from future.builtins import range

in your code header and wuolah!- your range now behaves as the one from Python 3 and now you can use x in range(a) again, without any performance issues.

mike239x
  • 195
  • 2
  • 9