289

Recently I started using Python3 and it's lack of xrange hurts.

Simple example:

1) Python2:

from time import time as t
def count():
  st = t()
  [x for x in xrange(10000000) if x%4 == 0]
  et = t()
  print et-st
count()

2) Python3:

from time import time as t

def xrange(x):

    return iter(range(x))

def count():
    st = t()
    [x for x in xrange(10000000) if x%4 == 0]
    et = t()
    print (et-st)
count()

The results are, respectively:

1) 1.53888392448 2) 3.215819835662842

Why is that? I mean, why xrange's been removed? It's such a great tool to learn. For the beginners, just like myself, like we all were at some point. Why remove it? Can somebody point me to the proper PEP, I can't find it.

Cheers.

Air
  • 7,120
  • 2
  • 47
  • 78
catalesia
  • 3,100
  • 2
  • 11
  • 9
  • 245
    `range` in Python 3.x is `xrange` from Python 2.x. It was in fact Python 2.x's `range` that was removed. – Anorov Feb 21 '13 at 23:43
  • 31
    PS, you should never time with `time`. Besides being easier to use and harder to get wrong, and repeating tests for you, `timeit` takes care of all kinds of things you won't remember, or even know how, to take care of (like disabling the GC), and may use a clock with thousands of times better resolution. – abarnert Feb 22 '13 at 00:06
  • 9
    Also, why are you testing the time to filter the `range` on `x%4 == 0`? Why not just test `list(xrange())` vs. `list(range())`, so there's as little extraneous work as possible? (For example, how do you know 3.x isn't doing `x%4` more slowly?) For that matter, why are you building a huge `list`, which involves a whole lot of memory allocation (which, besides being slow, is also incredibly variable)? – abarnert Feb 22 '13 at 00:14
  • 6
    See http://docs.python.org/3.0/whatsnew/3.0.html , section "Views And Iterators Instead Of Lists": "range() now behaves like xrange() used to behave, except it works with values of arbitrary size. The latter no longer exists." So, range now returns an iterator. `iter(range)` is redundant. – ToolmakerSteve Dec 14 '13 at 23:02
  • 10
    Sorry, realized quoting the change doc doesn't make it blindingly obvious. For anyone else who is confused, and doesn't want to read through the long accepted answer, and all its comments: **Wherever you were using xrange in python 2, use range in python 3. It does what xrange used to do, which is return an iterator. If you need the results in a list, do `list(range(..))`. That is equivalent to python 2's range.** Or to say it another way: **xrange has been renamed range, because it is the better default; it wasn't necessary to have both, do `list(range)` if you really need a list.**. – ToolmakerSteve Dec 14 '13 at 23:12
  • Clearly, ``list(range(...))`` is a necessary change. – Wes Turner Nov 08 '14 at 00:08

6 Answers6

183

Some performance measurements, using timeit instead of trying to do it manually with time.

First, Apple 2.7.2 64-bit:

In [37]: %timeit collections.deque((x for x in xrange(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.05 s per loop

Now, python.org 3.3.0 64-bit:

In [83]: %timeit collections.deque((x for x in range(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.32 s per loop

In [84]: %timeit collections.deque((x for x in xrange(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.31 s per loop

In [85]: %timeit collections.deque((x for x in iter(range(10000000)) if x%4 == 0), maxlen=0) 
1 loops, best of 3: 1.33 s per loop

Apparently, 3.x range really is a bit slower than 2.x xrange. And the OP's xrange function has nothing to do with it. (Not surprising, as a one-time call to the __iter__ slot isn't likely to be visible among 10000000 calls to whatever happens in the loop, but someone brought it up as a possibility.)

But it's only 30% slower. How did the OP get 2x as slow? Well, if I repeat the same tests with 32-bit Python, I get 1.58 vs. 3.12. So my guess is that this is yet another of those cases where 3.x has been optimized for 64-bit performance in ways that hurt 32-bit.

But does it really matter? Check this out, with 3.3.0 64-bit again:

In [86]: %timeit [x for x in range(10000000) if x%4 == 0]
1 loops, best of 3: 3.65 s per loop

So, building the list takes more than twice as long than the entire iteration.

And as for "consumes much more resources than Python 2.6+", from my tests, it looks like a 3.x range is exactly the same size as a 2.x xrange—and, even if it were 10x as big, building the unnecessary list is still about 10000000x more of a problem than anything the range iteration could possibly do.

And what about an explicit for loop instead of the C loop inside deque?

In [87]: def consume(x):
   ....:     for i in x:
   ....:         pass
In [88]: %timeit consume(x for x in range(10000000) if x%4 == 0)
1 loops, best of 3: 1.85 s per loop

So, almost as much time wasted in the for statement as in the actual work of iterating the range.

If you're worried about optimizing the iteration of a range object, you're probably looking in the wrong place.


Meanwhile, you keep asking why xrange was removed, no matter how many times people tell you the same thing, but I'll repeat it again: It was not removed: it was renamed to range, and the 2.x range is what was removed.

Here's some proof that the 3.3 range object is a direct descendant of the 2.x xrange object (and not of the 2.x range function): the source to 3.3 range and 2.7 xrange. You can even see the change history (linked to, I believe, the change that replaced the last instance of the string "xrange" anywhere in the file).

So, why is it slower?

Well, for one, they've added a lot of new features. For another, they've done all kinds of changes all over the place (especially inside iteration) that have minor side effects. And there'd been a lot of work to dramatically optimize various important cases, even if it sometimes slightly pessimizes less important cases. Add this all up, and I'm not surprised that iterating a range as fast as possible is now a bit slower. It's one of those less-important cases that nobody would ever care enough to focus on. No one is likely to ever have a real-life use case where this performance difference is the hotspot in their code.

abarnert
  • 313,628
  • 35
  • 508
  • 596
  • But it's only 30% slower. Still slower, but a great response mate, something to think about. It doesn't answer my quesion though: why was xrange removed?? Think about it this way - if you had a performance-dependant app based on multiprocessing knowing how much queue you need to consume a-time, would 30% make a difference or not? You see, you say it doesn't matter, but every time i use range i hear that huge distressing fan sound meaning cpu is on it's worst, while xrange doesn't do it. Think you about it ;) – catalesia Feb 22 '13 at 00:16
  • 11
    @catalesia: Once again, it wasn't removed, it was just renamed `range`. The `range` object in 3.3 is a direct descendant of the `xrange` object in 2.7, not of the `range` function in 2.7. It's like asking while `itertools.imap` was removed in favor of `map`. There is no answer, because no such thing happened. – abarnert Feb 22 '13 at 00:18
  • 1
    @catalesia: The minor performance changes are presumably not the result of a direct design decision to make ranges slower, but a side effect of 4 years of changes all over Python that have made many things faster, some things a little slower (and some things faster on x86_64 but slower on x86, or faster in some use cases but slower in others, etc.). Nobody was likely worried about a 30% difference either way in how long it takes to iterate a `range` while doing nothing else. – abarnert Feb 22 '13 at 00:20
  • 1
    "Nobody was likely worried about a 30% difference either way in how long it takes to iterate a range _while doing nothing else._" Exactly. – catalesia Feb 22 '13 at 00:25
  • 18
    @catalesia: Yes, exactly. But you seem to think that means the opposite of what it says. It's not a use case that anyone will ever care about, so nobody noticed that it got 30% slower. So what? If you can find a real-life program that runs more slowly in Python 3.3 than in 2.7 (or 2.6) because of this, people will care. If you can't, they won't, and you shouldn't either. – abarnert Feb 22 '13 at 00:33
148

Python3's range is Python2's xrange. There's no need to wrap an iter around it. To get an actual list in Python3, you need to use list(range(...))

If you want something that works with Python2 and Python3, try this

try:
    xrange
except NameError:
    xrange = range
John La Rooy
  • 263,347
  • 47
  • 334
  • 476
  • 1
    Sometimes you need code that works in both Python 2 and 3. This is a good solution. – Greg Glockner Jun 14 '17 at 15:10
  • 3
    The trouble is that with this, code that uses both `range` and `xrange` will behave differently. It's not enough to do this, one would also have to make sure never to assume that `range` is returning a list (as it would in python 2). – LangeHaare Sep 01 '17 at 11:47
  • You can use xrange from this project. There is `futurize` tool to automatically convert you source code: http://python-future.org/reference.html?highlight=xrange#past.builtins.xrange – guettli Mar 22 '18 at 20:42
19

Python 3's range type works just like Python 2's xrange. I'm not sure why you're seeing a slowdown, since the iterator returned by your xrange function is exactly what you'd get if you iterated over range directly.

I'm not able to reproduce the slowdown on my system. Here's how I tested:

Python 2, with xrange:

Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import timeit
>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)
18.631936646865853

Python 3, with range is a tiny bit faster:

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import timeit
>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)
17.31399508687869

I recently learned that Python 3's range type has some other neat features, such as support for slicing: range(10,100,2)[5:25:5] is range(20, 60, 10)!

Blckknght
  • 85,872
  • 10
  • 104
  • 150
  • Perhaps the slowdown comes from the lookup of the new `xrange` so many times, or is that done only once? – askewchan Feb 21 '13 at 23:46
  • Does an iterator actually increase speed anyway? I thought it just saved memory. – askewchan Feb 21 '13 at 23:51
  • @askewchan Memory is everything, when it comes to heavy processor operations. I can class that method, but it still gives the same result. More-less. Anyway, I would be glad if anyone pointed me to the PEP explaining th reason behind removing xrange. – catalesia Feb 21 '13 at 23:54
  • @askewchan It depends on how big the set is. Beyond a certain size, memory allocation (and garbage collection) represents a large part of the running time. – Blckknght Feb 21 '13 at 23:54
  • 3
    @catalesia I think the point here is that `xrange` was _not_ removed, just _renamed_. – askewchan Feb 21 '13 at 23:54
  • @askewchan - renamed or not, it is much slower and consumes much more resources than Python 2.6+ Of course, one can build scipy and such, but come'on! I will not understand why a perfectly working solution was removed in order to... Exactly, what?? – catalesia Feb 21 '13 at 23:57
  • The removal of Python 2's `xrange` (with its semantics being given to `range` in Python 3) is mentioned briefly in [PEP 3100](http://www.python.org/dev/peps/pep-3100/#built-in-namespace). – Blckknght Feb 21 '13 at 23:57
  • @catalesia Dunno, I use 2.7 :P – askewchan Feb 21 '13 at 23:58
  • 1
    @Blckknght: Cheers, but it still sucks having an explanation the likes of: "Set literals and comprehensions [19] [20] [done] {x} means set([x]); {x, y} means set([x, y]). {F(x) for x in S if P(x)} means set(F(x) for x in S if P(x)). NB. {range(x)} means set([range(x)]), NOT set(range(x)). There's no literal for an empty set; use set() (or {1}&{2} :-). There's no frozenset literal; **they are too rarely needed.**" – catalesia Feb 22 '13 at 00:03
  • @askewchan: Hehe, 2.6 here, but we have to move on mate ;) – catalesia Feb 22 '13 at 00:03
  • 3
    The biggest win in 3.x `range`, as far as I'm concerned, is the constant-time `__contains__`. Newbies used to write `300000 in xrange(1000000)` and that caused it to iterate the whole `xrange` (or at least the first 30% of it), so we had to explain why that was a bad idea, even though it looks so pythonic. Now, it _is_ pythonic. – abarnert Feb 22 '13 at 00:05
  • Newbies are the base of Python, never forget that. Python, or any language of the sort. G'Night people! – catalesia Feb 22 '13 at 00:27
  • @catalesia: Agreed 100%. That was my point—they replaced `range.__contains__` solely to make it work the way newbies expected—and that was a great change. – abarnert Feb 22 '13 at 00:34
  • @catalesia The reference in PEP 3100 is `Make built-ins return an iterator where appropriate (e.g. range() ...) To be removed: ... xrange(): use range() instead`. The PEP has a reference to a [talk given by Guido](http://www.python.org/doc/essays/ppt/pycon2003/pycon2003.ppt) (the creator of Python) explaining the reason for the change. Part of Python's design philosophy is that "there should be one—and preferably only one—obvious way to do it". Having both `range` and `xrange` violated the "only one" part of that principle. – Blckknght Feb 22 '13 at 01:29
2

One way to fix up your python2 code is:

import sys

if sys.version_info >= (3, 0):
    def xrange(*args, **kwargs):
        return iter(range(*args, **kwargs))
andrew pate
  • 2,773
  • 26
  • 17
  • 1
    The point is in python3 xrange is not defined, so legacy code that used xrange breaks. – andrew pate Jan 23 '17 at 13:36
  • no , simply define `range = xrange` as is in comment by @John La Roy – mimi.vx Jun 19 '18 at 11:22
  • 1
    @mimi.vx Not sure range=xrange would work in Python3 because xrange is not defined. My comment refers to the case where you have old legacy code that contains xrange calls AND your trying to get it to run under python3. – andrew pate Jun 20 '18 at 13:04
  • 1
    Ah , my bad .. `xrange = range` ... i switched statements – mimi.vx Jun 21 '18 at 13:58
  • range *IS* an iiterator, and anyway this would be a terrible idea even if it wasn't, because it has to unpack the the entire range first and loses the advantages of using an iterator for this sort of thing. So the correct response is not "range=xrange" its "xrange=range" – Shayne Jul 01 '18 at 10:22
  • @Shayne I think your right I would go with John La Roy's solution below now, it looks good. – andrew pate Jul 04 '18 at 17:08
1

xrange from Python 2 is a generator and implements iterator while range is just a function. In Python3 I don't know why was dropped off the xrange.

  • No, range is not an interator. You can't do next() with this structure. For further info, you can check here http://treyhunner.com/2018/02/python-range-is-not-an-iterator/ – Michel Fernandes Apr 09 '18 at 00:30
  • Thank you so much for the clarification. But I will restate the intent of the original comment, and that is that PY3 `range()` is the equivalent of PY2 `xrange()`. And thus in PY3 `xrange()` is redundant. – Stephen Rauch Apr 09 '18 at 00:41
-1

comp:~$ python Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2

>>> import timeit
>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)

5.656799077987671

>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)

5.579368829727173

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

21.54827117919922

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

22.014557123184204

With timeit number=1 param:

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=1)

0.2245171070098877

>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=1)

0.10750913619995117

comp:~$ python3 Python 3.4.3 (default, Oct 14 2015, 20:28:29) [GCC 4.8.4] on linux

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

9.113872020003328

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

9.07014398300089

With timeit number=1,2,3,4 param works quick and in linear way:

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=1)

0.09329321900440846

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=2)

0.18501482300052885

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=3)

0.2703447980020428

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=4)

0.36209142999723554

So it seems if we measure 1 running loop cycle like timeit.timeit("[x for x in range(1000000) if x%4]",number=1) (as we actually use in real code) python3 works quick enough, but in repeated loops python 2 xrange() wins in speed against range() from python 3.

dmitriy
  • 11