-1

In Python 3, how is the line zip(*(range(1000),)*1000000) executed in less then a second even though it should have to process a lot of items?

Edit: This question has been marked as a possible duplicate of Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?. While that question addresses the range function, this question addresses the zip() function and why is can go so quickly.

Anonymous
  • 715
  • 2
  • 14
  • 29
  • 1
    Why not? It's just a million copies of a range object. – L3viathan Sep 14 '19 at 16:32
  • 3
    Well, all it does is create a `range` object, create a tuple of length 1000000, and call a function that does nothing. – Aran-Fey Sep 14 '19 at 16:33
  • @L3viathan it is the fact that is zips them all in less then a second, not the range calculations themselves. – Anonymous Sep 14 '19 at 16:33
  • 6
    It doesn't zip them. `zip` is lazy. It does nothing until you iterate over it. – Aran-Fey Sep 14 '19 at 16:34
  • @Aran-Fey Please post this as an answer – Anonymous Sep 14 '19 at 16:41
  • 2
    **Warning**: Don't run this on Python 2! It nearly crashed my computer. `range` and `zip` behave differently between 2 and 3. – wjandrea Sep 14 '19 at 16:41
  • 2
    Meh, I'd rather not. We've got plenty of questions with the answer "zip is lazy in py3", I'm not convinced we need more of them – Aran-Fey Sep 14 '19 at 16:46
  • 1
    Related: [Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?](https://stackoverflow.com/q/30081275/4518341), [Lazy evaluation in Python](https://stackoverflow.com/q/20535342/4518341) – wjandrea Sep 14 '19 at 16:49
  • 1
    AFAIK zip() returns an iterator. Iterating over all that content will probably take more than a second. – Epion Sep 14 '19 at 17:34

1 Answers1

4

Because starting in python3, zip returns an iterator, so it actually generates items as you request them.

Same goes for range, map and filter.

Previously, there used to be range and xrange, zip and izip, where the first one creates the whole sequence in memory then starts iterating on it, while the second generates elements on the fly. Starting in python3, they removed the other version and the default one is the iterator.

  • range() now behaves like xrange() used to behave, except it works with values of arbitrary size. The latter no longer exists.

  • zip() now returns an iterator.

wjandrea
  • 16,334
  • 5
  • 30
  • 53
Diaa Sami
  • 3,117
  • 22
  • 28
  • 2
    I noticed you wrote "generator" instead of "iterator" so I fixed it, but if you want to read more about the difference, this is a good place to start: [Difference between Python's Generators and Iterators](https://stackoverflow.com/q/2776829/4518341) – wjandrea Sep 14 '19 at 18:32