0

I want to know if the list created (instantiated) and used in for loop will reduce efficiency of my program.

For example:

for i in range(1, 10000):
     print("This i = ", i)

Please tell me if the list [1,2,3,...,10000] (which is range(1,10000)) will be generated (or instantiated) at every iteration or not. Because if Yes, then this is a huge overhead and inefficient program.

Actually I want to use it like this:

with open("bbc.txt", 'w', encoding='utf-8') as bbcFile:
    for headline in BS(REQ.get("https://www.bbc.com").text, 'html.parser').find_all('div', {'class':'media__content'}):
        bbcFile.write(" ".join(headline.text.split()) + "\n\n")
Ali Sajjad
  • 828
  • 5
  • 16
  • 3
    A `range` relies on a generator, so no, no list will be generated 10000 times – Thomas Schillaci Mar 12 '20 at 10:33
  • 2
    In Python 3... `range(...)` doesn't produce a list at all... – Jon Clements Mar 12 '20 at 10:33
  • 2
    There is no `range()` in your "actual" example. BeautifulSoup _will_ construct in-memory lists at some point, though. – AKX Mar 12 '20 at 10:34
  • 4
    @ThomasSchillaci Not exactly a generator. It's a custom object, check out https://stackoverflow.com/questions/30081275/why-is-1000000000000000-in-range1000000000000001-so-fast-in-python-3 – TerryA Mar 12 '20 at 10:34
  • Actually it's an iterable - but a lazy one indeed. – bruno desthuilliers Mar 12 '20 at 10:34
  • @brunodesthuilliers well... can also act as a sequence in terms of indexing due to some maths... – Jon Clements Mar 12 '20 at 10:36
  • @JonClements thanks for the update, I missed that point obviously. – bruno desthuilliers Mar 12 '20 at 10:37
  • Not that one wants to do something like: `range(12345, 1000000000, 74)[-7]` or some other convoluted example often :) – Jon Clements Mar 12 '20 at 10:37
  • @AKX I actually though that range(a, b) is a collection and the BeatifulSoup resultSet having all headlines tags is a collection as well. – Ali Sajjad Mar 12 '20 at 10:48
  • @AliSajjad it *is* a collection, it is not a generator at all, it isn't even an iterator. It does **not** produce a `list` object though, but a `range` object, which is a specialized sequence-like object that does not store all the items it "contains" in memory at once, rather, arithmetic sequences can be represented efficiently with simply a `start`, `stop`, `step` – juanpa.arrivillaga Mar 12 '20 at 10:49

2 Answers2

2

In a Python for-statement, as defined by the docs:

for_stmt ::=  "for" target_list "in" expression_list ":" suite
              ["else" ":" suite]

According to the aforementioned docs,

The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list. The suite is then executed once for each item provided by the iterator, in the order returned by the iterator. Each item in turn is assigned to the target list using the standard rules for assignments (see Assignment statements), and then the suite is executed.

So no, whatever expression you are using in to produce an iterable is only evaluated once. You could test this out yourself:

>>> class MyIterable:
...     def __init__(self):
...         print("Initialized")
...     def __iter__(self):
...         yield from (1,2,3)
...
>>> for x in MyIterable():
...     print(x)
...
Initialized
1
2
3
>>>
juanpa.arrivillaga
  • 65,257
  • 7
  • 88
  • 122
1

For Python 3, no. range(1, 10000) creates a range object that produces items when necessary:

>>> range(1, 10000)
range(1, 10000)
>>> type(range(1, 10000))
<class 'range'>

So there is never a list [1, ..., 10000] stored in memory.

A great SO question to check out is this one, which explains the range object.

TerryA
  • 52,957
  • 10
  • 101
  • 125
  • i think the issue is whether the expression for the iterable in the for-statment is re-evaluated on each iteration, so take for example `for x in list(range(10000)): ...` – juanpa.arrivillaga Mar 12 '20 at 10:43