-2

in outer loop i=156 ,it is not looping for other values of i=1,2,3,4.......I am trying to run this for loop in scrapy python . But first loop is not working

for i in range(1,157):
     start_urls = ["https://appworld.blackberry.com/cas/producttype/apps?"
          "countryid=100&lang=en&page={}&pagesize={}&"
          "sortby=popular&licensetype=all&callback=_producttype_apps"
          "&_=1499255634459".format(i, page) for page in xrange(1,1001)]

The first loop outside the start_urls is not working The results required is like this for page=1 pagesize should run from 1 to 1000 and when page=2 pagesize should again run from 1 to 1000 and so on till page=156

paul trmbrth
  • 19,235
  • 3
  • 47
  • 62
emon
  • 35
  • 6
  • 4
    Possible duplicate of [Python TypeError: not enough arguments for format string](https://stackoverflow.com/questions/11146190/python-typeerror-not-enough-arguments-for-format-string) – syntonym Jul 05 '17 at 12:46
  • 6
    the `for` loop is fine, your problem is that you are misusing string formatting. have a look at https://pyformat.info/ – Stael Jul 05 '17 at 12:46
  • 3
    You get a `TypeError` that states that your string formatting is wrong and you assume that the problem is the `for` loop? – UnholySheep Jul 05 '17 at 12:47
  • I don't think you're supposed to have multiple percent signs like that. As a simpler example, compare the results of `"%d%d" %23 %42` and `"%d%d" % (23,42)`. – Kevin Jul 05 '17 at 12:48
  • What exactly do you mean with "is not working"? Does it throw an error? If that is the case, can you post the complete traceback? If it doesn't throw an error is the result you get and the result you expect different? If yes, what result would you expect and what result do you get? – syntonym Jul 05 '17 at 13:19
  • @syntonym the value of i for first loop is starting from 156 not from 1 – emon Jul 05 '17 at 13:23
  • @emon - you are reassigning the whole list with each iteration, so only the final iteration exists by the time you exit the loop. I've put in an answer explaining and fixing this. – Stael Jul 05 '17 at 13:27

4 Answers4

2

You are incorrectly using % format strings, you could create a tuple() but it may be easier to use .format(). You can also split strings for better formatting:

start_urls = ["https://appworld.blackberry.com/cas/producttype/apps?"
              "countryid=100&lang=en&page={}&pagesize={}&"
              "sortby=popular&licensetype=all&callback=_producttype_apps&"
              "_=1499255634459".format(i, page) for page in xrange(1,1001)]
AChampion
  • 26,341
  • 3
  • 41
  • 64
  • I think you forget to add' page' after 'for' "for page in xrange(1,1001)" – Tejas Thakar Jul 05 '17 at 12:50
  • Thank you, fixed. And given the args, I imagine the OP has the variables the wrong way around too. – AChampion Jul 05 '17 at 12:51
  • I am new to python, first loop outside start_url is not running – emon Jul 05 '17 at 12:59
  • It is running but I imagine you are not saving each time around the loop and are only seeing the last list of `start_urls`. Perhaps you meant `start_urls = []` outside of the loop and `start_urls.extend(['https://...])` inside the loop. – AChampion Jul 05 '17 at 14:01
1

Hello Emon,

Try this code, In this below code use "\" so do not confuse because it is used for multiline code write and this functionality is python inbuilt provides.

for i in range(1,157):
    start_urls = ['https://appworld.blackberry.com/cas/producttype/apps?' \
            'countryid=100&lang=en&' \
            'page=%d' \
            '&pagesize=%s'\
            '&sortby=popular&licensetype=all&callback=_producttype_apps&_=1499255634459' % (i,page) for page in xrange(1,1001)]

    # Display starts_urls 
    print(start_urls)

Using the Format Method,
Accessing Arguments by Position

for i in range(1,157):
    start_urls = ['https://appworld.blackberry.com/cas/producttype/apps?' \
            'countryid=100&lang=en&' \
            'page={0}' \
            '&pagesize={1}' \
            '&sortby=popular&licensetype=all&callback=_producttype_apps&_=1499255634459'.format(i,page) for page in xrange(1,1001)]

When you use long any type string/content so I suggest that use the format method because of any time easily you understand the code.

More example for this below link read,
1. https://wiki.python.org/moin/ForLoop
2. https://www.codeproject.com/Tips/820236/Python-String-Formatting-Using-format-Method

I hope my answer is helpful. Any query so comments, please.

Er CEO Vora Mayur
  • 894
  • 1
  • 14
  • 24
0

what your are probably trying to do is

start_urls = ["https://appworld.blackberry.com/cas/producttype/apps?"
      "countryid=100&lang=en&page={}&pagesize={}&"
      "sortby=popular&licensetype=all&callback=_producttype_apps"
      "&_=1499255634459".format(i, page) for page in xrange(1,1001) for i in range(1,157)]

if the order matters, you can do it the other way around by reversing the for loops:

start_urls = ["https://appworld.blackberry.com/cas/producttype/apps?"
              "countryid=100&lang=en&page={}&pagesize={}&"
              "sortby=popular&licensetype=all&callback=_producttype_apps"
              "&_=1499255634459".format(i, page) for i in range(1,157) for page in xrange(1,1001)]

you're using two different loops in different ways - your use of the outer for loop is wrong. Think about this:

for i in range(10):
    x = i

for each iteration you assign the variable x with the value of i, but before you do anything else you reassign it to the next i. That is essentially what you're doing in your code.

alternatives include appending to a list:

x = []
for i in range(10):
    x.append(i)

or list comprehensions like you are using in your code

x = [i for i in range(10)]

obviously these are dumb examples because they are just reproducing the list from range(10)

Stael
  • 2,299
  • 9
  • 18
  • In my case how shall i use it , sorry I am new to loops – emon Jul 05 '17 at 13:33
  • @emon replace all the code you posted with the first code block from my answer (so you have 2 for loops inside the list comprehension). That should work, giving you a list of URLs for each page (1 to 1000) and each i (1 to 157) – Stael Jul 05 '17 at 13:47
  • @your code is not giving the desire results , the results required is like this for page=1 pagesize should run from 1 to 1000 and when page=2 pagesize should again run from 1 to 1000 and so on till page=156 – emon Jul 05 '17 at 13:54
  • @emon if the order matters I have added how to change it. – Stael Jul 05 '17 at 14:06
-1

Try converting the list into a tuple first Then look for formatter sensitively affects.