I have a python script that fetches a webpage and mirrors it. It works fine for one specific page, but I can't get it to work for more than one. I assumed I could put multiple URLs into a list and then feed that to the function, but I get this error:

Traceback (most recent call last):
  File "autowget.py", line 46, in <module>
  File "autowget.py", line 43, in getUrl
    response = urllib.request.urlopen(url)
  File "/usr/lib/python3.2/urllib/request.py", line 139, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.2/urllib/request.py", line 361, in open
    req.timeout = timeout
AttributeError: 'tuple' object has no attribute 'timeout'

Here's the offending code:

url = ['https://www.example.org/', 'https://www.foo.com/', 'http://bar.com']
def getUrl(*url):
    response = urllib.request.urlopen(url)
    with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
        shutil.copyfileobj(response, out_file)

I've exhausted Google trying to find how to open a list with urlopen(). I found one way that sort of works. It takes a .txt document and goes through it line-by-line, feeding each line as a URL, but I'm writing this using Python 3 and for whatever reason twillcommandloop won't import. Plus, that method is unwieldy and requires (supposedly) unnecessary work.

Anyway, any help would be greatly appreciated.

Eric Lagergren
  • 481
  • 2
  • 6
  • 18

3 Answers3


In your code there are some errors:

  • You define getUrls with variable arguments list (the tuple in your error);
  • You manage getUrls arguments as a single variable (list instead)

You can try with this code

import urllib2
import shutil

urls = ['https://www.example.org/', 'https://www.foo.com/', 'http://bar.com']
def getUrl(urls):
   for url in urls:
      #Only a file_name based on url string
      file_name = url.replace('https://', '').replace('.', '_').replace('/','_')
      response = urllib2.urlopen(url)
      with open(file_name, 'wb') as out_file:
         shutil.copyfileobj(response, out_file)
Salvatore Avanzo
  • 2,348
  • 1
  • 15
  • 28
  • Thanks - earlier in my code I set the file name to `'/path/to/directory'` plus 'domain' where the 'domain' is the string between `http://www` and `.com` in the URL. The script both FTPs and pushes (via Git to GitHub pages) so I have to have a set file path or else the Git portion won't work. Anyway, thanks again! Appreciate it! – Eric Lagergren Apr 24 '14 at 21:36

It do not support tuple:

urllib.request.urlopen(url[, data][, timeout])
Open the URL url, which can be either a string or a Request object.

And your calling is incorrect. It should be:


And inside the function, use a loop like "for u in url" to travel all urls.

  • 3,205
  • 1
  • 15
  • 21

You should just iterate over your URLs using a for loop:

import shutil
import urllib.request

urls = ['https://www.example.org/', 'https://www.foo.com/']

file_name = 'foo.txt'

def fetch_urls(urls):
    for i, url in enumerate(urls):
        file_name = "page-%s.html" % i
        response = urllib.request.urlopen(url)
        with open(file_name, 'wb') as out_file:
            shutil.copyfileobj(response, out_file)


I assume you want the content saved to separate files, so I used enumerate here to create a uniqe file name, but you can obviously use anything from hash(), the uuid module to creating slugs.

  • 1
  • 1
Lukas Graf
  • 23,458
  • 7
  • 65
  • 81