2

I have a list of some links (which are pdf files). I'm running wget.download() for each link in the list. However, only some of them get downloaded and then I get:

File "/home/.local/lib/python3.6/site-packages/wget.py", line 526, in download
    (tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
  File "/usr/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

I also tried to use r = requests.get(link) but the issue is still the same and additionally, I get this error:

  File "/usr/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp

Example of link that does not get downloaded:

https://cnds.jacobs-university.de/courses/sads-2020/p6.pdf

If I open the link on my browser, I get the download. Also, this method was working until a few months back. Idk why it's not working anymore

Jbd
  • 295
  • 1
  • 10

1 Answers1

0

I can't comment as of yet, but have you tried downloading the files with a browser manually? If that works then the issue must be that wget can't reach the target for some reason.

  • It's a good place to start. I have also occasionally seen weird behavior around SSL/TLS certificates, as well as bizarre behavior through proxies, that masquerade as other HTTP errors. – Frank Jun 20 '20 at 09:42
  • Yes, if I open the link on my browser, I get the download. Also, this was working until a few months back. Idk why it's not working anymore @Frank – Jbd Jun 20 '20 at 09:48
  • Is there some reason you chose `wget`? I'd tend to use `requests` directly. – Frank Jun 20 '20 at 09:51
  • That doesn't work either. The issue is still the same, only some links get downloaded @Frank – Jbd Jun 20 '20 at 11:40