2

I'm using the requests library to fetch a simple URL (I've put a dummy URL here, a normal URL is used in code):

import requests
response = requests.get("http://example.com/foo/bar/", headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"})

Locally it works fine, but when I put the same code on my server, this request takes forever to finish. I've enabled logging output for all of these loggers:

urllib3.util.retry
urllib3.util
urllib3
urllib3.connection
urllib3.response
urllib3.connectionpool
urllib3.poolmanager
requests

This is the only output produced by them:

2018-05-31 19:55:56,894 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): example.com
2018-05-31 19:58:06,676 - urllib3.connectionpool - DEBUG - http://example.com:80 "GET /foo/bar/ HTTP/1.1" 200 None

The funny thing is that it always takes exactly 2 minutes and 10 seconds for the request to finish (if you disregard milliseconds). Locally it's instant.

Any clues where I should look next?

ShadowRanger
  • 108,619
  • 9
  • 124
  • 184
morgoth84
  • 950
  • 1
  • 9
  • 21
  • How big is the file being pulled? What sort of downstream bandwidth does your server guarantee, and how much is already used to service existing connections? Servers are frequently underprovisioned on downstream bandwidth, especially servers marketed to people who largely serve static content. If you run a plain `wget` from the server, is it equally slow? – ShadowRanger May 31 '18 at 20:16
  • A good idea! Yes, wget is equally slow, strange. It's a Linode server, totally empty and unused (except for the OS itself). The website is a fairly normal, old-fashioned one. Downloaded it's 40KB. – morgoth84 May 31 '18 at 20:20
  • 2
    Given `wget` has the same issues, it's a safe bet this has nothing to do with Python (or programming at all), and everything to do with the server's network setup/config. Maybe ask over at Server Fault? – ShadowRanger May 31 '18 at 20:26
  • Yeah, I think I'll do that. Thanks! – morgoth84 May 31 '18 at 20:39

1 Answers1

1

This sounds like there is an issue with the underlying IPv6 connection. The fact that the request takes exactly 2 minutes and 10 seconds is a give-away, because that suggests the IPv6 request times out.

Verify by utilizing, e.g., wget or curl:

wget --inet6-only https://www.example.com -O - > /dev/null
# or
curl --ipv6 -v https://www.example.com

In both cases, we force the tool to connect via IPv6 to isolate the issue. If this times out, try again forcing IPv4:

wget --inet4-only https://www.example.com -O - > /dev/null
# or
curl --ipv4 -v https://www.example.com

If this works fine, you have found your problem! But how to solve it, you ask?

  1. A brute-force solution is to disable IPv6 completely.
  2. You may also disable IPv6 for the current session only.
  3. You may just want to force requests to use IPv4. (In the linked answer, you have to adapt the code to always return socket.AF_INET for IPv4.)
  4. If you want to fix this problem for SSH, here is how to force IPv4 for SSH. (In short, add AddressFamily inet to your SSH config.)
  5. You may also want to check if the problem lies with your DNS or TCP.

Didn't solve your issue?

If that did not solve your issue, I have collected some other possible solutions here.

vauhochzett
  • 1,035
  • 11
  • 25