6

For my project, I have to check the status of a website (on a shared hosting).

I use Python requests library.

def getStatusCode(url):
    try:
        return requests.head(url,timeout=0.3).status_code
    except:
        return -1

This code works great under MacOS 10.10 with Python3.4 with an url like http://www.google.com. If I unplug my ISP cable, I immediately got an exception.

Under Ubuntu Server 14.04 with Python3.4, if I unplug my ISP cable, I never get a timeout error. Same problem on Raspbian.

After some tests, if I replace the url with an IP http://216.58.212.100, Ubuntu Server raise me an exception, but as I'm on a shared web hosting so I can't use an IP.

After some research I found there is a difference between timeout in requests library and DNS lookup that not performed by it but by the OS.

So my question is what is the most beautiful way to solve this ? Do I need to add extra timeout exception in Python like : Timeout on a function call

Thank you

Community
  • 1
  • 1
Kantium
  • 320
  • 2
  • 9
  • Looks like you answered your own question (which is encouraged on SO). I don't know of a more "beautiful" solution. I would post your final code and links to any relevant research about timeouts differing by OS (which will be useful to other people who read this). – Charlie Apr 06 '15 at 13:39
  • Thank you Charlie. The problem of solutions like the link I posted or [here](http://stackoverflow.com/questions/2281850/timeout-function-if-it-takes-too-long-to-finish) is that they have a resolution of one second minimum and I was hopping for a millisecond resolution. Maybe it's possible to forge a request based on IP and edit the header tu avoid the problem of shared hosting. I will post my solution if there is no better one :) – Kantium Apr 06 '15 at 14:07
  • Now I'm curious - why do you need millisecond resolution for a URL request? If you need to kick off thousands of requests in parallel than that will take some work. For example see [here](http://stackoverflow.com/questions/14245989/python-requests-non-blocking). – Charlie Apr 06 '15 at 14:29
  • In fact, I'm looking to monitor my ISP connection. I have micro-disconnections and I would like a resolution of a half second. Using a web request is not very efficient but it's really less complicated than parsing a ping answer through popen or a subprocess. But as I'm learning Python I surely missed something and I will take any better propositions. – Kantium Apr 06 '15 at 14:44

2 Answers2

1

Based on Charlie's encouragement, I post here my two solutions

For the first one I added the host in the request header, so I can put the IP address as url and avoir DNS lookup.

def getStatusCode(url):
    headers = {'host': 'www.example.com'}
    try:
        return requests.head(url,timeout=0.3,headers=headers).status_code
    except:
        return -1

print(getStatusCode('http://1.2.3.4'))

The second solution is based on the use of signals but have a resolution of one second.

class timeout:
    def __init__(self, seconds=1, error_message='Timeout'):
        self.seconds = seconds
        self.error_message = error_message
    def handle_timeout(self, signum, frame):
        raise TimeoutError(self.error_message)
    def __enter__(self):
        signal.signal(signal.SIGALRM, self.handle_timeout)
        signal.alarm(self.seconds)
    def __exit__(self, type, value, traceback):
        signal.alarm(0)

def getStatusCode(url):
    try:
        return requests.head(url,timeout=0.3).status_code
    except:
        return -1

with timeout(seconds=1):
    print(getStatusCode('http://www.example.com'))

(This solution is from Thomas Ahle at https://stackoverflow.com/a/22348885/3896729)

Community
  • 1
  • 1
Kantium
  • 320
  • 2
  • 9
0

Now that I have a better understanding of your problem - I think there's a better approach which is to to use your OS ping application which shouldn't be hard to do in Python - for example. You should also average 1000s of requests and look at mean, standard deviation, outliers, etc. The reason for this is that if one request takes say 500ms and you want a resolution of 1ms you will need to spawn at least 500 requests to get anything close to the resolution you want.

The problem with using Pythons urllib(2) is that it won't perform nearly as well as a system level call so you'll have difficulty spawning enough threads to get the timing resolution you want.

Finally, I would check your result again a commercial product to make sure your results are similar. For example (no affiliation): http://www.thinkbroadband.com/ping.

Community
  • 1
  • 1
Charlie
  • 1,398
  • 2
  • 14
  • 30