I am new to scraping websites with python 3. Currently, I am facing an issue that getting a request of a site (www.tink.de) is really slow. Every request takes around 40 seconds. When I am trying my script with other sites, I am getting the request immediately.
I have already read this, this, this and many other stuff around this issue...but I didn't get it solved. I also tried running the script on a different machine and OS and even use a different internet connection.
My current workaround is to use silenium (which is indeed faster), but I would like to solve the problem with the request module.
Can anyone help?
Here is my example code:
import requests
from datetime import datetime
url = 'https://www.tink.de'
headers = {
'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/45.0.2454.101 Safari/537.36')
}
print('Process started! ' + str(datetime.now()))
r = requests.get(url, headers=headers) # I also tried with stream=True
print(r.content)
print('Process finished! ' + str(datetime.now()))
Update, here is my response header:
{'Date': 'Sun, 10 Feb 2019 22:27:15 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Content-Length': '69400', 'Connection': 'keep-alive', 'Server': 'nginx/1.10.3 (Ubuntu)', 'X-Frame-Options': 'SAMEORIGIN', 'X-Aoestatic-Action': 'cms_index_index', 'X-Tags': 'PAGE-14-1', 'X-Aoestatic': 'cache', 'X-Aoestatic-Lifetime': '86400', 'X-Aoestatic-Debug': 'true', 'Expires': 'Mon, 30 Apr 2008 10:00:00 GMT', 'X-Url': '/', 'Cache-Control': 'public', 'X-Aoestatic-Fetch': 'Removed cookie in vcl_backend_response', 'Content-Encoding': 'gzip', 'Vary': 'Accept-Encoding', 'X-Varnish': '134119436 128286748', 'Age': '33396', 'Via': '1.1 varnish-v4', 'X-Cache': 'HIT (2292)', 'Client-ip': '10.XX.XX.XX', 'Accept-Ranges': 'bytes'}
Thanks a lot for your help!