Redirection in browser is different from python code

Question

I want to open a link from a website using Python, so here is the flow:

I open the main URL (e.g. www.url1.com)
I scrape the page and find the button, it has a redirection link (www.url2.com)
when I use this link in browser it redirects to (www.url3.com) then immediately goes to another (Required link) (www.url4.com)
When I try the same flow using Python requests, it only goes to (www.url3.com)
I tried using the allow_redirects argument without any success

Here is my code:

import requests

headers = {
    'User-Agent': '',
    'authority': '',
    'scheme': '',
    'accept': '',
    'x-requested-with': '',
    'cookie': '',
    'referer': 
    }


def download(req):      
    resp = requests.get(req, headers=headers, allow_redirects=True)
    print(resp.text)

I also tried to print history using this answer.

but it keeps redirecting me too (url3)

If url3 redirects the browser using a `refresh meta tag`, `requests` will not follow it even with `allow_redirects` enabeled as it does not parse the html. [How to follow meta refresh in python](https://stackoverflow.com/questions/2318446/how-to-follow-meta-refreshes-in-python) — wuerfelfreak, Aug 11 '19 at 08:44

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

1

It's quite difficult to give a full answer without having the actual URLs you are using. That being said I think the problem is due to the fact that you are not saving/keeping track of the cookies, for that I would recommend you using requests.session() when sending requests as it keeps track of the cookies for you.

All in all, I would recommend trying the following code:

import requests

session = requests.session()

headers = {
    'User-Agent': '',
    'authority': '',
    'scheme': '',
    'accept': '',
    'x-requested-with': '',
    'cookie': '',
    'referer': 
    }


def download(req):
    global session
 
    resp = session.get(req, headers=headers, allow_redirects=True)
    print(resp.text)

(PS: if you are scrapping a website I would highly recommend you use a User-Agent in the headers instead of leaving it blank)

Hope this helps

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 11 '19 at 07:19

Nazim Kerimbekov

3,965
6
23
48

The referer might also change the behavior of url3 – wuerfelfreak Aug 11 '19 at 08:38
@wuerfelfreak indeed! It’s difficult to say without the actual urls/website. – Nazim Kerimbekov Aug 11 '19 at 09:03
@Fozoro thank you I tried sessions but no luck yet, All the headers here are filled. if you want the actual urls: url3(which opens on button click) : https://www.egy.best/api?call=NaaapmDdapaqapVcaVUazFTqazzVVVazpVYpqpapjYapaqapVUwqjmVqVVKcVapqpapxgIapaqapmVDvEumzxghwBapqpapwDYpaqapeDYDwDukhqzapqpapmDwapaqapVccqVxpVupqpapKeLUmDxKupaqVcaVUVaqTaVzxaVVaU&auth=7445579426cd5bd4c63e7ae0b8a464f1 But in browser it redirect to another one, while in python request, the response in the html of url3 and no further redirections – Aya Aug 16 '19 at 09:38

Redirection in browser is different from python code

1 Answers1