121

I've been looking through the Python Requests documentation but I cannot see any functionality for what I am trying to achieve.

In my script I am setting allow_redirects=True.

I would like to know if the page has been redirected to something else, what is the new URL.

For example, if the start URL was: www.google.com/redirect

And the final URL is www.google.co.uk/redirected

How do I get that URL?

Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997
Daniel Pilch
  • 1,807
  • 2
  • 20
  • 29
  • Check out [this answer](https://stackoverflow.com/a/4902578/4607733) for dealing with `urllib2 ` – horcrux Nov 23 '19 at 14:12

5 Answers5

182

You are looking for the request history.

The response.history attribute is a list of responses that led to the final URL, which can be found in response.url.

response = requests.get(someurl)
if response.history:
    print("Request was redirected")
    for resp in response.history:
        print(resp.status_code, resp.url)
    print("Final destination:")
    print(response.status_code, response.url)
else:
    print("Request was not redirected")

Demo:

>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/3')
>>> response.history
(<Response [302]>, <Response [302]>, <Response [302]>)
>>> for resp in response.history:
...     print(resp.status_code, resp.url)
... 
302 http://httpbin.org/redirect/3
302 http://httpbin.org/redirect/2
302 http://httpbin.org/redirect/1
>>> print(response.status_code, response.url)
200 http://httpbin.org/get
tommy.carstensen
  • 7,083
  • 10
  • 55
  • 91
Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997
  • 1
    httpbin.org is giving 404s for some reason, but httpbingo.org (same URL scheme) worked just fine for me. – Preston Badeer Oct 26 '20 at 19:33
  • 1
    @PrestonBadeer: This is a known issue: https://github.com/postmanlabs/httpbin/issues/617. It's not crucial that the demo works for the answer, luckily. – Martijn Pieters Oct 26 '20 at 21:29
76

This is answering a slightly different question, but since I got stuck on this myself, I hope it might be useful for someone else.

If you want to use allow_redirects=False and get directly to the first redirect object, rather than following a chain of them, and you just want to get the redirect location directly out of the 302 response object, then r.url won't work. Instead, it's the "Location" header:

r = requests.get('http://github.com/', allow_redirects=False)
r.status_code  # 302
r.url  # http://github.com, not https.
r.headers['Location']  # https://github.com/ -- the redirect destination
unkulunkulu
  • 10,526
  • 2
  • 27
  • 47
hwjp
  • 13,059
  • 6
  • 66
  • 68
  • Thank you - this boosted my URL referral script (which had thousands of urls) by several seconds. – ahinkle Jan 12 '17 at 17:33
  • Do you know what is up with `r.next`? I thought that would contain a `PreparedRequest` pointing to the redirect URL, but that does not seem to be the case... – Elias Strehle Aug 17 '18 at 13:55
41

the documentation has this blurb https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

import requests

r = requests.get('http://www.github.com')
r.url
#returns https://www.github.com instead of the http page you asked for 
FelixEnescu
  • 3,364
  • 1
  • 30
  • 33
Back2Basics
  • 6,301
  • 1
  • 30
  • 39
36

I think requests.head instead of requests.get will be more safe to call when handling url redirect,check the github issue here:

r = requests.head(url, allow_redirects=True)
print(r.url)
Geng Jiawen
  • 7,297
  • 2
  • 39
  • 36
10

For python3.5, you can use the following code:

import urllib.request
res = urllib.request.urlopen(starturl)
finalurl = res.geturl()
print(finalurl)
Shuai.Z
  • 327
  • 2
  • 5