I'm using the urllib2.urlopen method to open a URL and fetch the markup of a we开发者_Python百科bpage. Some of these sites redirect me using the 301/302 redirects. I would like to know the final URL that I've been redirected to. How can I get this?
Call the .geturl() method of the file object returned. Per the urllib2 docs:
geturl()— return the URL of the resource retrieved, commonly used to determine if a redirect was followed
Example:
import urllib2
response = urllib2.urlopen('http://tinyurl.com/5b2su2')
response.geturl() # 'http://stackoverflow.com/'The return value of urllib2.urlopen has a geturl() method which should return the actual (i.e. last redirect) url.
e.g.:
urllib2.urlopen('ORIGINAL LINK').geturl()
urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()
You can use HttpLib2 with follow_all_redirects = True and get the content-location from the response headers. See my answer to 'httplib is not getting all the redirect codes' for an example.
加载中,请稍侯......
精彩评论