0

I'm currently trying to webscrape a website, and am doing so with python's urllib.request and bs4. However, this particular website has a truncated/dummy url, and so I can't put in the url and work with the html.

import urllib.request
import bs4 as bs
mylink = urllib.request.urlopen("http://www.vacationstogo.com/ticker.cfm").read()
soup = bs.BeautifulSoup(mylink, "html.parser")

NOTE:

http://www.vacationstogo.com/custom.cfm is the website where I fill in some inputs, and then when I click the search button, I get the url http://www.vacationstogo.com/ticker.cfm. Note however, that the previous URL will redirect me to some empty search page, and is not the url for the website with my search results.

Thanks.

Justin Jung
  • 343
  • 3
  • 7
  • 2
    I am not a lawyer and I'm not claiming this is illegal, but according to VTG's terms and conditions ("No part of the contents may be reproduced, modified, removed, sold, transferred, or otherwise distributed without the express written permission of VTG and/or the applicable third party providers."), they might frown upon you scraping their website. – Cᴏʀʏ Jul 27 '17 at 15:03
  • https://stackoverflow.com/questions/3477333/what-is-the-difference-between-post-and-get – maestromusica Jul 27 '17 at 15:15
  • Got it. I was using this website as a project to practice scraping, and I'll look into another website to use. However, the question still stands. How do I get the url of a site like this? I've never encountered something like this before and am curious as to what I should do. – Justin Jung Jul 28 '17 at 01:02

0 Answers0