0

I am a new user of rvest package in R to conduct web scraping on Marriott website.
I would like to make a list of the name and price of Marriott hotel in Japan from the url: https://www.marriott.com/search/submitSearch.mi?destinationAddress.destination=Japan&destinationAddress.country=JP&searchType=InCity&filterApplied=false.

What I have done is as below;

#library
library(rvest)
library(dplyr)
#get the url
url = "https://www.marriott.com/hotel-search.mi" # url
html = read_html(url)  # read webpage
# pull out links to get the labels
links = html %>%
  html_nodes(".js-region-pins") %>%  
  html_attr("href") %>%
  str_subset("^.*Japan*")

Here links include the url of the page that includes 47 Japanese hotel as below;

links [1] "/search/submitSearch.mi?destinationAddress.destination=Japan&destinationAddress.country=JP&searchType=InCity&filterApplied=false"

Then,

url_japan = paste("https://www.marriott.com",links,sep="")

url_japan [1] "https://www.marriott.com/search/submitSearch.mi?destinationAddress.destination=Japan&destinationAddress.country=JP&searchType=InCity&filterApplied=false"

Here is the problem, which I came across with.

When we jump to the url_japan, it appears that the loaded page is redirected to another url (https://www.marriott.com/search/findHotels.mi).

In this case, how can I continue web-scraping with rvest package?

imtaiky
  • 113
  • 6
  • 1
    Have you read the terms of service? " you agree that you will not use any robot, spider, other automatic device, or manual process to monitor, scrape, or copy our Sites or the Marriott Information contained therein, or any aspect of the Sites or the Marriott Information, without the prior express consent from an authorized Marriott representative" – Dave2e Dec 18 '19 at 18:52
  • Ha, weeeeooooh weeeeoooooh, the internet police gotcha. I suspect a lot of the data is loaded via javascript and so you'll need selenium or some other way to access it. – cory Dec 18 '19 at 20:23
  • That's a important comment. I really appreciate your comment@Dave2e. – imtaiky Dec 20 '19 at 11:34

0 Answers0