Web scraping using R when the url doesn't change after select the page results

Asked Nov 12 '15 at 04:41

Active Oct 07 '17 at 05:58

Viewed 921 times

I want to scrape all the product prices from this page:

http://www.la14.com/Tiendala14/paginacion/numericpaging.aspx?Catalog=base_catalog&Category=Mercado%2fAseo+Hogar

I can see in the bottom right of this url that it has many pages. If I select the page number six, for example, the url doesn't change.

I'm running the next code in R to get all the prices:

library(RCurl)
library(XML)
library(xml2)
doc <- read_html('http://www.la14.com/Tiendala14/paginacion/numericpaging.aspx?Catalog=base_catalog&Category=Mercado%2fAseo+Hogar')

prices <- xml_find_all(doc, xpath="//span[@id=]")

But I am just getting the product prices from page number 1 in this url. How can I get the product prices from the rest of pages in this url?

asked Nov 12 '15 at 04:41

Jeisson

You can use `sprintf` to change the number for each page, and then loop to get the output. – akrun Nov 12 '15 at 04:42
1

Your issue is basically the same as this one: http://stackoverflow.com/q/29861117/2372064. For pages that update based on AJAX requests, you need to create an environment that can run javascript, such as RSelenium. – MrFlick Nov 12 '15 at 04:45
Thank you, I could solve it with your recommendation @MrFlick – Jeisson Nov 17 '15 at 02:02

Web scraping using R when the url doesn't change after select the page results

0 Answers0