1

I'm trying to get the price and stock condition from some Walmart stores with rvest package with the help of Selector Gadget extension. I was able to get the address of the store, but couldn't get the price and stock condition. Any suggestion would be appreciated!

Here is what I have done so far

    library(dplyr)
    library(rvest)

    url <- read_html("http://www.walmart.com/store/25/search?query=50636282")

    selector_name<-".cs-secondary-copy"
    fnames <- html_nodes(x = url, css = selector_name) %>%
      html_text()
    fnames

    price <- html_nodes(x = url, css = ".sup") %>%
      html_text() %>% 
      as.numeric()
    price

    stock <- html_nodes(x = url, css = ".stockStatus-unavailable") %>%
      html_text()
    stock

Output

    > fnames
    [1] "4820 S Clark St, Mexico, MO 65265"                   "Item availability is updated every day at midnight."
    > price
    numeric(0)
    > stock
    character(0)

Relevant data from Selector Gadget

    <span class="cs-secondary-copy">4820 S Clark St, Mexico, MO 65265</span>

      <div class="csTile">

      <div class="csTile-img">
      <a href="/ip/Virgin-Mobile-LG-Tribute-5-Prepaid-Smartphone/50636282">
      <img class="js-cs-image-link" id="43A657WDTF0J" src="https://i5.walmartimages.com/asr/51a2cea5-abe4-4a03-9711-b995cb7e215f_1.fd7b362cc57347042f4b518ff05de7ec.jpeg?odnHeight=180&amp;odnWidth=180&amp;odnBg=ffffff" alt="Virgin Mobile LG Tribute 5 Prepaid Smartphone" width="144" height="144">
      </a>
      </div>

      <div class="csTile-content">
      <div class="csTile-stockStatus">
      <strong class="stockStatus-unavailable">
      Out of Stock
    </strong>
      </div>
      <div class="price-display csTile-price">
      <p class="csTile-disclaimer">Store Price</p>
      <span class="sup">$</span>15<span class="currency-delimiter">.</span><span class="sup">00</span>
      </div>

      <p class="csTile-heading js-cstile-heading"><span>
      Virgin Mobile LG Tribute 5 Prepaid Smartphone
    </span><div class="js-truncate-disclosure-arrow truncate-disclosure-arrow"></div></p>
      <div class="csTile-rating">
      <span class="stars stars-small">
      <i class="star star-rated"></i><i class="star star-rated"></i><i class="star star-rated"></i><i class="star star-rated"></i><i class="star star-partial"></i><span class="visuallyhidden">4.5 stars</span>
      <span class="visuallyhidden">Average rating: 4.4375 stars</span>
      <span class="stars-reviews stars-reviews--grey">16
    <span class="visuallyhidden">ratings</span>
      </span>
      </span>
      </div>
      <a class="btn btn-inverse l-margin-top js-cs-product-link" id="43A657WDTF0J" href="/ip/Virgin-Mobile-LG-Tribute-5-Prepaid-Smartphone/50636282">
      Buy online
    </a>
      </div>

      </div>
Tung
  • 20,273
  • 6
  • 66
  • 83
  • 1
    What you are trying to extract is created dynamically with javascript, therefore you won't be able to collect it until the javascript is triggered (something that can be done through headless browsing). An easy way to see this is that `html_nodes(x = url, css = ".sup")` is an empty nodeset (because it hasn't been created yet!) or else you can load the url in your browser but disable javascript to see if the element you want is actually in the html or not. More on the headless browsing solution here: http://stackoverflow.com/questions/26631511/scraping-javascript-website-in-r – Chrisss Feb 10 '17 at 03:16
  • Thanks @Chrisss! That explains it – Tung Feb 11 '17 at 01:43
  • Too bad about the terms of use. I was hoping to do something similar with price information. Sadly, no. – Mallick Hossain Dec 18 '17 at 01:31

1 Answers1

6

[Update May 2018:] Walmart have released an API which may be able to satisfy the needs of this question: https://medium.com/@kyleake/how-to-extract-data-from-walmart-open-api-efd01a2f91e0 -- it appears that price may be returned from some endpoints.

That said, scraping via rvest may still be a violation of the terms as the API requires you to register.


https://help.walmart.com/app/answers/detail/a_id/8#2

You are prohibited from:

  • Violating or attempting to violate the security of the Walmart Sites;
  • Using any device, software, or routine to interfere or attempt to interfere with the proper working of the Walmart Sites; or
  • Using or attempting to use any engine, software, tool, agent or other device or mechanism (except the search mechanisms provided by Walmart or other third party web browsers) to navigate or search the Walmart Sites.

(emphasis mine).

Jonathan Carroll
  • 3,637
  • 12
  • 33