0

I am scraping data from this website and for some reason, I'm unable to get the name of the seller, even though I use the exact node returned by SelectorGadget. I have, however, managed to get all the other data with Rvest.

I managed to scrape the seller's name with RSelenium but that takes too much time. Anyway, here's the link of the page I'm scraping:

https://www.kijiji.ca/v-fitness-personal-trainer/bedford/swimming-lessons/1421292946

Here's the code I've used

SellerName <-
  read_html("https://kijiji.ca/v-fitness-personal-trainer/bedford/swimming-lessons/1421292946") %>%
  html_nodes(".link-4200870613") %>%
  html_text()
Saad Rehman
  • 155
  • 1
  • 3
  • 15
  • The seller's name is dynamically generated using a script so it won't be in the raw file we pull using Rvest. You could do what is suggested here. https://stackoverflow.com/questions/29861117/scraping-a-dynamic-ecommerce-page-with-infinite-scroll – Sada93 Sep 01 '19 at 19:06

1 Answers1

1

You can regex out the seller name easily from the return as it is contained in a script tag (presumably loaded from here when browser is able to run javascript - which rvest does not.)

library(rvest)
library(magrittr)
library(stringr)

p <- read_html('https://www.kijiji.ca/v-fitness-personal-trainer/bedford/swimming-lessons/1421292946') %>% html_text()
seller_name <- str_match_all(p,'"sellerName":"(.*?)"')[[1]][,2][1]
print(seller_name)

Regex:

enter image description here

QHarr
  • 72,711
  • 10
  • 44
  • 81