0

I am collecting wax images for my classification algorithm.

At first I recovered all the links of the image blocks. Each block contains 1 or 2 images that I want to get back the links.

For example on this block link : https://www.zoobashop.com/woodin-fusion-de-woodin-wo29gha-29017-6-yards.html

enter image description here

library(rvest)
html <- read_html("https://www.zoobashop.com/woodin-fusion-de-woodin-wo29gha-29017-6-yards.html")

get_block_img <- function(html){
  html %>% 
    html_nodes('.fotorama__thumb  img#fotorama__img')%>% 
    html_attr("src")
}

get_block_img(html)

I receive as result character(0)

Can someone help me please

QHarr
  • 72,711
  • 10
  • 44
  • 81
  • I can not find the corresponding css node to retrieve the image – Armel Soubeiga Dec 09 '19 at 12:23
  • 1
    Sorry, I haven't looked at the website so my advice may be off. If you can't find an element, you may need to load the page through a headless browser and trigger all javascript in the background. – Roman Luštrik Dec 09 '19 at 12:24

1 Answers1

0

It is dynamically retrieved from a script tag when javascript runs in the browser. You can regex from the response text instead.

library(rvest)
library(stringr)

link <- str_match(read_html('https://www.zoobashop.com/woodin-fusion-de-woodin-wo29gha-29017-6-yards.html') %>%
        html_text(),'"data": .*?"img":"(.*?)"' )[1,2]
QHarr
  • 72,711
  • 10
  • 44
  • 81