Web Scraping image in R from www.zoobashop.com

Question

I am collecting wax images for my classification algorithm.

At first I recovered all the links of the image blocks. Each block contains 1 or 2 images that I want to get back the links.

For example on this block link : https://www.zoobashop.com/woodin-fusion-de-woodin-wo29gha-29017-6-yards.html

library(rvest)
html <- read_html("https://www.zoobashop.com/woodin-fusion-de-woodin-wo29gha-29017-6-yards.html")

get_block_img <- function(html){
  html %>% 
    html_nodes('.fotorama__thumb  img#fotorama__img')%>% 
    html_attr("src")
}

get_block_img(html)

I receive as result character(0)

Can someone help me please

I can not find the corresponding css node to retrieve the image — Armel Soubeiga, Dec 09 '19 at 12:23
Sorry, I haven't looked at the website so my advice may be off. If you can't find an element, you may need to load the page through a headless browser and trigger all javascript in the background. — Roman Luštrik, Dec 09 '19 at 12:24

score 0 · Accepted Answer · answered Dec 18 '19 at 04:42

0

It is dynamically retrieved from a script tag when javascript runs in the browser. You can regex from the response text instead.

library(rvest)
library(stringr)

link <- str_match(read_html('https://www.zoobashop.com/woodin-fusion-de-woodin-wo29gha-29017-6-yards.html') %>%
        html_text(),'"data": .*?"img":"(.*?)"' )[1,2]

answered Dec 18 '19 at 04:42

QHarr

72,711
10
44
81

Thank you for your answer @QHarr . But with this code, I always get a single image – Armel Soubeiga Dec 18 '19 at 10:02
1

YES, I just replace this pattern `'"data": .*?"img":"(.*?)"'` by `'"data": .*?"img":"(.*?)"img'` and it's good – Armel Soubeiga Dec 18 '19 at 10:04
HI, So you used the first pattern in your comment above is that right? – QHarr Dec 18 '19 at 12:09
1

Yes that's it. _Thank you_ – Armel Soubeiga Dec 18 '19 at 14:12

Web Scraping image in R from www.zoobashop.com

1 Answers1