0

im new to r and Webscraping. I'm currently scraping a realestate website (https://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Rheinland-Pfalz/Koblenz?enteredFrom=one_step_search) but i don't manage to scrape the links of the specific offers.

When using the code below, i get every link attached to the Website, and im not quite sure how i can filter it in a way that it only scrapes the links of the 20 estate offers. Maybe you can help me.

Viewing the source code / inspecting the elements didn't help me so far...

url <- immo_webp %>%

  html_nodes("a") %>%

  html_attr("href")
QHarr
  • 72,711
  • 10
  • 44
  • 81
  • I would suggest you to take a look at this.(https://stackoverflow.com/questions/35247033/using-rvest-to-extract-links) – FAlonso Oct 01 '19 at 12:35

1 Answers1

1

You can target the article tags and then construct the urls from the data-obid attribute by concatenating with a base string

library(rvest)
library(magrittr)

base = 'https://www.immobilienscout24.de/expose/'

urls <- lapply(read_html("https://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Rheinland-Pfalz/Koblenz?enteredFrom=one_step_search")%>%
       html_nodes('article')%>%
       html_attr('data-obid'), function (url){paste0(base, url)})
print(urls)
QHarr
  • 72,711
  • 10
  • 44
  • 81