How to fix following error in R 'Error in UseMethod("xml_find_all")' while web scraping with rvest?

Question

I am new to R and am currently working on an assignment dealing with web scraping.

I am supposed to read in all the sentences from this web page: https://www.cs.columbia.edu/~hgs/audio/harvard.html

This is my current code:

library(xml2)
library(rvest)
url <- 'https://www.cs.columbia.edu/~hgs/audio/harvard.html'
read_html(url)
sentences <- url %>%
  html_nodes("li") %>%
  html_text()

And everytime I run it, I get this error:

Error in UseMethod("xml_find_all") : no applicable method for 'xml_find_all' applied to an object of class "character"

Can you please help me? I don't understand what I'm doing wrong.

score 1 · Answer 1 · answered Nov 18 '19 at 19:34

You forgot to assign a variable (I imagine it was intended to be the same url) to read_html(url). So url %>% html_nodes("li") is reading a "string" instead of a "xml_document", which is what the error is telling you (internally, rvest::html_nodes calls the function xml2::xml_find_all).

You could do this:

html <- read_html(url)

sentences <- html%>%
  html_nodes("li") %>%
  html_text()

Or this, if you are reading url only once

sentences <- read_html(url) %>%
  html_nodes("li") %>%
  html_text()

How to fix following error in R 'Error in UseMethod("xml_find_all")' while web scraping with rvest?

1 Answers1