Due to global IT settings, I am having a hard time to use htmlParse
or read_HTML
.
The solution for my purpose, was just to use readLines
from the base
package and then parse it with htmlParse
. Is there a disadvantage to this process that I am not aware of?
At least for my MWE it seems to yield the same output. Maybe this will be different for more elaborate HTML code.
library(XML)
mailing_url = "http://www.r-project.org/mail.html"
mailing lines <- readLines(mailing_url)
mailing_doc.RL = htmlParse(mailing_lines)
mailing_doc.HTML = htmlParse(mailing_url)
all.equal(mailing_doc.RL, mailing_doc.HTML)