The following script allows me to get to a website with several links with similar names. I want to get only one of them, which can be diferentiated from the others because it is printed in bold in the website. However, i could not find a way of selecting a bold link within a list.
Would anyone have ahint on this? Thanks in advance!
library(httr)
library(rvest)
sp="Alnus japonica"
res <- httr::POST(url ="http://apps.kew.org/wcsp/advsearch.do",
body = list(page ="advancedSearch",
AttachmentExist ="",
family ="",
placeOfPub ="",
genus = unlist(strsplit(as.character(sp), split=" "))[1],
yearPublished ="",
species = unlist(strsplit(as.character(sp), split=" "))[2],
author ="",
infraRank ="",
infraEpithet ="",
selectedLevel ="cont"),
encode ="form")
pg <- content(res, as="parsed")
lnks <- html_attr(html_nodes(pg,"a"),"href")
#how get the url of the link wth accepted name (in bold)?
res2 <- try(GET(sprintf("http://apps.kew.org%s", lnks[grep("id=",lnks)] [1])),silent=T)
#this gets a link but often fails to get the bold one
` is technically invalid HTML/XML and `libxml2` parses it that way.
– hrbrmstr May 06 '16 at 01:52