3

I routinely want images of organisms to complement datasets and it would be great if I could pull out a species image for, say a bottlenose dolphin, given the Genus and species. I would then use this image in a K12 educational Shiny App similar to this one for students to explore the dataset. I found a way to get the URL and some page info with the WikipediR package, but I can't figure out how to extract the URL for the image in the sidebar.

require(WikipediR)
page_info("en","wikipedia",page="Tursiops truncatus")

I know there's a way (i.e. here), but I don't really understand how to make this work in R.

mattador
  • 165
  • 1
  • 9
  • 1
    I don't think you need to use WikipediaR package or even the Wikimedia API. I would just suggest using rvest, similar to this question here: https://stackoverflow.com/questions/36202414/r-download-image-using-rvest – Stedy Sep 13 '17 at 02:56
  • Thanks, so much for pointing me in this direction! This led me to what I was looking for! Cheers – mattador Sep 13 '17 at 18:14

1 Answers1

3

Thanks to Stedy's suggestion, I found a solution. Note there are 2 similarly named Wikipedia interface packages for R. This one uses WikipediR, not WikipediaR.

require(WikipediR); require(rvest)

#titles= vector of page name(s)
#res= desired width in pixels (220 px thumbnail by default)
#savedest= save destination (w terminal '/'); wd by default

getwikipic<-function(titles,res,savedest){
    if(missing(res)){res=220}
    if(missing(savedest)){savedest=NA}
  lapply(titles, function (ttl,...){
  d<-page_info("en","wikipedia",page=ttl,clean_response=T)
  url<-d[[1]]$fullurl
  wikipage<-html_session(url)
  imginfo<-wikipage %>% html_nodes("tr:nth-child(2) img")
  img.url<- imginfo[1] %>% html_attr("src")
  img.url<-paste0("https:",img.url)
  if(is.na(savedest)){
    savefilename<-paste0(ttl,".jpg")
    }else{savefilename<-paste0(savedest,ttl,".jpg")}

  if(res!=220){img.url<-gsub(220,res,img.url)}  

  download.file(img.url,savefilename)
  return(paste0("orig.file: ",basename(img.url)))#tell user original filename (or error)

  },res,savedest)#End lapply
}#End function

Alternatively, I created a GitHub repo with the code here. You can source and run this quite simply in R.

devtools::source_url("https://raw.githubusercontent.com/drwilkins/getwikipic/master/getwikipic.R")

titles<-c("numbat")
getwikipic(titles,1024)

Downloads this to your working directory pic

mattador
  • 165
  • 1
  • 9
  • since your code isn't that long, could you please post (i.e. cut and paste) it here in addition to providing the Github link? SO strongly encourages making answers as self-contained as possible ... – Ben Bolker Sep 13 '17 at 18:40
  • No problem. Code added! – mattador Sep 14 '17 at 22:03