13

I'm trying to scrape web data but first step requires a login. I've successfully been able to log into other websites but I a weird error with this website.

library("rvest")
library("magrittr")    

research <- html_session("https://www.fitchratings.com/")

signin <- research %>%
  html_nodes("form") %>%
  extract2(1) %>%
  html_form() %>%
  set_values (
    'userName' = "abc",
    'password' = "1234"
     )

research <- research %>%
  submit_form(signin)

When I run the 'submit_form' line I get the following error:

> research <- research %>%
+ submit_form(signin)
Submitting with '<unnamed>'
Error: length(url) == 1 is not TRUE

Submitting with unnamed is correct b/c there is no name assigned to the sign in button. Any help appreciated!

barny
  • 5,280
  • 4
  • 16
  • 21
Hugo S.
  • 131
  • 1
  • 4
  • Is this example still valid? When I run it, I get `Error: Unknown field names: userName, password`. – WhiteViking Sep 14 '15 at 23:44
  • 1
    Also, it seems this problem was due to a bug in rvest and got solved by the rvest package author: https://github.com/hadley/rvest/issues/73 Unfortunately no official version of rvest (with the fix) has been released since. It may be possible to manually install the latest version from github though. – WhiteViking Sep 14 '15 at 23:47

1 Answers1

9

I was having the same issue. I jumped through a few hoops to get the dev version of rvest running, and it's working smoothly now. Here's how I went about it:

First thing first. You need to install RTools. Make sure R is closed out. This can be found here: https://cran.r-project.org/bin/windows/Rtools/. And information for the installation of Rtools can be found here (if you're using Windows): github.com/stan-dev/rstan/wiki/Install-Rtools-for-Windows

Boot up R, then install libraries "httr" and "Rcpp" if you don't have them already.

Install "devtools" and the correlated github installer. Information can be found here, but I'll give you a quick summary from the linked repo.

Windows:

install.packages("devtools")
library(devtools)
build_github_devtools()

#### Restart R before continuing ####
install.packages("devtools.zip", repos = NULL, type = "source")

# Remove the package after installation
unlink("devtools.zip")

Mac/Linux:

devtools::install_github("hadley/devtools")

Now, to run the final steps.

library(httr)
library(Rcpp)
library(devtools)
install_github("hadley/rvest")

You should now be able to run submit_form(session, form) and not experience the error

Submitting with 'xxxx'
Error: length(url) == 1 is not TRUE
robeot
  • 203
  • 2
  • 5
  • 1
    Note: Rtools is only necessary on Windows. On Mac & Linux, installing a development package is trivial (it’s one command, as shown in your answer). In fact, I haven’t used CRAN in ages, I install everything from Github. – Konrad Rudolph Sep 18 '15 at 09:22