I am trying to scrape data from yahoo after logging in to my yahoo account. I've tried to follow the answers to Using rvest or httr to log in to non-standard forms on a webpage and Scraping password protected forum in r , but after submitting the form I do not seem to be logged in and I get redirected back to the log in page.
Here is the code I am using:
library(rvest)
yahoo <- html_session("https://login.yahoo.com/")
form <- html_form(yahoo)[[1]]
filled_form <- set_values(form, "username"="myusername", "passwd"="mypassword")
filled_form$url <- yahoo$url # otherwise I get an error from no url
sess <- submit_form(yahoo, filled_form)
I should then be able to navigate to the personal information page and select a node that reads "Personal info":
sess %>%
jump_to("https://login.yahoo.com/account/personalinfo") %>%
read_html() %>%
html_nodes("h1") %>%
html_text()
But instead, I just get "Sign in". One potential issue is with submit_form()
, which seems to be using passwd
rather than the correct submit button (signin
). But when I try using signin
, I get an error from submit_form
:
sess <- submit_form(yahoo, filled_form, submit='signin')
gives "Error: Unknown submission name 'signin'.Possible values: passwd"
Any ideas for what I should be doing differently, or what I'm doing wrong? Thanks!