1

I am trying to scrape my trip history data on Capital Bikeshare Website. I have to log in and go to the trips menu to see the data. but i get this error:

> `No encoding supplied: defaulting to UTF-8.
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘readHTMLTable’ for signature ‘"xml_document"’

Here's my code.

> library(httr)
> library(XML)
> handle <- handle("https://www.capitalbikeshare.com/")
> path <-"profile/trips"

> login <- list( profile_login="myemail", profile_pass ="mypassword", profile_redirect_url="https://secure.capitalbikeshare.com/profile/trips/QNURCMF2Q6")
> response <- POST(handle = handle, path = path, body = login)
> readHTMLTable(content(response))

I also tried using rvest but then I kept getting the "Error: Unknown field names: _username, _password" error. Which field should I use here? I tried Id, name, etc and still didn't work.

Dave2e
  • 15,736
  • 17
  • 32
  • 37
jso1226
  • 59
  • 1
  • 1
  • 8

1 Answers1

2

For a start the member login page is different than the intro page which you have listed above:

This may not be correct but try this as a possible rvest starting point:

login<-"https://secure.capitalbikeshare.com/profile/login"

library(rvest)
pgsession<-html_session(login)
pgform<-html_form(pgsession)[[1]]
#update user id and password in the next line
filled_form<-set_values(pgform, "_username"="myemail@gmail.com", "_password"="password")
submit_form(pgsession, filled_form)

Once you login in then one can use the jump_to function to move to the desired pages:

page<-jump_to(pgsession, newurl) #newurl will be the address where to go to next.

Hope this helps, if this does not work, leave a comment and I'll delete the post.

Dave2e
  • 15,736
  • 17
  • 32
  • 37
  • I get this error `Error in xml2::url_absolute(url, x$url) : object 'newurl' not found` after `page – jso1226 Sep 24 '16 at 02:44
  • @jso1226, If you got this far the login step worked. Sorry, I should have been clearer, "newurl" will be the url to the page you wish go to. When I have done similar steps I have to predefine this URL earlier in my code. – Dave2e Sep 24 '16 at 04:06
  • I don't get the error anymore and it seems like it went through but would I need to run another command? i wanted to see if it will give me the fields that i'm supposed to see, such as trip information and duration. So i ran `summary(page)' `but it gives me a table of handle, config, url, back, forward, response and html in rows with length, class and mode as columns. – jso1226 Sep 24 '16 at 04:30
  • Yes, correct, now since you jumped to the new page you will need to parse the webpage with the html_nodes function. Please see the documentation associated with the rvest package for more help. – Dave2e Sep 24 '16 at 12:40
  • Thank you @Dave2e! – jso1226 Sep 28 '16 at 04:16