0

I am not an absolute beginner in R but I am an absolute beginner in scraping.

I am trying to download each page from the URL below

https://www.tbmm.gov.tr/develop/owa/secim_sorgu.genel_secimler

I would go under "Secim Cevresi" for 1950 and then "Adana" as my beginning, so the first page to be downloaded is:

https://www.tbmm.gov.tr/develop/owa/secim_sorgu.secim_cevresi_partiler?p_secim_yili=1950&p_il_kodu=1

I basically need to download the table on the website above, and then loop it for the other pages. However, so far, I couldn't even download the table above, let alone other pages. I've written the following code, but it is basically getting me an empty table. Any input is appreciated!

url <- "http://www.tbmm.gov.tr/develop/owa/secim_sorgu.secim_cevresi_partiler?p_secim_yili=1950&p_il_kodu=%d"
map_df(1:67, function(i){
page <- read_html(sprintf(url, i))
data.frame(vote = html_text(html_nodes(page, "table")), 
           Heading = html_text(html_nodes(page, "h2"))
)
}) -> Adana1950
  • The table is not loaded with javascript. There is however a test for whether you are human. I am seeing "This question is for testing whether you are a human visitor and to prevent automated spam submission..." - this means there is no table to select at this point. Weirdly, I don't have this problem with python _requests_. Requests will add a default user-agent so I played around with httr using different headers and html_session but seems stuck in a loop for me of that same question. – QHarr Nov 19 '19 at 10:40
  • I can also use pd.read_html from pandas in python no problem - that uses no request headers I can see listed in source code. – QHarr Nov 19 '19 at 11:25

0 Answers0