I want to scrape HTML table in R using rvest package. It works, but I have one problem: not all rows are scraped. For this example, I am using data from Yahoo! Finance. Following are my codes:
library("rvest")
# I use AAPL as an example
# Time period: Jan 1, 2012 - May 14, 2018
url = 'https://finance.yahoo.com/quote/AAPL/history?period1=1325350800&period2=1526230800&interval=1d&filter=history&frequency=1d'
df = url %>%
read_html() %>%
html_nodes("table") %>%
html_table()
df = data.frame(df[[1]])
nrow(df)
The problem emerges when I see the total numbers of rows, which are only 101 (Dec 20, 2017 - May 11, 2018). What am I missing?
Thank you.