I have a script that parses Yahoo Finance's historical pricing data for a vector of ticker symbols. It also uses the date codes in the url for the timeframe from 1/1/2014 to yesterday. No issues getting it to work, but I'm only getting the first 100 rows. It appears the problem is that Yahoo Finance (even with a large data range selected) will only show the first 100 results until you scroll down. Is there a work around?
You can see the issue going here...
#Example to test...
Ticker <- c("AMZN","F")
maxDate <- 1548918000
for (s in Ticker){
url <- paste('https://finance.yahoo.com/quote/',s, '/history?period1=1388559600&period2=',maxDate,'&interval=1d&filter=history&frequency=1d',sep="")
webpage <- readLines(url,warn=FALSE)
html <- htmlTreeParse(webpage, useInternalNodes = TRUE, asText = TRUE)
tableNodes <- getNodeSet(html, "//table")
assign(s, readHTMLTable(tableNodes[[1]],
header=c("Date","Open","High","Low","Close","Adj. Close","Volume")))
df <- get(s)
df['Symbol'] <- s
assign(s, df)
}
tickerDataList <- cbind(mget(Ticker))
tickerData <- do.call(rbind, tickerDataList)
The expected results would be the same but with a date range back to 1/1/14. This would mean there would be a couple thousand rows vs. two-hundred.