I wrote code that scraps every day of the year and saves it in a separate .xlms file for each day.
start <- as.Date("25-01-19",format="%d-%m-%y")
end <- as.Date("17-12-19",format="%d-%m-%y")
theDate <- start
while (theDate <= end)
{
url <- (paste0("http://www.b3.com.br/pt_br/produtos-e-servicos/emprestimo-de-ativos/renda-variavel/emprestimos-registrados/renda-variavel-8AE490CA64CD50310164D1EFD6412F1C.htm?data=",format(theDate,"%d/%m/%y"),"&f=0"))
site <- read_html(url)
Info_Ajuste_HTML <- html_nodes(site,'table')
Info_ajuste <- html_text(Info_Ajuste_HTML)
head(Info_ajuste,20)
t <- head(Info_Ajuste_HTML)
lista_tabela <- site %>%
html_nodes("table") %>%
html_table(fill = TRUE)
str(lista_tabela)
head(lista_tabela[[1]], 10)
if (t =="character(0)") {
theDate <- theDate + 1
} else {
... code ...
The url accessed is dynamic and changes for each day. The problem is in the days when the site goes offline, generates the error "character (0)" when executing the command: >head (Info_ajuste, 20), and the error: "{xml_nodeset (0)}" when executing >head (Info_Ajust_HTML).
This is because it downloads a table and on some days the site does not make that table available.
I needed to create an "if" to skip the days that give this error.