I am wanting to fill in a web form and submit my query and download the resulting data. Some of the fields have the option of a drop-down menu or typing in a search query, sections can also be left blank (if all sections are left blank the entire database is downloaded), hitting the "search and download" button should instigate the downloading of a file.
Here is what I have tried (selecting all records for species "Salmo salar") based on this question. I used my browser (Opera) "Developer Tools" to inspect page elements and identify the names of all the possible fields:
library(httr)
url <- "https://nzffdms.niwa.co.nz/search"
fd <- list(
search_catchment_no_name = "",
search_river_lake = "",
search_sampling_locality = "",
search_fishing_method = "",
search_start_year = "",
search_end_year = "",
search_species = "Salmo salar", # species of interest
search_download_format = 1, # select csv file format
submit = "Search and Download"
)
POST(url, body = fd, encode = "form")
I had hoped this would result in a csv file being downloaded (all records for species "Salmo salar"), but no file downloads (but outputs this (list of 10, just showing the first bit):
Response [https://nzffdms.niwa.co.nz/search]
Date: 2019-10-02 23:35
Status: 200
Content-Type: text/html; charset=utf-8
Size: 19.1 kB
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; c...
<meta name="title" content="NZ Freshwater Fish Database...
<meta name="description" content="NIWA NZ Freshwater Fish...
<meta name="keywords" content="NIWA, NZ, Freshwater Fish" />
<meta name="language" content="en" />
<meta name="robots" content="index, follow />
...
Edit
I think the issue is with how I am calling the Search and download
button, when inspecting the web-page most fields look like this:
# end year field
<input maxlength="4" class="form-control" type="text" name="search[end_year]" id="search_end_year">
But the search and download
button elements don't have a name
or id
option:
<input type="submit" value="Search and Download" class="btn btn-primary btn-md">
Also I have just noticed there is a hidden field, maybe I need to define this?
<input type="hidden" name="search[_csrf_token]" value="d1530f09c1ce8110b5163bd100cb0d67" id="search__csrf_token">
Any advice on how I can get the file downloading would be much appreciated.