0

For statistical analysis in performance sports, I often collect data form https://www.fis-ski.com/en and import it to MS Excel by copy/paste, before I work on it in RStudio. As FIS updated its website structure recently, the imported data shows up in one column and it is not possible to convert it back to a structured table via "text to columns". I tried it also by using a macro but as the imported data doesn`t have a uniform structure, because missing data (NA´s) are not shown as empty cells, converting data is quite tricky for me as a "non-programmer". The data I would like to extract are the following:

https://www.fis-ski.com/DB/alpine-skiing/biographies.html?lastname=&firstname=&sectorcode=AL&gendercode=M&birthyear=1980-2004&skiclub=&skis=&nationcode=SUI&fiscode=&status=&search=true

...but as I need the results of every single athlete in this list, here an example...

https://www.fis-ski.com/DB/general/athlete-biography.html?sectorcode=AL&seasoncode=&competitorid=230012&type=result&categorycode=&sort=&place=&disciplinecode=&position=&limit=1000

..., I have a lot of data which I need to get in order! So, I have two questions:

  1. Is there a easy method to get the copied data back in order as a table?
  2. Is there a way to extract the results-data from all athletes (SUI, male, YoB 1980-2004) without switching from athlete to athlete?

Thank you very much in advance... looking forward to your answers...

Greetings!!

brubaker
  • 3
  • 2
  • you could save the webpage or directly read the table by `rvest`. please refer to https://stackoverflow.com/questions/35707534/how-to-scrape-a-table-with-rvest-and-xpath – BigMOoO Apr 20 '21 at 16:37

1 Answers1

0

You wrote that you are not a programmer, so this would be very complicated to explain to you, but I have a solution for you. On any of those two pages, open your browser developer tools with F12 and go to the "console" tab, and then paste this and press Enter:

copy([...$('.thead .container, .table-row .container')].map(e => [...$(e).children(':visible:not(:has(.g-sm-24), :has(.g-xs-24), :has(.pale))'), ...$(e).find(':has(.g-sm-24), :has(.g-xs-24), :has(.pale)').find(':visible.g-sm-24, :visible.g-xs-24, :visible.pale')].map(c => c.innerText.replace(/\n/g, ' ').trim()).join('\t')).join('\r\n'))

This will copy a nice table for you into the clipboard, which you can paste into Excel.

Small caveat: The order of the columns is a bit different (because some of those columns are "special" in that they are actually a group of columns which is shown/hidden together normally depending on whether you are on mobile or not).

By the way, this is stored in a command history then, next time you just need to press the up arrow key in the command line field to recall the command and press Enter again.

CherryDT
  • 13,941
  • 2
  • 31
  • 50