I need to process data that is being provided in a password protected Excel (xlsx) workbook. For legal reasons, I cannot create an unprotected Excel file or a csv file etc and process from there. None of the Excel import packages can deal with password protected workbooks.
From this answer Import password-protected xlsx workbook into R I have managed to extract the data. However it is imported in a list of character lists format. The dput of my list looks like this:
list(list("ID", "ID1", "ID2"),
list("V2", NULL, "text2"),
list("Name", "John Smith", "Mary Brown"),
list("Score", 1, 2),
list("email", "JS@gmail.com", "MB@gov.uk"))
What I want is a dataframe with columns ID, V2 etc that looks like this:
ID V2 Name Score email
ID1 NULL John Smith 1 JS@gmail.com
ID2 text2 Mary Brown 2 MS@gov.uk
There are empty cells in the original Excel workbook, so solutions with unlist will not work.
Using a combination of answers from R list to data frame and other similar questions, I have the following code (where listform is the name of the list):
matform <- as.matrix(sapply(listform, function(s) s)) # retains empty
df <- data.frame(matform[2:nrow(matform),])
names(df) = matform[1,]
This is close, but the dataframe has lists as columns. So str(df)
yields:
'data.frame': 2 obs. of 5 variables:
$ ID:List of 2
..$ : chr "ID1"
..$ : chr "ID2"
$ V2:List of 2
..$ : NULL
..$ : chr "text2"
and so on