1

I have an unusual problem with an spss dataset that I import to R via the Haven package (I also made a post about this on GitHub). The dataset is full of variables with missing value definitions that are not included among value labels, which leads to errors in R. Eg. -77 is defined as a missing value, but not as a value label. Indexing the variable's column returns

Error: `x` and `labels` must be same type

The only way I've found to fix the issue is to apply a label, remove the missing value, then remove the label:

ds <- read_spss(sav.file, user_na=TRUE)
val_label(ds[[1]], -77) <- "temp"
na_values(ds[[1]]) <- NULL 
val_label(ds[[1]], -77) <- NULL

The solution relies on double brackets (or $). I'm wondering what the fastest way to apply this to all numeric variables in a large dataset would be. I could easily do it with a for loop, but I'm looking for something faster.

20salmon
  • 31
  • 5
  • 1
    For this operation, the three lines of code replacing value labels, you are unlikely to find anything significantly faster than a `for` loop. – lmo Jun 12 '17 at 20:20

0 Answers0