0

Hello I imported a dataset from SPSS in R, the dataset has labels and I want to use value labels as string values. Is there a way to do it?

head(dataset$A7B1)
<Labelled double>: A7b1. Cantón de San José en que reside
[1] NA NA NA 2  8 NA 4 NA 5

Labels:
 value         label
     1      SAN JOSÉ
     2        ESCAZÚ
     3  DESAMPARADOS
     4      PURISCAL
     5       TARRAZÚ
     6        ASERRÍ
     7          MORA
     8    GOICOECHEA
     9     SANTA ANA
    10    ALAJUELITA
    11      CORONADO
    12        ACOSTA
    13         TIBAS
    14       MORAVIA
    15 MONTES DE OCA
    16    TURRUBARES
    17          DOTA
    18    CURRIDABAT
    19 PÉREZ ZELEDÓN
    20   LEÓN CORTÉS

I need that every double labelled value become a string value according to the value label.

glimpse(dataset)
Rows: 283
Columns: 9
$ A7A  <dbl+lbl> 2, 8, 3, 3, 1, 2, 4, 4, 4, 2, 2, 4, 3, 4, 2, 3, 1, 2, 2, 6, 1, 1, 2, 2, 1, 2, 3, 1, 2, 1, 1, 4, 3, 1, 2, 2, 1, 1, 4, ...
$ A7B1 <dbl+lbl> NA, NA, NA, NA, 8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, 1, 11, NA, NA, 8, NA, NA, 3, NA, 14, 1,...
$ A7B2 <dbl+lbl> 1, NA, NA, NA, NA, 1, NA, NA, NA, 1, 1, NA, NA, NA, 1, NA, NA, 6, 2, NA, NA, NA, 1, 10, NA, 1, NA, NA, 1, NA, NA, NA,...
$ A7B3 <dbl+lbl> NA, NA, 1, 7, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA...
$ A7B4 <dbl+lbl> NA, NA, NA, NA, NA, NA, 2, 1, 1, NA, NA, 9, NA, 7, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ A7B5 <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
$ A7B6 <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ A7B7 <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
$ A7B8 <dbl+lbl> NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA..


 dput(head(dataset$A7A))
structure(c(2, 8, 3, 3, 1, 2), label = "A7a. Provincia de residencia", labels = c(`San Jose` = 1,     Alajuela = 2, Cartago = 3, Heredia = 4, Guanacaste = 5, Puntarenas = 6, 
Limon = 7, Extrenjero = 8), class = "haven_labelled")
Andrew
  • 4,653
  • 2
  • 8
  • 20
jruizri
  • 36
  • 4
  • 1
    What are you using to read-in the data? Can you provide a reproducible example? Do you want to retain the original values as an attribute or you just want the labels to be the values? – Andrew Aug 24 '20 at 17:39
  • Hello, thank you. I read it with haven package but I can also do it with sjlabelled package. Yes, what I really want is that the labels become string values – jruizri Aug 24 '20 at 17:50

1 Answers1

1

I typically use haven when reading in SPSS data and have a helper function for this. Hope this helps--if it doesn't please provide more info in your question :)

library(haven)

swap_labels <- function(x, keep_original = TRUE) {
  
  labels <- attr(x, "labels")
  new_vec <- names(labels)[match(x, labels)]
  
  if(keep_original) {
    haven::labelled_spss(new_vec, setNames(names(labels), labels))
  } else {
    new_vec
  }
  
}

# Reproducible example
test_vec <- labelled_spss(1:3, labels = setNames(1:3, letters[1:3]))

> test_vec
<labelled_spss<integer>[3]>
[1] 1 2 3

Labels:
 value label
     1     a
     2     b
     3     c

> swap_labels(test_vec)
<labelled_spss<character>[3]>
[1] a b c

Labels:
 value label
     a     1
     b     2
     c     3
Andrew
  • 4,653
  • 2
  • 8
  • 20
  • Hello Andrew, Thank you for your help but it didn't work. It gives me this error: Error: `labels` must be unique . This is how the data set looks like. I'm showing just one column but has more columns with double labelled values. I need the same for all of them. > head(data set$A7A) : A7a. Province. [1] 2 8 3 3 1 2 7 1 1 1 Labels: value label 1 San Jose 2 Alajuela 3 Cartago 4 Heredia 5 Guanacaste 6 Puntarenas 7 Limon 8 Extrenjero – jruizri Aug 24 '20 at 18:00
  • @jruizri, can you post the results of `dput(head(dataset$A7A))` or an example that will reproduce the error? – Andrew Aug 24 '20 at 18:17
  • I edited the question with that outcome. dput(head(dataset$A7A)) structure(c(2, 8, 3, 3, 1, 2), label = "A7a. Provincia de residencia permanente.", labels = c(`San Jose` = 1, Alajuela = 2, Cartago = 3, Heredia = 4, Guanacaste = 5, Puntarenas = 6, Limon = 7, Extrenjero = 8), class = "haven_labelled") – jruizri Aug 24 '20 at 18:33
  • 1
    Thanks for posting the `dput()`! Should be good to go now. Also, sounds like you want `keep_original = FALSE` – Andrew Aug 24 '20 at 18:45
  • Wow thank you. With as.data.frame( lapply(dataset,swap_labels) it worked perfectly! – jruizri Aug 24 '20 at 19:24
  • 1
    Sure thing! You can also do `dataset[] – Andrew Aug 24 '20 at 19:53