1

Follow-up question to Dynamically create value labels with haven::labelled, where akrun provided a good answer using deframe.

I am using haven::labelled to set value labels of a variable. The goal is to create a fully documented dataset I can export to SPSS.

Now, say I have a df value_labels of values and their value labels. I also have a df df_data with variables to which I want allocate value labels.

value_labels <- tibble(
  value = c(seq(1:6), seq(1:3), NA),
  labels = c(paste0("value", 1:6),paste0("value", 1:3), NA),
  name = c(rep("var1", 6), rep("var2", 3), "var3")
)


df_data <- tibble(
  id = 1:10, 
  var1 = floor(runif(10, 1, 7)),
  var2 = floor(runif(10, 1, 4)), 
  var3 = rep("string", 10)
)

Manually, I would create value labels for df_data$var1 and df_data$var2 like so:

df_data$var1 <- haven::labelled(df_data$var, labels = c(values1 = 1, values2 =  2, values3 = 3, values4 = 4, values5 = 5, values6 = 6))

df_data$var2 <- haven::labelled(df_data$var, labels = c(values1 = 1, values2 =  2, values3 = 3))

I need a more dynamic way of assigning correct value labels to the correct variable in a large dataset. The solution also needs to ignore character vectors, since I dont want these to have value labels. For that reason, var3 in value_labels is listed as NA.

The solution does not need to work with multiple datasets in a list.

oskjerv
  • 91
  • 5
  • The datasets Im working with have character vectors. In this example `df_data$var3`. These character vectors contains long strings (comments). Logically, these can`t have value labels. But it is important to keep the character vectors in the final dataset. Let me know if I can clarify any further. – oskjerv Dec 17 '19 at 14:48
  • Can you check the solution posted below – akrun Dec 17 '19 at 15:07
  • 1
    Worked like a charm. Thank you! – oskjerv Dec 17 '19 at 15:33

1 Answers1

1

Here is one option where we split the named 'value/labels' by 'name' after removing the NA rows, use the names of the list to subset the columns of 'df_data', apply the labelled and assign it to back to the same columns

lbls2 <- na.omit(value_labels)
lstLbls <- with(lbls2, split(setNames(value, labels), name))
df_data[names(lstLbls)] <- Map(haven::labelled, 
          df_data[names(lstLbls)], labels = lstLbls)
df_data
# A tibble: 10 x 4
#      id       var1       var2 var3  
#   <int>  <dbl+lbl>  <dbl+lbl> <chr> 
# 1     1 2 [value2] 2 [value2] string
# 2     2 5 [value5] 2 [value2] string
# 3     3 4 [value4] 1 [value1] string
# 4     4 1 [value1] 2 [value2] string
# 5     5 1 [value1] 1 [value1] string
# 6     6 6 [value6] 2 [value2] string
# 7     7 1 [value1] 3 [value3] string
# 8     8 1 [value1] 1 [value1] string
# 9     9 3 [value3] 3 [value3] string
#10    10 6 [value6] 1 [value1] string
akrun
  • 674,427
  • 24
  • 381
  • 486