2

I have a dataset with 90 responses to Likert Items that I would like to convert to numeric values. It is structured like the example here:

q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
q7 <- c("Never", "Never", "Often", "Often", "Daily")
q23 <- c("Daily", "Often", "Never", "Never", "Neutral")
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
example <- cbind(q6, q7, q17, q23)

The responses to each question differ slightly, but are in the main either in the range of Strongly Disagree to Strongly Agree, Daily to Never, or Important to Not Important. Each of the responses to the 90 questions are in a separate column (labelled q1 > q90). I'd like to create new columns for set of responses with a numeric value that corresponds to the text response (Strong Agree (3) to Strongly Disagree (-3), via Neutral (0)). Like so

q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
n6 <- c(3,-3,1,2,3)
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
n17 <- c(2,2,3,0,-3)
num_example <- cbind(q6, n6, q17, n17)
num_example

I've managed to get so far with the code below, which generates a new variable called n6 that matches the text responses in the existing q6 column, that I can then add to the existing data frame using cbind. My questions is: how would I automate this across the entire data frame of 90 questions without having to run the code below for each response (i.e. changing q6 to q7, then to q8, and so on).

n6 <- ifelse(example$q6=="Daily", 3,
                  ifelse(h16$q6=="",0,
                  ifelse(h16$q6=="Very Often", 2,
                  ifelse(h16$q6=="Often", 1,
                  ifelse(h16$q6=="Neither Rarely nor Often", 0,
                  ifelse(h16$q6=="Rarely", -1,
                  ifelse(h16$q6=="Very Rarely", -2,
                  ifelse(h16$q6=="Never", -3,5
                         ))))))))

For further reference, columns q6:q12, then q23:30 have responses ranging from Daily to Never, as per the example above. Columns q17:q22 have responses ranging from Not Important to Very Important, Columns q49:q90 have responses that range from Strongly Agree to Strongly Disagree. I'm trying to find a smarter way of running the code below over the relevant columns (e.g. q6:12, q23:q30) in a way that generates a new data frame with numeric values in columns named n6:n16, n23:30, rather than having to run the code below 90 times!

Hope this is a clear explanation of the issue.

Thank you.

  • 1
    Take a look at `?factor`. You can use the ordered argument to get ordered factors if that is desired. – lmo Aug 02 '16 at 15:45
  • 1
    The scale is not clear. What are the sets of possible responses and their corresponding values? – Pierre L Aug 02 '16 at 15:45
  • You also used to different objects in your search `h16` and `example` did you mean to use one only? – Pierre L Aug 02 '16 at 15:53

3 Answers3

5

plyr package has a function called revalue. Replace specified values with new values, in a factor or character vector. May be that is helpful here...

 require(plyr)
 example2 <- revalue(example, c("Daily"= "3", "Never"= "-3", "Often"= "1",
             "Very Often"= "2", "Important" = "3", "Very Important"= "3",
              "Neutral"= "0", "Not Important"= "-3" ))  

     q6   q7   q17  q23 
[1,] "3"  "-3" "2"  "3" 
[2,] "-3" "-3" "2"  "1" 
[3,] "1"  "1"  "3"  "-3"
[4,] "2"  "1"  "0"  "-3"
[5,] "3"  "3"  "-3" "0" 

data

q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
q7 <- c("Never", "Never", "Often", "Often", "Daily")
q23 <- c("Daily", "Often", "Never", "Never", "Neutral")
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
example <- cbind(q6, q7, q17, q23) 

Alterantively, mapvalues also works

 mapvalues(example, from = c("Daily", "Never", "Often", "Very Often",
          ,"Important", "Very Important", "Neutral", "Not Important"),
          to = c(3,2,0,-3,2,3,0,-3))
user5249203
  • 3,764
  • 1
  • 14
  • 37
  • Thank you - I can see how that would work, once I add all possible variables (Strongly Agree, etc) to that list of options. – Craig Hamilton Aug 02 '16 at 16:02
2

If you want to use base R, I would recommend using named vectors to build a look-up table, rather than nesting multiple ifelsess eg:

n <- c('Daily'=3, 'Very Often'=2, 'Often'=1, 'Never'=-3)
n[q6]
#Daily      Never      Often Very Often      Daily 
#    3         -3          1          2          3 
n[q7]
#Never Never Often Often Daily 
#   -3    -3     1     1     3 
Neal Fultz
  • 8,413
  • 36
  • 49
1

There are faster ways but since you already did all of that work, transform your current process into a function then use sapply to go over all columns:

Notice that I changed the q6 to [,x]:

numConvert <- function(x) ifelse(example[,x]=="Daily", 3,
             ifelse(h16[,x]=="",0,
                    ifelse(h16[,x]=="Very Often", 2,
                           ifelse(h16[,x]=="Often", 1,
                                  ifelse(h16[,x]=="Neither Rarely nor Often", 0,
                                         ifelse(h16[,x]=="Rarely", -1,
                                                ifelse(h16[,x]=="Very Rarely", -2,
                                                       ifelse(h16[,x]=="Never", -3,5
                                                       ))))))))

Now the function accepts column names and converts based on your specification. Try it out:

h16 <- example
sapply(colnames(example), numConvert)
#      q6 q7 q17 q23
# [1,]  3 -3   5   3
# [2,] -3 -3   5   1
# [3,]  1  1   5  -3
# [4,]  2  1   5  -3
# [5,]  3  3   5   5

Edit

If you want to use a shiny new function try case_when available with dplyr >= 0.5.0:

library(dplyr)
factorise <- function(x) {
  case_when(x %in% c("Daily", "Very Important") ~ 3,
            x %in% c("Very Often", "Important") ~ 2,
            x %in% c("Often") ~ 1,
            x %in% c("Neutral") ~ 0,
            x %in% c("Never", "Not Important") ~ -3)
}

sapply(example, factorise)
#      q6 q7 q17 q23
# [1,]  3 -3   2   3
# [2,] -3 -3   2   1
# [3,]  1  1   3  -3
# [4,]  2  1   0  -3
# [5,]  3  3  -3   0
Pierre L
  • 26,748
  • 5
  • 39
  • 59