1

I have created a data frame with test scores for three different tests. I would now like to create a new variable in the same data frame ('highest_score') stating which of three scores was the individual's highest. I have created the following code to do this:

test <- data.frame(test_1, test_2, test_3) 
# this gives me a dataframe with scores from three different tests (6 observations in each) 

Then,

for (i in 1:nrow(test)) {
  if ((test$test_1[i] > test$test_2[i]) & (test$test_1[i] > test$test_3[i]))
      test$highest_score <- "Test 1 is the individual's highest score!"
} else if ((test$test_2[i] > test$test_1[i]) & (test$test_2[i] > 
test$test_3[i]))
       test$highest_score <- "Test 2 is the individual's highest score!"
} else if ((test$test_3[i] > test$test_1[i]) & (test$test_3[i] > 
test$test_2[i]))
       test$highest_score <- "Test 3 is the individual's highest score!"
} else 
  NULL
}
}

When I run the code, the new variable 'highest_score' prints out the 'Test 3 is the individual's highest score!' for all observations, even though that is not true.

I would be very grateful if someone was able to let me know where I am going wrong.

phiver
  • 19,366
  • 14
  • 36
  • 42
Jane
  • 11
  • 1
  • The problem is in your conditions - you're using a single &. Take a look at https://stackoverflow.com/questions/6558921/boolean-operators-and. – namokarm Jul 15 '18 at 08:14

1 Answers1

1

Since you don't have an example test data.frame I created one. The functions you are looking for is max.col. Read the help (?max.col) in case there is a tie in the scores. I wrapped everything in a simple function with no error handling to return your wanted text.

# create reproducible example 6 long as in OP's question.
set.seed(1234)
test <- data.frame(test_1 = sample(1:10, 6), test_2 = sample(1:10, 6), test_3 = sample(1:10, 6)) 

test
  test_1 test_2 test_3
1      2      1      3
2      6      3      9
3      5      6     10
4      8      4      6
5      9      5      2
6      4      9      7

#example of which column has the maximum value
max.col(test)
[1] 3 3 3 1 1 2

# wrap everything in function
my_func <- function(data){
  #which column is has the maximum value
  wm <- max.col(data)
  out <- ifelse(wm == 1, "Test 1 is the individual's highest score!", 
                ifelse(wm == 2, "Test 2 is the individual's highest score!",
                       "Test 3 is the individual's highest score!"))
  return(out)
}
my_func(test)
[1] "Test 3 is the individual's highest score!" "Test 3 is the individual's highest score!" "Test 3 is the individual's highest score!"
[4] "Test 1 is the individual's highest score!" "Test 1 is the individual's highest score!" "Test 2 is the individual's highest score!"

adding it to the test data.frame:

test$highest_score <- my_func(test)
phiver
  • 19,366
  • 14
  • 36
  • 42