2

Let's illustrate the problem on the famous iris dataset. I need to apply the selected function by rows but only on selected columns. Example goes as follows:

library(tidyverse)

iris %>%
  mutate_at(.funs = scale, .vars = vars(-c(Species))) %>%
  rowwise() %>% 
  mutate(my_mean=mean(c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)))

So, first I scale all variables, excluding Species and then compute mean rowwise over all four numeric variables. However, in the real dataset I have 100+ numeric variables and I wonder how to convince R to automatically include all variables excluding selected one (e.g., Species in the given example). I go through the solutions on SO (e.g., this), but all examples explicitly refer to column names. Any pointers are greatly welcome.

EDIT: after some munging here is my solution:

iris %>%
  as_tibble() %>% 
  mutate_at(.funs = scale, .vars = vars(-c(Species))) %>% 
  transmute(Species, row_mean = rowMeans(select(., -Species)))
Andrej
  • 3,354
  • 8
  • 35
  • 66
  • 2
    If I understand your question, in base R, you'd do `rowMeans(scale(iris[-grep("Species", names(iris))]))`. – lmo Oct 03 '17 at 17:46
  • 1
    You want to apply `rowMeans()` to all columns *except* one? eg, `dplyr::select(-Species) %>% dplyr::mutate(my_mean=rowMeans(.))`? – juan Oct 03 '17 at 17:47

1 Answers1

9

I'm not sure I got exactly what the problem is, but here are a few alternative dplyr solutions which will give you the mean of all columns except the selected one:

iris %>%
    select(-Species) %>%
    mutate(Means = rowMeans(.))

iris %>%
    mutate(Means = rowMeans(.[,1:4]))

iris %>%
    mutate(Means = rowMeans(.[,-5]))

The first is the only one that eliminates the selected column from the return. Hope one of them helps you.

csgroen
  • 2,082
  • 9
  • 23
  • How would you achieve the same using pure dplyr? I am looking for a solution without the `[]` to get the columns and with explicit use of `rowwise`. Thanks – beginneR Dec 12 '17 at 11:49
  • Hey @beginneR. You can try the first example to remove the columns before calculating the mean, then you don't need to specify. You can use rowwise like this: `iris %>% select(-Species) %>% rowwise() %>% mutate(Means = mean(c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)))`, but then you must specify the columns to mean, I believe. – csgroen Dec 14 '17 at 12:35