2

I have the following dataframe imported in R:

product per1  per2  per3
A       10    20    30
B       23    14    21
C       26    95    81

Consider A:C as products listed in rows one after another and their corresponding sales values across per_1:per_3 which are given in columns.

The number of products are quite high so couldn't list all the products My requirement here is to add the sales for a combination of products.

For example

A      10    20   30
B      23    14   21

The sum should be

myresult<- A+B
myresult
33 34 51

The sum should be 30 34 80, and it should be copied to another vector.

The number of products are so large that I could not transpose them and use it for the calculation. Assume the combination of products to be added are also high and random.

Here's my data:

structure(list(product = structure(1:3, .Label = c("A", "B", "C"), class = "factor"), 
               per1 = c(10, 20, 30), 
               per2 = c(23, 14, 21), 
               per3 = c(26, 95, 81)), 
          .Names = c("product", "per1", "per2", "per3"), 
          row.names = c(NA, -3L), class = "data.frame")
M--
  • 18,939
  • 7
  • 44
  • 76
Kathir
  • 21
  • 2

4 Answers4

2

We can make dataframes by each pair of rows and then summarize those pairs by summing per# columns and putting product names into a list. Here, I am using dplyr and purrr packages but it can be done in base similarly.

library(dplyr)
library(purrr)
id.row <- combn(nrow(df1),2) ## to get a matrix with ids of each pair of rows

## first I create a list of dataframes with pair of rows
map(1:ncol(id.row), function(i) 
                          rbind(df1[id.row[1,i], ], df1[id.row[2,i], ])) %>% 
## then I summarize them based on column class (converting factors to character first)
  map(. %>% 
        mutate_if(is.factor, as.character) %>% 
        summarise_each(funs(if(is.numeric(.)) sum(., na.rm = TRUE) else list(.))))
#> [[1]]
#>   product per1 per2 per3
#> 1    A, B   33   34   51
#> 
#> [[2]]
#>   product per1 per2 per3
#> 1    A, C   36  115  111
#> 
#> [[3]]
#>   product per1 per2 per3
#> 1    B, C   49  109  102

If you have a specific list of set of rows that you want to sum pairwise, then you can do the following:

## specific pair of rows as you need
id.row <- cbind(c("A", "B"), c("B", "C"))

## first I create a list of dataframes with pair of rows
map(1:ncol(id.row), function(i) 
  rbind(df1[df1$product==id.row[1,i],], df1[df1$product==id.row[2,i],])) %>% 
  ## then I summarize them based on column class (converting factors to character first)
  map(. %>% 
        mutate_if(is.factor, as.character) %>% 
        summarise_each(funs(if(is.numeric(.)) sum(., na.rm = TRUE) else list(.))))
#> [[1]]
#>   product per1 per2 per3
#> 1    A, B   33   34   51
#> 
#> [[2]]
#>   product per1 per2 per3
#> 1    B, C   49  109  102

Data:

df1 <- read.table(text="product per1  per2  per3
                          A       10    20    30
                          B       23    14    21
                          C       26    95    81", header=T)
M--
  • 18,939
  • 7
  • 44
  • 76
  • When I use `%>% bind_rows()` at the end of my pipe, I get an error saying ```Error in as_mapper(.f, ...) : argument ".f" is missing, with no default``` but if I save the result and then try `bind_rows(results)` I get the expected output with no error. Anyone has an idea about what could be the issue? – M-- Aug 08 '20 at 21:36
  • 1
    if i add `%>% bind_rows`, i get the `product per1 per2 per3 1 A, B 33 34 51 2 B, C 49 109 102`. May be it is a version difference? I am using `dplyr 1.0.1`. I do get some warning for the `summarise_each` deprecation. Otherwise, it is working for me – akrun Aug 08 '20 at 21:42
  • 1
    @akrun Could be my pkg version then. For debugging an old project I have to stick to these. Thanks for taking a look. – M-- Aug 08 '20 at 21:44
0

Here is a base R option via combn

combn(seq(nrow(df1)),
  2,
  FUN = function(k) {
    cbind(
      product = toString(df1$product[k]),
      data.frame(as.list(colSums(df1[k, -1])))
    )
  },
  simplify = FALSE
)

which gives

[[1]]
  product per1 per2 per3
1    A, B   33   34   51

[[2]]
  product per1 per2 per3
1    A, C   36  115  111

[[3]]
  product per1 per2 per3
1    B, C   49  109  102

Data

df1 <- structure(list(product = c("A", "B", "C"), per1 = c(10L, 23L, 
26L), per2 = c(20L, 14L, 95L), per3 = c(30L, 21L, 81L)), class = "data.frame", row.names = c(NA,
-3L))
ThomasIsCoding
  • 53,240
  • 4
  • 13
  • 45
-1

Would this work?

df["a", ] <- df["a", ] + df["c", ]
M--
  • 18,939
  • 7
  • 44
  • 76
BeccaLi
  • 152
  • 6
-1
p2sum <- c("A", "B")
out <- colSums(df[df$product %in% p2sum, c("per1", "per2", "per3")])
out
per1 per2 per3 
  30   37  121

Data:

df <- data.frame(
  product = structure(1:3, .Label = c("A", "B", "C"), class = "factor"), 
  per1 = c(10, 20, 30), 
  per2 = c(23, 14, 21), 
  per3 = c(26, 95, 81)
) 
sindri_baldur
  • 22,360
  • 2
  • 25
  • 48