Questions tagged [tapply]

tapply is a function in the R programming language for apply a function to subsets of a vector.

tapply is a function in the R programming language for apply a function to subsets of a vector. A vector is broken in to subsets, potentially of different lengths (aka a ragged array) based on the values of one or more other vector. The second vector is either already a factor or coerced to be a factor by as.factor. A function is applied to each of these subsets. tapply then returns either an array or a list, depending on the output of the function.

330 questions
1086
votes
10 answers

Grouping functions (tapply, by, aggregate) and the *apply family

Whenever I want to do something "map"py in R, I usually try to use a function in the apply family. However, I've never quite understood the differences between them -- how {sapply, lapply, etc.} apply the function to the input/grouped input, what…
grautur
  • 27,957
  • 33
  • 90
  • 125
16
votes
4 answers

Multiple functions in a single tapply or aggregate statement

Is it possible to include two functions within a single tapply or aggregate statement? Below I use two tapply statements and two aggregate statements: one for mean and one for SD. I would prefer to combine the statements. my.Data = read.table(text =…
Mark Miller
  • 11,294
  • 21
  • 69
  • 119
14
votes
3 answers

sum multiple columns by group with tapply

I wanted to sum individual columns by group and my first thought was to use tapply. However, I cannot get tapply to work. Can tapply be used to sum multiple columns? If not, why not? I have searched the internet extensively and found numerous…
Mark Miller
  • 11,294
  • 21
  • 69
  • 119
13
votes
3 answers

What does the t in tapply stand for?

There seems to be general agreement that the l in "lapply" stands for list, the s in "sapply" stands for simplify and the r in "rapply" stands for recursively. But I could not find anything on the t in "tapply". I am now very curious.
orizon
  • 2,979
  • 2
  • 22
  • 30
11
votes
3 answers

Remove NA from list of lists

I have a matrix, data.mat, that looks like: A B C D E 45 43 45 65 23 12 45 56 NA NA 13 4 34 12 NA I am trying to turn this into a list of lists, where each row is one list within a bigger list. I do the following: list <-…
Amberopolis
  • 445
  • 1
  • 5
  • 13
11
votes
2 answers

Mean of variable by two factors

I have the following data: a <- c(1,1,1,1,2,2,2,2) b <- c(2,4,6,8,2,3,4,1) c <- factor(c("A","B","A","B","A","B","A","B")) df <- data.frame( sp=a, length=b, method=c) I can use the following to get a count of the number of samples of…
Ben
  • 167
  • 1
  • 2
  • 10
11
votes
2 answers

How to pass na.rm as argument to tapply?

I´d like to calculate mean and sd from a dataframe with one column for the parameter and one column for a group identifier. How can I calculate them when using tapply? I could use sd(v1, group, na.rm=TRUE), but can´t fit the na.rm=TRUE into the…
Doc
  • 318
  • 1
  • 4
  • 22
8
votes
1 answer

Breaking the tapply junkie habit

I've learned R by toying, and I'm starting to think that I'm abusing the tapply function. Are there better ways to do some of the following actions? Granted, they work, but as they get more complex I wonder if I'm losing out on better options. I'm…
Totovader
  • 115
  • 1
  • 5
7
votes
2 answers

Combining tapply and 'not in' logic, using R

How do I combine the tapply command with 'not in' logic? Objective: Obtain the median sepal length for each species. tapply(iris$Sepal.Length, iris$Species, median) Constraint: Remove entries for which there is a petal width of 1.3 and…
bubbalouie
  • 573
  • 1
  • 8
  • 17
7
votes
2 answers

does the by( ) function make growing list

Does the by function make a list that grows one element at a time? I need to process a data frame with about 4M observations grouped by a factor column. The situation is similar to the example below: > # Make 4M rows of data > x =…
Anand
  • 71
  • 1
7
votes
4 answers

How to assign a counter to a specific subset of a data.frame which is defined by a factor combination?

My question is: I have a data frame with some factor variables. I now want to assign a new vector to this data frame, which creates an index for each subset of those factor variables. data <-data.frame(fac1=factor(rep(1:2,5)),…
JBJ
  • 806
  • 9
  • 20
6
votes
1 answer

What is the difference between the functions tapply and ave?

I can't wrap my mind around the ave function. I read the help and searched the net but I still cannot understand what it does. I understand it applies some function on a subset of observation but not in the same way as for example tapply Could…
ECII
  • 8,377
  • 17
  • 70
  • 114
5
votes
1 answer

Custom rcpp last function slow with dplyr group_by and summarise compared to tapply

I'm trying to get a sense of how to write Rcpp summarise functions that will be fast with dplyr. The motivation for this is a function that dplyr does not seem to have an equivalent for, however, for the sake of simplicity, I'm going to use the…
user2506086
  • 453
  • 2
  • 8
5
votes
1 answer

R's tapply with null function

I'm having trouble understanding what tapply function does when the FUN argument is null. The documentation says: If FUN is NULL, tapply returns a vector which can be used to subscript the multi-way array tapply normally produces. For example,…
carmellose
  • 4,146
  • 7
  • 35
  • 46
5
votes
2 answers

R: How do you apply grep() in lapply()

I would like to apply grep() in R, but I am not really good in lapply(). I understand that lapply is able to take a list, apply function to each members and output a list. For instance, let x be a list consists of 2 members. >…
HNSKD
  • 1,378
  • 1
  • 11
  • 22
1
2 3
21 22