Questions tagged [tapply]

tapply is a function in the R programming language for apply a function to subsets of a vector.

tapply is a function in the R programming language for apply a function to subsets of a vector. A vector is broken in to subsets, potentially of different lengths (aka a ragged array) based on the values of one or more other vector. The second vector is either already a factor or coerced to be a factor by as.factor. A function is applied to each of these subsets. tapply then returns either an array or a list, depending on the output of the function.

330 questions
5
votes
1 answer

R - Loop through different matrices without using loop ! Help to simply a code

So I have two separate matrix (mat1 and mat2) and I need to go through them in order to make a check. I need to store the results into a third matrix. I feel that my code is very long for the purpose. I wanted to have some of your suggestion to…
giac
  • 3,673
  • 2
  • 19
  • 47
5
votes
2 answers

Equivalent of R's tapply() in Python Pandas

I have a dataset that contains the feeding data of 3 animals, consisting of the animals' tag ids (1,2,3), the type (A,B) and amount (kg) of feed given at each 'meal': Animal FeedType Amount(kg) Animal1 A 10 Animal2 B …
Zhubarb
  • 8,409
  • 17
  • 65
  • 100
4
votes
0 answers

R tapply: different R releases produce different outputs

The Problem This a simple tapply example: z=data.frame(s=as.character(NA), rows=c(1,2,1), cols=c(1,1,2), stringsAsFactors=FALSE) tapply(z$s, list(z$rows, z$cols), identity) On R (Another Canoe) v3.3.3 (2017-03-06) for Windows, it brings: # 1 2…
antonio
  • 9,285
  • 10
  • 59
  • 113
4
votes
1 answer

Scale all values depending on group

I have a dataframe similar to this one ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3) p1 <- c(21000, 23400, 26800, 2345, 23464, 34563, 456433, 56543, 34543,3524, 353, 3432, 4542, 6343, 4534 ) p2 <- c(234235, 2342342, 32, 23432, 23423, 2342342, 34, 2343,…
GNee
  • 117
  • 2
  • 9
4
votes
3 answers

R function which.max with tapply

I am trying to make a data frame with the maximum over records by a factor. I would like a data frame with 4 rows (one for each G) with the max for X in that group and the corresponding Y value. I know I could write a loop but would rather…
LoveMeow
  • 991
  • 1
  • 11
  • 24
4
votes
1 answer

What is the difference of tapply and aggregate in R?

Aaa <- data.frame(amount=c(1,2,1,2,1,1,2,2,1,1,1,2,2,2,1), card=c("a","b","c","a","c","b","a","c","b","a","b","c","a","c","a")) aggregate(x=Aaa$amount, by=list(Aaa$card), FUN=mean) ## Group.1 x ## 1 a 1.50 ## 2 …
Neo XU
  • 91
  • 2
  • 5
3
votes
4 answers

Mean with condition for multiple columns in r

Let's use mtcars to explain the situation. What I want to do is the same below for multiple columns. To have the mean of a column qsec (in the example) regarding another column with a specific value (4 and 6, in the example below). I'll compare the…
ivan lange
  • 55
  • 1
  • 3
3
votes
1 answer

Aggregate the total revenue for each date by using aggregate function

I have a daily revenue dataset df from 2016-01-01 to 2017-05-21. The dataset contains Datum, languages and Opbrengst variables. Datum lanuage Opbrengst 596 20160101 bg 254 923 20160101 bg-bg 434 1044 20160101 ca …
Sheryl
  • 521
  • 1
  • 7
  • 16
3
votes
5 answers

Get sum of every n th column for each individual and create new data frame in r

Having searched for similar posts, I am posting my question. I have monthly rainfall variables for several years for each site. I need to calculate monthly average rainfall over the years. I have given a simple data frame as follows. I need to…
sriya
  • 129
  • 1
  • 1
  • 7
3
votes
1 answer

R - tapply column mean, returning logical array

I have a data frame. I am trying to use the tapply function to find the average of one column when the values of a second column are equal to a given value. I want tapply to return the value of the mean, but it is returning a logical array (FALSE -…
Lama Kaysi
  • 69
  • 4
3
votes
1 answer

Calculating quintile based scores on R

I have a dataframe with year (2006 to 2010), 4 industry sectors, 150 firm names and the net income of these firms. In total I have 750 observations, one for each firm for each year. I want to give scores to firms for their income within each…
Piyush Shah
  • 131
  • 1
  • 11
3
votes
3 answers

Computing pairwise Hamming distance between all rows of two integer matrices/data frames

I have two data frames, df1 with reference data and df2 with new data. For each row in df2, I need to find the best (and the second best) matching row to df1 in terms of hamming distance. I used e1071 package to compute hamming distance. Hamming…
alaj
  • 177
  • 10
3
votes
1 answer

Apply custom function to each subset of a data frame and result a dataframe

It may be asked many times here, but i am not able to relate it to any since my function returns data frame. I have my custom function which builds model and outputs a data frame with slope(coeff2) in one column, intercept(coeff1) in another…
ds_user
  • 1,899
  • 2
  • 25
  • 61
3
votes
3 answers

Calculate accuracy by groups

I have a data frame which looks like this: df<- data.frame("iteration" = c(1,1,1,1,1,1), "model" = c("RF","RF","RF","SVM", "SVM","SVM"), "label" = c(0,0,1,0,0,1), "prediction" = c(0,1,1,0,1,1)) iteration model label prediction 1 …
Saul Garcia
  • 792
  • 1
  • 7
  • 20
3
votes
1 answer

Using tapply on data with counts to add zeros and NAs

I have a DB composed of: Species ID (as factor), counts, site, visit, year. Find a subset in here [Google Drive] I want to create a 4D array with the dimensions: species, site, visit and year. Counts as cell values. For which I am using the…
YMC
  • 63
  • 6
1
2
3
21 22