Questions tagged [aggregate]

Aggregate refers to the process of summarizing grouped data, commonly used in Statistics.

Aggregate refers to the process of summarizing grouped data, commonly used in Statistics. Typically this involves replacing groups of data with single values (e.g. sum, mean, standard deviation, etc.). In SQL databases, this is accomplished with the use of GROUP BY and aggregate functions.

6659 questions
1053
votes
14 answers

Group By Multiple Columns

How can I do GroupBy multiple columns in LINQ Something similar to this in SQL: SELECT * FROM GROUP BY , How can I convert this to LINQ: QuantityBreakdown ( MaterialID int, ProductID int, Quantity…
Sreedhar
  • 26,251
  • 31
  • 104
  • 163
756
votes
12 answers

LINQ Aggregate algorithm explained

This might sound lame, but I have not been able to find a really good explanation of Aggregate. Good means short, descriptive, comprehensive with a small and clear example.
Alexander Beletsky
  • 18,109
  • 9
  • 57
  • 84
590
votes
6 answers

What are Aggregates and PODs and how/why are they special?

This FAQ is about Aggregates and PODs and covers the following material: What are Aggregates? What are PODs (Plain Old Data)? How are they related? How and why are they special? What changes for C++11?
Armen Tsirunyan
  • 120,726
  • 52
  • 304
  • 418
424
votes
15 answers

How to sum a variable by group

I have a data frame with two columns. First column contains categories such as "First", "Second", "Third", and the second column has numbers that represent the number of times I saw the specific groups from "Category". For example: Category …
user5243421
  • 9,066
  • 22
  • 65
  • 102
383
votes
6 answers

How to use GROUP BY to concatenate strings in MySQL?

Basically the question is how to get from this: foo_id foo_name 1 A 1 B 2 C to this: foo_id foo_name 1 A B 2 C
Paweł Hajdan
  • 16,853
  • 9
  • 45
  • 63
368
votes
14 answers

How to group dataframe rows into list in pandas groupby

I have a pandas data frame df like: a b A 1 A 2 B 5 B 5 B 4 C 6 I want to group by the first column and get second column as lists in rows: A [1,2] B [5,5,4] C [6] Is it possible to do something like this using pandas groupby?
Abhishek Thakur
  • 13,185
  • 13
  • 57
  • 92
358
votes
2 answers

C# Linq Group By on multiple columns

public class ConsolidatedChild { public string School { get; set; } public string Friend { get; set; } public string FavoriteColor { get; set; } public List Children { get; set; } } public class Child { public string…
Kasy
  • 3,591
  • 2
  • 13
  • 6
288
votes
9 answers

Pandas group-by and sum

I am using this data frame: Fruit Date Name Number Apples 10/6/2016 Bob 7 Apples 10/6/2016 Bob 8 Apples 10/6/2016 Mike 9 Apples 10/7/2016 Steve 10 Apples 10/7/2016 Bob 1 Oranges 10/7/2016 Bob 2 Oranges 10/6/2016 Tom …
Trying_hard
  • 6,963
  • 22
  • 54
  • 76
190
votes
8 answers

Mean per group in a data.frame

I have a data.frame and I need to calculate the mean per group (i.e. per Month, below). Name Month Rate1 Rate2 Aira 1 12 23 Aira 2 18 73 Aira 3 19 45 Ben 1 53 19 Ben …
Ianthe
  • 4,879
  • 18
  • 53
  • 70
171
votes
3 answers

Multiple aggregations of the same column using pandas GroupBy.agg()

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df["returns"], without having to call agg() multiple times? Example dataframe: import pandas as pd import datetime as dt import numpy as…
ely
  • 63,678
  • 30
  • 130
  • 206
163
votes
8 answers

Aggregate / summarize multiple variables per group (e.g. sum, mean)

From a data frame, is there a easy way to aggregate (sum, mean, max et c) multiple variables simultaneously? Below are some sample data: library(lubridate) days = 365*2 date = seq(as.Date("2000-01-01"), length = days, by = "day") year =…
MikeTP
  • 6,836
  • 15
  • 42
  • 56
162
votes
5 answers

Summarizing multiple columns with dplyr?

I'm struggling a bit with the dplyr-syntax. I have a data frame with different variables and one grouping variable. Now I want to calculate the mean for each column within each group, using dplyr in R. df <- data.frame( a = sample(1:5, n,…
Daniel
  • 6,454
  • 5
  • 21
  • 35
131
votes
16 answers

Count number of rows within each group

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows: df2 <- aggregate(x ~ Year + Month, data = df1, sum) Now, I would like to count observations but can't…
MikeTP
  • 6,836
  • 15
  • 42
  • 56
95
votes
7 answers

Apply several summary functions on several variables by group in one call

I have the following data frame x <- read.table(text = " id1 id2 val1 val2 1 a x 1 9 2 a x 2 4 3 a y 3 5 4 a y 4 9 5 b x 1 7 6 b y 4 4 7 b x 3 9 8 b y 2 8", header =…
broccoli
  • 4,316
  • 8
  • 33
  • 50
92
votes
3 answers

Pandas sum by groupby, but exclude certain columns

What is the best way to do a groupby on a Pandas dataframe, but exclude some columns from that groupby? e.g. I have the following dataframe: Code Country Item_Code Item Ele_Code Unit Y1961 Y1962 Y1963 2 Afghanistan 15 …
user308827
  • 18,706
  • 61
  • 194
  • 336
1
2 3
99 100