Questions tagged [window-functions]

A window function is a type of SQL operation that aggregates over a partition of the result set.

As a basic example of a window function, consider a table that contains a list persons along with their age and gender.
If you wanted a query to return the complete list of people along with their age rank per gender, you could accomplish this most simply by using a ranking window function.
Such a function can be used to rank individual rows within a subset of the results.

Using a window function the above function would look like this:

SELECT name, 
       age, 
       gender, 
       RANK() OVER (PARTITION BY gender ORDER BY age DESC) AS age_rank
FROM persons

Window functions as defined by the SQL standard can also access rows before and after the current row using the lag() or lead() function. The following query calculates the delta to the previous row for the amount column for each customer:

SELECT customer_id,
       order_date,
       amount, 
       amount - lag(amount) over (partition by customer_id ORDER BY order_date) as delta
FROM order
ORDER BY customer_id, order_date;

Standard aggregates (e.g. count(), sum()) can also be used as a window function. If combined with an ORDER BY (inside the partition defintion) this will generate "running" totals based on the aggregate being used.

Window functions are available in a wide range of DBMS: Oracle (called "analytical functions" there), Microsoft SQL Server, DB2 (called "olap functions" there) PostgreSQL, Teradata, Sybase, Vertica, and partially in H2 Database.

3029 questions
21
votes
4 answers

Why do I need to apply a window function to samples when building a power spectrum of an audio signal?

I have found for several times the following guidelines for getting the power spectrum of an audio signal: collect N samples, where N is a power of 2 apply a suitable window function to the samples, e.g. Hanning pass the windowed samples to an FFT…
Nuno Santos
  • 1,382
  • 3
  • 14
  • 33
20
votes
2 answers

SQL window function with a where clause?

I'm trying to correlate two types of events for users. I want to see all event "B"s along with the most recent event "A" for that user prior to the "A" event. How would one accomplish this? In particular, I'm trying to do this in Postgres. I was…
MJ.
  • 1,089
  • 4
  • 10
  • 21
19
votes
1 answer

Row_number over (Partition by yyy) in Entity Framework

I want to load data by Row_number over Partition by using EF. SELECT * FROM ( SELECT sf.SerialFlowsId ,sf.GoodsSerialId ,d.FormTypeId , d.GoodsId …
Mahdi Farhani
  • 727
  • 1
  • 6
  • 20
19
votes
2 answers

T-SQL calculate moving average

I am working with SQL Server 2008 R2, trying to calculate a moving average. For each record in my view, I would like to collect the values of the 250 previous records, and then calculate the average for this selection. My view columns are as…
RunW
  • 259
  • 1
  • 2
  • 11
19
votes
2 answers

Applying Multiple Window Functions On Same Partition

Is it possible to apply multiple window functions to the same partition? (Correct me if I'm not using the right vocabulary) For example you can do SELECT name, first_value() over (partition by name order by date) from table1 But is there a way to…
Verhogen
  • 23,861
  • 32
  • 81
  • 109
19
votes
3 answers

PostgreSQL - how should I use first_value()?

This answer to shows how to produce High/Low/Open/Close values from a ticker: Retrieve aggregates for arbitrary time intervals I am trying to implement a solution based on this (PG 9.2), but am having difficulty in getting the correct value for…
Brent.Longborough
  • 8,787
  • 10
  • 38
  • 60
18
votes
1 answer

Avoid performance impact of a single partition mode in Spark window functions

My question is triggered by the use case of calculating the differences between consecutive rows in a spark dataframe. For example, I have: >>> df.show() +-----+----------+ |index| col1| +-----+----------+ | 0.0|0.58734024| | …
17
votes
2 answers

Does Spark know the partitioning key of a DataFrame?

I want to know if Spark knows the partitioning key of the parquet file and uses this information to avoid shuffles. Context: Running Spark 2.0.1 running local SparkSession. I have a csv dataset that I am saving as parquet file on my disk like…
astro_asz
  • 2,058
  • 2
  • 13
  • 30
17
votes
2 answers

spark sql window function lag

I am looking at the window slide function for a Spark DataFrame in Scala. I have a DataFrame with columns Col1, Col2, Col3, date, volume and new_col. Col1 Col2 Col3 date volume new_col 201601 100.5 …
Ramesh
  • 1,323
  • 7
  • 18
  • 34
17
votes
2 answers

Referencing current row in FILTER clause of window function

In PostgreSQL 9.4 the window functions have the new option of a FILTER to select a sub-set of the window frame for processing. The documentation mentions it, but provides no sample. An online search yields some samples, including from 2ndQuadrant…
Patrick
  • 23,688
  • 4
  • 49
  • 75
16
votes
2 answers

How to use a SQL window function to calculate a percentage of an aggregate

I need to calculate percentages of various dimensions in a table. I'd like to simplify things by using window functions to calculate the denominator, however I am having an issue because the numerator has to be an aggregate as well. As a simple…
15
votes
4 answers

MySql using correct syntax for the over clause

What is the correct syntax to get the over clause to work in mysql? I would like to see the total sms's sent by each user without grouping it with the group by clause. SELECT username, count(sentSmsId) OVER (userId) FROM …
Nightwolf
  • 907
  • 2
  • 9
  • 22
14
votes
1 answer

Postgres window function and group by exception

I'm trying to put together a query that will retrieve the statistics of a user (profit/loss) as a cumulative result, over a period of time. Here's the query I have so far: SELECT p.name, e.date, sum(sp.payout) OVER (ORDER BY e.date) -…
Martin
  • 1,995
  • 4
  • 19
  • 41
14
votes
2 answers

How to make a SUM without group by

Here is my problem.. Actual Auction Ammanat id 7000 500 100 228,229 7000 100 100 228,229 7000 900 100 228,229 5000 0 0 230 I want result as given below Actual Auction …
Aijaz Chauhan
  • 1,315
  • 2
  • 16
  • 45
14
votes
2 answers

TSQL OVER clause: COUNT(*) OVER (ORDER BY a)

This is my code: USE [tempdb]; GO IF OBJECT_ID(N'dbo.t') IS NOT NULL BEGIN DROP TABLE dbo.t END GO CREATE TABLE dbo.t ( a NVARCHAR(8), b NVARCHAR(8) ); GO INSERT t VALUES ('a', 'b'); INSERT t VALUES ('a', 'b'); INSERT t VALUES ('a',…
Just a learner
  • 21,448
  • 45
  • 133
  • 206