How do I get min, median and max from my query in postgresql?

Question

I have written a query in which one column is a month. From that I have to get min month, max month, and median month. Below is my query.

select ext.employee,
       pl.fromdate,
       ext.FULL_INC as full_inc,
       prevExt.FULL_INC as prevInc,
       (extract(year from age (pl.fromdate))*12 +extract(month from age (pl.fromdate))) as month,
       case
         when prevExt.FULL_INC is not null then (ext.FULL_INC -coalesce(prevExt.FULL_INC,0))
         else 0
       end as difference,
       (case when prevExt.FULL_INC is not null then (ext.FULL_INC - prevExt.FULL_INC) / prevExt.FULL_INC*100 else 0 end) as percent
from pl_payroll pl
  inner join pl_extpayfile ext
          on pl.cid = ext.payrollid
         and ext.FULL_INC is not null
  left outer join pl_extpayfile prevExt
               on prevExt.employee = ext.employee
              and prevExt.cid = (select max (cid) from pl_extpayfile
                                 where employee = prevExt.employee
                                 and   payrollid = (
                                   select max(p.cid)
                                   from pl_extpayfile,
                                        pl_payroll p
                                   where p.cid = payrollid
                                   and   pl_extpayfile.employee = prevExt.employee
                                   and   p.fromdate < pl.fromdate
                                 )) 
              and coalesce(prevExt.FULL_INC, 0) > 0 
where ext.employee = 17 
and (exists (
    select employee
    from pl_extpayfile preext
    where preext.employee = ext.employee
    and   preext.FULL_INC <> ext.FULL_INC
    and   payrollid in (
      select cid
      from pl_payroll
      where cid = (
        select max(p.cid)
        from pl_extpayfile,
             pl_payroll p
        where p.cid = payrollid
        and   pl_extpayfile.employee = preext.employee
        and   p.fromdate < pl.fromdate
      )
    )
  )
  or not exists (
    select employee
    from pl_extpayfile fext,
         pl_payroll p
    where fext.employee = ext.employee
    and   p.cid = fext.payrollid
    and   p.fromdate < pl.fromdate
    and   fext.FULL_INC > 0
  )
)
order by employee,
         ext.payrollid desc

If it is not possible, than is it possible to get max month and min month?

Your query is pretty much illegible. I've put it in a code block, but it's still impossible to really follow. It might be worth your time to edit your question and format it for readability; right now some people will look at it, go "Gah!" and move on without trying to answer. I don't know if it matters what the query is for the question, though; all you need is the `min` and `max` aggregate functions. For median did you try http://wiki.postgresql.org/wiki/Aggregate_Median ? 1st hit on search for "postgresql median" — Craig Ringer, Aug 22 '12 at 06:57

score 106 · Answer 1 · edited Sep 07 '18 at 22:11

106

To calculate the median in PostgreSQL, simply take the 50% percentile (no need to add extra functions or anything):

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY x) FROM t;

edited Sep 07 '18 at 22:11

simhumileco

21,911
14
106
90

answered Oct 29 '16 at 07:45

Tobi Oetiker

4,591
2
15
22

4

`PERCENTILE_DISC()` may be preferred under many circumstances. – Gordon Linoff Jan 19 '17 at 21:20
6

works like a charm but observe that this is postgres 9.4+! – Kristian Sandström Mar 06 '17 at 15:57
1

Nice. I was worried that it wouldn't average the values in an even-length set but it seems to work well. `SELECT PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER by val) FROM generate_series(1, 4) as t(val);` returns 2.5. `PERCENTILE_DISC`, however, returns 2. – isapir Oct 23 '17 at 19:55
Very useful, but does not work with window functions. – Jan Šimbera Aug 29 '18 at 07:45

Craig Ringer · Accepted Answer · 2012-08-22T07:31:27.017

You want the aggregate functions named min and max. See the PostgreSQL documentation and tutorial:

There's no built-in median in PostgreSQL, however one has been implemented and contributed to the wiki:

http://wiki.postgresql.org/wiki/Aggregate_Median

It's used the same way as min and max once you've loaded it. Being written in PL/PgSQL it'll be a fair bit slower, but there's even a C version there that you could adapt if speed was vital.

UPDATE After comment:

It sounds like you want to show the statistical aggregates alongside the individual results. You can't do this with a plain aggregate function because you can't reference columns not in the GROUP BY in the result list.

You will need to fetch the stats from subqueries, or use your aggregates as window functions.

Given dummy data:

CREATE TABLE dummystats ( depname text, empno integer, salary integer );
INSERT INTO dummystats(depname,empno,salary) VALUES
('develop',11,5200),
('develop',7,4200),
('personell',2,5555),
('mgmt',1,9999999);

... and after adding the median aggregate from the PG wiki:

You can do this with an ordinary aggregate:

regress=# SELECT min(salary), max(salary), median(salary) FROM dummystats;
 min  |   max   |         median          
------+---------+----------------------
 4200 | 9999999 | 5377.5000000000000000
(1 row)

but not this:

regress=# SELECT depname, empno, min(salary), max(salary), median(salary)
regress-# FROM dummystats;
ERROR:  column "dummystats.depname" must appear in the GROUP BY clause or be used in an aggregate function

because it doesn't make sense in the aggregation model to show the averages alongside individual values. You can show groups:

regress=# SELECT depname, min(salary), max(salary), median(salary) 
regress-# FROM dummystats GROUP BY depname;
  depname  |   min   |   max   |          median          
-----------+---------+---------+-----------------------
 personell |    5555 |    5555 | 5555.0000000000000000
 develop   |    4200 |    5200 | 4700.0000000000000000
 mgmt      | 9999999 | 9999999 |  9999999.000000000000
(3 rows)

... but it sounds like you want the individual values. For that, you must use a window, a feature new in PostgreSQL 8.4.

regress=# SELECT depname, empno, 
                 min(salary) OVER (), 
                 max(salary) OVER (), 
                 median(salary) OVER () 
          FROM dummystats;

  depname  | empno | min  |   max   |        median         
-----------+-------+------+---------+-----------------------
 develop   |    11 | 4200 | 9999999 | 5377.5000000000000000
 develop   |     7 | 4200 | 9999999 | 5377.5000000000000000
 personell |     2 | 4200 | 9999999 | 5377.5000000000000000
 mgmt      |     1 | 4200 | 9999999 | 5377.5000000000000000
(4 rows)

See also:

If i am putting max and min method it is asking to put rest of the column in group by clause and after that also it is not working — Deepak Kumar, Aug 22 '12 at 07:04
@DeepakKumar You need to read the PostgreSQL tutorial. It explains about aggregates, `GROUP BY`, etc. At a guess you either need to get the min, max and median via a subquery, or need to use a window function to calculate them. See http://www.postgresql.org/docs/current/static/tutorial-window.html . — Craig Ringer, Aug 22 '12 at 07:07
@DeepakKumar I suspect you need window functions. See the updated answer above. I can't run your query since there's no sample data, but I've provided a simple example. I've used avg() to get a mean, since there's no built-in median, though you can add one via that wiki code. If you add `OVER ()` to your aggregates without adding any `GROUP BY` it might just work. — Craig Ringer, Aug 22 '12 at 07:17
Or, if you want to aggregate per department: `min(salary) OVER (PARTITION BY depname) AS dep_min_salary` etc. — Erwin Brandstetter, Aug 22 '12 at 17:33

score 0 · Answer 3 · answered Dec 01 '19 at 21:07

0

One more option for median:

SELECT x
FROM table
ORDER BY x
LIMIT 1 offset (select count(*) from x)/2

answered Dec 01 '19 at 21:07

Aray Karjauv

1,998
21
37

Prajwal KV · Answer 4 · 2021-01-07T12:28:05.663

To find Median: for instance consider that we have 6000 rows present in the table.First we need to take half rows from the original Table (because we know that median is always the middle value) so here half of 6000 is 3000(Take 3001 for getting exact two middle value).

SELECT *
      FROM (SELECT column_name
            FROM Table_name
            ORDER BY column_name
            LIMIT 3001)As Table1
      ORDER BY column_name DESC ---->Look here we used DESC(Z-A)it will display the last 
                                --   two values(using LIMIT 2) i.e (3000th row and 3001th row) from 6000 
                                --   rows  
      LIMIT 2;

How do I get min, median and max from my query in postgresql?

4 Answers4

Linked

Related