0

I want to get the median of unitRate from [dbo].[ReplaceCost_DirectCost_Details] view in Microsoft Sql Server Management Studio. I already got Min,Max and avg of it.But do not know about median. I tried following code, but did not get median .Thanks in advacen for your help.

 select 
        JobName as JobName
       ,Client as Client
       ,AssetClass as AssetClass
       ,AssetType as AssetType
       ,AssetSubType as AssetSubType
       ,Component as Component
       ,ComponentType as ComponentType
       ,ComponentSubType as ComponentSubType
       ,UnitRate AS UnitRate
       ,Max(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MaxFinalUnitRate]
       ,Min(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType)  as [MinFinalUnitRate]
       ,AVG(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MeanFinalUnitRate]
       ,AVG (UnitRate)  over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType)as Median
       from
        (
         Select top (10)
        JobName as JobName
       ,Client as Client
       ,AssetClass as AssetClass
       ,AssetType as AssetType
       ,AssetSubType as AssetSubType
       ,Component as Component
       ,ComponentType as ComponentType
       ,ComponentSubType as ComponentSubType
       ,UnitRate AS UnitRate
       ,ROW_NUMBER () over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType order by UnitRate) as [RowNum]
       ,COUNT(*) OVER (PARTITION BY JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType ) AS RowCnt

     from [dbo].[ReplaceCost_DirectCost_Details] rdd
     where  client = 'APV_Ballina_Shire_Council_Old'  and  UnitRate is not Null and UnitRate <> 0 
       ) x
       WHERE   RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2) 

Maryam Yasaei
  • 23
  • 1
  • 7
  • "not working" (does that mean smoke is emitted from the desktop? it crahed the server? the answer is wrong by a factor of 1000? I have an error message on the screen?) Please define "not working" and you may want to refrain from using that expression in future questions. – Paul Maxwell Nov 16 '17 at 23:55
  • @ Used_By_Already_ sorry for misunderstanding. I meant, I did not get median from this code.I edit my question now. Thanks. – Maryam Yasaei Nov 16 '17 at 23:59
  • the row that is returned by that query has a value for [UnitRate] that is the median for that column. i.e. `The where clause determines the row that carries the median` – Paul Maxwell Nov 17 '17 at 00:02
  • @Used_By_Already thanks , but how should I get the value of those rows? – Maryam Yasaei Nov 17 '17 at 00:04
  • If you expect to return the median ON ALL ROWS, then suggest you need a subquery to give you that single value, and then join it back to the other rows. You may want to push some of these calculations into CTEs to assist with efficiency. – Paul Maxwell Nov 17 '17 at 00:12

2 Answers2

0

EDIT

SQL Fiddle

CREATE TABLE Table1
    ([somevalue] int)
;
    
INSERT INTO Table1
    ([somevalue])
VALUES
    (141),
    (325),
    (325),
    (353),
    (3166),
    (325),
    (207),
    (141),
    (3166),
    (161)
;

Query 1:

with cte as (
  select *
  , row_number() over(order by somevalue) as RowNum
  , count(*) over() as RowCnt
  from table1
  )
select
*
from CTE
WHERE   RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2) 

| somevalue | RowNum | RowCnt |
|-----------|--------|--------|
|       325 |      5 |     10 |
|       325 |      6 |     10 |

Please consider the following small example. There are 7 rows of data, the median is the "midpoint" of those, so the where clause uses a row number compared to row count, and returns just that midpoint valuse. That value (67) repesents the median of that small sample.

SQL Fiddle

MS SQL Server 2014 Schema Setup:

CREATE TABLE Table1
    ([somevalue] int)
;
    
INSERT INTO Table1
    ([somevalue])
VALUES
    (2),
    (45),
    (67),
    (89),
    (4567),
    (6),
    (1290)
;

Query 1:

with cte as (
  select *
  , row_number() over(order by somevalue) as RowNum
  , count(*) over() as RowCnt
  from table1
  )
select
*
from CTE
WHERE   RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2) 

Results:

| somevalue | RowNum | RowCnt |
|-----------|--------|--------|
|        67 |      4 |      7 |
Community
  • 1
  • 1
Paul Maxwell
  • 26,417
  • 3
  • 25
  • 46
  • 1
    The median is 67 in this specific example because the number of rows are odd. But if we add an additional row to the dataset- for example the value 1- the median should be the sum of the middle most numbers divided by 2: (45 + 67) / 2 = 56. Instead this algorithm returns 67 again. – Zorkolot Nov 17 '17 at 00:20
  • @Zorkolot Whilst I am not disagreeing with the good point you make, some do not agree with averaging the midpoint is the correct method. What I was attempting to point out is that `the row returned` should represent the median. – Paul Maxwell Nov 17 '17 at 00:26
  • There's a way to do it with rownumbers... https://stackoverflow.com/questions/1342898/function-to-calculate-median-in-sql-server is the way I've done it. – Zorkolot Nov 17 '17 at 00:31
  • @Zorkolot yes, thanks, that's appropriate if you accept that averaging to 2 rows closest to midpoint if the method you want to adopt. – Paul Maxwell Nov 17 '17 at 00:33
  • Thanks @Used_By_Already for your responds. but I think my problem is that I wanted to get Median for top 10 as you see in selection part but this does not give me median for top 10 and the median it gives me is strange. Any idea? Thanks – Maryam Yasaei Nov 17 '17 at 02:03
  • 141 325 325 353 3166 325 207 141 3166 161 for these 10 it give me 5 & 6 as row number and 5000 for median @Used_By_Already – Maryam Yasaei Nov 17 '17 at 02:04
  • how on earth do you get 5000 as the median? what calculation are you using for that number? – Paul Maxwell Nov 17 '17 at 02:09
  • also , I tried with Asc and Desc ordering rows , it gives me same result.. – Maryam Yasaei Nov 17 '17 at 02:17
  • sorry, but I do not follow what your attempting to send me. Is `5000` the **WANTED RESULT?** (from the 10 values) If you look at the edited answer you will see that rows 5 & 6 are both `325`. Look the sqlfiddle http://sqlfiddle.com/#!6/da777/1 – Paul Maxwell Nov 17 '17 at 02:26
  • yes , you are right two 325 , my bad.@Used_By_Already. Thanks .But still need to work on it. – Maryam Yasaei Nov 17 '17 at 02:44
0

(sorry for using a second answer, but it will get lost if just added to the earlier one)

I am really not certain what the expected output of your query is. But I note that you are using TOP(10) and for that to work you must have an order by otherwise the result is indeterminate for the first 10 rows.

While the following may produce many more rows than you need, perhaps it will help lead to a solution.

WITH Basis as (
      SELECT
            JobName
          , Client
          , AssetClass
          , AssetType
          , AssetSubType
          , Component
          , ComponentType
          , ComponentSubType
          , UnitRate
          , ROW_NUMBER() OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType
                               ORDER BY UnitRate)
            AS [rownum]
      FROM [dbo].[ReplaceCost_DirectCost_Details] rdd
      WHERE client = 'APV_Ballina_Shire_Council_Old'
      AND UnitRate IS NOT NULL
      AND UnitRate <> 0
     )
, Top10s as (
      SELECT
            JobName
          , Client
          , AssetClass
          , AssetType
          , AssetSubType
          , Component
          , ComponentType
          , ComponentSubType
          , UnitRate
          , rownum
          , COUNT(*) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType)
            AS rowcnt
      FROM Basis
      WHERE rownum <= 10
      )
, Medians as (
      SELECT
            JobName
          , Client
          , AssetClass
          , AssetType
          , AssetSubType
          , Component
          , ComponentType
          , ComponentSubType
          , AVG(UnitRate) AS Median
      FROM Top10s
      WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
      GROUP BY
            JobName
          , Client
          , AssetClass
          , AssetType
          , AssetSubType
          , Component
          , ComponentType
          , ComponentSubType
          , AVG(UnitRate) 
      )
SELECT
      JobName
    , Client
    , AssetClass
    , AssetType
    , AssetSubType
    , Component
    , ComponentType
    , ComponentSubType
    , UnitRate
    , rownum
    , rowcnt
    , MAX(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [maxfinalunitrate]
    , MIN(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [minfinalunitrate]
    , AVG(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [meanfinalunitrate]
    , Medians.Median
FROM Top10s t
JOIN Medians m ON t.JobName = m.JobName
    AND t.Client = m.Client
    AND t.AssetClass = m.AssetClass
    AND t.AssetType = m.AssetType
    AND t.AssetSubType = m.AssetSubType
    AND t.Component = m.Component
    AND t.ComponentType = m.ComponentType
    AND t.ComponentSubType = m.ComponentSubType
;
Paul Maxwell
  • 26,417
  • 3
  • 25
  • 46