3

I am new to sql and have been struggling about nan-median, for example, I have a table with three (million) rows, each with ten numbers (or null):

row1: 1,2,3,null,4,5,6,7,8,9
----------
row2: 2,4,null,6,8,2,1,0,9,10
----------
row3: 1,1,1,1,null,7,2,9,9,9
----------

How to get the nan-median of each row?

Kashif Qureshi
  • 1,414
  • 2
  • 13
  • 20
noringname
  • 33
  • 3
  • this may help. https://stackoverflow.com/questions/1342898/function-to-calculate-median-in-sql-server – Ferdinand Gaspar Aug 31 '17 at 19:49
  • if i understand correctly, that post is about calculating median of a column, I am trying to figure out how to calculate median of a row. – noringname Aug 31 '17 at 20:10
  • Do your rows have a primary key? – MBurnham Aug 31 '17 at 20:30
  • @MBurnham yes I use the auto_increment index as a primary key – noringname Aug 31 '17 at 20:36
  • can you provide schema for table? – MBurnham Aug 31 '17 at 20:40
  • @MBurnham, TB( id (pri), field1, field2......field10 (float NULL) ), basically 11 columns with id column as primary key, three rows, or three million rows... – noringname Aug 31 '17 at 20:45
  • See https://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query – Strawberry Aug 31 '17 at 20:56
  • Looks like you are having problems with the null values. For that use the coalesce function e.g. coalesce (Col1, 0). – WellingtonD Aug 31 '17 at 21:41
  • There is no easy way to get the median without ordering your columns. Unfortunately (for you), mysql doesn't provide a function to order columns, just rows (data of same meaning is usually stored in the same column). You can create rows out of your columns (`select id, col1 from table union all select id, col2 from table union all ...`) and then use any of the median queries for rows. Or, probably faster, write a function that takes 10 values, orders them and thus calculates the median row wise. – Solarflare Aug 31 '17 at 22:26
  • @Solarflare, I guess there is indeed no easy way to do it, the data is a time series (row) with ten station sensors, there is no easy way to switch rows and columns either, I managed to use R to deal with it and it works now. – noringname Aug 31 '17 at 22:36

2 Answers2

0

According to your question, If you have 10 columns that you want to calculate. The trick is to avoid the null values with Coalesce function. Use this logic:

Select (coalesce(Col1, 0) + coalesce(Col2, 0) + coalesce(Col3, 0)+ coalesce(Col4, 0)+coalesce(Col5, 0)+coalesce(Col6, 0)+coalesce(Col7, 0)+coalesce(Col8, 0)+coalesce(Col9, 0)+.coalesce(Col10, 0))/10 
from yourtable
Neels
  • 2,489
  • 6
  • 31
  • 39
WellingtonD
  • 155
  • 4
  • 16
  • 1
    That is the average. The median in the value that seperates the top half and the lower half of elements (e.g.: the 5th element for a list of 9 elements and the average of the 5th and 6th element for a list of 10 elements). – Solarflare Aug 31 '17 at 22:26
0
select(
   (select MAX(cols) from 
    (select TOP 50 PERCENT cols from table ORDER BY cols) )  
        + 
          (select MIN(cols) from 
            (select TOP 50 PERCENT cols from table ORDER BY cols DESC) ) 
     ) / 2  
Median

Select max of first 50% rows by sorting the rows in asc which should be the median in case if column count is even. Select min of next 50% rows by ordering the rows in desc, in case if column count is even, this will be median or else max of (firsthalf+min of secondhalf)/2 will be the median.

pvkcse
  • 101
  • 1
  • 10
  • Good answer but MySQL's dialect does not have `TOP` clause. – Parfait Sep 01 '17 at 00:12
  • Didn't LIMIT 50 PERCENT replaces the same when it comes to mysql? Just a thought, correct me if wrong! @Parfait – pvkcse Sep 01 '17 at 00:19
  • An often [asked question](https://www.google.com/search?client=ubuntu&channel=fs&q=mysql+limit+percent), no such clause. – Parfait Sep 01 '17 at 00:24
  • How about this? Select * from (Select * from (select count(*)/2 from INFORMATION_SCHEME.COLUMNS where table_scheme='schema' and table_name='table') as cnt limit cnt) order by desc limit 1 Doing the same and sorting another 50% in asc as suggested in my answer. I think this will work. – pvkcse Sep 01 '17 at 00:42
  • can the 50 percent order method deal with NULLs automatically? – noringname Sep 01 '17 at 01:50
  • You can use coalesce(val,0) to handle nulls since IsNull() is not an ANSI-SQL compliant. And here, it actually handles null. I'm ordering the records before picking the data. So probably null might not be an obstacle. But there are some special cases like 5,null,null and null,null which wouldn't work. But he didn:t mention anything about what to return when these cases occur. – pvkcse Sep 01 '17 at 02:11
  • He wants the median of 10 columns, not rows. (The median of the content of `(col1, col2, col3... col10) ` per row). – Solarflare Sep 01 '17 at 06:34