0

I have a table with the following structure.

create table FindMedians 
( GroupByColumn varchar(100) 
, TimeInterval_1 int 
, TimeInterval_2 int 
, TimeInterval_3 int 
);

I need to find the medians of each of the time intervals for each group. I've been calculating the medians for each column separately and UNIONing them, and then PIVOTing to get a final result as:

GroupByColumn     Median1     Median2     Median3

using a query given in solution to Function to Calculate Median in Sql Server

Note: I'm just using the query, I haven't created a function.

The original table has close 500K rows, and trying to calculate the medians separately for each column is slow. Is there a well performing way that would give me the medians of all the columns in a single query, without having to calculate separately for each column?

Thanks

Community
  • 1
  • 1
Venu
  • 3
  • 3

2 Answers2

1

As calculating a median requires sorting the data, unless your data is in order, then there isn't really any alternative to sorting and calculating the medians independently.

podiluska
  • 49,221
  • 7
  • 86
  • 96
0

DOn't worry about performance. The ranking functions should be sorting only the columns that are needed, not the entire data set. The performance is going to be affected primarily by how much of the data fits in memory. Since the original data fits in memory, and the ranking for each column should be an in-memory sort, performance should be pretty good.

For the record, there are O(n) algorithms for finding a median. I am not aware of any databases that implements them, though.

Gordon Linoff
  • 1,122,135
  • 50
  • 484
  • 624