Questions tagged [median]

The median is the 'middle' value from a set of values. If the number of values is an even number, the median is the mean of the 'middle' values.

The median is generally used in programming in the sense of the term that comes from statistics. In simple terms it means the number that has half the values above it and half the values below it in the range. It is different than average in that an average can be influenced by extremes on either end of the spectrum. If a few numbers are a lot greater or a lot smaller than the rest, the average will be significantly different from the median. Median can give you a better sense of where the typical middle case is than the average.

The median is also very useful in signal and image processing, in the context of a moving median filter. This filter is usually used to reduce "salt and pepper" type noise, as well as spikes, because each output pixel or element contains the median value of the m-by-n neighborhood around that corresponding pixel in the data.

For a more statistically and technically correct and thorough explanation, see Wikipedia.

In scientific software for statistical computing and graphics, the median of a numeric vector can be found by function median, or quantile with prob = 0.5.

1286 questions
12
votes
5 answers

find median in a fixed-size moving window along a long sequence of data

Given a sequence of data (it may have duplicates), a fixed-sized moving window, move the window at each iteration from the start of the data sequence, such that (1) the oldest data element is removed from the window and a new data element is…
user1002288
  • 4,502
  • 9
  • 45
  • 76
12
votes
4 answers

Fast alternative for numpy.median.reduceat

Relating to this answer, is there a fast way to compute medians over an array that has groups with an unequal number of elements? E.g.: data = [1.00, 1.05, 1.30, 1.20, 1.06, 1.54, 1.33, 1.87, 1.67, ... ] index = [0, 0, 1, 1, 1, 1, …
Jean-Paul
  • 15,930
  • 7
  • 55
  • 78
12
votes
2 answers

Add hline with population median for each facet

I'd like to plot a horizontal facet-wide line with the population median of that facet. I tried the approach without creating a dummy summary table with the following code: require(ggplot2) dt = data.frame(gr = rep(1:2, each = 500), id…
mattek
  • 754
  • 5
  • 16
11
votes
4 answers

Is there a RAM efficient way to calculate the median over a complement set?

I am looking for an RAM efficient way to calculate the median over a complement set with the help of data.table. For a set of observations from different groups, I am interested in an implementation of a median of "other groups". I.e., if a have a…
11
votes
3 answers

Get median of array

I have an array that looks like this: let arr = [1,2,3,4,5,6,7,8,9] I know you can get min and max by: let min = arr.min() let max = arr.max() But how do you get the median?
John S
  • 187
  • 2
  • 7
11
votes
3 answers

How to calculate median of a numeric sequence in Google BigQuery efficiently?

I need to calculate median value of a numeric sequence in Google BigQuery efficiently. Is the same possible?
Manish Agrawal
  • 127
  • 1
  • 1
  • 5
11
votes
11 answers

median of two sorted arrays

My question is with reference to Method 2 of this link. Here two equal length sorted arrays are given and we have to find the median of the two arrays merged. Algorithm: 1) Calculate the medians m1 and m2 of the input arrays ar1[] and ar2[]…
AvinashK
  • 3,106
  • 5
  • 38
  • 85
10
votes
3 answers

Optimal median of medians selection - 3 element blocks vs 5 element blocks?

I'm working on a quicksort-variant implementation based on the Select algorithm for choosing a good pivot element. Conventional wisdom seems to be to divide the array into 5-element blocks, take the median of each, and then recursively apply the…
R.. GitHub STOP HELPING ICE
  • 195,354
  • 31
  • 331
  • 669
10
votes
4 answers

Python/Pandas Dataframe replace 0 with median value

I have a python pandas dataframe with several columns and one column has 0 values. I want to replace the 0 values with the median or mean of this column. data is my dataframe artist_hotness is the column mean_artist_hotness =…
jeangelj
  • 3,328
  • 11
  • 39
  • 86
10
votes
1 answer

Generic Method to find the median of 3 values

I needed a method to get the median of 3 values, I thought it a good opportunity to write a generic method since I don't really have that practiced. I wrote this and it seems pretty straight-forward, though I get a warning, but it seems to work…
Legato
  • 578
  • 11
  • 21
10
votes
1 answer

ggplot2 boxplot medians aren't plotting as expected

So, I have a fairly large dataset (Dropbox: csv file) that I'm trying to plot using geom_boxplot. The following produces what appears to be a reasonable plot: require(reshape2) require(ggplot2) require(scales) require(grid) require(gridExtra) df <-…
Ryan Pugh
  • 243
  • 1
  • 9
10
votes
3 answers

Find a median of N^2 numbers having memory for N of them

I was trying to learn about distributed computing and came across a problem of finding median of a large set of numbers: Assume that we have a large set of numbers (lets say number of elements is N*K) that cannot fit into memory (size N). How do we…
Akshya11235
  • 771
  • 3
  • 9
  • 22
10
votes
4 answers

How can I calculate the median and standard deviation of a bunch stream of numbers in Perl?

In our logfiles we store response times for the requests. What's the most efficient way to calculate the median response time, the "75/90/95% of requests were served in less than N time" numbers etc? (I guess a variation of my question is: What's…
Ask Bjørn Hansen
  • 6,099
  • 2
  • 23
  • 38
10
votes
3 answers

How to do median splits within factor levels in R?

Here I make a new column to indicate whether myData is above or below its median ### MedianSplits based on Whole Data #create some test data myDataFrame=data.frame(myData=runif(15),myFactor=rep(c("A","B","C"),5)) #create column showing median…
Dan Goldstein
  • 21,713
  • 17
  • 34
  • 41
9
votes
4 answers

How to calculate median of a Map?

For a map where the key represents a number of a sequence and the value the count how often this number appeared in the squence, how would an implementation of an algorithm in java look like to calculate the median? For…
Chris
  • 14,451
  • 18
  • 70
  • 73