What is the shortest way to calculate running median in python?

Question

I need to calculate a running median in python. Currently I do it like this:

med_y = []
med_x = []
for i in numpy.arange(240, 380, 1):

    med_y.append(numpy.median(dy[(dx > i)*(dx < i+20)]))
    med_x.append(i + 10)

Here the data is stored in dx (x-coordinate) and dy (y-coordinate) and the median is taken over the dy and plotted agaist the dx (which has to be shifted by window/2). Assuming even spacing for x and window size here is 20.

Is there a shorter way?

For example, running average can be done like this:

cs = numpy.cumsum(dy)
y_20 = (cs[20:] - cs[:-20])/20.0
x_20 = dx[10:-10]

Predefined running X functions in site-packages are also ok.

score 0 · Answer 1 · answered Dec 18 '19 at 13:45

Googling after writing the question revealed signal prosessing function called medfilt, e.g. scipy.signal.medfilt with two input parameters: list of numbers and window size.

It works when:

window size is uneven
more than (window+1)/2 distance from edges

Near the edges it gives the minimum inside window/2. I guess the reason is that it is meant originally for reducing black error pixels in images and you want the edges to be black.

For example:

from scipy.signal import medfilt 
values = [1,1,1,0,1,1,1,1,1,1,1,2,1,1,1,10,1,1,1,1,1,1,1,1,1,1,0,1]
print medfilt(values,7)

works perfectly for values[4:-4] and gives min(values[:4]) and min(values[-4:]) for edges. The output of above example is:

output = [0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]

Ok, learned the hard way what 'zero padded' means in the documentation. The behavior at the edges is due to the zeros added to the window when it goes outside the data. — Juha, Dec 19 '19 at 07:07

Juha · Answer 2 · 2019-12-19T08:35:20.410

This is the shortest:

from scipy.ndimage import median_filter
values = [1,1,1,0,1,1,1,1,1,1,1,2,1,1,1,10,1,1,1,1,1,1,1,1,1,1,0,1]
print median_filter(values, 7, mode='mirror')

and it works correctly in the edges (or you can choose how it works at the edges).

And any general running X is done like this (running standard deviation as an example):

import numpy
from scipy.ndimage.filters import generic_filter
values = numpy.array([0,1,2,3,4,5,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1]).astype('float')
print(generic_filter(values, numpy.std, size=7, mode='mirror'))

In the above, float input type is important.

Useful links:

https://nickc1.github.io/python,/matlab/2016/05/17/Standard-Deviation-(Filters)-in-Matlab-and-Python.html

improving code efficiency: standard deviation on sliding windows

What is the shortest way to calculate running median in python?

2 Answers2