First off, there's more than one way to do this.
It's not the most efficient speed-wise, but using scipy.ndimage.generic_filter
will allow you to easily apply an arbitrary python function over a moving window.
As a quick example:
result = scipy.ndimage.generic_filter(data, np.std, size=2*radius)
Note that the boundary conditions can be controlled by mode
kwarg.
Another way to do this is to use some various striding tricks to make a view of the array that's effectively a moving window, and then apply np.std
along the last axis. (Note: this is taken from one of my previous answers here: https://stackoverflow.com/a/4947453/325565)
def strided_sliding_std_dev(data, radius=5):
windowed = rolling_window(data, (2*radius, 2*radius))
shape = windowed.shape
windowed = windowed.reshape(shape[0], shape[1], -1)
return windowed.std(axis=-1)
def rolling_window(a, window):
"""Takes a numpy array *a* and a sequence of (or single) *window* lengths
and returns a view of *a* that represents a moving window."""
if not hasattr(window, '__iter__'):
return rolling_window_lastaxis(a, window)
for i, win in enumerate(window):
if win > 1:
a = a.swapaxes(i, -1)
a = rolling_window_lastaxis(a, win)
a = a.swapaxes(-2, i)
return a
def rolling_window_lastaxis(a, window):
"""Directly taken from Erik Rigtorp's post to numpy-discussion.
<http://www.mail-archive.com/numpy-discussion@scipy.org/msg29450.html>"""
if window < 1:
raise ValueError, "`window` must be at least 1."
if window > a.shape[-1]:
raise ValueError, "`window` is too long."
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
It's a bit hard to understand what's going on here at first glance. Not to plug one of my own answers, but I don't want to re-type the explanation, so have a look here: https://stackoverflow.com/a/4924433/325565 if you haven't see these sorts of "striding" tricks before.
If we compare timings with a 100x100 array of random floats with a radius
of 5, it's ~10x faster than the original or the generic_filter
version. However, you have no flexibility in the boundary conditions with this version. (It's identical to what you're currently doing, while the generic_filter
version gives you lots of flexibility at the expense of speed.)
# Your original function with nested loops
In [21]: %timeit sliding_std_dev(data)
1 loops, best of 3: 237 ms per loop
# Using scipy.ndimage.generic_filter
In [22]: %timeit ndimage_std_dev(data)
1 loops, best of 3: 244 ms per loop
# The "stride-tricks" version above
In [23]: %timeit strided_sliding_std_dev(data)
100 loops, best of 3: 15.4 ms per loop
# Ophion's version that uses `np.take`
In [24]: %timeit new_std_dev(data)
100 loops, best of 3: 19.3 ms per loop
The downside to the "stride-tricks" version is that unlike "normal" strided rolling window tricks, this version does make a copy, and it's much larger than the original array. You will run into memory problems if you use this on a large array! (On a side note, it's basically equivalent to @Ophion's answer in terms of memory use and speed. It's just a different approach to doing the same thing.)