I have a Numpy array of shape (6,2):

[[ 0, 1],

I need a sliding window with step size 1 and window size 3 like this:

[[ 0, 1,10,11,20,21],

I'm looking for a Numpy solution. If your solution could parametrise the shape of the original array as well as the window size and step size, that'd be great.

I found this related answer Using strides for an efficient moving average filter but I don't see how to specify the stepsize there and how to collapse the window from the 3d to a continuous 2d array. Also this Rolling or sliding window iterator? but that's in Python and I'm not sure how efficient that is. Also, it supports elements but does not join them together in the end if each element has multiple features.

8 Answers8


You can do a vectorized sliding window in numpy using fancy indexing.

>>> import numpy as np

>>> a = np.array([[00,01], [10,11], [20,21], [30,31], [40,41], [50,51]])

>>> a
array([[ 0,  1],
       [10, 11],
       [20, 21],                      #define our 2d numpy array
       [30, 31],
       [40, 41],
       [50, 51]])

>>> a = a.flatten()

>>> a
array([ 0,  1, 10, 11, 20, 21, 30, 31, 40, 41, 50, 51])    #flattened numpy array

>>> indexer = np.arange(6)[None, :] + 2*np.arange(4)[:, None]

>>> indexer
array([[ 0,  1,  2,  3,  4,  5],
       [ 2,  3,  4,  5,  6,  7],            #sliding window indices
       [ 4,  5,  6,  7,  8,  9],
       [ 6,  7,  8,  9, 10, 11]])

>>> a[indexer]
array([[ 0,  1, 10, 11, 20, 21],
       [10, 11, 20, 21, 30, 31],            #values of a over sliding window
       [20, 21, 30, 31, 40, 41],
       [30, 31, 40, 41, 50, 51]])

>>> np.sum(a[indexer], axis=1)
array([ 63, 123, 183, 243])         #sum of values in 'a' under the sliding window.

Explanation for what this code is doing.

The np.arange(6)[None, :] creates a row vector 0 through 6, and np.arange(4)[:, None] creates a column vector 0 through 4. This results in a 4x6 matrix where each row (six of them) represents a window, and the number of rows (four of them) represents the number of windows. The multiple of 2 makes the sliding window slide 2 units at a time which is necessary for sliding over each tuple. Using numpy array slicing you can pass the sliding window into the flattened numpy array and do aggregates on them like sum.

In [1]: import numpy as np

In [2]: a = np.array([[00,01], [10,11], [20,21], [30,31], [40,41], [50,51]])

In [3]: w = np.hstack((a[:-2],a[1:-1],a[2:]))

In [4]: w
array([[ 0,  1, 10, 11, 20, 21],
       [10, 11, 20, 21, 30, 31],
       [20, 21, 30, 31, 40, 41],
       [30, 31, 40, 41, 50, 51]])

You could write this in as a function as so:

def window_stack(a, stepsize=1, width=3):
    n = a.shape[0]
    return np.hstack( a[i:1+n+i-width:stepsize] for i in range(0,width) )

This doesn't really depend on the shape of the original array, as long as a.ndim = 2. Note that I never use either lengths in the interactive version. The second dimension of the shape is irrelevant; each row can be as long as you want. Thanks to @Jaime's suggestion, you can do it without checking the shape at all:

def window_stack(a, stepsize=1, width=3):
    return np.hstack( a[i:1+i-width or None:stepsize] for i in range(0,width) )
  • Fixed it. I had the +1 in there but then removed it in another edit. Added commentary related to that. – askewchan Mar 30 '13 at 19:39
  • It doesn't work with stepsize > 1. Anyway, most people only need stepsize 1 so it's good enough. I just remove that as a parameter – siamii Mar 30 '13 at 19:59
  • 1
    For the `[:-i]` not working thing, I have seen `[:-i or None]` used. – Jaime Mar 30 '13 at 21:07
  • what if `a.ndim = 1`? is there a generic approach? – leoschet Sep 04 '18 at 11:26
  • @leoschet, it should work as-is, but it will interpret `a` as one row, and I suppose you want it to behave as one column. A quick solution is to make it a column, with `a[:, None]` or `a.reshape(-1, 1)`. But really the better solution is the [indexer answer](https://stackoverflow.com/a/42258242/1730674). – askewchan Sep 11 '18 at 16:14
  • 1
    exactly, my solution was to switch between `hstack` and `vstack`, I'll check your solution out! – leoschet Sep 12 '18 at 12:21
  • @askewchan any version without using `np`? – loretoparisi May 17 '19 at 12:39
  • 1
    @loretoparisi, it should work without much change: start by replacing the call to `np.hstack( ... )` and with a list comprehension: `[ ... ]`. You may need a `zip` in there if you need to transpose it. – askewchan Jun 24 '19 at 18:44
  • 3
    This code now produces `FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.` One should surround the arg to `np.hstack` with brackets. – Björn Lindqvist Sep 05 '19 at 21:09

One solution is

np.lib.stride_tricks.as_strided(a, shape=(4,6), strides=(8,4)).

Using strides is intuitive when you start thinking in terms of pointers/addresses.

The as_strided() method has 3 arguments.

  1. data
  2. shape
  3. strides

data is the array on which we would operate.

To use as_strided() for implementing sliding window functions, we must compute the shape of the output beforehand. In the question, (4,6) is the shape of output. If the dimensions are not correct, we end up reading garbage values. This is because we are accessing data by moving the pointer by a couple of bytes (depending on data type).

Determining the correct value of strides is essential to get expected results. Before calculating strides, find out the memory occupied by each element using arr.strides[-1]. In this example, the memory occupied by one element is 4 bytes. Numpy arrays are created in row major fashion. The first element of the next row is right next to the last element of the current row.


0 , 1 | 10, 11 | ...

10 is right next to 1.

Imagine the 2D array reshaped to 1D (This is acceptable as the data is stored in a row-major format). The first element of each row in the output is the odd indexed element in the 1D array.

0, 10, 20, 30, ..

Therefore, the number of steps in memory we need to take to move from 0 to 10, 10 to 20, and so on is 2 * mem size of element. Each row has a stride of 2 * 4bytes = 8. For a given row in the output, all the elements are adjacent to each other in our imaginary 1D array. To get the next element in a row, just take one stride equal to the size of an element. The value of column stride is 4 bytes.

Therefore, strides=(8,4)

An alternate explanation: The output has a shape of (4,6). Column stride 4. So, the first row elements start at index 0 and have 6 elements each spaced 4 bytes apart. After the first row is collected, the second row starts 8 bytes away from the starting of the current row. The third row starts 8 bytes away from the starting point of the second row and so on.

The shape determines the number of rows and columns we need. strides define the memory steps to start a row and collect a column element

    Note that if you omit the 3rd argument, then the `strides` value is taken from the array you pass in as the first argument. That saves you having to figure this out yourself. – Martijn Pieters Dec 11 '18 at 15:36

A short list comprehension is possible with more_itertools.windowed1:


import numpy as np
import more_itertools as mit

a = [["00","01"],

b = np.array(a)


np.array([list(mit.flatten(w)) for w in mit.windowed(a, n=3)])


np.array([[i for item in w for i in item] for w in mit.windowed(a, n=3)])


np.array(list(mit.windowed(b.ravel(), n=6)))


array([['00', '01', '10', '11', '20', '21'],
       ['10', '11', '20', '21', '30', '31'],
       ['20', '21', '30', '31', '40', '41'],
       ['30', '31', '40', '41', '50', '51']], 

Sliding windows of size n=3 are created and flattened. Note the default step size is more_itertools.windowed(..., step=1).


As an array, the accepted answer is fastest.

%timeit np.hstack((a[:-2], a[1:-1], a[2:]))
# 37.5 µs ± 1.88 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.hstack((b[:-2], b[1:-1], b[2:]))
# 12.9 µs ± 166 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.array([list(mit.flatten(w)) for w in mit.windowed(a, n=3)])
# 23.2 µs ± 1.73 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.array([[i for item in w for i in item] for w in mit.windowed(a, n=3)])
# 21.2 µs ± 999 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.array(list(mit.windowed(b.ravel(), n=6)))
# 43.4 µs ± 374 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

A third-party library that implements itertool recipes and many helpful tools.

Starting in Numpy 1.20, using the new sliding_window_view to slide/roll over windows of elements, and based on the same idea as user42541's answer, we can do:

import numpy as np
from numpy.lib.stride_tricks import sliding_window_view

# values = np.array([[0,1], [10,11], [20,21], [30,31], [40,41], [50,51]])
sliding_window_view(values.flatten(), window_shape = 2*3)[::2]
# array([[ 0,  1, 10, 11, 20, 21],
#        [10, 11, 20, 21, 30, 31],
#        [20, 21, 30, 31, 40, 41],
#        [30, 31, 40, 41, 50, 51]])

where 2 is the size of sub-arrays and 3 the window.

Details of the intermediate steps:

# values = np.array([[0,1], [10,11], [20,21], [30,31], [40,41], [50,51]])

# Flatten the array (concatenate sub-arrays):
# array([ 0,  1, 10, 11, 20, 21, 30, 31, 40, 41, 50, 51])

# Slide through windows of size 2*3=6:
sliding_window_view(values.flatten(), 2*3)
# array([[ 0,  1, 10, 11, 20, 21],
#        [ 1, 10, 11, 20, 21, 30],
#        [10, 11, 20, 21, 30, 31],
#        [11, 20, 21, 30, 31, 40],
#        [20, 21, 30, 31, 40, 41],
#        [21, 30, 31, 40, 41, 50],
#        [30, 31, 40, 41, 50, 51]])

# Only keep even rows (1 row in 2 - if sub-arrays have a size of x, then replace 2 with x):
sliding_window_view(values.flatten(), 2*3)[::2]
# array([[ 0,  1, 10, 11, 20, 21],
#        [10, 11, 20, 21, 30, 31],
#        [20, 21, 30, 31, 40, 41],
#        [30, 31, 40, 41, 50, 51]])
Here is One-liner using Numpy >= v1.17

rowsJoined = 3

splits = np.vstack(np.split(x,np.array([[i, i + rowsJoined] for i in range(x.shape[0] - (rowsJoined - 1))]).reshape(-1))).reshape(-1, rowsJoined * x.shape[1]) 


x = np.array([[00,1],


[[ 0  1 10 11 20 21]
 [10 11 20 21 30 31]
 [20 21 30 31 40 41]
 [30 31 40 41 50 51]]

Test Performance On Large Array

import numpy as np
import time

x = np.array(range(1000)).reshape(-1, 2)
rowsJoined = 3

all_t = 0.
for i in range(1000):
    start_ = time.time()
        numpy.split(x,np.array([[i, i + rowsJoined] for i in range(x.shape[0] - (rowsJoined - 1))])
                    .reshape(-1))).reshape(-1, rowsJoined * x.shape[1])
    all_t += time.time() - start_

print('Average Time of 1000 Iterations on Array of Shape '
      '1000 x 2 is: {} Seconds.'.format(all_t/1000.))

Performance Result

Average Time of 1000 Iterations on Array of Shape 1000 x 2 is: 0.0016909 Seconds.
As of NumPy version 1.20.0 this can be done using

np.lib.stride_tricks.sliding_window_view(arr, winsize)


>>> arr = np.arange(0, 9).reshape((3, 3))
>>> np.lib.stride_tricks.sliding_window_view(arr, (2, 2))

array([[[[0, 1],
         [3, 4]],

        [[1, 2],
         [4, 5]]],

       [[[3, 4],
         [6, 7]],

        [[4, 5],
         [7, 8]]]])

You can read more about it here.

This is a pure Python implementation:

def sliding_window(arr, window=3):
    i = iter(arr)
    a = []
    for e in range(0, window): a.append(next(i))
    yield a
    for e in i:
        a = a[1:] + [e]
        yield a

An example:

# flatten array
flatten = lambda l: [item for sublist in l for item in sublist]

a = [[0,1], [10,11], [20,21], [30,31], [40,41], [50,51]]
w = sliding_window(a, width=3)
print( list(map(flatten,w)) )

[[0, 1, 10, 11, 20, 21], [10, 11, 20, 21, 30, 31], [20, 21, 30, 31, 40, 41], [30, 31, 40, 41, 50, 51]]


import timeit
def benchmark():
  a = [[0,1], [10,11], [20,21], [30,31], [40,41], [50,51]]
  sliding_window(a, width=3)

times = timeit.Timer(benchmark).repeat(3, number=1000)
time_taken = min(times) / 1000

