Iterator-only solution
foslok's solution is definitely fine, but I wanted to play around and make a version of this with generators. It only stores a deque of length(window_size)
as it iterates through the original list, then finds the n_largest values and calculates the mean thereof.
import itertools as it
from collections import deque
from heapq import nlargest
from statistics import mean
def windowed(iterable, n):
_iter = iter(iterable)
d = deque((it.islice(_iter, n)), maxlen=n)
yield tuple(d)
for i in _iter:
d.append(i)
yield tuple(d)
a = [3, 5, 2, 7, 5, 3, 6, 8, 4]
means = [mean(nlargest(2, w)) for w in windowed(a, 3)]
print(means)
result:
[4, 6, 6, 6, 5.5, 7, 7]
Thus to change both the number of elements (window size) or the n largest elements just change the arguments to the respective functions. This approach also avoids the use of slicing so it can be more easily applied to iterables that you can't or don't want to slice.
Timings
def deque_version(iterable, n, k):
means = (mean(nlargest(n, w)) for w in windowed(iterable, k))
for m in means:
pass
def tee_version(iterable, n, k):
means = (mean(nlargest(n, w)) for w in windowed_iterator(iterable, k))
for m in means:
pass
a = list(range(10**5))
n = 3
k = 2
print("n={} k={}".format(n, k))
print("Deque")
%timeit deque_version(a, n, k)
print("Tee")
%timeit tee_version(a, n, k)
n = 1000
k = 2
print("n={} k={}".format(n, k))
print("Deque")
%timeit deque_version(a, n, k)
print("Tee")
%timeit tee_version(a, n, k)
n = 50
k = 25
print("n={} k={}".format(n, k))
print("Deque")
%timeit deque_version(a, n, k)
print("Tee")
%timeit tee_version(a, n, k)
result:
n=3 k=2
Deque
1.28 s ± 3.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Tee
1.28 s ± 16.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
n=1000 k=2
Deque
1.28 s ± 8.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Tee
1.27 s ± 2.92 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
n=50 k=25
Deque
2.46 s ± 10.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Tee
2.47 s ± 2.45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
So apparently the itertools tee vs deque doens't matter much.