I've written this implementation of the median of medians algorithm in python, but it doesn't seem to output the right result, and it also does not seem of linear complexity to me, any idea where I went off track ?
def select(L):
if len(L) < 10:
L.sort()
return L[int(len(L)/2)]
S = []
lIndex = 0
while lIndex+5 < len(L)-1:
S.append(L[lIndex:lIndex+5])
lIndex += 5
S.append(L[lIndex:])
Meds = []
for subList in S:
print(subList)
Meds.append(select(subList))
L2 = select(Meds)
L1 = L3 = []
for i in L:
if i < L2:
L1.append(i)
if i > L2:
L3.append(i)
if len(L) < len(L1):
return select(L1)
elif len(L) > len(L1) + 1:
return select(L3)
else:
return L2
The function is called like so:
L = list(range(100))
shuffle(L)
print(select(L))
LE: Sorry. GetMed was a function that simply sorted the list and returned the element at len(list), it should've been select there, I fixed it now, but still I get the wrong outputs. As for the indentation, the code works without error, and I see nothing wrong with it :-??
LE2: I'm expecting 50 (for the current L), it gives me outputs from 30 to 70, no more no less (yet)
LE3: Thank you very much, that did the trick it works now. I'm confuse though, I'm trying to make a comparison between this method, and the naive one, where I simply sort the array and output the results. Now, from what I read so far, the time complexity of the select method should be O(n) Deterministic Selection. Although I probably can't compete with the optimisation python developers did, I did expect closer results than I got, for example, if I change the range of the list to 10000000, select outputs the result in 84.10837116255952 seconds while the sort and return method does it in 18.92556029528825. What are some good ways to make this algorithm faster?