0

I had an old script that from a pandas dataframe calculates new columns from others, but also from the previous result of that column being calculated.

This script used for loops, and it was quite slow. For this reason, I replaced the for loops with recursive functions.

The new script is around 100 times faster than the old one, which is good news. But I am now encountering a limit that I did not have before. As soon as I have more than 29952 rows in my dataset, I get the following error:

"Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)"

I made this little script with lists reflecting my problem :

  • If I increase the size of the lists (list_lenght) to more than 29952, the script crashes (on my computer)
import random
import sys


def list_generator(min_value, max_value, list_lenght):
    return [random.randrange(min_value,max_value) for i in range(list_lenght)]


def recursive_function(list_1, list_2, n, result):
    if n == len(list_1):
        return result
    elif list_1[n] <= list_2[n]:
        result.append(1 + result[n - 1])
    else:
        result.append(0)
    return recursive_function(list_1, list_2, (n + 1), result)


list_lenght = 29952  # How to increase this limit without generating an error?

min_value = 10
max_value = 20

list_one = list_generator(min_value, max_value, list_lenght)
list_two = list_generator(min_value, max_value, list_lenght)

# Set recursion limit
sys.setrecursionlimit(list_lenght * 2)

# Compute a new list from list_one and list_two
list_result = recursive_function(list_one, list_two, 1, [0])

I suspect a memory problem, but how do you take advantage of all the power of python's recursive functions while avoiding this limit as well as possible?

Thanks in advance


Following comment from @trincot, here is the version of the code without recursion function... which is ultimately faster than the version above with a recursive function ! And with which there are no more limits

def no_recursive_function(list_1, list_2, n, result):
    if list_1[n] <= list_2[n]:
        return 1 + result[n - 1]
    else:
        return 0
    
    
list_lenght = 29952
min_value = 10
max_value = 20

list_one = list_generator(min_value, max_value, list_lenght)
list_two = list_generator(min_value, max_value, list_lenght)

# Set recursion limit
sys.setrecursionlimit(list_lenght * 2)

list_result_2 = [0]
for n in range(list_lenght - 1):
    result = no_recursive_function(list_one, list_two, n + 1, list_result_2)
    list_result_2.append(result)
David
  • 190
  • 8
  • I am not really sure but did you try memoization before the recursive function? – Abinav R Oct 27 '20 at 19:56
  • @AbinavR, what did you mean by memoization? Because I don't think I tried it – David Oct 27 '20 at 20:03
  • https://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python. I think this should answer your question. – Abinav R Oct 27 '20 at 20:22
  • @AbinavR that is an alternative, not a solution to OP's question – Abhinav Mathur Oct 27 '20 at 20:28
  • 1
    *"it was quite slow. For this reason, I replaced the for loops with recursive functions."*: what made you think that recursion is faster than an iterative solution? – trincot Oct 27 '20 at 20:29
  • Can you edit your question and add the loop-version you used before? – trincot Oct 27 '20 at 20:35
  • @trincot, I just noticed that the old script I picked up took about 30 seconds to produce the same result as the new one ... So I deduced that recursion functions were faster than for loops – David Oct 27 '20 at 20:58
  • 1
    @trincot But following your comment, I just did the test with a version without recursive function. The result surprised me. For 29952 rows (or list size) I get the result in 0.11 seconds with the recursive function, compared to only 0.09 seconds without a recursive version. And with this latest version I have no more limits ... I guess my old script was really not optimized – David Oct 27 '20 at 20:59
  • That's what I guessed. So now you have no more question, right? – trincot Oct 27 '20 at 20:59
  • Exactly, your reflection allowed me to answer my questions – David Oct 27 '20 at 21:01

0 Answers0