0

I have two questions, which I think that are related enough to be part of a single question. But in case they are not, I can ask the them as separate questions. Please let me know. I also apologize in advance because I have the impression that I am doing something very wrong, but I don't know what it is.

So far, I run the following code in Python (using Jupyter notebooks, if it makes a difference):

First, I initialize a very long (multilevel?) list:

object = [[[[[[[[None for i in range(2)]  
               for j in range(2)] 
              for k in range(2)] 
             for l in range (2)] 
            for m in range (2)] 
           for n in range (2)]
          for o in range (2)]
         for p in range (2)]

Next, I run a bunch of loops, one inside the other, and run a single function (that depends on all the indices that I am using in the loop), assigning the result to one of the positions that I created above:

for i in range(2):
    for j in range(2):
        for k in range(2):
            for l in range(2):
                for m in range(2):
                    for n in range(2):
                        for o in range(2):
                            for p in range(2):
                                object[i][j][k][l][m][n][o][p] = function(i,j,k,l,m,n,o,p) 

Here are the two related questions:

  1. The objects that the function returns in each iteration are completely independent from each other (I could run each iteration of the loop in one computer and collect them later, for example). So I imagine that this loop would be the ideal candidate to be solved in parallel/multiprocessing. If so, how do I do that? I found a few mentions of running nested loops in parallel, but I could not understand how it applies for my case. Full disclosure: I never ran anything in parallel in Python.

  2. Is this list (referencing to this rather unpleasant object[i][j][k][l][m][n][o][p]) how you would properly keep the results (in a way that you can find later)? Or could you maybe suggest a better way? If it is relevant, the objects returned by the function have properties such as pandas dataframes, numbers, and strings.

Thiago
  • 45
  • 5
  • If your case is just for 2**8 and you don't care speed so much, your current code looks good for me. 1) When using multiprocessing, it is like one CPU do i=0, another do i=1 part. See [multiprocessing doc](https://docs.python.org/3/library/multiprocessing.html). 2) Seems properly kept with object[][][]... – shimo Jul 09 '20 at 21:38

1 Answers1

0

For your first question I suggest you look here at the top answers to see how to parallelize the for loop I outline below (which answers question 2):

How do I parallelize a simple Python loop?

Second question:

#dummy function for illustrative purposes
def function(a,b,c,d,e,f,g,h):
  return a+b+c+d+e+f+g+h

If output of functions were hashable I'd create a dictionary:

#This is your 'objects'
O={}

for y in range(2**8):
    #this generates all the permutations you were after I believe
    s=format(y, '#010b')[2:]
    #print(s) #uncomment to see what it does
    #This is slightly messy, in that you have to split up your integer into its components, but I've seen worse.
    O[y]=function(int(s[0]),int(s[1]),int(s[2]),int(s[3]),int(s[4]),int(s[5]),int(s[6]),int(s[7]))

#Now, if you wanted to print the output of f(1,1,1,1,1,1,1,1):
g='11111111'
print(O[int(g,2)]) #uncomment to see what it does 

#print(O) #uncomment to see what it does 

If output is not hashable just leave as a list:

O=[] 

for y in range(2**8):
    #this generates all the permutations you were after I believe
    s=format(y, '#010b')[2:]
    #print(s) #uncomment to see what it does
    #This is slightly messy, in that you have to split up your integer into its components, but I've seen worse.
    O.append(function(int(s[0]),int(s[1]),int(s[2]),int(s[3]),int(s[4]),int(s[5]),int(s[6]),int(s[7])))

#Now, if you wanted to print the output of f(1,1,1,1,1,1,1,1):
g='11111111'
#print(O[int(g,2)]) #uncomment to see what it does 

#print(O) #uncomment to see what it does