Parallel processing with Python class object

Question

I have a for loop in Python that contains an optimization function scipy.optimize.root. The function outputs a class object (called sol) that describes the optimized results:

import numpy as np           
import scipy.optimize as so

def root2d(x,a,b):
   F1 = np.exp(-np.exp(-(x[0]+x[1]))) - x[1]*(b+x[0]**2)
   F2 = x[0]*np.cos(x[1]) + x[1]*np.sin(x[0]) - a
   return (F1,F2)

x0 = np.array([0.1,0.1]) # initial guess
alist = np.linspace(-0.5,-0.3,10)
blist = np.linspace(0.2,0.3,10)

xlist = np.zeros(10)
ylist = np.zeros(10)
zlist = np.zeros(10)

for jj in range(0,10):

    a = alist[jj]
    b = blist[jj]

    sol = so.root(root2d,x0,args=(a,b),method='lm',tol=1e-9)

    xlist[jj] = sol.x[0] # optimised value
    ylist[jj] = sol.x[1] # optimised value
    zlist[jj] = sol.success # was solver successful?

# do something with xlist ylist zlist

Now I'm trying to parallelize the for loop using the suggestions in this post. However I'm not sure how to deal with the sol outputs and how to write the above for loop so that it can be used in this kind of structure:

from multiprocessing import Pool

p = Pool(4)
xlist,ylist,zlist = zip(*p.map(so.root,range(0,10)))

which was given as an answer by Nolen Royalty.

Edit: I want to run my program (no this MWE) on a HPC cluster where the available Python modules are numpy, scipy, matplotlib, cython and mpi4py. Although there are numerous methods to do parallel processing, I want to make minimal changes to my existing (serial for loops) code.

akoeltringer · Accepted Answer · 2017-04-10T09:53:34.067

2

To use Pool, you typcially provide a function and call Pool.map on it.

In your case:

from multiprocessing import Pool

def run(jj):
    import scipy.optimize as so

    a = alist[jj]
    b = blist[jj]

    sol = so.root(root2d,x0,args=(a,b),method='lm',tol=1e-9)

    return sol.x[0], sol.x[1], sol.success


if __name__ == '__main__':

    # your declarations go here ...

    p = Pool(4)
    result = p.map(run, range(0,10))

... which gives you a list of tuples, containing the solutions...

edited Apr 10 '17 at 09:53

answered Apr 10 '17 at 09:42

akoeltringer

1,493
3
15
31

Should `x0`, `alist`, `blist` be inside the `run(jj)` function? And the `xlist = ylist = zlist = np.zeros(10)` should be just before `p=Pool(4)` I assume – Medulla Oblongata Apr 10 '17 at 11:15
Nope. All those definitions, including the function definition of ``root2d`` go where I put ``# your declarations go here`` – akoeltringer Apr 10 '17 at 11:18
Strange... I just get an error when I do that: `AttributeError: module '__main__' has no attribute '__spec__'` – Medulla Oblongata Apr 10 '17 at 11:32
which version of python are you using? For me (Python3) this code runs without errors... Did you create the pool (``p = Pool(4)``) inside the ``if __name__ == '__main__'`` as stated above? Where did you put the ``import numpy as np``? – akoeltringer Apr 10 '17 at 11:44
Python 3.5.2 64bit. Yes I create the parallel pool inside `if __name__ == '__main__'`. The `import numpy as np` comes before I import `multiprocessing`. – Medulla Oblongata Apr 10 '17 at 11:53
Not possible to **exactly** run your code, but I have copied and pasted everything you mentioned at `# your declarations go here` and put the `import numpy` at the very top before I call the `multiprocessing` module. – Medulla Oblongata Apr 10 '17 at 12:26
yeah what I meant was whether you had a different 'local' code which you could not share publicly. – akoeltringer Apr 10 '17 at 12:29
No local code. I'm just trying to figure out this MWE, so what you see here is what I'm currently working on. – Medulla Oblongata Apr 10 '17 at 12:33

Medulla Oblongata · Answer 2 · 2017-04-10T12:18:07.777

The code given in Tw UxTLi51Nus's post doesn't appear to work on Python 3.5.2 for some reason. With some changes, the following code appears ok...

import numpy as np 
import scipy.optimize as so
from multiprocessing import Pool

def root2d(x,a,b):
    F1 = np.exp(-np.exp(-(x[0]+x[1]))) - x[1]*(b+x[0]**2)
    F2 = x[0]*np.cos(x[1]) + x[1]*np.sin(x[0]) - a
    return (F1,F2)

def run(jj):

    x0 = np.array([0.1,0.1]) # initial guess
    alist = np.linspace(-0.5,-0.3,10)
    blist = np.linspace(0.2,0.3,10)

    a = alist[jj]
    b = blist[jj]

    sol = so.root(root2d,x0,args=(a,b),method='lm',tol=1e-9)

    return sol.x[0], sol.x[1], sol.success


if __name__ == '__main__':

    xlist = np.zeros(10)
    ylist = np.zeros(10)
    zlist = np.zeros(10)

    p = Pool(4)
    result = p.map(run, range(0,10))
    print(result)

I've had to put the iterated values alist, blist, and initial guess x0 inside the run function. These are fed into the scipy.optimize.root solver. There are no errors when I do this, and the solutions are:

[(-0.53888782445459149, 0.043495493090149454, True),
 (-0.52328348126598456, 0.032937536902490253, True), 
 (-0.50743370799474086, 0.022462155879391384, True), 
 (-0.49135105437855231, 0.01203948230426068, True), 
 (-0.47502920008575156, 0.001732198125265777, True), 
 (-0.45846822054716679, -0.0084225504551842089, True), 
 (-0.4416527225847745, -0.018336039419045602, True), 
 (-0.42455931996843449, -0.027893297385455082, True), 
 (-0.40720848051853215, -0.037005663686040566, True), 
 (-0.38955545334271019, -0.045486751099290013, True)]

well I do not get it why there is that difference, on my system it runs perfectly fine when ``x0``, ``alist`` and ``blist`` are declared between the declaration of ``xlist`` and ``if __name__ == '__main__'`` — akoeltringer, Apr 10 '17 at 12:31
you can even delete the declarations of ``xlist``, ``ylist`` and ``zlist`` as the results are now in ``result`` BTW — akoeltringer, Apr 10 '17 at 12:32

Parallel processing with Python class object

2 Answers2