1

I need to run a function in parallel for different input values. The thing is that I need to get the output (a NetworKit Graph) of the function for each iterations. I tried to use joblib, Multiprocessing.Process and Queue, pathos, but I always have the same error

can't pickle _NetworKit.Graph objects

Here is a snippet of the code I'm trying to parallelize, with an example with joblib :

from networkit import Graph
from joblib    import Parallel, delayed

def f( i ):
    graph = networkit.Graph()
    graph.addNode()

    # ... Some other graph computations ...

    return graph

res = Parallel( n_jobs = 2 ) ( delayed( f )( i ) for i in np.arange( 5 ) )

I understand that all these libraries use pickle to serialize an object and that networkit objects are not pickable. I read that dill allows us to pickle non pickable objects, does anyone have any experience with dill using multiprocessing ?

Otherwise, is there any way to accomplish what I need ?

Thank you !

user3666197
  • 1
  • 6
  • 43
  • 77
Mohamed AL ANI
  • 1,714
  • 1
  • 6
  • 23
  • 2
    If it can't be pickled then you probably can't do much about it. But you can transform it into something that can be pickled. e.g. read the data from the graph that you need. – de1 Oct 27 '17 at 10:15
  • @de1 Thanks ! My problem is way more complex (streaming data) and I really need to get the graph as an output of the function. – Mohamed AL ANI Oct 30 '17 at 08:35
  • For alternatives you could start having a look to this intro to dill and pathos http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization or this one https://kampta.github.io/Parallel-Processing-in-Python/ . I think McKeam is the author of both pathos and dill but while you do not have an answer this may help. – Rafael Valero Mar 27 '18 at 09:57

0 Answers0