I'm trying to use multiprocessing.pool
to speed up some parsing of a file parsed using pyparsing
, however I get a multiprocessing.pool.MaybeEncodingError
exception whenever I try this.
I've narrowed it down to something to do with returning a dictionary (ParseResults.asDict()
), using asList()
the error doesn't occur; but the input I'm actually parsing is pretty complex so ideally I'd like to use asDict
.
The actual data being parsed is an Erlang list of tagged tuples, which I want to map to a python list. The grammar for this is pretty complex, so I've instead got a simplified test case (updated to include a nested dict):
#!/usr/bin/env python2.7
from pyparsing import *
import multiprocessing
dictionary = Forward()
key = Word(alphas)
sep = Suppress(":")
value = ( key | dictionary )
key_val = Group( key + sep + value )
dictionary <<= Dict( Suppress('[') + delimitedList( key_val ) + Suppress(']') )
def parse_dict(s):
p = dictionary.parseString(s).asDict()
return p
def parse_list(s):
return dictionary.parseString(s).asList()
# This works (list)
data = ['[ foo : [ bar : baz ] ]']
pool = multiprocessing.Pool()
pool.map(parse_list, data)
# This fails (dict)
pool.map(parse_dict, data)
Fails with:
Traceback (most recent call last):
File "lib/python/nutshell/multi_parse.py", line 19, in <module>
pool.map(parse, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 250, in map
return self.map_async(func, iterable, chunksize).get()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 554, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[{'foo': ([(['bar', 'baz'], {})], {'bar': [('baz', 0)]})}]'. Reason: 'TypeError("'str' object is not callable",)'