11

I'm using the (awesome) mrjob library from Yelp to run my python programs in Amazon's Elastic Map Reduce. It depends on subprocess in the standard python library. From my mac running python2.7.2, everything works as expected

However, when I switched to using the exact same code on Ubuntu LTS 11.04 also with python2.7.2, I encountered something strange:

mrjob loads the job, and then attempts to communicate with its child processes using subprocess and generates this error:

      File "/usr/local/lib/python2.7/dist-packages/mrjob-0.3.1-py2.7.egg/mrjob/emr.py", line 1212, in _build_steps
        steps = self._get_steps()
      File "/usr/local/lib/python2.7/dist-packages/mrjob-0.3.1-py2.7.egg/mrjob/runner.py", line 1003, in _get_steps
        stdout, stderr = steps_proc.communicate()
      File "/usr/lib/python2.7/subprocess.py", line 754, in communicate
        return self._communicate(input)
      File "/usr/lib/python2.7/subprocess.py", line 1302, in _communicate
        stdout, stderr = self._communicate_with_poll(input)
      File "/usr/lib/python2.7/subprocess.py", line 1332, in _communicate_with_poll
        poller = select.poll()
    AttributeError: 'module' object has no attribute 'poll'

This appears to be a problem with subprocess and not mrjob.

I dug into /usr/lib/python2.7/subprocess.py and found that during import it runs:

    if mswindows:
        ... snip ...
    else:
        import select
        _has_poll = hasattr(select, 'poll')

By editing that, I verified that it really does set _has_poll==True. And this is correct; easily verified on the command line.

However, when execution progresses to using Popen._communicate_with_poll somehow the select module has changed! This is generated by printing dir(select) right before it attempts to use select.poll().

    ['EPOLLERR', 'EPOLLET', 'EPOLLHUP', 'EPOLLIN', 'EPOLLMSG', 
    'EPOLLONESHOT', 'EPOLLOUT', 'EPOLLPRI', 'EPOLLRDBAND', 
    'EPOLLRDNORM', 'EPOLLWRBAND', 'EPOLLWRNORM', 'PIPE_BUF', 
    'POLLERR', 'POLLHUP', 'POLLIN', 'POLLMSG', 'POLLNVAL', 
    'POLLOUT', 'POLLPRI', 'POLLRDBAND', 'POLLRDNORM',
    'POLLWRBAND', 'POLLWRNORM', '__doc__', '__name__', 
    '__package__', 'error', 'select']

no attribute called 'poll'!?!? How did it go away?

So, I hardcoded _has_poll=False and then mrjob happily continues with its work, runs my job in AWS EMR, with subprocess using communicate_with_select... and I'm stuck with a hand-modified standard library...

Any advice? :-)

John Vandenberg
  • 187
  • 1
  • 12
user1181407
  • 295
  • 1
  • 3
  • 8
  • I would try to put some more trace and try to find where exactly the select module is losing the attribute poll - which by the way seems extremely shady. You sure there are no other versions of Python installed? Can you check the exact directory this select module is coming from? – Sid Jan 31 '12 at 22:03
  • That really *is* strange - it's not just `poll()` which has gone but `epoll()` too, which should also be there on an Ubuntu system (from Python 2.6 onwards). Also, since the [select module is in C](http://hg.python.org/cpython/file/e23d51f17cce/Modules/selectmodule.c) and the existence of `poll()` is determined at *compile time* then I think something must be executing `del select.poll` or similar (though I can't possibly imagine why). Seems outlandish, but you could possibly grep for that just in case? – Cartroo Jan 10 '13 at 21:25
  • make sure it is not [The name shadowing trap](http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#the-name-shadowing-trap) – jfs Sep 12 '14 at 07:42
  • I recently ran into a similar problem and googling came up with this question, and also this bug report: https://github.com/gevent/gevent/issues/446 I'm not an experienced a python programmer so I don't fully understand what is being said there, but it looks like the gevents module patches poll out of select. – dohashi Jul 02 '15 at 19:09

2 Answers2

4

I had a similar problem and it turns out that gevent replaces the built-in select module with gevent.select.select which doesn't have a poll method (as it is a blocking method). However for some reason by default gevent doesn't patch subprocess which uses select.poll.

An easy fix is to replace subprocess with gevent.subprocess:

import gevent.monkey
gevent.monkey.patch_all(subprocess=True)

import sys
import gevent.subprocess
sys.modules['subprocess'] = gevent.subprocess

If you do this before importing the mrjob library, it should work fine.

mbarkhau
  • 7,472
  • 3
  • 28
  • 34
Tiago Queiroz
  • 141
  • 10
2

Sorry for writing a full answer instead of a comment, otherwise I'd lose code indentation.

I cannot help you directly since something seems very strictly tied to your code, but I can help you find out, by relying on the fact that Python modules can be arbitrary object, try something like that:

class FakeModule(dict):
    def __init__(self, origmodule):
        self._origmodule = origmodule
    self.__all__ = dir(origmodule)

    def __getattr__(self, attr):
    return getattr(self._origmodule, attr)


    def __delattr__(self, attr):
        if attr == "poll":
            raise RuntimeError, "Trying to delete poll!"
        self._origmodule.__delattr__(attr)


def replaceSelect():
    import sys
    import select
    fakeselect = FakeModule(select)

    sys.modules["select"] = fakeselect

replaceSelect()

import select
del select.poll

and you'll get an output like:

Traceback (most recent call last):
  File "domy.py", line 27, in <module>
    del select.poll
  File "domy.py", line 14, in __delattr__
    raise RuntimeError, "Trying to delete poll!"
RuntimeError: Trying to delete poll!

By calling replaceSelect() in your code you should be able to get a traceback of where somebody is deleting poll(), so you can understand why.

I hope my FakeModule implementation is good enough, otherwise you might need to modify it.

Alan Franzoni
  • 2,758
  • 1
  • 18
  • 29