30

The task I try to accomplish is to stream a ruby file and print out the output. (NOTE: I don't want to print out everything at once)

main.py

from subprocess import Popen, PIPE, STDOUT

import pty
import os

file_path = '/Users/luciano/Desktop/ruby_sleep.rb'

command = ' '.join(["ruby", file_path])

master, slave = pty.openpty()
proc = Popen(command, bufsize=0, shell=True, stdout=slave, stderr=slave, close_fds=True)     
stdout = os.fdopen(master, 'r', 0)

while proc.poll() is None:
    data = stdout.readline()
    if data != "":
        print(data)
    else:
        break

print("This is never reached!")

ruby_sleep.rb

puts "hello"

sleep 2

puts "goodbye!"

Problem

Streaming the file works fine. The hello/goodbye output is printed with the 2 seconds delay. Exactly as the script should work. The problem is that readline() hangs in the end and never quits. I never reach the last print.

I know there is a lot of questions like this here a stackoverflow but non of them made me solve the problem. I'm not that into the whole subprocess thing so please give me a more hands-on/concrete answer.

Regards

edit

Fix unintended code. (nothing to do with the actual error)

vermin
  • 1,079
  • 2
  • 11
  • 17
  • Not sure if you have a typo when pasting code into your question, or if the problem is genuine. Looks to me like the `if` should be indented so that it is inside the loop. – cdarke Sep 14 '12 at 06:38
  • Thanks for noticing. It's a typo from when I pasted the code. It's fixed now. Sorry about that. – vermin Sep 14 '12 at 08:26
  • I'm not sure, but isn't your problem very similar to the one in http://stackoverflow.com/questions/8495794/python-popen-stdout-readline-hangs ? – HerrKaputt Sep 17 '12 at 14:22

4 Answers4

34

I assume you use pty due to reasons outlined in Q: Why not just use a pipe (popen())? (all other answers so far ignore your "NOTE: I don't want to print out everything at once").

pty is Linux only as said in the docs:

Because pseudo-terminal handling is highly platform dependent, there is code to do it only for Linux. (The Linux code is supposed to work on other platforms, but hasn’t been tested yet.)

It is unclear how well it works on other OSes.

You could try pexpect:

import sys
import pexpect

pexpect.run("ruby ruby_sleep.rb", logfile=sys.stdout)

Or stdbuf to enable line-buffering in non-interactive mode:

from subprocess import Popen, PIPE, STDOUT

proc = Popen(['stdbuf', '-oL', 'ruby', 'ruby_sleep.rb'],
             bufsize=1, stdout=PIPE, stderr=STDOUT, close_fds=True)
for line in iter(proc.stdout.readline, b''):
    print line,
proc.stdout.close()
proc.wait()

Or using pty from stdlib based on @Antti Haapala's answer:

#!/usr/bin/env python
import errno
import os
import pty
from subprocess import Popen, STDOUT

master_fd, slave_fd = pty.openpty()  # provide tty to enable
                                     # line-buffering on ruby's side
proc = Popen(['ruby', 'ruby_sleep.rb'],
             stdin=slave_fd, stdout=slave_fd, stderr=STDOUT, close_fds=True)
os.close(slave_fd)
try:
    while 1:
        try:
            data = os.read(master_fd, 512)
        except OSError as e:
            if e.errno != errno.EIO:
                raise
            break # EIO means EOF on some systems
        else:
            if not data: # EOF
                break
            print('got ' + repr(data))
finally:
    os.close(master_fd)
    if proc.poll() is None:
        proc.kill()
    proc.wait()
print("This is reached!")

All three code examples print 'hello' immediately (as soon as the first EOL is seen).


leave the old more complicated code example here because it may be referenced and discussed in other posts on SO

Or using pty based on @Antti Haapala's answer:

import os
import pty
import select
from subprocess import Popen, STDOUT

master_fd, slave_fd = pty.openpty()  # provide tty to enable
                                     # line-buffering on ruby's side
proc = Popen(['ruby', 'ruby_sleep.rb'],
             stdout=slave_fd, stderr=STDOUT, close_fds=True)
timeout = .04 # seconds
while 1:
    ready, _, _ = select.select([master_fd], [], [], timeout)
    if ready:
        data = os.read(master_fd, 512)
        if not data:
            break
        print("got " + repr(data))
    elif proc.poll() is not None: # select timeout
        assert not select.select([master_fd], [], [], 0)[0] # detect race condition
        break # proc exited
os.close(slave_fd) # can't do it sooner: it leads to errno.EIO error
os.close(master_fd)
proc.wait()

print("This is reached!")
Community
  • 1
  • 1
jfs
  • 346,887
  • 152
  • 868
  • 1,518
  • What's the reason for the `if not data: break` here? wouldn't that case be caught by the `proc.poll() is not None` in the next while iteration? – Andy Hayden Oct 23 '18 at 19:57
  • The reason to ask is for this (related) answer: https://stackoverflow.com/a/52954716/1240268 – Andy Hayden Oct 23 '18 at 20:29
  • @AndyHayden ignore the last code example. It is here for historical reasons (read the comment before the code). `p.poll()` is not used to break the loop in the new code (with `while 1` loop). Related [Python subprocess .check_call vs .check_output](https://stackoverflow.com/q/36169571/4279) – jfs Oct 24 '18 at 17:19
  • @AndyHayden about the answer you link: avoid multiple ptys, see [caveats and the link at the end of the answer which uses 2 ptys](https://stackoverflow.com/a/31953436/4279) – jfs Oct 24 '18 at 17:40
  • @jfs: Is there anything wrong with the last code example other than it being more complicated? It seems like a useful option for addressing [Andy Hayden's problem](https://stackoverflow.com/a/52954716/1240268) since we need something like `select.select` there to distinguish `stdout` from `stderr`. The other code examples above do not provide that ability (which we need since in general we wouldn't know which of the two to read from first). – unutbu Oct 24 '18 at 19:30
  • @unutbu there are multiple issues: the race condition mentioned in the comment, EIO handling, no cleanup — they could be fixed as the new example demonstrates. – jfs Oct 24 '18 at 19:45
  • @jfs: Ah, now I see that is pretty much what you did here: https://stackoverflow.com/a/31953436/190597. – unutbu Oct 24 '18 at 20:08
5

Not sure what is wrong with your code, but the following seems to work for me:

#!/usr/bin/python

from subprocess import Popen, PIPE
import threading

p = Popen('ls', stdout=PIPE)

class ReaderThread(threading.Thread):

    def __init__(self, stream):
        threading.Thread.__init__(self)
        self.stream = stream

    def run(self):
        while True:
            line = self.stream.readline()
            if len(line) == 0:
                break
            print line,


reader = ReaderThread(p.stdout)
reader.start()

# Wait until subprocess is done
p.wait()

# Wait until we've processed all output
reader.join()

print "Done!"

Note that I don't have Ruby installed and hence cannot check with your actual problem. Works fine with ls, though.

Florian Brucker
  • 7,641
  • 3
  • 37
  • 62
  • Using `if len(line):` was peace which helps me. using only `if line:` doesn't work on python3. – pevik May 10 '16 at 12:07
2

Basically what you are looking at here is a race condition between your proc.poll() and your readline(). Since the input on the master filehandle is never closed, if the process attempts to do a readline() on it after the ruby process has finished outputting, there will never be anything to read, but the pipe will never close. The code will only work if the shell process closes before your code tries another readline().

Here is the timeline:

readline()
print-output
poll()
readline()
print-output (last line of real output)
poll() (returns false since process is not done)
readline() (waits for more output)
(process is done, but output pipe still open and no poll ever happens for it).

Easy fix is to just use the subprocess module as it suggests in the docs, not in conjunction with openpty:

http://docs.python.org/library/subprocess.html

Here is a very similar problem for further study:

Using subprocess with select and pty hangs when capturing output

Community
  • 1
  • 1
jmh
  • 7,676
  • 1
  • 16
  • 25
  • However, the code also hangs when not using `pty`, but using `readline`. – Hans Then Sep 17 '12 at 17:56
  • But it is using readline on a fd that is part of a pty. That is why readline is hanging. If the output pipe was closed it would return EOF and readline would return "". As it is the output pipe is still open, but no process is providing any input, so it isn't providing any output. – jmh Sep 17 '12 at 19:47
  • No I rewrote the code, so as not to use a `pty`. It still hangs in readline. It is only when I removed readline that the code worked. – Hans Then Sep 17 '12 at 20:43
  • +1 for the last link. It is not clear what you mean by *"just use the subprocess module as it suggests in the docs, not in conjunction with openpty"* but it probably won't work due to block-buffering on ruby's side. See [my answer](http://stackoverflow.com/a/12471855/4279) – jfs Sep 18 '12 at 07:04
1

Try this:

proc = Popen(command, bufsize=0, shell=True, stdout=PIPE, close_fds=True)
for line in proc.stdout:
    print line

print("This is most certainly reached!")

As others have noted, readline() will block when reading data. It will even do so when your child process has died. I am not sure why this does not happen when executing ls as in the other answer, but maybe the ruby interpreter detects that it is writing to a PIPE and therefore it will not close automatically.

Hans Then
  • 10,149
  • 2
  • 30
  • 49
  • 1
    this won't show `hello` before the child dies due to block-buffering. `.readline()` in your case returns `''` when the child dies (try it with `iter(proc.stdout.readline, b'')`). – jfs Sep 18 '12 at 07:18