8

Maybe there's someone out in the ether that can help me with this one. (I have seen a number of similar questions to this on SO, but none deal with both standard out and standard error or deal with a situation quite like mine, hence this new question.)

I have a python function that opens a subprocess, waits for it to complete, then outputs the return code, as well as the contents of the standard out and standard error pipes. While the process is running, I'd like to also display the output of both pipes as they are populated. My first attempt has resulted in something like this:

process = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

stdout = str()
stderr = str()
returnCode = None
while True:
    # collect return code and pipe info
    stdoutPiece = process.stdout.read()
    stdout = stdout + stdoutPiece
    stderrPiece = process.stderr.read()
    stderr = stderr + stderrPiece
    returnCode = process.poll()

    # check for the end of pipes and return code
    if stdoutPiece == '' and stderrPiece == '' and returnCode != None:
        return returnCode, stdout, stderr

    if stdoutPiece != '': print(stdoutPiece)
    if stderrPiece != '': print(stderrPiece)

There's a couple problems with this though. Because read() reads until an EOF, the first line of the while loop will not return until the subprocess closes the pipe.

I could replace the read() in favor of read(int) but the printed output is distorted, cut off at the end of the read characters. I could readline() as a replacement, but the printed output is distorted with alternating lines of output and errors when there are many of both that occur at the same time.

Perhaps there's a read-until-end-of-buffer() variant that I'm not aware of? Or maybe it can be implemented?

Maybe it's best to implement a sys.stdout wrapper as suggested in this answer to another post? I would only want to use the wrapper in this function, however.

Any other ideas from the community?

I appreciate the help! :)

EDIT: The solution really should be cross-platform, but if you have ideas that aren't, please share them away to keep the brainstorming going.


For another one of my python subprocess head scratchers, take a look at another of my questions on accounting for subprocess overhead in timing.

Community
  • 1
  • 1
perden
  • 597
  • 4
  • 19
  • You might want to look at something like pexpect. – Thomas K Oct 11 '11 at 16:38
  • why not just create a StringIO and pass the same instance to the subprocess's stdout and stderr? – Nathan Ernst Oct 11 '11 at 16:52
  • @NathanErnst: Because that wouldn't work. `stdout` and `stderr` must be real, OS level file descriptors. – Sven Marnach Oct 11 '11 at 17:14
  • 1
    @Sven Marnach, I just checked the docs, `stderr` can be set to `STDOUT` which would redirect, so you could then set `stdout` to `PIPE` and just read from `stdout`. – Nathan Ernst Oct 11 '11 at 19:15
  • related: [Subprocess.Popen: cloning stdout and stderr both to terminal and variables](http://stackoverflow.com/a/25960956/4279) – jfs Sep 22 '14 at 03:27

3 Answers3

11

Make the pipes non-blocking by using fcntl.fcntl, and use select.select to wait for data to become available in either pipe. For example:

# Helper function to add the O_NONBLOCK flag to a file descriptor
def make_async(fd):
    fcntl.fcntl(fd, fcntl.F_SETFL, fcntl.fcntl(fd, fcntl.F_GETFL) | os.O_NONBLOCK)

# Helper function to read some data from a file descriptor, ignoring EAGAIN errors
def read_async(fd):
    try:
        return fd.read()
    except IOError, e:
        if e.errno != errno.EAGAIN:
            raise e
        else:
            return ''

process = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
make_async(process.stdout)
make_async(process.stderr)

stdout = str()
stderr = str()
returnCode = None

while True:
    # Wait for data to become available 
    select.select([process.stdout, process.stderr], [], [])

    # Try reading some data from each
    stdoutPiece = read_async(process.stdout)
    stderrPiece = read_async(process.stderr)

    if stdoutPiece:
        print stdoutPiece,
    if stderrPiece:
        print stderrPiece,

    stdout += stdoutPiece
    stderr += stderrPiece
    returnCode = process.poll()

    if returnCode != None:
        return (returnCode, stdout, stderr)

Note that fcntl is only available on Unix-like platforms, including Cygwin.

If you need it to work on Windows without Cygwin, it's doable, but it's much, much tougher. You'll have to:

Adam Rosenfield
  • 360,316
  • 93
  • 484
  • 571
  • Can this also be done in windows? I looks like `fcntl` is unix only, and `select` has some stipulations as well. I should have mentioned in my question the solution needs to be cross platform. I'll add that. This is still pretty cool though! Thanks! – perden Oct 11 '11 at 17:49
  • I note that fcntl is only for Unix like platforms. If he's on a windows box, or wants this to be agnostic, then this solution won't work. – Spencer Rathbun Oct 11 '11 at 17:53
  • I see. Well, your unix-esque solution works beautifully and I guess I'll save the windows side of things as a rainy afternoon project. Thanks for the help! – perden Oct 11 '11 at 18:59
  • 2
    It's a shame the windows side is not as well supported as linux, but I guess that's the nature of the beast... a big, hairy, closed-source, non-standard pipes beast. – perden Oct 11 '11 at 19:01
  • Does it drop some output due to the earlier return on `poll() is not None`? A better returned value on EAGAIN could be None, to allow eof detection on empty string. btw, if there are some platforms that support `select` with timeout for pipes but do not support `fcntl(NONBLOCK)` then `os.read(size)` could be used to read available output (it may be less than `size`). Though I don't know any such platform. – jfs Jun 19 '13 at 14:57
  • Hey guys! Please do not use this code in this way. select.select([process.stdout, process.stderr], [], []) may hang forever if your command doesn't produce much output. you better use it this way: select.select([process.stdout, process.stderr], [], [], 1) or select.select([process.stdout, process.stderr], [], [], 0) or select.select([process.stdout, process.stderr], [], [], 0.1) either way will work. Adam Rosenfield, please correct your code. – kinORnirvana Apr 29 '16 at 11:45
  • An alternative function that is used in production (Chromium's CI builds) is here: https://github.com/catapult-project/catapult/blob/master/devil/devil/utils/cmd_helper.py#L214 See _IterProcessStdout() method – kinORnirvana Apr 29 '16 at 11:49
0

Combining this answer with this, the following code works for me:

import subprocess, sys
p = subprocess.Popen(args, stderr=sys.stdout.fileno(), stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ""):
 print line,
Community
  • 1
  • 1
Sundae
  • 664
  • 7
  • 25
  • If you're going combine stderr and stdout, the more common approach is to use `stderr=subprocess.STDOUT`. – dbn Jun 17 '16 at 20:03
0

When I tested it, it seemed readline() is blocking. However I was able to access stdout and stderr separately using threads. Code sample as follows:

import os
import sys
import subprocess
import threading

class printstd(threading.Thread):
    def __init__(self, std, printstring):
        threading.Thread.__init__(self)
        self.std = std
        self.printstring = printstring
    def run(self):
        while True:
          line = self.std.readline()
          if line != '':
            print self.printstring, line.rstrip()
          else:
            break

pythonfile = os.path.join(os.getcwd(), 'mypythonfile.py')

process = subprocess.Popen([sys.executable,'-u',pythonfile], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

print 'Process ID:', process.pid

thread1 = printstd(process.stdout, 'stdout:')
thread2 = printstd(process.stderr, 'stderr:')

thread1.start()
thread2.start()

threads = []

threads.append(thread1)
threads.append(thread2)

for t in threads:
    t.join()

However, I am not certain that this is thread-safe.

user1379351
  • 643
  • 1
  • 5
  • 15