Read streaming input from subprocess.communicate()

Question

I'm using Python's subprocess.communicate() to read stdout from a process that runs for about a minute.

How can I print out each line of that process's stdout in a streaming fashion, so that I can see the output as it's generated, but still block on the process terminating before continuing?

subprocess.communicate() appears to give all the output at once.

related: [Getting realtime output using subprocess](http://stackoverflow.com/q/803265/4279) — jfs, Oct 16 '14 at 20:11

score 165 · Answer 1 · edited May 18 '19 at 12:38

165

To get subprocess' output line by line as soon as the subprocess flushes its stdout buffer:

#!/usr/bin/env python2
from subprocess import Popen, PIPE

p = Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1)
with p.stdout:
    for line in iter(p.stdout.readline, b''):
        print line,
p.wait() # wait for the subprocess to exit

iter() is used to read lines as soon as they are written to workaround the read-ahead bug in Python 2.

If subprocess' stdout uses a block buffering instead of a line buffering in non-interactive mode (that leads to a delay in the output until the child's buffer is full or flushed explicitly by the child) then you could try to force an unbuffered output using pexpect, pty modules or unbuffer, stdbuf, script utilities, see Q: Why not just use a pipe (popen())?

Here's Python 3 code:

#!/usr/bin/env python3
from subprocess import Popen, PIPE

with Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1,
           universal_newlines=True) as p:
    for line in p.stdout:
        print(line, end='')

Note: Unlike Python 2 that outputs subprocess' bytestrings as is; Python 3 uses text mode (cmd's output is decoded using locale.getpreferredencoding(False) encoding).

edited May 18 '19 at 12:38

martineau

99,260
22
139
249

answered Jul 17 '13 at 11:15

jfs

346,887
152
868
1,518

what does the b'' mean? – Aaron Apr 07 '14 at 18:58
4

`b''` is a `bytes` literal in Python 2.7 and Python 3. – jfs Apr 07 '14 at 18:59
`bufsize=1` is the key! At least it works when cmd is `fastboot`. – Jinghao Shi Aug 03 '14 at 20:50
@JinghaoShi: bufsize=1 should have no effect other than performance on Python 2 where bufsize=0 by default. – jfs Aug 03 '14 at 21:06
@J.F.Sebastian: Yeah, I was also confused. Yet it did make a difference in my case. – Jinghao Shi Aug 04 '14 at 20:15
2

@JinghaoShi: `bufsize=1` may make a difference if you also *write* (using `p.stdin`) to the subprocess e.g., it can help to avoid a deadlock while doing an interactive (`pexpect`-like) exchange -- assuming there are no buffering issues in child process itself. If you are only reading then as I said the difference is only in performance: if it is not so then could you provide a minimal complete code example that shows it? – jfs Aug 05 '14 at 01:22
@J.F.Sebastian: My code is the same with what you have in the answer. The sub process I was running is Android `fastboot` tool to flash a phone. And I didn't write to `p.stdin` during the process. Before setting `bufisze` to 1, I can only see the output after the flashing finishes (~2 min). – Jinghao Shi Aug 05 '14 at 16:56
This prints all of the read lines with a `b'...'` around them. – Nate Glenn Apr 16 '15 at 16:03
@NateGlenn: no, it won't. Notice: it uses Python 2 syntax: `print line,`. You are likely to see `b''` in Python 3. I've added Python 3 code. – jfs Apr 16 '15 at 21:14
Thanks. I'm writing an ST3 plugin and don't use Python normally. – Nate Glenn Apr 17 '15 at 02:29
I am also having same issue where my prints from subprocess are buffered till the end of the subprocess. I tired to use your way of getting prints on the terminal as they are encountered but it is still not resolving the issue and still doing the buffer. My post is at http://stackoverflow.com/questions/31321414/how-to-get-the-normal-print-statement-execution-when-using-stdout-subprocess-pip – user2966197 Jul 09 '15 at 16:14
@user2966197: `-u` works for a python subprocess. In general, I recommend that you read the links in the paragraph above that starts with *"If subprocess' stdout uses a block buffering.."* – jfs Jul 09 '15 at 16:24
@J.F.Sebastian I am using python 3.4, `bufsize=1` really helps when using `subprocess.Popen().communicate()` as it essentially tells the process to flush stdout / stderr as soon as there is 1 byte of data. default is -1 – Devy Apr 01 '16 at 22:33
@Devy it is a wrong assumption. If `bufsize` had any effect on the buffering **inside** the child process then I wouldn't need to mention `stdbuf`, `pty`, etc. If you are using `.communicate()` then `bufsize` should not have any effect on the parent too (unless its implementation is defective). – jfs Apr 01 '16 at 22:36
@J.F.Sebastian so is the idea that we put all the code that we want to execute after the subprocess is finished *inside* the `with` context manager (after the `for` loop) and if so, how do we get something like the return code? (as neither `wait` nor `communicate` are being called) – Startec Jun 01 '16 at 21:53
@Startec why do you think it is necessary? Put the code after the `with`-statement, to make sure the process is reaped. – jfs Jun 01 '16 at 22:02
could you listen in for stderr at the same time? – ealeon Jun 30 '16 at 16:46
1

@ealeon: yes. It requires techniques that can [read stdout/stderr concurrently](http://stackoverflow.com/a/31953436/4279) unless you merge stderr into stdout (by passing `stderr=subprocess.STDOUT` to `Popen()`). See also, [threading](http://stackoverflow.com/a/4985080/4279) or [asyncio solutions](http://stackoverflow.com/a/25960956/4279) linked there. – jfs Jun 30 '16 at 16:59
@J.F.Sebastian In python 3.5.2, I see exactly the same behavior when I replace the loop starting `for line in p.stdout:` by `pass`. Is there some way to actually process the lines? Suppose I only want to print some of the lines, for example? – saulspatz Feb 27 '17 at 19:31
2

@saulspatz if `stdout=PIPE` doesn't capture the output (you still see it on the screen) then your program might print to stderr or directly to the terminal instead. To merge stdout&stderr, pass `stderr=subprocess.STDOUT` (see my previous comment). To capture output printed directly to your tty, you could [use pexpect, pty solutions.](http://stackoverflow.com/a/25945031/4279). Here's a [more complex code example](http://stackoverflow.com/a/29085418/4279). – jfs Feb 27 '17 at 21:09
with Popen("./Portability.py", stdout=PIPE, stderr=STDOUT, bufsize=1, universal_newlines=True) as p, \ open('Portability.log', 'ab') as file: for line in p.stdout: # b'\n'-separated lines print(line, end='') #new addition sys.stdout.buffer.write(line) # pass bytes as is file.write(line) ERROR IS : Traceback (most recent call last): File "./Portability_Tests.py", line 67, in sys.stdout.buffer.write(line) # pass bytes as is TypeError: a bytes-like object is required, not 'str' – Ujjawal Khare Dec 26 '17 at 07:11

score 47 · Accepted Answer · edited May 23 '17 at 12:26

47

Please note, I think J.F. Sebastian's method (below) is better.

Here is an simple example (with no checking for errors):

import subprocess
proc = subprocess.Popen('ls',
                       shell=True,
                       stdout=subprocess.PIPE,
                       )
while proc.poll() is None:
    output = proc.stdout.readline()
    print output,

If ls ends too fast, then the while loop may end before you've read all the data.

You can catch the remainder in stdout this way:

output = proc.communicate()[0]
print output,

edited May 23 '17 at 12:26

Community

1
1

answered Apr 26 '10 at 18:54

unutbu

711,858
148
1,594
1,547

1

does this scheme fall victim to the buffer blocking problem that the python doc refers to? – Heinrich Schmetterling Apr 26 '10 at 19:22
@Heinrich, the buffer blocking problem is not something I understand well. I believe (just from googling around) that this problem only occurs if you don't read from stdout (and stderr?) inside the while loop. So I think the above code is okay, but I can't say for sure. – unutbu Apr 26 '10 at 19:44
1

This actually does suffer from a blocking problem, a few years ago I had no end to the trouble where readline would block 'til it got a newline even if the proc had ended. I don't remember the solution, but I think it had something to do with doing the reads on a worker thread and just looping `while proc.poll() is None: time.sleep(0)` or something to that effect. Basically- you need to either ensure that the output newline is the last thing that the process does (because you can't give the interpreter time to loop again) or you need to do something "fancy." – dash-tom-bang Apr 26 '10 at 20:05
@Heinrich: Alex Martelli writes about how to avoid the deadlock here: http://stackoverflow.com/questions/1445627/how-can-i-find-out-why-subprocess-popen-wait-waits-forever-if-stdoutpipe/1445647#1445647 – unutbu Apr 26 '10 at 20:47
6

The buffer blocking is simpler than it sometimes sounds: parent blocks waiting for child to exit + child blocks waiting for parent to read and free some space in the communication pipe which is full = deadlock. It is that simple. The smaller the pipe the more likely to happen. – MarcH Mar 28 '13 at 14:13

D Coetzee · Answer 3 · 2013-08-22T23:45:27.040

I believe the simplest way to collect output from a process in a streaming fashion is like this:

import sys
from subprocess import *
proc = Popen('ls', shell=True, stdout=PIPE)
while True:
    data = proc.stdout.readline()   # Alternatively proc.stdout.read(1024)
    if len(data) == 0:
        break
    sys.stdout.write(data)   # sys.stdout.buffer.write(data) on Python 3.x

The readline() or read() function should only return an empty string on EOF, after the process has terminated - otherwise it will block if there is nothing to read (readline() includes the newline, so on empty lines, it returns "\n"). This avoids the need for an awkward final communicate() call after the loop.

On files with very long lines read() may be preferable to reduce maximum memory usage - the number passed to it is arbitrary, but excluding it results in reading the entire pipe output at once which is probably not desirable.

`data = proc.stdout.read()` blocks until *all* data is read. You might be confusing it with `os.read(fd, maxsize)` that can return earlier (as soon as any data is available). — jfs, Aug 22 '13 at 09:15
You're correct, I was mistaken. However if a reasonable number of bytes is passed as an argument to `read()` then it works fine, and likewise `readline()` works fine as long as the maximum line length is reasonable. Updated my answer accordingly. — D Coetzee, Aug 22 '13 at 23:46

score 3 · Answer 4 · answered Apr 26 '10 at 18:29

3

If you want a non-blocking approach, don't use process.communicate(). If you set the subprocess.Popen() argument stdout to PIPE, you can read from process.stdout and check if the process still runs using process.poll().

answered Apr 26 '10 at 18:29

Lukáš Lalinský

38,094
6
90
114

1

[non-blocking approach is not straightforward](http://stackoverflow.com/q/375427/4279) – jfs Sep 23 '15 at 14:45

score 2 · Answer 5 · edited May 23 '17 at 12:18

2

If you're simply trying to pass the output through in realtime, it's hard to get simpler than this:

import subprocess

# This will raise a CalledProcessError if the program return a nonzero code.
# You can use call() instead if you don't care about that case.
subprocess.check_call(['ls', '-l'])

See the docs for subprocess.check_call().

If you need to process the output, sure, loop on it. But if you don't, just keep it simple.

Edit: J.F. Sebastian points out both that the defaults for the stdout and stderr parameters pass through to sys.stdout and sys.stderr, and that this will fail if sys.stdout and sys.stderr have been replaced (say, for capturing output in tests).

edited May 23 '17 at 12:18

Community

1
1

answered Sep 22 '15 at 15:34

Nate

4,033
2
27
41

It won't work if `sys.stdout` or `sys.stderr` are replaced with file-like objects that have no real fileno(). If `sys.stdout`, `sys.stderr` are not replaced then it is even simpler: `subprocess.check_call(args)`. – jfs Sep 22 '15 at 18:47
Thanks! I'd realized the vagaries of replacing sys.stdout/stderr, but somehow never realized that if you omit the arguments, it passes stdout and stderr to the right places. I like `call()` over `check_call()` unless I want the `CalledProcessError`. – Nate Sep 23 '15 at 14:41
`python -mthis`: *"Errors should never pass silently. Unless explicitly silenced."* that is why the _example code_ should prefer `check_call()` over `call()`. – jfs Sep 23 '15 at 14:42
Heh. A lot of the programs I wind up `call()`ing return nonzero error codes in non-error conditions, because they are terrible. So on our case, a nonzero error code is not actually an error. – Nate Sep 23 '15 at 14:45
yes. There are programs such as `grep` that may return non-zero exit status even if there is no error -- they are exceptions. By default zero exit status indicates success. – jfs Sep 23 '15 at 14:48
Sure. Anyhow, now the example code uses check_call() and explains when you might want call() instead. – Nate Sep 23 '15 at 14:52

score 1 · Answer 6 · answered Nov 12 '17 at 23:22

1

myCommand="ls -l"
cmd=myCommand.split()
# "universal newline support" This will cause to interpret \n, \r\n and \r     equally, each as a newline.
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, universal_newlines=True)
while True:    
    print(p.stderr.readline().rstrip('\r\n'))

answered Nov 12 '17 at 23:22

Petr J

11
1

1

it is always good to explain what your solution does just to make people understand better – DaFois Nov 12 '17 at 23:44
2

You should consider using `shlex.split(myCommand)` instead of `myCommand.split()`. It honors spaces in quoted arguments, as well. – UtahJarhead Sep 17 '18 at 01:32

score 0 · Answer 7 · answered Oct 15 '20 at 23:05

Adding another python3 solution with a few small changes:

Allows you to catch the exit code of the shell process (I have been unable to get the exit code while using the with construct)
Also pipes stderr out in real time

import subprocess
import sys
def subcall_stream(cmd, fail_on_error=True):
    # Run a shell command, streaming output to STDOUT in real time
    # Expects a list style command, e.g. `["docker", "pull", "ubuntu"]`
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1, universal_newlines=True)
    for line in p.stdout:
        sys.stdout.write(line)
    p.wait()
    exit_code = p.returncode
    if exit_code != 0 and fail_on_error:
        raise RuntimeError(f"Shell command failed with exit code {exit_code}. Command: `{cmd}`")
    return(exit_code)

Read streaming input from subprocess.communicate()

7 Answers7

Linked

Related