What difference between subprocess.call() and subprocess.Popen() makes PIPE less secure for the former?

Question

I've had a look at the documentation for both of them.

This question is prompted by J.F.'s comment here: Retrieving the output of subprocess.call()

The current Python documentation for subprocess.call() says the following about using PIPE for subprocess.call():

Note Do not use stdout=PIPE or stderr=PIPE with this function. The child process will block if it generates enough output to a pipe to fill up the OS pipe buffer as the pipes are not being read from.

Python 2.7 subprocess.call():

Note Do not use stdout=PIPE or stderr=PIPE with this function as that can deadlock based on the child process output volume. Use Popen with the communicate() method when you need pipes.

Python 2.6 includes no such warnings.

Also, the subprocess.call() and subprocess.check_call() don't seem to have a way to access their output, except for using stdout=PIPE with communicate():

https://docs.python.org/2.6/library/subprocess.html#convenience-functions

Note that if you want to send data to the process’s stdin, you need to create the Popen object with stdin=PIPE. Similarly, to get anything other than None in the result tuple, you need to give stdout=PIPE and/or stderr=PIPE too.

https://docs.python.org/2.6/library/subprocess.html#subprocess.Popen.communicate

What difference between subprocess.call() and subprocess.Popen() makes PIPE less secure for subprocess.call()?

More Specific: Why does subprocess.call() "deadlock based on the child process output volume.", and not Popen()?

// , Followup question: How does this difference change between Python 2 and Python 3? — Nathan Basanese, Sep 02 '15 at 23:47
// , Follow-up Question for Meta: Why do so few people have a sense of humor on SO? — Nathan Basanese, Sep 03 '15 at 17:12
# @NathanBasanese Perhaps I'm [completely missing the point](https://allthetropes.orain.org/wiki/Comically_Missing_the_Point), but the Python comment symbol is `#`, unlike in C++, PHP, and JavaScript. — Damian Yerrick, Sep 03 '15 at 17:14

score 21 · Accepted Answer · edited May 23 '17 at 12:34

21

call() is just Popen().wait() (± error handling).

You should not use stdout=PIPE with call() because it does not read from the pipe and therefore the child process will hang as soon as it fills the corresponding OS pipe buffer. Here's a picture that shows how data flows in command1 | command2 shell pipeline:

It does not matter what your Python version is -- the pipe buffer (look at the picture) is outside of your Python process. Python 3 does not use C stdio but it affects only the internal buffering. When the internal buffer is flushed the data goes into the pipe. If command2 (your parent Python program) does not read from the pipe then command1 (the child process e.g., started by call()) will hang as soon as the pipe buffer is full (pipe_size = fcntl(p.stdout, F_GETPIPE_SZ) ~65K on my Linux box (max value is /proc/sys/fs/pipe-max-size ~1M)).

You may use stdout=PIPE if you read from the pipe later e.g., using Popen.communicate() method. You could also read from process.stdout (the file object that represents the pipe) directly.

edited May 23 '17 at 12:34

Community

1
1

answered Sep 04 '15 at 11:26

jfs

346,887
152
868
1,518

// , It looks like the "pipe" metaphor works quite well. I did not know I could take it so literally that there could be a _stuffed_ pipe. Makes me wonder why the kernel's pipe buffer for that command does not just flush itself automatically, or why `call()` wouldn't just flush that buffer. To extend the "pipe" metaphor, if the pipe is installed somewhere that might not get out of the pipe, why not leave a pressure-activated connection (solenoid valve?) to the drainage sump? ("Flush that Buffer! Make code Tougher!"-Possible Slogan?) – Nathan Basanese Sep 04 '15 at 22:54
// , I can guess why some call systems programmers "Plumbers." It's all in the pipes. – Nathan Basanese Sep 04 '15 at 22:56
1

There is a way to say "flush the pipe buffer automatically"; it is called: "drop data" implemented as `stdout=DEVNULL` (you can use it safely with `call()`). – jfs Sep 04 '15 at 23:02
1

// , Ah, but then one no longer has the option of, later, possibly using `communicate()` with the output of `call()` or the output of `check_call()`, right? – Nathan Basanese Sep 04 '15 at 23:15
1

@NathanBasanese: `call()` returns child's exit status -- it is an ordinary integer (8-bit range on POSIX, 32-bit -- on Windows); you can't call `communicate()` with it. Anyway, the child process is already dead, even its PID may be reused already by other processes by the time `call()` returns. `check_call()` is just `call()` that raises an exception if the exit status is non-zero. It would be a useful exercise to implement a shell pipeline using only POSIX calls: pipe, fork, exec, dup2, waitpid e.g., [`recursive-pipe.c`](https://gist.github.com/zed/7835043) – jfs Sep 05 '15 at 00:01

Daniel · Answer 2 · 2015-09-04T23:26:33.410

Both call and Popen provide means of accessing the output of your command:

With Popen you can use communicate or provide a file descriptor or file object to the stdout=... parameter.
With call your only option is to pass a file descriptor or a file object to the stdout=... parameter (you can not use communicate with this one).

Now, the reason why stdout=PIPE is insecure when used with call is because call doesn't return until the subprocess has finished, this means all the output would have to reside in memory until that moment, and if the amount of output is too much then that would fill the OS pipes' buffer.

The references where you can validate the above information are the following:

According to this the parameters for both call and Popen are the same:

The arguments shown above are merely the most common ones, described below in Frequently Used Arguments (hence the slightly odd notation in the abbreviated signature). The full function signature is the same as that of the Popen constructor - this functions passes all supplied arguments directly through to that interface.

According to this the possible values for the stdout parameter are:

Valid values are PIPE, an existing file descriptor (a positive integer), an existing file object, and None. PIPE indicates that a new pipe to the child should be created

// , Er, does `stdout=POPEN`, in the fourth paragraph, actually mean `stdout=PIPE`, or is `POPEN` a worthy file descriptor? — Nathan Basanese, Sep 04 '15 at 23:21
// , Like I said, I did read through the documentation. I read it way too many times. It made me sad. Also, as to the bullet • points, I know that you have to pass a file descriptor or file object, and that PIPE is a bad idea, but I am just wondering _why_ that is less secure in the case of `subprocess methods. Also, in the fourth paragraph, why does the first part occur for the convenience functions, and not for Popen()? Why does this difference cause the "this means..." part? — Nathan Basanese, Sep 04 '15 at 23:24
@NathanBasanese Hi, about the `stdout=POPEN` thing, yes, it was a typo. Now it is corrected. About your other doubts: the reason is that with `Popen() + PIPE` you can call `communicate()` as many times as you want to avoid the internal OS pipe's buffer from filling, while in the case of `call() + PIPE` you cannot invoke `communicate()` as your process blocks as soon as you invoke `call()`. — Daniel, Sep 04 '15 at 23:29

What difference between subprocess.call() and subprocess.Popen() makes PIPE less secure for the former?

2 Answers2

Linked