63

I'm confused about how subprocess searches for the executable when using Popen(). It works if given absolute paths to the child process, but I'm trying to use relative paths. I've found that if I set the environment variable PYTHONPATH then I can get imported modules from that path ok, and PYTHONPATH is there in sys.path, but it doesn't seem to help with the behaviour of subprocess.Popen. I've also tried editing the sitecustomize.py file adding PYTHONPATH to os.environ, like so

# copy PYTHONPATH environment variable into PATH to allow our stuff to use
# relative paths for subprocess spawning
import os
if os.getenv('PYTHONPATH') is not None and os.getenv('PATH') is not none:
    os.environ['PATH'] = ':'.join([os.getenv('PATH'), os.getenv('PYTHONPATH')])

and verified that when starting up python , either interactively, with ipython, or by running a script from the command line, that PYTHONPATH is successfully appearing in os.environ. However, subrocess.Popen still doesn't search there for the executable. I thought it was supposed to inherit the parents environment, if no env kwarg is specified? Next I tried giving env explicitly, first by making a copy of os.getenv and secondly just by giving env={'PATH': '/explicit/path/to/search/from'}, and it still does not find the executable. Now I'm stumped.

Hopefully an example will help explain my problem more clearly:

/dir/subdir1/some_executable
/dir/subdir2/some_script.py

# some_script.py
from subprocess import Popen, PIPE
spam, eggs = Popen(['../subdir1/some_executable'], stdout=PIPE, stderr=PIPE).communicate()

If I'm in /dir/subdir2 and I run python some_script.py it works, but if I'm in /dir and I run python subdir2/some_script.py even though /dir/subdir2 is in the os.environ['PATH'], then subprocess will throw OSError: [Errno 2] No such file or directory.

wim
  • 266,989
  • 79
  • 484
  • 630
  • 1
    On rereading the question, I think I see the issue. In a command shell, switch to `/dir` and see what happens if you type `../subdir1/some_executable`. – ncoghlan Apr 14 '11 at 06:24
  • ok i see what you are saying, my misunderstanding was the assumption that relative paths would get searched for just the same as a bare program call. thanks – wim Apr 14 '11 at 07:08

4 Answers4

68

(filling in details from a comment to make a separate answer)

First off, relative paths (paths containing slashes) never get checked in any PATH, no matter what you do. They are relative to the current working directory only. If you need to resolve relative paths, you will have to search the PATH manually, or munge the PATH to include the subdirectories and then just use the command name as in my suggestion below.

If you want to run a program relative to the location of the Python script, use __file__ and go from there to find the absolute path of the program, and then use the absolute path in Popen.

Searching in the current process' environment variable PATH

Secondly, there is an issue in the Python bug tracker about how Python deals with bare commands (no slashes). Basically, on Unix/Mac Popen behaves like os.execvp when the argument env=None (some unexpected behavior has been observed and noted at the end):

On POSIX, the class uses os.execvp()-like behavior to execute the child program.

This is actually true for both shell=False and shell=True, provided env=None. What this behavior means is explained in the documentation of the function os.execvp:

The variants which include a “p” near the end (execlp(), execlpe(), execvp(), and execvpe()) will use the PATH environment variable to locate the program file. When the environment is being replaced (using one of the exec*e variants, discussed in the next paragraph), the new environment is used as the source of the PATH variable.

For execle(), execlpe(), execve(), and execvpe() (note that these all end in “e”), the env parameter must be a mapping which is used to define the environment variables for the new process (these are used instead of the current process’ environment); the functions execl(), execlp(), execv(), and execvp() all cause the new process to inherit the environment of the current process.

The second quoted paragraph implies that execvp will use the current process' environment variables. Combined with the first quoted paragraph, we deduce that execvp will use the value of the environment variable PATH from the environment of the current process. This means that Popen looks at the value of PATH as it was when Python launched (the Python that runs the Popen instantiation) and no amount of changing os.environ will help you fix that.

Also, on Windows with shell=False, Popen pays no attention to PATH at all, and will only look in relative to the current working directory.

What shell=True does

What happens if we pass shell=True to Popen? In that case, Popen simply calls the shell:

The shell argument (which defaults to False) specifies whether to use the shell as the program to execute.

That is to say, Popen does the equivalent of:

Popen(['/bin/sh', '-c', args[0], args[1], ...])

In other words, with shell=True Python will directly execute /bin/sh, without any searching (passing the argument executable to Popen can change this, and it seems that if it is a string without slashes, then it will be interpreted by Python as the shell program's name to search for in the value of PATH from the environment of the current process, i.e., as it searches for programs in the case shell=False described above).

In turn, /bin/sh (or our shell executable) will look for the program we want to run in its own environment's PATH, which is the same as the PATH of the Python (current process), as deduced from the code after the phrase "That is to say..." above (because that call has shell=False, so it is the case already discussed earlier). Therefore, the execvp-like behavior is what we get with both shell=True and shell=False, as long as env=None.

Passing env to Popen

So what happens if we pass env=dict(PATH=...) to Popen (thus defining an environment variable PATH in the environment of the program that will be run by Popen)?

In this case, the new environment is used to search for the program to execute. Quoting the documentation of Popen:

If env is not None, it must be a mapping that defines the environment variables for the new process; these are used instead of the default behavior of inheriting the current process’ environment.

Combined with the above observations, and from experiments using Popen, this means that Popen in this case behaves like the function os.execvpe. If shell=False, Python searches for the given program in the newly defined PATH. As already discussed above for shell=True, in that case the program is either /bin/sh, or, if a program name is given with the argument executable, then this alternative (shell) program is searched for in the newly defined PATH.

In addition, if shell=True, then inside the shell the search path that the shell will use to find the program given in args is the value of PATH passed to Popen via env.

So with env != None, Popen searches in the value of the key PATH of env (if a key PATH is present in env).

Propagating environment variables other than PATH as arguments

There is a caveat about environment variables other than PATH: if the values of those variables are needed in the command (e.g., as command-line arguments to the program being run), then even if these are present in the env given to Popen, they will not get interpreted without shell=True. This is easily avoided without changing shell=True: insert those value directly in the list argument args that is given to Popen. (Also, if these values come from Python's own environment, the method os.environ.get can be used to get their values).

Using /usr/bin/env

If you JUST need path evaluation and don't really want to run your command line through a shell, and are on UNIX, I advise using env instead of shell=True, as in

path = '/dir1:/dir2'
subprocess.Popen(['/usr/bin/env', '-P', path, 'progtorun', other, args], ...)

This lets you pass a different PATH to the env process (using the option -P), which will use it to find the program. It also avoids issues with shell metacharacters and potential security issues with passing arguments through the shell. Obviously, on Windows (pretty much the only platform without a /usr/bin/env) you will need to do something different.

About shell=True

Quoting the Popen documentation:

If shell is True, it is recommended to pass args as a string rather than as a sequence.

Note: Read the Security Considerations section before using shell=True.

Unexpected observations

The following behavior was observed:

  • This call raises FileNotFoundError, as expected:

    subprocess.call(['sh'], shell=False, env=dict(PATH=''))
    
  • This call finds sh, which is unexpected:

    subprocess.call(['sh'], shell=False, env=dict(FOO=''))
    

    Typing echo $PATH inside the shell that this opens reveals that the PATH value is not empty, and also different from the value of PATH in the environment of Python. So it seems that PATH was indeed not inherited from Python (as expected in the presence of env != None), but still, it the PATH is nonempty. Unknown why this is the case.

  • This call raises FileNotFoundError, as expected:

    subprocess.call(['tree'], shell=False, env=dict(FOO=''))
    
  • This finds tree, as expected:

    subprocess.call(['tree'], shell=False, env=None)
    
Ioannis Filippidis
  • 8,272
  • 8
  • 66
  • 96
Walter Mundt
  • 22,993
  • 5
  • 50
  • 60
  • 13
    +1 "Also, on Windows with shell=False, it pays no attention to PATH at all, and will only look in relative to the current working directory." Just helped me with a big issue - thanks! – sparc_spread Oct 09 '13 at 20:03
  • 3
    A simple way that should work on Windows too is to explicitly give `os.environ['PATH']` as the argument `env` to `subprocess.Popen`, as done here: http://stackoverflow.com/a/4453495/1959808 and there: http://stackoverflow.com/a/20669704/1959808. – Ioannis Filippidis Jun 30 '16 at 12:02
  • the `/usr/bin/env` trick does not work, at least for system commands like `useradd` and at least at CentOS (with empty PATH from cron): `/usr/bin/env: groupadd: No such file or directory` – grandrew Jul 05 '16 at 13:39
  • If PATH is empty, that's not surprising. AFAIK, unlike the shell, /usr/bin/env doesn't have a default PATH it falls back on. Honestly, I wouldn't recommend relying on the shell's default PATH anyway; if you're writing cron jobs, just write out the full paths to your binaries or set a PATH yourself. – Walter Mundt Jul 14 '16 at 17:56
  • I have a `subprocess.Popen` that does appear to search the path with `shell=False`. What is ineffective, however, is augmenting the path to include the location of the executable using `sys.path.append` - I found that it only works if `%PATH%` contains the path to the executable before the Python program is launched. – starfry Mar 20 '17 at 11:34
14

You appear to be a little confused about the nature of PATH and PYTHONPATH.

PATH is an environment variable that tells the OS shell where to search for executables.

PYTHONPATH is an environment variable that tells the Python interpreter where to search for modules to import. It has nothing to do with subprocess finding executable files.

Due to the differences in the underlying implementation, subprocess.Popen will only search the path by default on non-Windows systems (Windows has some system directories it always searches, but that's distinct from PATH processing). The only reliable cross-platform way to scan the path is by passing shell=True to the subprocess call, but that has its own issues (as detailed in the Popen documentation)

However, it appears your main problem is that you are passing a path fragment to Popen rather than a simple file name. As soon as you have a directory separator in there, you're going to disable the PATH search, even on a non-Windows platform (e.g. see the Linux documentation for the exec family of functions).

ncoghlan
  • 35,440
  • 8
  • 67
  • 77
  • 3
    This doesn't match up with the Python docs. [The Popen docs](http://docs.python.org/library/subprocess.html) state that the program is executed via `os.execvp` -- and that call DOES take into account the PATH environment variable. Also, if you JUST need path evaluation, I advise using `env` instead of `shell=True`, as in `Popen(['/usr/bin/env', 'progtorun', other, args], ...)`. This avoids issues with shell metacharacters and potential security issues with passing arguments through the shell. – Walter Mundt Apr 14 '11 at 06:03
  • 1
    Those are both *NIX specific though - they don't work on Windows, so I don't like recommending them as workarounds for a nominally cross-platform module. You're correct that my answer is incorrect as written though - will edit accordingly. – ncoghlan Apr 14 '11 at 06:17
  • 2
    Updated to make it clear that not searching PATH by default is a Windows-only thing, but also to point out the real problem (a directory separator in the command to be executed). – ncoghlan Apr 14 '11 at 06:36
  • A small alteration. `subprocess.Popen` will pick up executables in `C:\Windows\System32` which (and I had fun figuring this out) if you're running 32-bit python on a 64-bit Windows is actually `C:\Windows\SysWOW64` – John Oxley Oct 06 '16 at 09:19
  • 1
    @JohnOxley I've tweaked the answer to mention that, but are you aware of any good reference links for that? Maybe on MSDN somewhere? – ncoghlan Oct 10 '16 at 06:39
  • Personally, I've had no problem with windows and paths... no idea why. Totally finds "nosetests" in the path, for example on our Jenkins virtualenv windows instances. – Erik Aronesty May 02 '18 at 15:41
2

A relative path in subprocess.Popen acts relative to the current working directory, not the elements of the systems PATH. If you run python subdir2/some_script.py from /dir then the expected executable location (passed to Popen) will be /dir/../subdir1/some_executable, a.k.a /subdir1/some_executable not /dir/subdir1/some_executable.

If you would definitely like to use relative paths from a scripts own directory to a particular executable the best option would be to first construct an absolute path from the directory portion of the __file__ global variable.

#/usr/bin/env python
from subprocess import Popen, PIPE
from os.path import abspath, dirname, join
path = abspath(join(dirname(__file__), '../subdir1/some_executable'))
spam, eggs = Popen(path, stdout=PIPE, stderr=PIPE).communicate()
Jeremy Fishman
  • 894
  • 9
  • 11
  • Uh, what? `subdir2/some_script.py` relative to `/dir` is simply `/dir/subdir2/some_script.py` – tripleee Dec 10 '20 at 16:06
  • The python script at `subdir2/some_script.py` executes `Popen` with an executable path of `../subdir1/some_executable`. It is that executable path that is resolved relative to the current working director `/dir`, resulting in `/dir/../subdir1/some_executable`. See Walter's answer, which says the same thing in different ways. I could have worded my answer better. Cheers! edit: looks like I had a typo in my answer too, using `subdir2` in the executable path where I meant `subdir1`. – Jeremy Fishman Dec 11 '20 at 17:36
0

The pythonpath is set to the path from where the python interpreter is executed. So, in second case of your example, the path is set to /dir and not /dir/subdir2 That's why you get an error.

c0da
  • 959
  • 3
  • 13
  • 28
  • 1
    i don't believe this is correct, because if i write a simple script to print os.environ , then PYTHONPATH is the same no matter from where i run the interpreter. PYTHONPATH was set in /etc/environment and is used to augment the search path for modules – wim Apr 14 '11 at 04:57
  • I meant to say that the directory from which python is executed, that directory is added to pythonpath. Here in the 2nd case, /dir is added, and not /dir/subdir2. So, you can either change your code to reflect the changes (one way can be to add /dir/subdir2 to os.path in your code) or launch python from the appropriate directory. – c0da Apr 14 '11 at 05:48