8

Background

I write small python packages for a system that uses modules (https://luarocks.org/) to manage packages. For those of you who don't know it, you can run module load x and a small script is run that modifies various environmental variables to make software 'x' work, you can then undo this with module unload x.

This method of software management is nearly ubiquitous in scientific computing and has a lot of value in that arena: you can run ancient unmaintained software alongside packages that that software would interfere with, you can run multiple versions of software, which allows you to reproduce your data exactly (you can go back to old versions), and you can run frankly poorly written non updated software with outdated dependencies.

These features are great, but they create an issue with the python 2/3 split:

What if you want to write a package that works with both python 2 and 3 and use it alongside software that requires either python 2 or 3?

The way you make old python2 dependent software work on these large systems is that you make a python/2.7.x module and a python/3.5 module. When you want to run a script that uses python 2, you load that module, etc.

However, I want to write a single python package that can work in either environment, because I want that software to be active regardless of which python interpreter is being used.

This is fundamentally extremely easy: just use a #!/usr/bin/env python shebang line, done. That works. I write all my software to work with either, so no problem.

Question

The issue is: I want to use setuptools to distribute my package to other scientists in the same situation, and setup tools mangles the shebang line.

I don't want to get into a debate about whether mangling the shebang line is a good idea or not, I am sure it is since it has existed for years now in the same state. I honestly don't care, it doesn't work for me. The default setuptools install causes the software not to run because when a python interpreter's module is not loaded, that python interpreter does not function, the PYTHONPATH is totally wrong for it.

If all of my users had root access, I could use the data_files option to just copy the scripts to /usr/bin, but this is a bad idea for compatibility, and my users don't have root access anyway so it is a moot point.

Things I tried so far:

I tried setting the sys.executable to /usr/bin/env python in the setup.py file, but that doesn't work, because then the shebang is: #!"/usr/bin/env python", which obviously doesn't work.

I tried the Don't touch my shebang class idea in this question: Don't touch my shebang! (it is the bottom answer with 0 votes). That didn't work either, probably because it is written for distutils and not setuptools. Plus that question is 6 years old.

I also looked at these questions:

Setuptools entry_points/console_scripts have specific Python version in shebang

Changing console_script entry point interpreter for packaging

The methods described there do not work, the shebang line is still altered.

Creating a setup.cfg file with the contents::

[build]
executable = /usr/bin/env python

also does not change the shebang line mangling behavior.

There is an open issue on the setuptools github page that discusses something similar:

https://github.com/pypa/setuptools/issues/494

So I assume this isn't possible to do natively, but I wonder if there is a workaround?

Finally, I don't like any solution that involves asking the user to modify their install flags, e.g. with -e.

Is there anyway to modify this behavior, or is there another distribution system I can use instead? Or is this too much of an edge case and I just need to write some kind of custom installation script?

Thanks all.


Update

I think I was not clear enough in my original question, what I want the user to be able to do is:

  • Install the package in both python2 and python3 (the modules will go into lib/pythonx/site-lib.
  • Be able to run the scripts irrespective of which python environment is active.

If there is a way to accomplish this without preventing shebang munging, that would be great.

All my code is already compatible with python 2.7 and python 3.3+ out of the box, the main thing is just making the scripts run irrespective of active python environment.

Community
  • 1
  • 1
Mike Dacre
  • 667
  • 8
  • 26
  • 1
    Couldn't you just install the software in two virtualenvs? One with Python 2 and one with Python 3. – Martijn Pieters Jun 17 '16 at 06:21
  • Yes I could, I could also easily edit the installed scripts and changed the shebang line. However, I ideally want to avoid anything 'complex' like that, because this project is intended to be used by people who don't have the patience for complex installs. I want to make it as easy for them as possible. i.e. `python ./setup.py install --user` and you're done. – Mike Dacre Jun 17 '16 at 10:20
  • You can write your code in Python polyglot. See http://stackoverflow.com/questions/11372190/python-2-and-python-3-dual-development – boardrider Jun 17 '16 at 10:52
  • @boardrider my code is actually already natively polyglot, the issue is that the scripts have the python path hardcoded in, so if the user's python environment changes, the scripts stop working, even though the code would work fine with any version of python... if it weren't for the shebang line. – Mike Dacre Jun 17 '16 at 18:27
  • 1
    You seem to have accidentally mangled your question. A big chunk of the text appears twice, with the formatting messed up around the start of the second copy. – user2357112 supports Monica Jun 17 '16 at 18:32
  • @user2357112 Thank you, I fixed it now – Mike Dacre Jun 17 '16 at 18:40
  • I don't understand what's your problem with using `#!/usr/bin/env python` – boardrider Jun 18 '16 at 19:18
  • @boardrider setuptools changes the shebang line to match the interpreter used to install the package. – Mike Dacre Jun 20 '16 at 04:21

1 Answers1

2

I accidentally stumbled onto a workaround while trying to write a custom install script.

import os
from setuptools import setup
from setuptools.command.install import install

here = os.path.abspath(os.path.dirname(__file__))

# Generate a list of python scripts
scpts = []
scpt_dir = os.listdir(os.path.join(here, 'bin'))
for scpt in scpt_dir:
    scpts.append(os.path.join(here, 'bin', scpt))

class ScriptInstaller(install):

    """Install scripts directly."""

    def run(self):
        """Wrapper for parent run."""
        super(ScriptInstaller, self).run()

setup(
    cmdclass={'install': ScriptInstaller},
    scripts=scpts,
    ...
)

This code doesn't do exactly what I wanted (alter just the shebang line), it actually just copies the whole script to ~/.local/bin, instead of wrapping it in::

__import__('pkg_resources').run_script()

Additionally, and more concerningly, this method makes setuptools create a root module directory plus an egg-info directory like this::

.local/lib/python3.5/site-packages/cluster
.local/lib/python3.5/site-packages/python_cluster-0.6.1-py3.5.egg-info

Instead of a single egg, which is the usual behavior::

.local/lib/python3.5/site-packages/python_cluster-0.6.1-py3.5.egg

As far as I am aware this is the behavior of the old distutils, which makes me worry that this install would fail on some systems or have other unexpected behavior (although please correct me if I am wrong, I really am no expert on this).

However, given that my code is going to be used almost entirely on linux and OS X, this isn't the end of the world. I am more concerned that this behavior will just disappear sometime very soon.

I posted a comment on an open feature request on the setuptools github page:

https://github.com/pypa/setuptools/issues/494

The ideal solution would be if I could add an executable=/usr/bin/env python statement to setup.cfg, hopefully that is reimplemented soon.

This workaround will work for me for now though. Thanks all.

Mike Dacre
  • 667
  • 8
  • 26