27

I'm just starting a new Python project, and ideally I'd like to offer Python 2 and 3 support from the start, with minimal developmental overhead. My question is, what is the best way of doing this for brand new projects?

I have come across projects that run 2to3, or even 3to2, as part of their installation script. This seems to be a very common way. However, there seems to be several different ways of doing this. I also came across Distribute.

There is also the option of trying to write polyglot Python 2/Python 3 code. Even though this seems like a horrible idea, I have noticed that I tend to write code lately that is more idiomatic as Python 3 code, even though I still run it as Python 2. I have a feeling this only helps my own transition when the day finally arrives, and doesn't do much for offering or at least helping dual support though.

Most of the projects offering dual support that I have seen added Python 3 support late, so I'm especially curious if there is a better way that is more suited for new projects, where you have the benefit of a clean slate.

Thanks!

jonrsharpe
  • 99,167
  • 19
  • 183
  • 334
Gustav Larsson
  • 7,289
  • 2
  • 28
  • 50
  • 3
    FWIW, the CherryPy project has managed to write polyglot code that works without conversion under 2.3 through 3.2. It is doable but really depends on exactly what your project is doing. – gps Jul 07 '12 at 17:34
  • Can we do this with the libraries also? I mean can we merge them together for compatibility? BTW, I was about to ask this same question on 2.7 and 3+, you saved me :-D – ABcDexter Aug 11 '15 at 14:58

5 Answers5

7

In my experience, it depends on the kind of project.

If it is a library or very self contained application, a common choice is develop in Python 2.7 avoiding constructs deprecated in Python 3.x as much as possible and resort to automated tests to identify holes left by py2to3 that you will have to fix manually.

On the other side, for real life applications, be prepared to constantly stumble upon libraries that are not ported to py3k yet (sometimes important ones). Most of the time you will have no choice but port the library to Python 3, so if you can afford that, go for it. Usually I can't, that is why I'm not supporting Python 3 for this kind of project (but I struggle to write code that will be easier to port when opportune).

For unicode handling, I found this PyCon 2012 video very informative. The advice is good for both Python 2.x and 3.x: treat every string coming from outside as bytes and convert to unicode as soon as possible and output strings converting to bytes as late as possible. There is another very informative video about date/time handling.

[update]

This is an old answer. As of today (2019) there is no good rationale to start a project using Python 2.x and there are several compelling reasons to port older projects to Python 3.7+ and abandon support for Python 2.x.

Paulo Scardine
  • 60,096
  • 9
  • 116
  • 138
5

In my experience it is better to not to use a library like six; I instead have a single compat.py for each package with just the needed code, not unlike Scott Griffiths's approach. six has also the burden of trying to support long-gone Python versions: the truth is that life is much easier when you accept that Pythons <=2.6 and <=3.2 are gone. In 2.7 there are backported compatibility features such as .view* methods on dicts that work exactly like their non-prefixed versions on Python 3; and Python 3.3 on the other hand supports u prefix on unicode strings again.

Even for very substantial packages, a compat.py module, which allows other code to work unchanged, can be quite short: here's an example from the pika package that my colleagues and I helped to make 2/3 polyglot. Pika is one of those projects that had really messed internals mixing unicode and 8 bit strings with each other, but now we've used it in production on Python 3 for well over 6 months without problems.


Other important thing is to always use the following __future__s when developing:

from __future__ import absolute_import, division, print_function

I recommend against using unicode_literals, because there are some strings that need to be of the type called str on either platform. If you don't use unicode_literals, you can do the following:

  • b'123' is the 8-bit string literal
  • '123' is of the type str on both platforms
  • u'123' is proper unicode text on both platforms

In any case, please do not do 2to3 on installation/package build time; some packages used to do that in the past - pip installing those packages took some seconds on Python 2, but closer to minute on Python 3.

Community
  • 1
  • 1
Antti Haapala
  • 117,318
  • 21
  • 243
  • 279
  • It is a good advice (write 2.7/3.3+ polyglot code) similar to [Porting Python 2 Code to Python 3 by Brett Cannon](https://docs.python.org/3/howto/pyporting.html) but why do you against using `six` for implementing your `compat.py` module? For example, what would be the point of reimplementing `six.indexbytes(b'123', 1) == 50` and other similar things in your own project with your own unique API? – jfs Mar 12 '16 at 20:00
  • @J.F.Sebastian `six` didn't contain all the compat features needed on pika. It for example lacked a type that shouldn't be considered lists `(str, bytes)`; `basestring` works for that in Python 2; method for for doing Python 2 style `dict.keys()`, and some other stuff; stuff that could have taken from six amount to like a dozen or two of those lines. – Antti Haapala Mar 12 '16 at 20:12
  • I would use a tuple `(six.text_type, bytes)`. If the API is useful in a general Python 2/3 code then you could suggest its inclusion into `six`. Again, what is the point of creating a slightly different compatibility APIs in different projects for the same thing? I understand that `six` may be too fat for some projects but it shouldn't justify the code duplication *every* Python 2/3 project. – jfs Mar 12 '16 at 20:29
  • Well, not only is `six` fat, you also need to depend on a certain version of it. – Antti Haapala Mar 12 '16 at 20:42
2

Pick 2 or 3, whichever is your favorite flavor, and make it work really well in that, with unit tests. Then make sure those tests work after running it through py2to3 or py3to2. Better to maintain one version of code.

Yusuf X
  • 13,803
  • 5
  • 32
  • 42
  • `2to3` was conceived for this purpose, but isn't usually very workable for it. Most of its early advocates have seen how it worked in the real world and no longer really support using it this way. `3to2` is even less likely actually to work. – Mike Graham Jan 31 '13 at 16:11
1

My personal experience has been that it's easier to write code that works unchanged in both Python 2 and 3, rather than rely on 2to3/3to2 scripts which often can't quite get the translation right.

Maybe my situation is unusual as I'm doing lots with byte types and 2to3 has a hard task converting these, but the convenience of having one code base outweighs the nastiness of having a few hacks in the code.

As a concrete example, my bitstring module was an early convert to Python 3 and the same code is used for Python 2.6/2.7/3.x. The source is over 4000 lines of code and this is this bit I needed to get it to work for the different major versions:

# For Python 2.x/ 3.x coexistence
# Yes this is very very hacky.
try:
    xrange
    for i in range(256):
        BYTE_REVERSAL_DICT[i] = chr(int("{0:08b}".format(i)[::-1], 2))
except NameError:
    for i in range(256):
        BYTE_REVERSAL_DICT[i] = bytes([int("{0:08b}".format(i)[::-1], 2)])
    from io import IOBase as file
    xrange = range
    basestring = str

OK, that's not pretty, but it means that I can write 99% of the code in good Python 2 style and all the unit tests still pass for the same code in Python 3. This route isn't for everyone, but it is an option to consider.

Scott Griffiths
  • 19,759
  • 6
  • 53
  • 82
1

If you need support for Python 2.5 or earlier, using Distribute and it's 2to3 integration is typically the best way to go. But if you only need to support Python 2.6 or later, I would make the code run under Python 2 and Python 3 without conversion. I would also use the six library to make this easier.

Lennart Regebro
  • 147,792
  • 40
  • 207
  • 241