5

PEP 440 lays out what is the accepted format for version strings of Python packages.

These can be simple, like: 0.0.1

Or complicated, like: 2016!1.0-alpha1.dev2

What is a suitable regex which could be used for finding and validating such strings?

Leo
  • 779
  • 8
  • 19

2 Answers2

7

I had the same question. This is the most thorough regex pattern I could find. PEP440 links to the codebase of the packaging library in it's references section.

pip install packaging

To access just the pattern string you can use the global

from packaging import version
version.VERSION_PATTERN

See: https://github.com/pypa/packaging/blob/16.7/packaging/version.py#L159

# Deliberately not anchored to the start and end of the string, to make it
# easier for 3rd party code to reuse
VERSION_PATTERN = r"""
v?
(?:
    (?:(?P<epoch>[0-9]+)!)?                           # epoch
    (?P<release>[0-9]+(?:\.[0-9]+)*)                  # release segment
    (?P<pre>                                          # pre-release
        [-_\.]?
        (?P<pre_l>(a|b|c|rc|alpha|beta|pre|preview))
        [-_\.]?
        (?P<pre_n>[0-9]+)?
    )?
    (?P<post>                                         # post release
        (?:-(?P<post_n1>[0-9]+))
        |
        (?:
            [-_\.]?
            (?P<post_l>post|rev|r)
            [-_\.]?
            (?P<post_n2>[0-9]+)?
        )
    )?
    (?P<dev>                                          # dev release
        [-_\.]?
        (?P<dev_l>dev)
        [-_\.]?
        (?P<dev_n>[0-9]+)?
    )?
)
(?:\+(?P<local>[a-z0-9]+(?:[-_\.][a-z0-9]+)*))?       # local version
"""

Of course this example is specific to Python's flavor of regex.

4

I think this should comply with PEP440:

^(\d+!)?(\d+)(\.\d+)+([\.\-\_])?((a(lpha)?|b(eta)?|c|r(c|ev)?|pre(view)?)\d*)?(\.?(post|dev)\d*)?$

Explained

Epoch, e.g. 2016!:

(\d+!)?

Version parts (major, minor, patch, etc.):

(\d+)(\.\d+)+

Acceptable separators (., - or _):

([\.\-\_])?

Possible pre-release flags (and their normalisations; as well as post release flags r or rev), may have one or more digits following:

((a(lpha)?|b(eta)?|c|r(c|ev)?|pre(view)?)\d*)?

Post-release flags, and one or more digits:

(\.?(post|dev)\d*)?
Leo
  • 779
  • 8
  • 19
  • 1
    It was a difficult problem to solve, and could very well be useful to someone working with Python packaging. Does that not make it a decent question? – Leo Jun 22 '16 at 15:20
  • More to the point, if anyone has a better solution I'd really like to know. – Leo Jun 22 '16 at 15:21
  • first thing first..ask the entire question..your regex has much more information than you pointed out in the question.. – rock321987 Jun 22 '16 at 15:26