223

In Python, the (?P<group_name>…) syntax allows one to refer to the matched string through its name:

>>> import re
>>> match = re.search('(?P<name>.*) (?P<phone>.*)', 'John 123456')
>>> match.group('name')
'John'

What does "P" stand for? I could not find any hint in the official documentation.

I would love to get ideas about how to help my students remember this syntax. Knowing what "P" does stand for (or might stand for) would be useful.

ekad
  • 13,718
  • 26
  • 42
  • 44
Eric O Lebigot
  • 81,422
  • 40
  • 198
  • 249
  • 21
    `P` stands for `Placeholder`. – kev Apr 08 '12 at 01:20
  • 1
    @kev: seems like that should be an answer? – ninjagecko Apr 08 '12 at 01:25
  • 4
    Since guesses are appropriate, I conjecture that Ken Thompson is a hippie sympathizer and the "P" stand for "Patchouli". – aaronasterling Apr 08 '12 at 01:30
  • 2
    This question has been added to the [Stack Overflow Regular Expression FAQ](http://stackoverflow.com/a/22944075/2736496), under "Groups". – aliteralmind Apr 10 '14 at 00:27
  • 1
    Just a reminder: The [regex](https://pypi.python.org/pypi/regex/) module supports naming groups with both the `(?...)` syntax as well as the current `(?P...)`. – AXO Oct 26 '16 at 23:14
  • 8
    By the way, if you use `match.groups` (with an `s`) you will silently get a tuple of *all* groups -_- `groups('name')` => `('John', '123456')` when what you actually wanted was `group('name')` => `'John'` I hope this saves someone somewhere some time(s). – szmoore Feb 10 '17 at 03:38
  • one word to summarize "stupid P" – AbstProcDo May 31 '19 at 09:13

3 Answers3

314

Since we're all guessing, I might as well give mine: I've always thought it stood for Python. That may sound pretty stupid -- what, P for Python?! -- but in my defense, I vaguely remembered this thread [emphasis mine]:

Subject: Claiming (?P...) regex syntax extensions

From: Guido van Rossum (gui...@CNRI.Reston.Va.US)

Date: Dec 10, 1997 3:36:19 pm

I have an unusual request for the Perl developers (those that develop the Perl language). I hope this (perl5-porters) is the right list. I am cc'ing the Python string-sig because it is the origin of most of the work I'm discussing here.

You are probably aware of Python. I am Python's creator; I am planning to release a next "major" version, Python 1.5, by the end of this year. I hope that Python and Perl can co-exist in years to come; cross-pollination can be good for both languages. (I believe Larry had a good look at Python when he added objects to Perl 5; O'Reilly publishes books about both languages.)

As you may know, Python 1.5 adds a new regular expression module that more closely matches Perl's syntax. We've tried to be as close to the Perl syntax as possible within Python's syntax. However, the regex syntax has some Python-specific extensions, which all begin with (?P . Currently there are two of them:

(?P<foo>...) Similar to regular grouping parentheses, but the text
matched by the group is accessible after the match has been performed, via the symbolic group name "foo".

(?P=foo) Matches the same string as that matched by the group named "foo". Equivalent to \1, \2, etc. except that the group is referred
to by name, not number.

I hope that this Python-specific extension won't conflict with any future Perl extensions to the Perl regex syntax. If you have plans to use (?P, please let us know as soon as possible so we can resolve the conflict. Otherwise, it would be nice if the (?P syntax could be permanently reserved for Python-specific syntax extensions. (Is there some kind of registry of extensions?)

to which Larry Wall replied:

[...] There's no registry as of now--yours is the first request from outside perl5-porters, so it's a pretty low-bandwidth activity. (Sorry it was even lower last week--I was off in New York at Internet World.)

Anyway, as far as I'm concerned, you may certainly have 'P' with my blessing. (Obviously Perl doesn't need the 'P' at this point. :-) [...]

So I don't know what the original choice of P was motivated by -- pattern? placeholder? penguins? -- but you can understand why I've always associated it with Python. Which considering that (1) I don't like regular expressions and avoid them wherever possible, and (2) this thread happened fifteen years ago, is kind of odd.

DSM
  • 291,791
  • 56
  • 521
  • 443
  • 5
    "Python-specific extension" perhaps? – jmort253 Apr 08 '12 at 03:33
  • 64
    Wow, you did find some good and relevant piece of historical data, here! My interpretation of Guido's post is that "P" stands for "Python-specific extensions". – Eric O Lebigot Apr 08 '12 at 03:41
  • 1
    Yep, that looks definitive to me. So it's ironic that Perl and PCRE initially copied the syntax, just because Python was the first flavor to support named captures. But they also support the `(?…)` syntax, which seems to be the most popular--even Java supports it now. – Alan Moore Apr 08 '12 at 05:43
  • Python / Proprietary, python does seem likely – Zachary Vance Apr 10 '17 at 02:45
  • 3
    +1 This is one of the best awkward answers which is well defended :). At first, I thought this is too stupid. But in the end, I totally agreed. – Sumudu Mar 01 '18 at 09:43
  • 5
    I love that even the creator of Python uses bizarre arcane syntax when Perl is involved, and the Perl community is totally fine with that. If you tried to add Perl-specific extensions/syntax to Python, there would be blood on the streets. – Keith Ripley Apr 05 '19 at 22:03
20

Pattern! The group names a (sub)pattern for later use in the regex. See the documentation here for details about how such groups are used.

Mike
  • 15,696
  • 10
  • 49
  • 75
  • 3
    +1: This is a good mnemonic device: `(?P…)` is "pattern `name`". Everything is a pattern, though, in a regexp, so it is kind of strange to only label `(?P…)` groups as patterns. This will do, though, for my students. :) – Eric O Lebigot Apr 08 '12 at 03:43
  • 2
    @EOL don't teach students false things. They are harder to shred off when you reach for exactness than you think. Eg. some, for me, takes years multiple of `5`. Paradoxically, it's encouraged to speak casually, just always be very clear and explicit about it - eg. tell your previous comment in full length to your students (revising perhaps the very last sentence;).) – n611x007 Sep 10 '13 at 07:04
14

Python Extension. From the Python Docs:

The solution chosen by the Perl developers was to use (?...) as the extension syntax. ? immediately after a parenthesis was a syntax error because the ? would have nothing to repeat, so this didn’t introduce any compatibility problems. The characters immediately after the ? indicate what extension is being used, so (?=foo) is one thing (a positive lookahead assertion) and (?:foo) is something else (a non-capturing group containing the subexpression foo).

Python supports several of Perl’s extensions and adds an extension syntax to Perl’s extension syntax.If the first character after the question mark is a P, you know that it’s an extension that’s specific to Python

https://docs.python.org/3/howto/regex.html

zvi
  • 2,593
  • 20
  • 33
SomeGuy
  • 141
  • 1
  • 2