-3

I have following regex pattern:

line_re = re.compile(r'(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})\s+(\S+):\s+(?P<name>.*)')

I am trying to understand what the ?P<name> means. The expression works the same even when I remove it, i.e.:

line_re = re.compile(r'(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})\s+(\S+):\s+(.*)')

I know that I can reference the matched patterns with match.group(3). What is the ?P<name> for?

Martin Vegter
  • 489
  • 7
  • 24
  • 51
  • 1
    In regular expressions it's a `named group`. You seriously couldn't google this? – l'L'l Aug 22 '14 at 11:08
  • @jonrsharpe: wrong dupe; that question asked about what the `P` in the various syntax constructs stands for. – Martijn Pieters Aug 22 '14 at 11:35
  • @l'L'l - have you ever tried googling for `python re (?P – Martin Vegter Aug 22 '14 at 12:01
  • @MartinVegter, You should search for `Python (?P)` instead of something that doesn't make sense truncated - [about 62,000 results](https://www.google.com/search?q=%22Python+%28%3FP%3Cname%3E%29%22#q=Python+%22%28%3FP%3Cname%3E%29%22). Even truncated, [the top two results](https://www.google.com/search?q=%22python+re+%28%3FP%3C+%22) still lead to the information you should've been able to find. – l'L'l Aug 22 '14 at 12:21

1 Answers1

2

From the re module documentation:

(?P<name>...)
Similar to regular parentheses, but the substring matched by the group is accessible via the symbolic group name name. Group names must be valid Python identifiers, and each group name must be defined only once within a regular expression. A symbolic group is also a numbered group, just as if the group were not named.

So it is essentially the same as what you changed your pattern to except now you can no longer access that pattern by name as well as by its number.

To understand the difference I recommend you read up on Non-capturing And Named Groups in the Regular Expression HOWTO.

You can access named groups by passing the name to the MatchObject.group() method, or get a dictionary containing all named groups with MatchObject.groupdict(); this dictionary would not include positional groups.

Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997