0

According to Python's documentation for regular expression syntax for | (or) operator: "...once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy."

I have tried this in my console (running Python 3.7.6):

import re
txt = 'tim is walking and tom is running'
pattern = 'tim|tom'
re.findall(pattern, txt)

and I get:

['tim', 'tom']

Why is the right side of | still evaluated in this case?

myroslav
  • 1,350
  • 14
  • 33
  • 1
    It's finding all instances that match the pattern `tim|tom`, try `re.match(pattern, txt)` which just captures `tim` – C.Nivs Apr 17 '20 at 14:41
  • 6
    Because `re.findall` finds all non-overlapping occurrences of a pattern in a string. `re.search` finds the first one. – Wiktor Stribiżew Apr 17 '20 at 14:42
  • 2
    It comes into play when looking at any individual match. Try `walk|walking`. You'll always get `walk` back, never `walking`, even though `walking` is longer. – John Kugelman Apr 17 '20 at 14:43
  • @JohnKugelman Please re-close the question. At least with [Reference - What does this regex mean?](https://stackoverflow.com/questions/22937618/). This question is a result of misunderstanding regex and `re.findall`. [How can I find all matches to a regular expression in Python?](https://stackoverflow.com/questions/4697882/how-can-i-find-all-matches-to-a-regular-expression-in-python) is a **valid** close reason for this question – Wiktor Stribiżew Apr 17 '20 at 14:44
  • So, use [Reference - What does this regex mean?](https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean). This question is a result of misunderstanding regex and `re.findall`. [How can I find all matches to a regular expression in Python?](https://stackoverflow.com/questions/4697882/how-can-i-find-all-matches-to-a-regular-expression-in-python) is a **valid** close reason for this question – Wiktor Stribiżew Apr 17 '20 at 14:45
  • Do any of those discuss the non-greediness of `|`? – John Kugelman Apr 17 '20 at 14:47
  • 2
    `|` has nothing to do with greediness, it is not a quantifier. The `|` operator is also thoroughly discussed in [Order of regular expression operator (..|.. … ..|..)](https://stackoverflow.com/questions/35606426/order-of-regular-expression-operator). This question is not asking about this `|` operator peculiarity. – Wiktor Stribiżew Apr 17 '20 at 14:47
  • It *could* be greedy but it's not. Greediness *is* a possible adjective. Read the first paragraph of this question--the (non-) greediness of `|` is exactly what the question is about. – John Kugelman Apr 17 '20 at 14:49
  • Greedy or non-greedy terms only relate to quantifiers, not alternation operator. **The question is about why two matches are returned.** - The answer: ***Because `re.findall` returns all of them.*** – Wiktor Stribiżew Apr 17 '20 at 14:50
  • 1
    Wiktor, the question is about both of those things. It's about the `|` operator *and* it's about `findall()`. The OP thinks they're related. It turns out they're orthogonal features. A good answer should explain both the quoted excerpt in the first paragraph and what `findall()` does, and why they're different. I've been trying to think about how to word such an answer; it's tricky and I haven't figured it out yet. – John Kugelman Apr 17 '20 at 14:56
  • 1
    John, so add [BOTH](https://stackoverflow.com/questions/4697882/how-can-i-find-all-matches-to-a-regular-expression-in-python) the [links](https://stackoverflow.com/questions/35606426/order-of-regular-expression-operator) as close reasons, just what I tried to do but failed since you rushed to reopen the post. – Wiktor Stribiżew Apr 17 '20 at 15:02
  • 1
    @JohnKugelman What about [python re.findall() with substring in alternations](https://stackoverflow.com/questions/20647431/python-re-findall-with-substring-in-alternations)? – Wiktor Stribiżew Apr 17 '20 at 15:19
  • Thank you for these comments gentlemen. I would encourage to keep this question open. I know that reading more material online could have answered my question, but at times it helps a human take a look at your code an explain why it is not working. Thanks again! – myroslav Apr 17 '20 at 15:33

0 Answers0