-1

I apologize if this is a repeated question, but I tried for a while and couldn't figure out what search to use.

Previously, I had a regular expression:

example_re = re.compile(r'[\(\[].*?[\)\]]')

which was supposed to capture (text that looked like this) and [this], but would also incorrectly capture text that [looked like this).

I fixed it:

example_re = re.compile(r'(\(.*?\))|(\[.*?\])')

but now, when I call example_re.findall(text), tests are breaking because where the first expression returns a list of strings, the second returns a list of tuples, whenever there is a nested expression [like (this)].

How do I fix this so that findall returns only have the outermost match?

Edit: Whoever marked this question as a duplicate really isn't helping anybody. The title of the question that this is supposedly a duplicate of is 'Python re.findall behaves weird'. How was I (or anyone else) supposed to find that? Just by virtue of the fact that I phrased the question differently makes it a non-duplicate.

whenitrains
  • 383
  • 1
  • 17
  • 1
    Use `re.finditer`. `[x.group() for x in re.finditer(r'\(.*?\)|\[.*?]', s)]` – Wiktor Stribiżew Jun 01 '18 at 21:24
  • So, it is a dupe of https://stackoverflow.com/questions/4664850/find-all-occurrences-of-a-substring-in-python, @Blckknght please re-close. Also a dupe of https://stackoverflow.com/questions/31915018. – Wiktor Stribiżew Jun 01 '18 at 21:35
  • 1
    I don't think this is a dupe. The linked question is about how to use `findall` in the first place, not about this question's issue about too many capturing groups. – Blckknght Jun 01 '18 at 21:36
  • There are tons of such questions, and there are always 1 answer: either `re.finditer`, or `re.findall` with a non-capturing group. – Wiktor Stribiżew Jun 01 '18 at 21:37
  • The title of the question that this is supposedly a duplicate of is 'Python re.findall behaves weird'. How was I (or anyone else) supposed to find that? Just by virtue of the fact that I phrased the question differently makes it a non-duplicate. – whenitrains Jun 07 '18 at 19:29
  • But thanks, your answer did help – whenitrains Jun 07 '18 at 19:34

1 Answers1

-1

You need to change your pattern so that you don't have two capturing groups. When you have two groups, you'll get a 2-tuple of matching strings, even if one of them is empty.

An easy fix is to make your groups non-capturing, with (?: ) instead of plain ( ):

example_re = re.compile(r'(?:\(.*?\))|(?:\[.*?\])')
Blckknght
  • 85,872
  • 10
  • 104
  • 150
  • Why do you post a wrong answer to a dupe? `:?` matches 1 or 0 `:` chars. OP does not need it. Pleas re-close. – Wiktor Stribiżew Jun 01 '18 at 21:35
  • The parentheses are important here. `(:? )` is a non-capturing group. It's not the normal `?` in regex. – Blckknght Jun 01 '18 at 21:38
  • You are wrong, non-capturing group syntax is `(?:...)`, [here is the answer](https://stackoverflow.com/a/31915134/3832970). – Wiktor Stribiżew Jun 01 '18 at 21:39
  • Ah, hell, you're right about the typo. And if the dupe had been to that answer, not the more general one, I probably would have left it closed. – Blckknght Jun 01 '18 at 21:41
  • Please re-close this 1 billionth dupe. https://stackoverflow.com/questions/31915018 explains it all. – Wiktor Stribiżew Jun 01 '18 at 21:42
  • Again, I reopened because this was marked as a dupe of a question that it was not a duplicate of (https://stackoverflow.com/questions/4664850/find-all-occurrences-of-a-substring-in-python). No need to jump down my throat over it, after you pulled the dupe trigger too fast. I'll close it as a dupe of https://stackoverflow.com/questions/31915018 now, since that one is actually applicable. – Blckknght Jun 01 '18 at 21:47
  • Heads up, for fairness. VLQs and duplicates that you've answered in the past are appearing, one after the other, in the deletion queue. Search "user:me deleted:1" to see what answers of yours are being deleted along with them. If you notice you're leaking rep, this is the reason. I also presume that this answer is the cause of the targeted deletion. – cs95 Jun 02 '18 at 07:38
  • @coldspeed: Um, OK. I don't object to bad questions being deleted, even if I've answered them. But are you telling me about it so I should do something about it (e.g. flag that I'm being singled out or whatever)? I don't care at all about the rep. – Blckknght Jun 02 '18 at 09:37
  • Letting you know since you might have a sentiment similar to [this](https://meta.stackoverflow.com/questions/362016/targeted-closing-of-questions-that-i-answered). – cs95 Jun 10 '18 at 10:55