How do I make an regular expression that findall text that is in between a and (b or c) in python?

Question

I want to make a regular expression where it returns all the occurrence of text that is in between a and first occurence of b or c.

so I tried this code:

text = 'dfgahfjbjicij'
re.findall('a(.*?)(b|c)',text)

output

[('hfj', 'b')]

expected:

['hfj']

How do I make it so the first occurrence is the return not a tuple?

score 2 · Answer 1 · answered Aug 24 '19 at 03:33

2

Use a non-capturing group:

>>> text = 'dfgahfjbjicij'
>>> re.findall('a(.*?)(b|c)',text)
[('hfj', 'b')]
>>> re.findall('a(.*?)(?:b|c)',text)
['hfj']

answered Aug 24 '19 at 03:33

Tordek

10,075
3
31
63

Emma · Answer 2 · 2019-08-24T03:38:59.380

This expression might work, with lookarounds:

(?<=a)(.*?)(?=b|c)

or:

(?<=a)(.*?)(?=[bc])

or:

a([^bc]*)[bc]

Demo

Test

import re


expression = r"(?<=a)(.*?)(?=b|c)"

string = """

dfgahfjbjicij
dfgahfjjicij

"""

print(re.findall(expression, string))

Output

['hfj', 'hfjji']

If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.

RegEx Circuit

jex.im visualizes regular expressions:

How do I make an regular expression that findall text that is in between a and (b or c) in python?

2 Answers2

Demo

Test

Output

RegEx Circuit