Why regular expression returning multiple matches?

Question

I was trying to match text with regular expression.

The following code works a bit strangely. It returns the result twice.

regex = r"((\w+\s*){1,3} from)"
test_str = "text text this is Alex Smith from text text"
re.findall(regex, test_str)

Can someone point whats wrong here?

If you are curious, my final goal is to mach any 2/3 words NAMEs like 'Alex Smith or Mr. Alex Smith' between (or at one side) specific text. For instance,

1. this is Alex Smith from Japan (2/3 words after 'this is' or before from)
2. this is Mr. Alex Smith from japan  (Optional Mr.)

2.     Mr. Alex Smith from Tokyo  (2/3 words before from)
3. this is Alex Smith text text

So basically it should trigger on 'this is' or 'from'. Any suggestion? text text Alex Smith from Japan I am Alex Smith text text

Capturing groups make `re.findall` return a list of tuples in this case. Convert to non-capturing those parts you do not need to return. — Wiktor Stribiżew, Apr 17 '18 at 10:37
Thank you for your answer. What confusing me is, its returning <> Why is it matching with Smith separately again? — Droid-Bird, Apr 17 '18 at 10:51
`Smith` is the value of Group 2, that is why it is output. See [Repeating a Capturing Group vs. Capturing a Repeated Group](https://www.regular-expressions.info/captureall.html), and [Capturing repeating subpatterns in Python regex](https://stackoverflow.com/questions/9764930/capturing-repeating-subpatterns-in-python-regex). — Wiktor Stribiżew, Apr 17 '18 at 10:52

Why regular expression returning multiple matches?

0 Answers0