0

Given a string, I want to find all the substrings consisting of two or three '4,'.

For example, given '1,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,4,3,2,' I want to get ['4,4,', '4,4,', '4,4,4'].

str_ = '1,4,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,3,2,'
m = re.findall(r"(4,){2,3}", str_)

what I get is : ['4,', '4,', '4,', '4,']

what's wrong?

It seems to me that the parenthesis wrapping '4,' is interpreted as grouping but not telling Python '4' and ',' should occur together. However, I don't know how to do this.

1 Answers1

1

Just use non-capturing group (online version of this regex here):

import re

s = '1,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,4,3,2,'

print(re.findall(r'(?:4,?){2,3}', s))

Prints:

['4,4,', '4,4,', '4,4,4,']

EDIT:

Edited regex to capture 2 or 3 elements "4,"

Andrej Kesely
  • 81,807
  • 10
  • 31
  • 56