-1

I found this:

>>> re.findall(r'((.)\2*)',s)
[('111', '1'), ('22', '2'), ('1', '1')]
>>> s
'111221'
>>> 

I'm not able to follow \2*, how does the regex worked: first group gives me the second group char repeating in s. Its amazing!

\2 meaning the second group, but what is the second group here?!

Note: this is to get number of times a char repeating in a string.

code muncher
  • 1,380
  • 2
  • 24
  • 42

2 Answers2

2

\2 is a backreference to what was captured in capture group 2.
For example, if group 2 captured b, \2+ can only match b or bb, etc..
Equivalent to bb+ where 'b' can be any character except newline.

 (                 # (1 start)
      ( . )             # (2), Any character
      \2*               # Backreference to capture group 2, 0 to many times
 )                 # (1 end)
2

In your example capture group 1 \1 is ((.)\2*) and the capture group 2 \2 is (.)

Because you are not using the first capture group, you could use non-capturing group instead: (?:(.)\1)

jil
  • 2,283
  • 8
  • 12