2

Can someone explain why the code below returns an empty list:

>>> import re
>>> m = re.findall("(SS){e<=1}", "PSSZ")
>>> m
[]

I am trying to find the total number of occurrences of SS (and incorporating the possibility of up to one mismatch) within PSSZ.

I saw a similar example of code here: Search for string allowing for one mismatch in any location of the string

Community
  • 1
  • 1
warship
  • 2,505
  • 6
  • 30
  • 58

1 Answers1

0

You need to remove e<= chars present inside the range quantifier. Range quantifier must be of ,

  • {n} . Repeats the previous token n number of times.
  • {min,max} Repeats the previous token from min to max times.

It would be,

m = re.findall("(SS){1}", "PSSZ")

or

m = re.findall(r'SS','PSSZ')

Update:

>>> re.findall(r'(?=(S.|.S))', 'PSSZ')
['PS', 'SS', 'SZ']
Avinash Raj
  • 160,498
  • 22
  • 182
  • 229
  • This only gives `SS` (one instance), not `PS` or `SZ`, where there is one mismatch. – warship Jul 12 '15 at 04:32
  • @warship .. didn't you ask for the number of occurrence of `SS` ? – Iron Fist Jul 12 '15 at 04:34
  • It should also return `SS`. Hopefully, with something simple like the `{1}` notation, this would be much more elegant. – warship Jul 12 '15 at 04:38
  • Thank you, very interesting. I assume this is the most concise way to do it. How would you go about working with triple character strings such as `SSZ`, would the regex notation change? Perhaps you can post an update. – warship Jul 12 '15 at 04:41