0

I am looking to build a regular expression that matches only the occurrences of the text passed. I tried using the \b which indeed worked for a word boundary but it didn't work with symbols like ! .

>>> list(re.finditer(r'\bhe\b','he is hey!'))

[<re.Match object; span=(0, 2), match='he'>]

>>> list(re.finditer(r'\bhe\b','he is he!'))

[<re.Match object; span=(0, 2), match='he'>, <re.Match object; span=(6, 8), match='he'>]

I don't want my regular expression to match the 'he!'

MarianD
  • 9,720
  • 8
  • 27
  • 44
  • Possible duplicate of [Reference - What does this regex mean?](https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) – Paolo Sep 08 '18 at 11:57
  • 2
    Actually, it doesn’t match “he!” but “he”, which is what you want. – Laurent LAPORTE Sep 08 '18 at 11:58
  • If you're looking for a way to check if the regex matches the whole string, give the `\A`, `\Z`, `\z`, `^` and `$` anchors a look. https://www.regular-expressions.info/refanchors.html – 3limin4t0r Sep 08 '18 at 12:35

1 Answers1

1

You might match a word boundary \b follwed by he and use a negative lookahead (?! to verify that what follows is not a non whitespace character \S

\bhe(?!\S)

Regex demo

Python test

The fourth bird
  • 96,715
  • 14
  • 35
  • 52