-4

Example: search for the pattern man but only as the beginning of a word (i.e. not preceded directly by letters).

This pattern would be found in the strings man, spider-man, manpower, iron_man. But not in woman or human.

ruohola
  • 16,015
  • 6
  • 33
  • 67
LoLa
  • 930
  • 1
  • 8
  • 16
  • 1
    `\b` is used to match the beginning or end of a word. However, it considers underscore to be part of a word. – Barmar Apr 09 '20 at 22:32
  • If lookbehinds are supported: `(?<=[\W_]|^)man` otherwise: `(?:[\W_]|^)man` – revo Apr 09 '20 at 22:51
  • Do you wish to match `"man"` in the strings `"_man"` and `"-man"`, or must the underscore or hyphen be preceded by a word? – Cary Swoveland Apr 10 '20 at 00:13
  • 1
    anubhava and I have both asked quesions that you have seen. Why haven't you answered them? – Cary Swoveland Apr 10 '20 at 02:31
  • Apologies for the delay. To be honest I didn't think of that case but it should not come up anyway. The solution below that does not match `-man` will work for my use case. – LoLa Apr 10 '20 at 21:10

3 Answers3

1

You can use positive lookbehind to achieve this:

(?<=^|[a-z][_-]|\s)man

regex101 demo

regex101 ss

ruohola
  • 16,015
  • 6
  • 33
  • 67
1

I have assumed that if the word "man" is preceded by a hyphen or underscore, to achieve a match the hyphen or underscore must be preceded by a letter (e.g., "-man" would not be matched).

The \K escape sequence resets the beginning of the match to the current position in the token list. If supported by the regex engine, the following regular expression (with the case-indifferent flag set) could be used.

(?:^| |[a-z][-_])\Kman

Demo

The selected answer to this SO question provides a list of regex engines that support \K. That list was last updated in August 2019.

The regex engine performs the following operations.

(?:       # begin non-capture group
  ^       # match beginning of line
  |       # or
          # match a space
  |       # or
  [a-z]   # match a letter
  [-_]    # match '-' or '_'
)         # end non-capture group
\K        # discard everything matched so far
man       # match 'man'

Alternatively, a capture group could be used.

(?:^| |[a-z][-_])(man)

Demo

Cary Swoveland
  • 94,081
  • 5
  • 54
  • 87
-1

Add a word boundary \b or look behind for underscore to the start:

((?<=_)|\b)man

See live demo.

Bohemian
  • 365,064
  • 84
  • 522
  • 658