1

I'm looking to try and ignore a word in regex, but the solutions I've seen here did not work correctly for me.

Regular expression to match a line that doesn't contain a word

The issue I'm facing is I have an existing regex:

(?P<MovieCode>[A-Za-z]{3,6}-\d{3,5})(?P<MoviePart>[A-C]{1}\b)?

That is matching on Deku-041114-575-boku.mp4.

However, I want this regex to fail to match for cases where the MovieCode group has Deku in it.

I tried (?P<MovieCode>(?!Deku)[A-Za-z]{3,6}-\d{3,5})(?P<MoviePart>[A-C]{1}\b)? but unfortunately it just matches eku-124 and I need it to fail.

I have a regex101 with my attempts. https://regex101.com/r/xqALM2/2

1 Answers1

1

The MovieClose group can match 3-6 chars A-Z and Deku has 4 chars. If that part should not contain Deku, you could use the negative lookahead predeced by repeating 0+ times a character class [A-Za-z]* as it can not cross the -.

To prevent matching eku-124, you could prepend a word boundary before the MovieClose group or add (?<!\S if there should be a whitespace boundary at the left.

Note that you can omit {1} from the pattern.

\b(?P<MovieCode>(?![A-Za-z]*Deku)[A-Za-z]{3,6}-\d{3,5})(?P<MoviePart>[A-C]\b)?

Regex demo

The fourth bird
  • 96,715
  • 14
  • 35
  • 52
  • Thank you! I've been battling this for a day now. I would have never thought to do it like this. The word boundary at the beginning was great as well. If I need to expand the "ignored" word to be more than one, I could just do: `\b(?P(?![A-Za-z]*(Deku|Boku))[A-Za-z]{3,6}-\d{3,5})(?P[A-C]\b)?` right? – bittermelonman Aug 04 '20 at 12:38
  • @bittermelonman That is correct, if you don't need the capture group, you could make it non capturing `(?:Deku|Boku)` – The fourth bird Aug 04 '20 at 12:41