149

I know that the following regex will match "red", "green", or "blue".

red|green|blue

Is there a straightforward way of making it match everything except several specified strings?

Alfred
  • 1,499
  • 2
  • 10
  • 3
  • 1
    Not all flavours of regular expressions can do this. What environment are you working in? Java? Perl? .NET? Some C/C++ regex library? An RDBMS? – FrustratedWithFormsDesigner Mar 08 '10 at 19:22
  • 8
    You don't say what you want it for, but you could simply invert the sense of the "match" operation. This won't help you if you are trying do extraction on the non-matching parts, but to test whether an excluded string is not present it would work: `if (!s.match(/red|green|blue/)) ...` Note: I know that the OP doesn't specify what language/framework, so the preceding should be considered a generic example, not a prescriptive one. – tvanfosson Mar 08 '10 at 19:23

7 Answers7

176

If you want to make sure that the string is neither red, green nor blue, caskey's answer is it. What is often wanted, however, is to make sure that the line does not contain red, green or blue anywhere in it. For that, anchor the regular expression with ^ and include .* in the negative lookahead:

^(?!.*(red|green|blue))

Also, suppose that you want lines containing the word "engine" but without any of those colors:

^(?!.*(red|green|blue)).*engine

You might think you can factor the .* to the head of the regular expression:

^.*(?!red|green|blue)engine     # Does not work

but you cannot. You have to have both instances of .* for it to work.

Wayne Conrad
  • 90,071
  • 22
  • 147
  • 183
58

Depends on the language, but there are generally negative-assertions you can put in like so:

(?!red|green|blue)

(Thanks for the syntax fix, the above is valid Java and Perl, YMMV)

caskey
  • 11,131
  • 2
  • 24
  • 27
  • 2
    @caskey, The full answer is a combination of mine and yours. If you'd like to merge them together, I'll delete mine. – Wayne Conrad Mar 08 '10 at 20:21
  • 21
    This answer would be a lot more useful is you explained it a little. For example: What do "?" and "!" mean? Why do you need capture groups? – Lii Dec 23 '14 at 08:51
  • It's valid Python, too. – Joe Mornin Mar 17 '15 at 22:13
  • 1
    just used this with Delphi regEx library and it only works like this : ^(?!red|green|blue). Also true for testing it on https://regex101.com/ . So is the above a typo missing a ^ or does it actually work like that in Java/Perl/Python .. ? – Peter Jul 31 '19 at 09:47
39

Matching Anything but Given Strings

If you want to match the entire string where you want to match everything but certain strings you can do it like this:

^(?!(red|green|blue)$).*$

This says, start the match from the beginning of the string where it cannot start and end with red, green, or blue and match anything else to the end of the string.

You can try it here: https://regex101.com/r/rMbYHz/2

Note that this only works with regex engines that support a negative lookahead.

Community
  • 1
  • 1
Sam
  • 25,752
  • 12
  • 68
  • 97
28

You don't need negative lookahead. There is working example:

/([\s\S]*?)(red|green|blue|)/g

Description:

  • [\s\S] - match any character
  • * - match from 0 to unlimited from previous group
  • ? - match as less as possible
  • (red|green|blue|) - match one of this words or nothing
  • g - repeat pattern

Example:

whiteredwhiteredgreenbluewhiteredgreenbluewhiteredgreenbluewhiteredgreenbluewhiteredgreenbluewhiteredgreenbluewhiteredgreenbluewhiteredwhiteredwhiteredwhiteredwhiteredwhiteredgreenbluewhiteredwhiteredwhiteredwhiteredwhiteredredgreenredgreenredgreenredgreenredgreenbluewhiteredbluewhiteredbluewhiteredbluewhiteredbluewhiteredwhite

Will be:

whitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhitewhite

Test it: regex101.com

hlcs
  • 4,236
  • 5
  • 35
  • 41
  • 5
    You can drastically reduce the step count by swapping [\s\S] for a dot. I was very confused why seemingly every other example captures each word individually. This way is slightly more regex steps but requires far less post-processing. – Zatronium Aug 20 '16 at 20:39
  • 3
    but this doesn't do matching (text validation), it just removes specified text during substitution. – Marek R Jan 10 '19 at 15:34
  • This solution will not output the final chunk of text after the known words. So, there is no need comparing the speed, it is just wrong. – Wiktor Stribiżew Feb 19 '20 at 21:50
  • @WiktorStribiżew fixed. – hlcs Feb 20 '20 at 03:01
11

I had the same question, the solutions proposed were almost working but they had some issue. In the end the regex I used is:

^(?!red|green|blue).*

I tested it in Javascript and .NET.

.* should't be placed inside the negative lookahead like this: ^(?!.*red|green|blue) or it would make the first element behave different from the rest (i.e. "anotherred" wouldn't be matched while "anothergreen" would)

Durden81
  • 918
  • 9
  • 25
7

Matching any text but those matching a pattern is usually achieved with splitting the string with the regex pattern.

Examples:

  • - Regex.Split(text, @"red|green|blue") or, to get rid of empty values, Regex.Split(text, @"red|green|blue").Where(x => !string.IsNullOrEmpty(x)) (see demo)
  • - Regex.Split(text, "red|green|blue") or, to remove empty items, Regex.Split(text, "red|green|blue").Where(Function(s) Not String.IsNullOrWhitespace(s)) (see demo, or this demo where LINQ is supported)
  • - text.split(/red|green|blue/) (no need to use g modifier here!) (to get rid of empty values, use text.split(/red|green|blue/).filter(Boolean)), see demo
  • - text.split("red|green|blue"), or - to keep all trailing empty items - use text.split("red|green|blue", -1), or to remove all empty items use more code to remove them (see demo)
  • - Similar to Java, text.split(/red|green|blue/), to get all trailing items use text.split(/red|green|blue/, -1) and to remove all empty items use text.split(/red|green|blue/).findAll {it != ""}) (see demo)
  • - text.split(Regex("red|green|blue")) or, to remove blank items, use text.split(Regex("red|green|blue")).filter{ !it.isBlank() }, see demo
  • - text.split("red|green|blue"), or to keep all trailing empty items, use text.split("red|green|blue", -1) and to remove all empty items, use text.split("red|green|blue").filter(_.nonEmpty) (see demo)
  • - text.split(/red|green|blue/), to get rid of empty values use .split(/red|green|blue/).reject(&:empty?) (and to get both leading and trailing empty items, use -1 as the second argument, .split(/red|green|blue/, -1)) (see demo)
  • - my @result1 = split /red|green|blue/, $text;, or with all trailing empty items, my @result2 = split /red|green|blue/, $text, -1;, or without any empty items, my @result3 = grep { /\S/ } split /red|green|blue/, $text; (see demo)
  • - preg_split('~red|green|blue~', $text) or preg_split('~red|green|blue~', $text, -1, PREG_SPLIT_NO_EMPTY) to output no empty items (see demo)
  • - re.split(r'red|green|blue', text) or, to remove empty items, list(filter(None, re.split(r'red|green|blue', text))) (see demo)
  • - Use regexp.MustCompile("red|green|blue").Split(text, -1), and if you need to remove empty items, use this code. See Go demo.

NOTE: If you patterns contain capturing groups, regex split functions/methods may behave differently, also depending on additional options. Please refer to the appropriate split method documentation then.

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
0

All except word "red"

var href = '(text-1) (red) (text-3) (text-4) (text-5)';

var test = href.replace(/\((\b(?!red\b)[\s\S]*?)\)/g, testF); 

function testF(match, p1, p2, offset, str_full) {
  p1 = "-"+p1+"-";
  return p1;
}

console.log(test);

All except word "red"

var href = '(text-1) (frede) (text-3) (text-4) (text-5)';

var test = href.replace(/\(([\s\S]*?)\)/g, testF); 

function testF(match, p1, p2, offset, str_full) {
  p1 = p1.replace(/red/g, '');
  p1 = "-"+p1+"-";
  return p1;
}

console.log(test);