2

I am struggling to make a regular expression to remove the items which have the status "notify" or "not in stock"

item1 120usd not in stock item2 150usd in stock item3 100usd notify item4 12usd in stock item5 25usd not in stock item6 250usd notify item7 50usd in stock item8 30sud in stock item9 5usd notify item10 5usd notify

The following regular expression will match all items .*?(notify|not in stock|in stock)

I have tried to remove the "in stock" from the regular expression but then all the grouping get "messed" up.

https://regex101.com/r/KyEg6k/1

All help is appreciated :)

mocet
  • 23
  • 2

1 Answers1

1

One option could be to match what you don't want and capture in a group what you want to keep.

To not cross in stock or not in stock or notify you could use a tempered greedy token using a negative lookahead.

\bin stock\b|(?:\s+|^)((?:(?!\b(?:notify|not in stock|in stock)\b).)+\b(?:notify|not in stock)\b)
  • \bin stock\b match in stock between word boundaries to prevent a partial match
  • | Or
  • (?:\s+|^) Match either 1+ whitespace chars or assert the start of the string to also match the first word
  • ( Capture group 1 (Referred to by m[1] in the example code)
    • (?: Non capture group for the tempered dot
      • (?!\b(?:notify|not in stock|in stock)\b). Negative lookahead, assert not any of the alternatives directly to the right. If that is true, match any char using the .
    • )+ Close the group and repeat 1+ times
    • \b(?:notify|not in stock)\b Match one of the alternatives between word boundaries
  • ) Close group 1

Regex demo

const str = "item1 120usd not in stock item2 150usd in stock item3 100usd notify item4 12usd in stock item5 25usd not in stock item6 250usd notify item7 50usd in stock item8 30sud in stock item9 5usd notify item10 5usd notify"
const regex = /\bin stock\b|(?:\s+|^)((?:(?!\b(?:notify|not in stock|in stock)\b).)+\b(?:notify|not in stock)\b)/g;
Array.from(str.matchAll(regex), m => {
  if (m[1]) {
    console.log(m[1]);
  }
});
The fourth bird
  • 96,715
  • 14
  • 35
  • 52
  • That is awesome :) Thank you for help and time. A minor thing ... In the Regexp demo - the text "in stock" shows up as match2,match4,match7 and match8.Is it possible to exclude these? – mocet May 01 '21 at 17:22
  • @mocet That is the match that you don't want. What you do want is in group 1, as shown in the demo logging the value of group 1. – The fourth bird May 01 '21 at 17:24
  • Do you use this in the browser? Javascript has limited support for lookbehinds. – The fourth bird May 01 '21 at 17:25
  • yes, I use this in the browser (chrome) via a chrome extension (pageprobe) , there is just two columns there one to put in a regular expression and the second to replace the matched value with. – mocet May 01 '21 at 20:10
  • @mochet like this? https://regex101.com/r/rfU7hY/1 `(? – The fourth bird May 01 '21 at 21:47
  • 1
    This is exactly what I am looking for. You guys rock!! Thank you soo much! – mocet May 02 '21 at 11:04