1

I need some help to write a regex that matches a string in parentheses with nested parentheses and starts with pattern. (NOTE: the text in the 'parent' parentheses could be without nested parentheses)

Examples:

Some text (pattern: SOME TEXT THAT (I WANT TO EXTRACT)) a bit more text (another pattern: ignore that text) and may be a little more text

Result should be: SOME TEXT THAT (I WANT TO EXTRACT)

Some text (pattern: SOME TEXT THAT (I WANT TO EXTRACT)) a bit more text (another pattern: ignore that text) and may be a little more text

Result should be SOME TEXT THAT (I WANT TO EXTRACT)

Some text (pattern: SOME TEXT THAT I WANT TO EXTRACT) a bit more text (another pattern: ignore that text) and may be a little more text

Result should be SOME TEXT THAT I WANT TO EXTRACT

The RegEx I've tried like /\((pattern:?\s?([^\)]+))\)/gi misses a nested )

random_user_name
  • 23,924
  • 7
  • 69
  • 103
To_wave
  • 435
  • 4
  • 15
  • In the end you forget to escape one of the `)`, also `[^)]` should suffice, no need to escape characters within square brackets – Alexander Derck Nov 24 '17 at 23:04
  • I guess it need to clarify I question. Another problem is a text in 'parent' parentheses could be without nested parentheses. 1 case: `Some text (pattern: SOME TEXT THAT (I WANT TO EXTRACT)) a bit more text (another pattern: ignore that text) and may be a little more text` result is ` SOME TEXT THAT (I WANT TO EXTRACT)` 2 case: `Some text (pattern: SOME TEXT THAT I WANT TO EXTRACT) a bit more text (another pattern: ignore that text) and may be a little more text` result is `SOME TEXT THAT I WANT TO EXTRACT` – To_wave Nov 25 '17 at 00:57
  • In every case the string is `SOME TEXT THAT I WANT TO EXTRACT` - is that the exact string you want? Or could that string contain other set of words / characters? – random_user_name Nov 25 '17 at 15:56

2 Answers2

1

To extract the text from your example data, I think you can use this regex:

\(pattern:?\s?(.+?\)?)\)

  • match \(pattern
  • an optional colon: :?
  • an optional whitespace \s?
  • start capturing group (
  • capture one or more characters non greedy .+?
  • an optional \)
  • close capturing group
  • match \)

    var string = "Some text (pattern: SOME TEXT THAT (I WANT TO EXTRACT)) a bit more text (another pattern: ignore that text) and may be a little more text Some text (pattern: SOME TEXT THAT I WANT TO EXTRACT) a bit more text (another pattern: ignore that text) and may be a little more text";
    var myRegexp = /\(pattern:?\s?(.+?\)?)\)/g;
    var matches;
    while ((matches = myRegexp.exec(string)) !== null) {
        console.log(matches[1]);
    }
The fourth bird
  • 96,715
  • 14
  • 35
  • 52
0

So you are looking for all texts that are in parenthesis, start with "pattern: " and are followed by a string that optionally may include a matching set of parenthesis.

This is far from readable, but this will do it:

\(pattern:([^()]+|[^(]+\([^)]*\)[^()]*)\)

In words: look for (pattern: ...) where ... is EITHER a string of characters that are not parentheses (let's call them NPCs - that's the [^()]+ part) OR a series of NPCs followed by an opening parenthesis, followed by a series of NPCs, followed by a closing parenthesis and optionally another string of NPC (that's the [^(]+\([^)]*\)[^()]* part).

This does not handle more levels or nesting of course, but if I understand the question right, you don't need it (there is no way to formulate a regexp that handles arbitrary nesting).

ehrencrona
  • 4,711
  • 1
  • 13
  • 19