-1

So I have a String of text lets say "hello world () (foo bar) (foo bar 2 (this looks cozy)) (foo bar 3..." Is there a regex pattern I could use that will get the parentheses and include any parentheses inside them to nth depth. So the matches would be "()", "(foo bar)", "(foo bar 2 (this looks cozy))", ...?

NotAFlyingGoose
  • 103
  • 1
  • 9

1 Answers1

1

Regex flavor in Java doesn't support recursion like some other flavors do. Instead you can write your own method which will build strings from characters only if they are:

  • (
  • )
  • inside parenthesis.

To know if currently handled character is inside parenthesis we can create counter which will check parenthesis balance (you can also think of it as counter for nesting level). In short: if we saw more ( than ) then we are inside unclosed (open) parenthesis section, so we should add current character to resulting string.

Using that idea our code can look like:

String str = "hello world () (foo bar) (foo bar 2 (this looks cozy)) (foo bar 3...)";
List<String> result = new ArrayList<>();

int parenthesisNestingLevel = 0;
StringBuilder sb = new StringBuilder();
for (char ch : str.toCharArray()) {
    if (ch == '(') {
        parenthesisNestingLevel++;
        sb.append(ch);
    } else if (ch == ')') {
        parenthesisNestingLevel--;
        sb.append(ch);
        if (parenthesisNestingLevel == 0) {
            result.add(sb.toString());
            sb.delete(0, sb.length());//reset sb
        }
    } else if (parenthesisNestingLevel > 0) {//we are inside unclosed parenthesis
        sb.append(ch);
    }
}

result.forEach(System.out::println);

Output:

()
(foo bar)
(foo bar 2 (this looks cozy))
(foo bar 3...)
Pshemo
  • 113,402
  • 22
  • 170
  • 242
  • No 'yet' about it. Regular expressions can't support recursion by definition. Chomsky hierarchy 1956. – user207421 Aug 09 '20 at 23:41
  • @MarquisofLorne Thank you for information. Will need to study it since I am not rally familiar with regex history an theories behind it. But now what confuses me (and is main reason why I added "yet") is existence of `(?R)` in some regex flavors. Based on example https://regex101.com/r/NLElp8/1 from https://stackoverflow.com/a/35271017 it looks like it represents recursion. So are those flavors no longer proper regex? – Pshemo Aug 10 '20 at 00:30
  • 1
    That is correct. They are no longer proper regular expressions, by definition. – user207421 Aug 10 '20 at 09:55
  • @MarquisofLorne Thank you for clarification. – Pshemo Aug 10 '20 at 15:09