2

How to validate if code blocks, as in a construct:

{
    // Any amount of characters that aren't '{' or '}'
}

Are properly nested, preferably with regex?

{} {
    {} {}
} // Properly nested
{{
    {{}}
} {} // Not properly nested

As referred to from this thread, approaches such as recursion and balancing groups cannot apply here, as the regular expression constructs are not present in Java Pattern.

Community
  • 1
  • 1
Unihedron
  • 10,251
  • 13
  • 53
  • 66

2 Answers2

3

Why do this with regex? I suggest building your own parser. Something like this:

public static boolean isProperlyNested(String toTest) {
    int countOpen = 0;
    for (char c : toTest.toCharArray()) {
        if (c == '{') {
            countOpen++;
        } else if (c == '}') {
            countOpen--;
            if (countOpen < 0) return false;
        }
    }
    return countOpen == 0;
}
Justin
  • 21,374
  • 12
  • 83
  • 129
  • 1
    Note that this is a simplification. For example what to do with a piece of code where an open or close brace is hidden in a comment; that is BTW where a stack or a state machine comes in. – Mark Rotteveel Sep 01 '14 at 15:17
  • @MarkRotteveel The stack doesn't really work any better. Just include some booleans that toggle for comments/strings/chars/etc, and don't change the count if one of them is true. – Justin Sep 01 '14 at 15:23
  • A stack is probably overkill for this specific example (but what if I want to track balanced braces **and** parentheses), although I'd sooner opt for a enum than a set of booleans to track the current state of the parser (and I actually did just that for Jaybird; it allows for leaving the state switch to the state itself: http://sourceforge.net/p/firebird/code/HEAD/tree/client-java/trunk/src/main/org/firebirdsql/jdbc/escape/FBEscapedParser.java#l364 ). – Mark Rotteveel Sep 01 '14 at 15:32
  • @MarkRotteveel Yes, that's true; a stack to store the character would be a good idea for testing with multiple types of matching characters. – Justin Sep 01 '14 at 15:36
0

I can solve this using two steps with a loop:

{
    String str1 = "{} {\n" +
                  "    {} {}\n" +
                  "} // Properly nested",
           str2 = "{{\n" +
                  "    {{}}\n" +
                  "} {} // Not properly nested";
    final Pattern pattern = Pattern.compile("\\Q{}\\E");

    Matcher matcher = pattern.matcher(str1.replaceAll("[^{}]", ""));
    while (matcher.find())
        matcher = pattern.matcher(str1 = matcher.replaceAll(""));
    System.out.println(str1.isEmpty());

    matcher = pattern.matcher(str2.replaceAll("[^{}]", ""));
    while (matcher.find())
        matcher = pattern.matcher(str2 = matcher.replaceAll(""));
    System.out.println(str2.isEmpty());
}

Here is an online code demo. Demo is slightly different from the code I wrote here so as to show to original string orientation as well.

Unihedron
  • 10,251
  • 13
  • 53
  • 66