0

I need regular expression to match braces correct e.g for every open one close one abc{abc{bc}xyz} I need it get all it from {abc{bc}xyz} not get {abc{bc}.

I tried using (\{.*?})

Brad Mace
  • 26,280
  • 15
  • 94
  • 141
xyz
  • 3
  • 1

6 Answers6

4

This is not possible with regular expressions. A context-free grammar would be necessary for this and regular expressions only work for finite regular languages.

According to this link there is an extension available for the regular expressions in .NET that can do this, but this just means that .NET regular expressions are more than just regular expressions.

Trey Hunner
  • 8,794
  • 4
  • 42
  • 82
2

This is not a task for a regular expression. What you're looking for is parser at that point. Which means a language grammar, LL(1), LALR, recursive-descent, the dragon book, and generally a splitting migraine.

rossipedia
  • 47,308
  • 9
  • 81
  • 87
  • Yes, I know... not a horribly helpful answer, but I still have nightmares about the Dragon Book. ***shudder*** – rossipedia May 06 '10 at 04:55
2

Balanced parenthesis of arbitrary nested depth is not a regular language. It's a context-free language.

That said, many "regular expression" implementations actually recognize more than regular languages, so this is possible with some implementation but not others.

Wikipedia

polygenelubricants
  • 348,637
  • 121
  • 546
  • 611
1

As Bryan said, regular expressions might not be the right tool here, but if you're using PHP, the manual gives an example of how you might be able to use regular expressions in a recursive/nested fashion:

$input = "plain [indent] deep [indent] deeper [/indent] deep [/indent] plain";

function parseTagsRecursive($input)
{

    $regex = '#\[indent]((?:[^[]|\[(?!/?indent])|(?R))+)\[/indent]#';

    if (is_array($input)) {
        $input = '<div style="margin-left: 10px">'.$input[1].'</div>';
    }

    return preg_replace_callback($regex, 'parseTagsRecursive', $input);
}

$output = parseTagsRecursive($input);

echo $output;

I'm not sure if that'll be helpful to you or not.

nickf
  • 499,078
  • 194
  • 614
  • 709
0

This is not possible in the "standard" regular expression language. However, a few different implementations have extensions that allow you to implement it. For example, here's a blog post that explains how to do it with .NET's regex library.

Generally speaking though, this is a task that regular expressions are not really suited to.

Dean Harding
  • 67,567
  • 11
  • 132
  • 174
0

Assuming what you want to do is select a maximal substring between { and }:

.*? is a lazy quantifier. That is, it will match the least number of characters possible. If you change your expression to {.*}, you should find it will work.

If what you want to do is to verify that the braces are matched correctly, then as the other answers have stated, this is not possible with a (single) regular expression. You can do it by scanning the string with a stack though. Or with some voodoo of iterating your regular expression over the previous maximal match. Yikes.

Nick Lewis
  • 4,018
  • 1
  • 18
  • 22
  • Not all regex engines support lazy quantifiers so if you do use them make sure yours supports it. Here's a related SO question: http://stackoverflow.com/questions/546433/regular-expression-to-match-outer-brackets – Trey Hunner May 06 '10 at 07:06
  • @Nick Lewis: I think you mean that lazy quantifiers will match the *most* number of characters possible, right? – Trey Hunner May 06 '10 at 07:06
  • @Trey Greedy quantifiers match the most characters possible, lazy quantifiers match the fewest. It's intuitive in that the lazy quantifier will stop as soon as possible (lazy) and the greedy quantifier will consume as much as possible (greedy). – Nick Lewis May 06 '10 at 07:12
  • You're correct. I think another way to match lazily in this special case would be: `{[^}]*}` – Trey Hunner May 06 '10 at 07:26