I would like to extract aaa
, bb b={{b}}bb bbb
and {ccc} ccc
from the following string using regular expression:
zyx={aaa}, yzx={bb b={{b}}bb bbb}, xyz={{ccc} ccc}
Note: aaa
represents an arbitrary sequence of any number of characters, hence no determined length or pattern. For instance, {ccc} ccc
could be {cccccccccc}cc {cc} cccc cccc
, or any other combination),
I have written the following regular expression:
(?<a>[^{}]*)\s*=\s*{((?<v>[^{}]+)*)},*
This expression extracts aaa
, but fails to parse the rest of the input with catastrophic backtracking failure, because of the nested curly-brackets.
Any thoughts on how I can update the regex to process the nested brackets correctly?
(Just in case, I am using C# .NET Core 3.0, if you need engine-specific options. Also, I rather not doing any magics on the code, but work with the regex pattern only.)
Similar question
The question regular expression to match balanced parentheses is similar to this question, with one difference that here the parenthesis are not necessarily balanced, rather they follow x={y}
pattern.
Update 1
Inputs such as the following are also possible:
yzx={bb b={{b}},bb bbb,},
Note ,
after {{b}}
and bbb
.
Update 2
I wrote the following patter, this can match anything but aaa
from the first example:
(?<A>[^{}]*)\s*=\s*{(?<V>(?<S>([^{}]?)\{(?:[^}{]+|(?&S))+\}))}(,|$)