3

I have a text like this:

UseProp1?(Prop1?Prop1:Test):(UseProp2?Prop2:(Test Text: '{TextProperty}' Test Reference:{Reference}))

I'm trying to use regex in c# to extract the nested if/else-segments.

To find '?' I've used:

Pattern 1: \?\s*(?![^()]*\))

and to find ':' I've used:

Pattern 2: \:\s*(?![^()]*\))

This works fine when there is one level of parentheses but not when nesting them.

I've used this online tool to simplify the testing: http://regexstorm.net/tester (and insert pattern-1 and input from above)

As you can see, it highlights two matches but I only want the first. You'll also notice that first parentheses is overlooked but not the next one with the nested levels

I expect the match list to be:

1) UseProp1

2) (Prop1?Prop1:Test):(UseProp2?Prop2:(Test Text: '{TextProperty}' Test Reference:{Reference}))

What I'm getting now is:

1) UseProp1

2) (Prop1?Prop1:Test):(UseProp2

3) Prop2:(Test Text: '{TextProperty}' Test Reference:{Reference}))

  • 2
    Possible duplicate of [Regular expression to match balanced parentheses](https://stackoverflow.com/questions/546433/regular-expression-to-match-balanced-parentheses) – Rhaokiel Jun 06 '19 at 19:08
  • 1
    What are the expected results? – Wiktor Stribiżew Jun 06 '19 at 19:14
  • @Rhaokiel, thanks, linked answer puts me closer to the finish line. The regex pattern \(([^()]|(?R))*\) identifies my parentheses correctly (with the exception of highlighting a 't'). Can I use this as an exclude pattern in one regualar expression? – eivindeizer Jun 06 '19 at 19:21
  • @WiktorStribiżew, just added expected match list and my current match list with the current pattern – eivindeizer Jun 06 '19 at 19:27
  • A classical nested parentheses regex is `\((?>[^()]+|(?\()|(?\)))*(?(o)(?!))\)`. Does it work? Does it return expexted matches? Your example does not look clear to me. Maybe `(?:(\w+)\?)?\(((?>[^()]+|(?\()|(?\)))*(?(o)(?!)))\)` will do the job?. – Wiktor Stribiżew Jun 06 '19 at 19:40
  • It works just fine if you're trying to grab just the outer parentheses and its contents. To me it sounds like eivindeizer wants more than just that. He also wants to separate the ternary parts ? and :. – Rhaokiel Jun 06 '19 at 19:43
  • 1
    @eivindeizer, can I ask what this is being used for? In my own experience with parsing code, I've found there are more effective ways than using regex. – Rhaokiel Jun 06 '19 at 19:44
  • @Rhaokiel, My input file is based on a configuration-mapping, and each level has an if/else check, with only 1 ':' and 1 '?'. Based on what the properties read it works itself down the levels of 'if/else' encapsulated in parentheses. Perhaps regex isn't the best way to solve my problem, but I need to split values if each side of the '?' and ':'. – eivindeizer Jun 06 '19 at 20:02
  • Based on that, don't you need 3 values, not 2, in the output? – Wiktor Stribiżew Jun 06 '19 at 20:04
  • A weird one, `(?=\b(?\w+)\?(?:(?\w+)|\((?(?>[^()]+|(?\()|(?\)))*(?(o)(?!)))\)):(?:(?\w+)|\((?(?>[^()]+|(?\()|(?\)))*(?(o)(?!))))\))`. See named groups values. – Wiktor Stribiżew Jun 06 '19 at 20:11
  • 1
    How about [this pattern](http://regexstorm.net/tester?p=%28%3f%3a%5c%28%28%3f%3e%5c%28%28%3f%3cc%3e%29%7c%5b%5e%28%29%5d%2b%7c%5c%29%28%3f%3c-c%3e%29%29*%28%3f%28c%29%28%3f!%29%29%5c%29%3a%3f%29%2b%7c%5cb%5b%5e%29%28%3f%5d%2b&i=UseProp1%3f%28Prop1%3fProp1%3aTest%29%3a%28UseProp2%3fProp2%3a%28Test+Text%3a+%27%7bTextProperty%7d%27+Test+Reference%3a%7bReference%7d%29%29%0d%0a%0d%0aUseProp3%3f%28Prop2%3fProp3%3aTest%29%3a%28UseProp2%3fProp2%3a%28Test+Text%3a+%27%7bTextProperty%7d%27+Test+Reference%3a%7bReference%7d%29%29%0d%0a%0d%0aab%3f%28cd%28de%29%28fg%29%29) – bobble bubble Jun 07 '19 at 00:39
  • Correct pattern answer in my case is both WiktorStribiżew and @bobblebubble. I ended up using the nested parentheses regex to identify parentheses and further exclude them from the if/else. – eivindeizer Jun 07 '19 at 06:55
  • Also linked answer by @Rhaokiel should be considered the solution as it contains a more detailed answer, therefor making this a duplicate. – eivindeizer Jun 07 '19 at 07:01

2 Answers2

0

Expanding on @bobble bubble's comment, here's my regex:

It will capture the first layer of ternary functions. Capture groups: $1 is the conditional, $2 is the true clause, and $3 is the false clause. You will then have to match the regex on each of those to step further down the tree:

((?:\((?>\((?<c>)|[^()]+|\)(?<-c>))*(?(c)(?!))\))+|\b[^)(?:]+)+\?((?:\((?>\((?<c>)|[^()]+|\)(?<-c>))*(?(c)(?!))\))+|\b[^)(?:]+)+\:((?:\((?>\((?<c>)|[^()]+|\)(?<-c>))*(?(c)(?!))\))+|\b[^)(?:]+)+

Code in Tester

That being said, if you are evaluating math in these expressions as well, it may be more valuable to use a runtime compiler to do all the heavy lifting for you. This answer will help you design in that direction if you so choose.

Rhaokiel
  • 748
  • 4
  • 17
-2

If I understand it right, and we wish to capture only the two listed formats, we can start with a simple expression using alternation, then we'd modify its compartments, if we would like so:

UseProp1|(\(?Prop1\?Prop1(:Test)\)):(\(UseProp2\?Prop2):\((Test\sText):\s+'\{(.+?)}'\s+Test\sReference:\{(.+?)}\)\)

Demo

Test

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string pattern = @"UseProp1|(\(?Prop1\?Prop1(:Test)\)):(\(UseProp2\?Prop2):\((Test\sText):\s+'\{(.+?)}'\s+Test\sReference:\{(.+?)}\)\)";
        string input = @"UseProp1
(Prop1?Prop1:Test):(UseProp2?Prop2:(Test Text: '{TextProperty}' Test Reference:{Reference}))
";
        RegexOptions options = RegexOptions.Multiline;

        foreach (Match m in Regex.Matches(input, pattern, options))
        {
            Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
        }
    }
}

RegEx

If this expression wasn't desired and you wish to modify it, please visit this link at regex101.com.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Emma
  • 1
  • 9
  • 28
  • 53