2

I am currently struggling to find a .NET-RegEx that deletes a given text in brackets only if the brackets are in a nested level of 2.

Here is a sample-string in multiple lines to exlain the problem:

(
    (text 
        (bingo)    <-- keep this (level=3)
    text)
    (text)
    (bingo)        <-- kick this (level=2)
    (text)
)

Now I need to delete the text "(bingo)" in level 2 but not in any other nested level.

Unfortunately I need to use .NET RegEx for this. Any help is more than welcome.

Carsten
  • 679
  • 5
  • 15
  • 3
    What about the `...(text(bingo)text)...`? Althought the leaf is level 3 the complete parent/child is in fact also level 2. If you would like to exclude this, you should refine your definition with something like: ***... except when it contains any child item***. – iRon Sep 02 '19 at 17:58
  • 3
    You really need to use a regex for this? See https://stackoverflow.com/questions/524548/regular-expression-to-detect-semi-colon-terminated-c-for-while-loops/524624#524624. – AnsFourtyTwo Sep 02 '19 at 18:24
  • I agree with @SimonFink you can't use a regex, you need a parser. The simple parser linked to will work. – joanis Sep 02 '19 at 19:47

3 Answers3

2

You can solve this problem with stateful callbacks that keep track of the nesting level of the parentheses (brackets):

$txt = @'
(
    (text 
        (bingo)
    text)
    (text)
    (bingo)
    (text)
)
'@

$level = 1
[regex]::Replace($txt, '\((bingo\))?|\)', { 
    param($m) # the match at hand
    if ($m.Value -eq ')') { # ')' -> decrease level
      ([ref] $level).Value--
      $m.Value
    }
    elseif ($m.Groups[1].Value) { # '(bingo)'
      if (([ref] $level).Value -eq 2) { # remove
        ''
      } else { # keep
        $m.Value
      }
    }
    else { # '(' -> increase level
      ([ref] $level).Value++
      $m.Value
    }
})

The above yields:

(
    (text 
        (bingo)
    text)
    (text)

    (text)
)

Note:

  • Only the exact string (bingo) is matched.
  • Only the matched string itself - if at the requested level - is removed (not the entire line).
mklement0
  • 245,023
  • 45
  • 419
  • 492
0

Your question implies this is a multi-line string, in which case I would recommend splitting it into an array of strings like so:

$StringArray = $StringVar -split “`n”

After which point you can iterate through this array and handle each line one at a time, like so:

ForEach($String in $StringArray) {
    $Match = [RegEx]::Match($String, "^        \(.*\)")
    if($Match.Success) {
        $KeepStrings += $String
    }
}

For composing & undertsanding RegEx strings, I would advise using https://regex101.com

0

Maybe, an expression similar to:

^\s{4}\(bingo\).*\s

and a replacement of an empty string might be close to work.


If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


Emma
  • 1
  • 9
  • 28
  • 53
  • I added the line breaks and spaces in above sample only to make the problem more clear. in the original scenario it is all in one line without any tabs/spaces. – Carsten Sep 04 '19 at 09:57