1

Similar to this question, I am trying to match a string with *two optional values, both of with are surrounded by parentheses and are of unknown length. Examples of the string in question are:

4
4(2(3))
4(2(3)(1))
4(2(3)(1))(6(5))
1(2(3(4(5(6(7(8)))))))
-4(2(3)(1))(6(5))

There is always a digit that may be negative, and at most 2 items in parenthesis. I would like to capture them in groups, so for the last item, the groups would be:

group 1: -4 
group 2: 2(3)(1) 
group 3: 6(5) 

Adding another optional group to the regex seems to break down:

(-?\d+)(?:\((.*)?\))?(?:\((.*)?\))?
    ^        ^             ^
   group 1  group 2     group 3
DL C
  • 19
  • 3

2 Answers2

1

Use

^(-?\d+)(?:\((\d+(?:\(\d+\))*)?\))?(?:\((\d+(?:\(\d+\))*)?\))?$

See proof.

Explanation

--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    -?                       '-' (optional (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \(                       '('
--------------------------------------------------------------------------------
    (                        group and capture to \2 (optional
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      \d+                      digits (0-9) (1 or more times
                               (matching the most amount possible))
--------------------------------------------------------------------------------
      (?:                      group, but do not capture (0 or more
                               times (matching the most amount
                               possible)):
--------------------------------------------------------------------------------
        \(                       '('
--------------------------------------------------------------------------------
        \d+                      digits (0-9) (1 or more times
                                 (matching the most amount possible))
--------------------------------------------------------------------------------
        \)                       ')'
--------------------------------------------------------------------------------
      )*                       end of grouping
--------------------------------------------------------------------------------
    )?                       end of \2 (NOTE: because you are using a
                             quantifier on this capture, only the
                             LAST repetition of the captured pattern
                             will be stored in \2)
--------------------------------------------------------------------------------
    \)                       ')'
--------------------------------------------------------------------------------
  )?                       end of grouping
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \(                       '('
--------------------------------------------------------------------------------
    (                        group and capture to \3 (optional
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      \d+                      digits (0-9) (1 or more times
                               (matching the most amount possible))
--------------------------------------------------------------------------------
      (?:                      group, but do not capture (0 or more
                               times (matching the most amount
                               possible)):
--------------------------------------------------------------------------------
        \(                       '('
--------------------------------------------------------------------------------
        \d+                      digits (0-9) (1 or more times
                                 (matching the most amount possible))
--------------------------------------------------------------------------------
        \)                       ')'
--------------------------------------------------------------------------------
      )*                       end of grouping
--------------------------------------------------------------------------------
    )?                       end of \3 (NOTE: because you are using a
                             quantifier on this capture, only the
                             LAST repetition of the captured pattern
                             will be stored in \3)
--------------------------------------------------------------------------------
    \)                       ')'
--------------------------------------------------------------------------------
  )?                       end of grouping
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
Ryszard Czech
  • 10,599
  • 2
  • 12
  • 31
  • The optional substrings may be nested to an unknown extent. For example: `1(2(3(4(5(6(7(8)))))))` should return [1, 2(3(4(5(6(7(8)))))), ] for the three groups – DL C Nov 25 '20 at 23:47
  • @DLC You must share more details. What is your programming language? – Ryszard Czech Nov 26 '20 at 21:21
0

You can try the regex-demo to capture there them groups:

Expression: (-?\d+)(\(\d*\(\d*\)\(\d*\)\)|(?:\(\d*\(\d*\)\)))?(\(\d*\(\d*\)\))?

Detail: two last groups are optional groups, the second is alternation.

Heo
  • 201
  • 1
  • 10