I would like to split a string into 2 groups, based on a regular expression. The string has basically the following structure:
some text (data1 | data2 | data3 | data4)
I've used a simple regular expression as follows:
re.match("^(?P<title>.*)\((?P<data>.*)\)$", s)
It works fine provided there are no parenthesis in the string, that would conflict with the regular expression.
But if there are parenthesis in one of the groups, it outputs an unexpected result:
>>> import re
>>> def process_string1(s):
... r = re.match("^(?P<title>.*?)\((?P<data>.*)\)$", s)
... return r.groups()
...
>>> def process_string2(s):
... r = re.match("^(?P<title>.*)\((?P<data>.*)\)$", s)
... return r.groups()
...
>>> s = "this is an example (detail) (data1 | data2 | data3 | data4)"
>>> print process_string1(s)
('this is an example ', 'detail) (data1 | data2 | data3 | data4') # Wrong
>>> print process_string2(s)
('this is an example (detail) ', 'data1 | data2 | data3 | data4') # Good
>>> s = "this is another example (data1 (detail) | data2 | data3 | data4)"
>>> print process_string1(s)
('this is another example ', 'data1 (detail) | data2 | data3 | data4') # Good
>>> print process_string2(s)
('this is another example (data1 ', 'detail) | data2 | data3 | data4') # Wrong
Can you please help me?