2

Such as there is a string s:

s = "((abc)((123))())blabla"

We know the beginning of s is "(" and we want to find the opposite of it, the ")" before "blabla", how to do this in python?

Is it possible to do this in a simple and intuitive way, without using status machines? or is there any library can do this?

zchenah
  • 1,884
  • 15
  • 29
  • 2
    http://stackoverflow.com/questions/524548/regular-expression-to-detect-semi-colon-terminated-c-for-while-loops/524624#524624 This might help – Amit Nov 06 '12 at 18:10
  • nope, i fear that you cannot escape regular expressions with this kind of problem – EnricoGiampieri Nov 06 '12 at 18:10
  • @AmitMizrahi thank you very much, that is a good way. So the question should be changed to is there any lib to do this kind of things.. – zchenah Nov 06 '12 at 18:15
  • I've never used it, but you could probably do something like this with `pyparsing`. – mgilson Nov 06 '12 at 18:47

2 Answers2

1

by code you can achieve that by:

from collections import defaultdict

opens = defaultdict(int)

open_close_pair = []

s = '((abc)((123))())blabla'
openc, closec = '(', ')'

for c in range(0, len(s)):
    if s[c] == openc:
        # +1 in every entry
        for key, val in opens.items():
            opens[key] += 1
        opens[c] += 1

    elif s[c] == closec:
        # -1 in every entery
        for key, val in opens.items():
            opens[key] -= 1
    else:   
        pass

    for key, val in opens.items():
        if val == 0:
            # open the entry to the open close pairs
            open_close_pair.append( (key, c))
            # the bracket is close so can be removed from the counter
            del opens[key]

for x in open_close_pair:
    print " %s %s " % (s[x[0]], s[x[1]])
print open_close_pair 
print opens

The output is:

 ( ) 
 ( ) 
 ( ) 
 ( ) 
 ( ) 
[(1, 5), (7, 11), (6, 12), (13, 14), (0, 15)]
defaultdict(<type 'int'>, {})

The algorithm is:

  • keep an opens dict containing the position of open brackets.
  • When you find an open bracket, you add +1 on all the previous entries and than add a new entry for the current position
  • When you find a closing bracket, you decrease -1 on all the previous enteries
  • Just run through the opens and if any entry is 0 means that we have a pair.
andrefsp
  • 3,132
  • 1
  • 24
  • 29
1

You may try regex, pyparsing, but a naive option with linear time complexity is the following naive way

>>> s = "((abc)((123))())blabla"
>>> count = 0
>>> for i,e in enumerate(s):
    if e == '(':
        count += 1
    elif e == ')':
        count -= 1
    if not count:
        break


>>> s[:i + 1]
'((abc)((123))())'
>>> 
Abhijit
  • 55,716
  • 14
  • 105
  • 186