0

Possible Duplicate:
Regular Expression to match outer brackets

I have a string of the following format:

(((aaa (bbb) ccc)(ddd (eee) fff) ggg)(hhh (iii) )(jjj (kkk) lll) mmm)(nnn (ooo) ppp)(qqq (rrr) sss)

It basically has 3 main parts:

(((aaa (bbb) ccc)(ddd (eee) fff) ggg)(hhh (iii) )(jjj (kkk) lll) mmm)

(nnn (ooo) ppp)

(qqq (rrr) sss)

I need the search expression to get the 3 parts in an array (ignoring any sub parentheses). Once that is done, I need another search expression to split the individual parts (only 2nd & 3rd):

(nnn (ooo) ppp) => nnn,ooo,ppp

Thanks

Community
  • 1
  • 1
victor_golf
  • 185
  • 2
  • 2
  • 10
  • 3
    [Grammars with matching parentheses aren't regular, and hence can't be parsed with regular expressions. Write (or use) a parser.](http://stackoverflow.com/questions/546433/regular-expression-to-match-outer-brackets) –  Apr 28 '12 at 07:36
  • 1
    Why create such a string in the first place? If you need an array make an array, if its part of your code that makes it then change it to output an array. Is it pseudo or homework? – Lawrence Cherone Apr 28 '12 at 07:41
  • Thanks for the response. Writing a parser, which checks every character for open/closed parentheses will be inefficient in terms of time taken and memory usage. The given format is just a syntax but the actual text is a lot more. Also I have to parse atleast 5k such strings. Is there any other way to do it??? – victor_golf Apr 28 '12 at 07:43
  • I am not creating the string. This is a response from a server. – victor_golf Apr 28 '12 at 07:44
  • Look like S-Expressions. – jcubic Apr 28 '12 at 07:54
  • Then get the server to output the text in JSON or XML. Either reinvent the wheel correctly or not at all. –  Apr 28 '12 at 08:46

1 Answers1

0

This is how I think I would do it:

<?php

$string = '(((aaa (bbb) ccc)(ddd (eee) fff) ggg)(hhh (iii) )(jjj (kkk) lll) mmm)(nnn (ooo) ppp)(qqq (rrr) sss)';

function parse_string($input) {
    $len = strlen($input);
    $substrings = array();
    $paren_count = 0;
    $cur_string = '';
    for ($i = 0; $i < $len; $i++) {
        $char = $input[$i];
        if ($char == '(') {
            $paren_count += 1;
        } elseif ($char == ')') {
            $paren_count -= 1;
        }
        $cur_string .= $char;
        if ($paren_count == 0 && strlen($cur_string)) {
            $substrings[] = $cur_string;
            $cur_string = '';
        }
    }
    return $substrings;
}

function convert_str($input) {
    $search = array('(', ')', ' ');
    $replace = array('', '', ',');
    return str_replace($search, $replace, $input);
}


$parsed_string = parse_string($string);
echo convert_str($parsed_string[1]);

OUTPUT:

nnn,ooo,ppp

This is a type of state machine.

sberry
  • 113,858
  • 17
  • 127
  • 157
  • Thanks. This is working. But I am still wondering if there is a faster way to do it?? As the strings that i need to parse are large in both, characters and numbers (>5k). – victor_golf Apr 28 '12 at 08:06
  • @goelvaibhav - Please keep in mind [the rules of Optimization Club](http://stackoverflow.com/a/177132/554546). If you don't want to use a homemade parser, then get the output in a more ubiquitous format for which optimized parsers exist (eg JSON or XML). –  Apr 28 '12 at 14:02