0

I am writing a python script that will read equations as text strings and evaluate them. The equations are very long (several hundred chars) and have several levels of parenthesis. In addition there are nested conditionals. As an example:

str = "A*((B+7>=0)?((D*(E+(F)/sqrt((W)(R)/C((S)-23+((max(S,6)>Z?U+I:V+T)))))/R)):((sqrt(D)(E+(F)/sqrt((9)(R)/C((S)-2*3+((S>Z?max(U,X)+I:V+T)))))/R)/sqrt(W)))"

i remove all whitespace to make it easier to parse

I brute forced the following code:

str = "A*((B+7>=0)?((D*(E+(F)/sqrt((W)*(R)/C*((S)-2*3+((max(S,6)>Z?U+I:V+T)))))/R)):((sqrt(D)*(E+(F)/sqrt((9)*(R)/C*((S)-2*3+((S>Z?max(U,X)+I:V+T)))))/R)/sqrt(W)))"

operators = ['+','-','/','*','?',':']
functions = ['sqrt','max']

def ternAround(eqn):
    # this function replaces ?: ternary operators with python if else operators
    # eqn is an equation text string
    done = 0
    
    while not done:
        
        str = eqn.split("?",1)

        if len(str) > 1:
            # get term to left of ?
            A     = str[0]
            astop = len(A)
            i     = astop-1
            pct   = 0
            while i >= 0 and pct!=1:
                # find first unpaired "("
                pct = pct+1 if A[i]=="(" else pct-1 if A[i]==")" else pct
                i   = i-1
                
            A      = A[i+2:] if pct==1 else A
            astart = i+2 if pct==1 else 0
            #print(A,astart,astop)
    
            # get term between ? and :
            B      = str[1]
            bstart = 0
            pct    = 0
            i      = 0
            while i < len(B):
                # find first : outside of paired "()"
                if B[i] == ":" and pct==0:
                    bstop = i
                    B = B[bstart:bstop]
                    break
                pct = pct+1 if B[i]=="(" else pct-1 if B[i]==")" else pct
                i   = i+1        
            #print(B,bstart,bstop)
    
            # get term to right of :
            cstart = bstop + 1
            C      = str[1][cstart:]
            pct    = 0
            i      = 0
            while i < len(C) and pct!=-1:
                # find first unpaired "("
                pct = pct+1 if C[i]=="(" else pct-1 if C[i]==")" else pct
                i   = i+1
            C      = C[:i-1] if pct==-1 else C
            cstop  = i-1 if pct==-1 else len(C)
            #print(C,cstart,cstop)
    
            bstart = bstart + astop + 1
            bstop  = bstart + bstop
            cstart = cstart + astop + 1
            cstop  = cstart + cstop
            
            eqn = ''.join([eqn[:astart],B," if ",A," else ",C,eqn[cstop:]])
            done = 0
        else:
            done = 1

    return(eqn)    

print(ternAround(str))

The result:

A*(((D*(E+(F)/sqrt((W)(R)/C((S)-23+((U+I if max(S,6)>Z else V+T)))))/R)) if (B+7>=0) else ((sqrt(D)(E+(F)/sqrt((9)(R)/C((S)-2*3+((max(U,X)+I if S>Z else V+T)))))/R)/sqrt(W)))

I have a feeling there might have been a much more elegant way to do this with regex. Someone here could probably do this in two or three lines of code.

R. Parker
  • 43
  • 3

1 Answers1

1

Regular expressions aren't fitted to deal with nested expressions (you can't build a regex that counts and matches the number of open and closed parentheses, for example)

See this question here for more details: Regular expression to match balanced parentheses

bensha
  • 69
  • 3