How can I parse these strings into tuples of prefix notation in Python?

Question

I need to reformat these inputs into a better format, but an optimal method for doing so perplexes me.

Here are some possible inputs (strings):

'[neg (p or q)]'
'[p imp q, (neg r) imp (neg q)]'
'[(p and q) and r]'

Here are the desired formats (lists of strings), respectively:

['neg(or(p,q))']
['imp(p,q)', 'imp(neg(r),neg(q))']
['and(and(p,q),r)']

Basically, these are propositional formulae that may be nested and I'm looking for a better way to format the input so I can more easily work with them later on in my code.

I've attempted using some regex, but am not too familiar with it.

Check [this question](https://stackoverflow.com/questions/11714582/good-infix-to-prefix-implementation-in-python-that-covers-more-operators-e-g) out. — Selcuk, Oct 15 '19 at 01:39

Ajax1234 · Accepted Answer · 2019-10-15T14:51:09.360

You can create a simple parser by first tokenizing your input with re:

import re
class Token:
   def __init__(self, _t, val):
      self._type, self.val = _t, val
   def __repr__(self):
      return f'{self.__class__.__name__}({self._type}, {self.val})'

class Tokenize:
   gram, _t = r'neg|or|imp|and|iff|\(|\)|\w+', [(r'neg|or|imp|and|iff', 'func'), (r'\(', 'oparen'), (r'\)', 'cparen'), (r'\w+', 'value')]
   @classmethod
   def tokenize(cls, _input):
      return [Token([b for a, b in cls._t if re.findall(a, i)][0], i) for i in re.findall(cls.gram, _input)]

def parse(d, stop=None):
   s = next(d, None)
   if s is None or s._type == stop:
     return ''
   if s._type == 'func':
     return f'{s.val}({parse(d, stop=stop)})'
   if s._type == 'oparen':
     s = parse(d, stop='cparen')
   _n = next(d, None)
   if _n and _n._type == stop:
      return getattr(s, 'val', s)
   return getattr(s, 'val', s) if _n is None else f'{_n.val}({getattr(s, "val", s)}, {parse(d, stop=stop)})'

n = ['[neg (p or q)]', '[p imp q, (neg r) imp (neg q)]', '[(p and q) and r]', '[neg (p iff (neg q))]']
result = [[parse(iter(Tokenize.tokenize(i))) for i in a[1:-1].split(',')] for a in n]

Output:

[['neg(or(p, q))'], ['imp(p, q)', 'imp(neg(r), neg(q))'], ['and(and(p, q), r)'], ['neg(iff(p, neg(q)))']]

This appears to fail for the input '[neg (p iff (neg q))]', which should become ‘[neg(iff(p,(neg(q))))]’ — Steve, Oct 15 '19 at 13:45

How can I parse these strings into tuples of prefix notation in Python?

1 Answers1