0

I need a regular expression for a Java syntax parser that matches my programming language syntax that look like this:

Variable1={1,2,3}
Variable2=Variable1+{4,5,6}+{}*{2}
Variable3=(Variable2+{1})*Variable1
?Variable3 
?{1,2,3}
?Variable3+{1,2,3}

Expression assignments to variables contain "=" and evaluations start with a "?" sign. Inside parenthesis, you can define a new expression, but the new expression can contain parenthesis again, so it's like a recursive regular assignment, which is not possible in this way:

String IdPattern = "[a-zA-Z][a-zA-Z0-9]*";            
String SePattern ="\\{"+"([0-9]*)(\\,[0-9]+)*"+"\\}";  


// Problem at next line:
   String CoPattern  = "\\(" + ExPattern + "\\)";   
// CoPattern depends on 
// Expattern, which depends on TePattern, 
// which depends on FaPattern, which depends on CoPattern again.

String FaPattern= "("+IdPattern+"|"+SePattern+"|"+CoPattern+")";              
String TePattern = FaPattern + "("+ "\\*"+ FaPattern+ ")*" ;   
String ExPattern= "" + TePattern + "(" + "\\+"+ TePattern+")*";  


String AsPattern =  "("+IdPattern+"="+ExPattern+")";  
String PriPattern = "(\\?"+ExPattern +")";                     
String StaPattern = "("+AsPattern+"|"+PriPattern+")";    
String Pro = StaPattern+"$";       
System.out.println("Input=((({20}+{1,2,3})))".matches(Pro));

The problem here is that CoPattern, depends on ExPattern, which depends on FaPattern, which depends on CoPattern itself again. So how do I make this work?

user1095332
  • 336
  • 5
  • 15
  • 2
    You're using the wrong tool for the job. You need a scanner and an expression parser. You can't do this with regular expressions. – user207421 Sep 12 '19 at 06:03

1 Answers1

1

Inside parenthesis, you can define a new expression, but the new expression can contain parenthesis again, so it's like a recursive regular assignment, which is not possible:

You figured it yourself: it doesn't seem to work.

Thus the simple answer is: regular expressions are an insufficient tool here. You should very much look into building a real parser instead.

Not only because of the hard conceptual limitations, see here for example. But because: building a parser is more than matching input. One key element of a compiler/parsers is to give feedback on invalid input. A regular expression gives you a binary "matches" vs "does not match" answer. But as programmer, you wan't to be told "your input is invalid, and most likely, one problem is a missing bracket over here and an invalid identifier over there".

So even if you somehow get that approach to work for you, it will give you only a binary answer. And: a "proof of concept" isn't the same as having a reasonable, robust foundation to build on.

It is your project, your "new language". You should understand any part of the tooling around it. Coming from there, "I have seen that super complicated regex that supposedly solves my problem, can someone adapt that to my needs" ... is clearly not a good starting point.

Regular expressions are a very helpful and import tool, but they need to be used with care. My personal rule of thumb: when your regex is so complicated that you need other people to explain it to you, even write it down for you ... then consider not using a regex. Because you are probably out of your league. And you will be the one who has to maintain that code.

GhostCat
  • 127,190
  • 21
  • 146
  • 218
  • The given answer is not a solution of how to solve this problem in Java. This posts suggest it could be possible using forward references: https://stackoverflow.com/questions/47162098/is-it-possible-to-match-nested-brackets-with-regex-without-using-recursion-or-ba/47162099#47162099 – user1095332 Sep 12 '19 at 05:40
  • 1
    @user1095332 I updated my answer accordingly. And this is one of the few occasions where I will not remove my answer, no matter how many more downvotes might be coming in. Because you are investing your time and energy in the wrong spot, and letting future readers know about that is worth it ... – GhostCat Sep 12 '19 at 06:28