1

I have a list of expressions (containing symbols) separated by hyphens:

"exp_1-exp_2-exp_3-exp_4-..........."

I can use the regex matcher /([^-]*)-/ and the standard matcher.find() in java to extract the expressions into

  • exp_1
  • exp_2
  • exp_3
  • exp_4

and so on.

However, I also want a list of exceptions that should be matched as a whole. For example, I want to have "a_1-b_2" and "c_3-d_4" not to split.

So, if the expression is

"exp_0-a_1-b_2-c_3-d_4-exp_5..."

the matcher should give me the list

  • exp_0
  • a_1-b_2
  • c_3-d_4
  • exp_5

How should I modify my regex expression? Or are there better alternatives?

Edit:

A typical example: exp can be \pi_1*b_3 or \sqrt{b_2/b_4}. I assume no minus signs (hyphens) involved. But I want to group terms for example:

String exception ="\sqrt{3}-\sqrt{2}"

So for example, the list may be

"5a^3-\sqrt{3}-\sqrt{2}-\pi_1*b_3"

and I should get

  • 5a^3
  • \sqrt{3}-\sqrt{2}
  • \pi_1*b_3

(These are just expressions, NO mathematics involved, I know what I am trying to get.)

Ivan
  • 149
  • 1
  • 11
  • So you want to have hyphens within expressions, where hyphens are also the expression delimiter? Sounds impossible to me. – Michael Yaworski May 21 '14 at 00:57
  • Can you give more details on "For example, I want to have `a_1-b_2` and `c_3-d_4` not to split."? Depending on criteria here one will use different regexps. – Ivan Nevostruev May 21 '14 at 01:00
  • Without any other indication as to how things should be grouped, this would be an impossible task even for a human. If you have parentheses or such, it might be possible. – awksp May 21 '14 at 01:01
  • the expressions do not contain hyphens, those are what i tried to separate from. The expressions are mathematical expressions and a_1 etc. are variable names. – Ivan May 21 '14 at 01:38
  • I have added an example in the question. – Ivan May 21 '14 at 01:43
  • Have a look at [Match (or replace) a pattern except in situations s1, s2, s3 etc](http://stackoverflow.com/questions/23589174/match-or-replace-a-pattern-except-in-situations-s1-s2-s3-etc/23589204#23589204) and split on `-` except in situation s1 (define s1: something like your `a_1-b_2`) There is a link to an article with sample java code that does exactly this. – zx81 May 21 '14 at 01:47

1 Answers1

1

Alright, this particular solution is straight out of Match (or replace) a pattern except in situations s1, s2, s3 etc

Here's a simple regex that we will use to split on the correct dashes:

a_\\d-b_\\d|c_\\d-d_\\d|(-)

Each of the two left OR cases (i.e., |) match one of your exceptions. We will ignore these matches. The right side matches and captures dashes to Group 1, and we know they are the right dashes because they were not matched by the expression on the left.

We replace the good dashes with SplitHere, then we split on SplitHere

This program shows how to use the regex (see the results at the bottom of the online demo). Just refine the exception regexes to suit your exact needs.

import java.util.*;
import java.io.*;
import java.util.regex.*;
import java.util.List;

class Program {
public static void main (String[] args) throws java.lang.Exception  {

String subject = "exp_0-a_1-b_2-c_3-d_4-exp_5";
Pattern regex = Pattern.compile("a_\\d-b_\\d|c_\\d-d_\\d|(-)");
Matcher m = regex.matcher(subject);
StringBuffer b= new StringBuffer();
while (m.find()) {
    if(m.group(1) != null) m.appendReplacement(b, "SplitHere");
    else m.appendReplacement(b, m.group(0));
}
m.appendTail(b);
String replaced = b.toString();
String[] splits = replaced.split("SplitHere");
for (String split : splits) System.out.println(split);
} // end main
} // end Program

Output:

exp_0
a_1-b_2
c_3-d_4
exp_5

Reference

  1. How to match pattern except in situations s1, s2, s3
Community
  • 1
  • 1
zx81
  • 38,175
  • 8
  • 76
  • 97
  • I just finished reading your answer in the other thread, and when I come back I saw your answer here :P Thanks a bunch! – Ivan May 21 '14 at 02:03
  • You're very welcome, Ivan, glad it helps. With that tool in your hand, the fun part is to write the small "exception expressions" to go on the left. :) These small regexes should be easy to test in regex101 or regexbuddy. – zx81 May 21 '14 at 02:07