0

I wanna a regular expression that exp-b occurs the SAME times as exp-a. Just like "(" and ")". For example, "x" vs "y", "abxefyg" matches, "abxefyyg" not, and "abxxefyyg" does.

How should I code it in Java?

Any advice will be appreciated.
Thanks.

youngzy
  • 369
  • 2
  • 12
  • Can you better explain what you are trying to do here? Why does `abxxefyyg` match, while `abxefyyg` does not? – Tim Biegeleisen Oct 17 '19 at 01:40
  • 2
    Regex is not the best tool for the kind of problem you have described. Although the title of your question is possibly a duplicate of https://stackoverflow.com/questions/41878948/is-it-possible-to-define-a-pattern-and-reuse-it-to-capture-multiple-groups. – jrook Oct 17 '19 at 01:40
  • 1
    @Tim I believe he wants a regex that checks if both x and y occur the same number of times in a single string. So `abxxefyyg` matches because x and y both occur twice. `abxefyyg` does not match because x occurs once and y occurs twice. – Anil M Oct 17 '19 at 01:44
  • 1
    regexes are not a good solution here. See https://stackoverflow.com/questions/546433/regular-expression-to-match-balanced-parentheses – Damiox Oct 17 '19 at 01:46
  • 3
    Using a simple for loop and counting the number of occurrence of each character and then checking for condition is the best way to solve such problems not regex, IMO – Code Maniac Oct 17 '19 at 01:51
  • 1
    Does `abexxeyfyg` match? – smac89 Oct 17 '19 at 02:11
  • 1
    *"Just like `(` and `)`"* With parentheses, the order matters too, e.g. `()`, `(())`, and `()()` are all good, while `)(`, `))((`, and `)()(` are all bad. Is that the case too for `xy` pairs? If so, then regex is *really* not the right tool for the job. – Andreas Oct 17 '19 at 02:19
  • @smac89 `abexxeyfyg` does not match. `y` should be together like `xx`. – youngzy Oct 22 '19 at 14:31
  • @TimBiegeleisen The SAME occurs. `abxefyyg`, `x` 1, `y` 2. – youngzy Oct 22 '19 at 14:37
  • @jrook Almost, but not suit Java. Thanks. – youngzy Oct 22 '19 at 15:58

3 Answers3

1

I believe this is what you want.

  1. The capture group captures a single character.
  2. The \\1 is a backreference to reference what the capture group matched.

So this replaces all double characters with an empty string.

      String str = "AAabbBCCeeFF--#";
      str = str.replaceAll("(.)\\1", "");
      System.out.println(str);

This prints

aB#

WJS
  • 22,083
  • 3
  • 14
  • 32
  • It's great code, but not I want. Thanks.Could I get the times that the group occurs? For example, `aaBBB`, the capture group `(.)`, for `a` the times is 2, `B` is 3. – youngzy Oct 22 '19 at 14:48
1

Just like "(" and ")"

Parentheses must be balanced, i.e. for every ( there must be a following ), and you cannot have a ) without a matching preceding (.

Regex cannot do this. Just write a simple loop, and keep track of the nesting depth, e.g.

public static boolean isBalanced(String text, char startChar, char endChar) {
    int depth = 0;
    for (int i = 0; i < text.length(); i++) {
        char c = text.charAt(i);
        if (c == startChar)
            depth++;
        else if (c == endChar && --depth < 0)
            return false; // endChar without matching startChar
    }
    return (depth == 0); // check startChar without matching endChar
}

Test 1

System.out.println(isBalanced("abxefyg", 'x', 'y'));
System.out.println(isBalanced("abxefyyg", 'x', 'y'));
System.out.println(isBalanced("abxxefyyg", 'x', 'y'));

Test 2

System.out.println(isBalanced("ab(ef)g", '(', ')'));
System.out.println(isBalanced("ab(ef))g", '(', ')'));
System.out.println(isBalanced("ab((ef))g", '(', ')'));

Output from both

true
false
true
Andreas
  • 138,167
  • 8
  • 112
  • 195
0

This answer assumes that you want to assert whether a string has the same length of sequence of two different characters. Using your sample input, we can first do a regex replacement to remove any characters other than x and y. Then, split on (?<=(.))(?!\1) to generate a string array with two entries, one for the x sequence, and one for the y sequence. Finally, assert that these two strings are the same length.

String input = "abxxefyyg";
input = input.replaceAll("[^xy]+", "");
String[] parts = input.split("(?<=(.))(?!\\1)");
System.out.println(Arrays.toString(parts));
if (parts[0].length() == parts[1].length()) {
    System.out.println("MATCH");
}
else {
    System.out.println("NO MATCH");
}

This prints:

[xx, yy]
MATCH

Here is an explanation of how the regex works:

(?<=(.))   look behind and capture a single character
(?!\1)     look ahead and assert that what follows is NOT the same character

So, if we split on (?<=(.))(?!\1), then we would be splitting in between any two characters which are not the same. After doing the regex replacement on the input abxxefyyg, we would be left with xxyy. Splitting using the above pattern generates an array with two terms, xx and yy.

Tim Biegeleisen
  • 387,723
  • 20
  • 200
  • 263