2
((\d{1,2})/(\d{1,2})/(\d{2,4}))

Is there a way to retrieve a list of all the capture groups with the Pattern object. I debugged the object and all it says is how many groups there are (5).

I need to retrieve a list of the following capture groups.

Example of output:

0 ((\d{1,2})/(\d{1,2})/(\d{2,4}))
1 (\d{2})/(\d{2})/(\d{4})
2 \d{2}
3 \d{2}
4 \d{4}

Update:

I am not necessarily asking if a regular expression exists, but that would be most favorable. So far I have created a rudimentary parser (I do not check for most out-of-bounds conditions) that only matches inner-most groups. I would like to know if there is a way to hold reference to already-visited parenthesis. I would probably have to implement a tree structure?

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;

public class App {

    public final char S = '(';
    public final char E = ')';
    public final char X = '\\';

    String errorMessage = "Malformed expression: ";

    /**
     * Actual Output:
     *    Groups: [(//), (\d{1,2}), (\d{1,2}), (\d{2,4})]
     * Expected Output:
     *    Groups: [\\b((\\d{1,2})/(\\d{1,2})/(\\d{2,4}))\\b, ((\\d{1,2})/(\\d{1,2})/(\\d{2,4})), (\d{1,2}), (\d{1,2}), (\d{2,4})]
     */

    public App() {
        String expression = "\\b((\\d{1,2})/(\\d{1,2})/(\\d{2,4}))\\b";
        String output = "";

        if (isValidExpression(expression)) {
            List<String> groups = findGroups(expression);
            output = "Groups: " + groups;
        } else {
            output = errorMessage;
        }

        System.out.println(output);
    }

    public List<String> findGroups(String expression) {
        List<String> groups = new ArrayList<>();
        int[] pos;
        int start;
        int end;
        String sub;
        boolean done = false;

        while (expression.length() > 0 && !done) {
            pos = scanString(expression);
            start = pos[0];
            end = pos[1];

            if (start == -1 || end == -1) {
                done = true;
                continue;
            }

            sub = expression.substring(start, end);
            expression = splice(expression, start, end);
            groups.add(0, sub);
        }

        return groups;
    }

    public int[] scanString(String str) {
        int[] range = new int[] { -1, -1 };
        int min = 0;
        int max = str.length() - 1;
        int start = min;
        int end = max;
        char curr;

        while (start <= max) {
            curr = str.charAt(start);
            if (curr == S) {
                range[0] = start;
            }
            start++;
        }

        end = range[0];

        while (end > -1 && end <= max) {
            curr = str.charAt(end);
            if (curr == E) {
                range[1] = end + 1;
                break;
            }

            end++;
        }

        return range;
    }

    public String splice(String str, int start, int end) {
        if (str == null || str.length() < 1)
            return "";

        if (start < 0 || end > str.length()) {
            System.err.println("Positions out of bounds.");
            return str;
        }

        if (start >= end) {
            System.err.println("Start must not exceed end.");
            return str;
        }

        String first = str.substring(0, start);
        String last = str.substring(end, str.length());

        return first + last;
    }

    public boolean isValidExpression(String expression) {
        try {
            Pattern.compile(expression);
        } catch (PatternSyntaxException e) {
            errorMessage += e.getMessage();
            return false;
        }

        return true;
    }

    public static void main(String[] args) {
        new App();
    }
}
nhahtdh
  • 52,949
  • 15
  • 113
  • 149
Mr. Polywhirl
  • 31,606
  • 11
  • 65
  • 114
  • You need a regex for your regex. – Sotirios Delimanolis Nov 05 '13 at 13:39
  • 1
    Explain better what you are trying to accomplish with this - ie what is that useful for. BTW - there are 3 groups there not 5 – Artur Nov 05 '13 at 13:48
  • 3
    Humm... I only see **4** groups in your pattern. What's **wrong** with me? – Paul Vargas Nov 05 '13 at 13:52
  • @Paul: 3 on the first line of the question and 4 on 5th ;-) – Artur Nov 05 '13 at 13:52
  • OK. I thought the group 0 is the entire string found. – Paul Vargas Nov 05 '13 at 13:55
  • 2
    @Paul: OK now I know what you mean. I count groups in his regex (equals number of openinig brackets) and you count groups as number of results provided by Matcher ;-) – Artur Nov 05 '13 at 13:58
  • 2
    Who up-voted the question if no one can figure out what the OP actually wants !! – Ibrahim Najjar Nov 05 '13 at 14:00
  • I have found [this stackoverflow topic](http://stackoverflow.com/questions/4589643/identifying-capture-groups-in-a-regex-pattern) about this ... Duplicated question?!? – Paolo Nov 05 '13 at 14:20
  • 1
    @Paolo it is similar but there is no good answer. Accepted answer claims that subexpressions are not available. This is true since there are no build in methods for it, but this question is about how to create such method. – Pshemo Nov 05 '13 at 14:31
  • @Pshemo: “but this question is about how to create such method” In this case the question had to be treated as off-topic as the question shows no attempt of implementing such a method. – Holger Nov 05 '13 at 14:41
  • I want to write a regular expression inspector. I need to return a list of all the **capture groups**. In my example, there are 5 matches in total. I know this not only, from experience, but because Pattern matching groups returns 5. I want the actual matches. Is there a library out there that can return the *actual* groups? – Mr. Polywhirl Nov 05 '13 at 14:42
  • "I want the actual matches. Is there a library out there that can return the actual groups?" => You are being very inconsistent in what you are asking for. First you say you want the matches, then you ask about getting the groups. Do you want: (1) the parts of the pattern that constitute each group; or (2) the parts of a matched string corresponding to each group in the pattern? – Mike Strobel Nov 05 '13 at 15:00
  • I need to deconstruct the regular expression into its matching groups as seen in the list above. I do not want to edit my question anymore. I think it was a big mistake. I just read that that I did not have enough detail in my question. Sorry. – Mr. Polywhirl Nov 05 '13 at 15:04
  • No problem. Unfortunately, I do not know of any core Java APIs that allow you to inspect the matching groups in a pattern. – Mike Strobel Nov 05 '13 at 15:17
  • So, I updated my question and included a rudimentary parser. – Mr. Polywhirl Nov 06 '13 at 01:05

1 Answers1

2

Here is my solution ... I simply provided a regex of the regex as @SotiriosDelimanolis commented out.

public static void printGroups() {
        String sp = "((\\(\\\\d\\{1,2\\}\\))\\/(\\(\\\\d\\{1,2\\}\\))\\/(\\(\\\\d\\{2,4\\}\\)))";
        Pattern p = Pattern.compile(sp);
        Matcher m = p.matcher("(\\d{1,2})/(\\d{1,2})/(\\d{2,4})");
        if (m.matches())
            for (int i = 0; i <= m.groupCount(); i++)
                System.out.println(m.group(i));
    }

Pay attention that you cannot remove the if-statement because in order to use the group method you should call the matches method first (I didn't know it!). See this link as a reference about it.

Hope this is what you were asking for ...

Paolo
  • 1,551
  • 10
  • 15
  • I have a feeling that pattern used in question was just example. You should try to figure out way for more general case like for example `(group)\\(no group\\)and another [(]no group[)]`. – Pshemo Nov 05 '13 at 17:46
  • Yes, I was wondering if there is a more general case. Thanks for your input though. I am thinking that I will have to parse the string by group. So I would grab the substrings of `pattern.substring(pattern.indexOf('('), pattern.indexOf(')'))` In this question, I am not necessarily seeking a regular expression to solve my problem, but a method of obtaining all the matching groups. I will try to write my method and either update this question or create a new one. – Mr. Polywhirl Nov 05 '13 at 19:46