4

Is there a way in Java (perhaps with an additional Open Source library) to identify the capture groups in a java.util.regex.Pattern (i.e. before creating a Matcher)

Example from the Java docs:

Capturing groups are numbered by counting their opening parentheses from left to right. In the expression ((A)(B(C))), for example, there are four such groups:

1         ((A)(B(C)))
2         (A)
3         (B(C))
4         (C)

In principle it should be possible to identify these from the (compiled) Pattern.

UPDATE: From @Leniel and eslewhere it seems that this facility ("named groups") will be present in Java 7 in mid 2011. If I can't wait for that I can use jregex although I'm not quite sure what the API is.

peter.murray.rust
  • 35,191
  • 41
  • 141
  • 211

2 Answers2

7

You can find out the number of groups by creating a dummy Matcher, like so:

Pattern p = Pattern.compile("((A)(B(C)))");
System.out.println(p.matcher("").groupCount());

If you want the actual subexpressions (((A)(B(C))), (A), etc.), then no, that information is not available.

Alan Moore
  • 68,531
  • 11
  • 88
  • 149
  • In your example there will be no match. Does it still report the match count as if it had been matched? and are the match groups null? – peter.murray.rust Jan 04 '11 at 08:01
  • 2
    `groupCount()` just tells how many capturing groups there are in the regex, so you'll know (for example) the highest group number to use if you want to iterate through the captures once a match has been found. It has nothing to do with the number of **matches**. If you want to know how many times the regex will match a given string, you just have to call `find()` repeatedly until it returns `false`. – Alan Moore Jan 04 '11 at 10:48
  • Thanks - this is useful for me. – peter.murray.rust Jan 04 '11 at 11:58
2

Yes. Check this:

Regex Named Groups in Java

Community
  • 1
  • 1
Leniel Maccaferri
  • 94,281
  • 40
  • 348
  • 451