10

In an odd number length string, how could you match (or capture) the middle character?

Is this possible with PCRE, plain Perl or Java regex flavors?

With .NET regex you could use balancing groups to solve it easily (that could be a good example). By plain Perl regex I mean not using any code constructs like (??{ ... }), with which you could run any code and of course do anything.

The string could be of any odd number length.

For example in the string 12345 you would want to get the 3, the character at the center of the string.

This is a question about the possibilities of modern regex flavors and not about the best algorithm to do that in some other way.

Qtax
  • 31,392
  • 7
  • 73
  • 111
  • 6
    Perl: Yes. Using recursion or `/^(.*)(.)(??{ '.' x length($1) })\z/s` – ikegami Jan 20 '15 at 17:36
  • Is this not context-sensitive? – Oli Jan 20 '15 at 17:40
  • @ikegami, I guess I shouldn't have said Perl, or at least not using Perl code. In which case you could just as well use a basic string function. – Qtax Jan 20 '15 at 17:40
  • 1
    @Oli, That's not true. It's only impossible with true regular expressions, but the OP specifically said he knew that, and that he's talking about the regex engines implemented by some languages. – ikegami Jan 20 '15 at 17:40
  • 1
    @Qtax, `/.../` is a regex match operator. Everything between the `//` is a Perl regex. As for recursion, I am referring to regex recursion, not Perl recursion. – ikegami Jan 20 '15 at 17:41
  • 1
    @ikegami, you can embed any code you like in `(??{ ... })` and friends, to do anything and everything. Should I remove the Perl tag or make it more clear? – Qtax Jan 20 '15 at 17:43
  • 2
    If you want to ask about PCRE and Java but don't care about Perl, go ahead. It's kinda weird to ask what modern regex engines can do and not want to know what the leader can do, though. – ikegami Jan 20 '15 at 17:44
  • 1
    I imagine this _has_ to be regex or you wouldn't have asked, but why not just get the middle character with: `char middle = str.charAt(str.length()/2)`? If the string length is always odd, that should work... – Ryan J Jan 20 '15 at 17:59
  • @RyanJ, that this question is about the capabilities/possibilities of modern regex flavors is explained in the last paragraph of the question. – Qtax Jan 20 '15 at 19:23
  • @Qtax fair enough, I probably glossed over it and missed the point of the question. That's what comments are for :) – Ryan J Jan 20 '15 at 19:51
  • @Qtax You say that using balancing groups for this is simple. Could you post the solution? – mbomb007 Sep 23 '16 at 14:10

2 Answers2

8

With PCRE and Perl (and probably Java) you could use:

^(?:.(?=.*?(?(1)(?=.\1$))(.\1?$)))*(.)

which would capture the middle character of odd length strings in the 2nd capturing group.

Explained:

^ # beginning of the string
(?: # loop
  . # match a single character
  (?=
    # non-greedy lookahead to towards the end of string
    .*?
    # if we already have captured the end of the string (skip the first iteration)
    (?(1)
      # make sure we do not go past the correct position
      (?= .\1$ )
    )
    # capture the end of the string +1 character, adding to \1 every iteration
    ( .\1?$ )
  )
)* # repeat
# the middle character follows, capture it
(.)
Qtax
  • 31,392
  • 7
  • 73
  • 111
2

Hmm, maybe someone can come up with a pure regex solution, but if not you could always dynamically build the regex like this:

public static void main(String[] args) throws Exception {
    String s = "12345";
    String regex = String.format(".{%d}3.{%d}", s.length() / 2, s.length() / 2);
    Pattern p = Pattern.compile(regex);
    System.out.println(p.matcher(s).matches());
}
BarrySW19
  • 3,421
  • 9
  • 23