0

I'm trying to understand the positive and negative lookahead, but I think I'm missing something.

q(?=u) 

What I understand this regex to mean is: "match a q that is followed by a u", so I get a match with the string "quit", but getting only a group with 'q'.

But with the regex q(?=u)i, I don't get any result with the string "quit". Why does happen? Probably this regex doesn't have sense, but I would like to know what it means in order to understand the lookahead.

GreenMatt
  • 16,928
  • 6
  • 49
  • 75
p0kero
  • 140
  • 1
  • 10

2 Answers2

2

The lookahead doesn't consume its text. qu is different from q(?=u) -- the latter matches just a q, but requires it to be followed by u (which is however not captured or consumed).

And that's why q(?=u)i cannot match -- the q needs to be followed by u and i at the same time, which is impossible. In other words, it would find and capture qi but only if the q was immediately followed by u which is obviously not true if it is followed by i.

If you want to match qui, the regex for that is qui.

tripleee
  • 139,311
  • 24
  • 207
  • 268
1

Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u). The negative lookahead construct is the pair of parentheses, with the opening parenthesis followed by a question mark and an exclamation point. Inside the lookahead, we have the trivial regex u.

Positive lookahead works just the same. q(?=u) matches a q that is followed by a u, without making the u part of the match. The positive lookahead construct is a pair of parentheses, with the opening parenthesis followed by a question mark and an equals sign.

Regex lookahead, lookbehind and atomic groups

Aman Jaiswal
  • 970
  • 2
  • 17
  • 32