0

from regular-expressions.info (emphasis added)

Let's take one more look inside, to make sure you understand the implications of the lookahead. Let's apply q(?=u)i to quit. The lookahead is now positive and is followed by another token. Again, q matches q and u matches u. Again, the match from the lookahead must be discarded, so the engine steps back from i in the string to u. The lookahead was successful, so the engine continues with i. But i cannot match u. So this match attempt fails. All remaining attempts fail as well, because there are no more q's in the string.

does this necessarily mean that the match has to stop after this q not followed by a u is matched? What can come after once the q is matched? What if we want to perform more matches after this q not followed by a u? eg, if I want to continue to match the rest of the letters in the word quote? q(?=u)ote.

1252748
  • 12,116
  • 27
  • 89
  • 197
  • It would still fail continue to fail.. – hwnd Aug 30 '14 at 20:32
  • I don't see why you're over complicating things. `(?=u)` asserts that there is an `u` ahead, if it exists it will continue but then the regex states that it needs to match `i` which is never true if the lookahead succeeded. So basically this is something like `10 > 20`. It's always false. [This post might be funny](http://stackoverflow.com/questions/1723182/a-regex-that-will-never-be-matched-by-anything) – HamZa Aug 30 '14 at 20:56

1 Answers1

0

Yes, once the lookahead assertion fails then the match stops. In that sense they're no different from any other part of a regex - if they don't "match" then the overall match fails. The difference is that the matching characters are not consumed by the match, so the following part of the regex (if there is one) still needs to match those characters.

In this case the u matches the lookahead (?=u), but the u isn't consumed by the match. Therefore in the next step the u is tested against the i and the overall match fails. Using q(?=u) means the q must be followed by a u. To match quit using a similar regex you could use q(?=u)uit.

If you want to match after a q not followed by a u then you could use a negative lookahead instead of a positive lookahead, e.g. q(?!u)ote would match qote, but these examples are contrived. Lookaheads (and Lookbehinds) are very useful but they take some getting used to, and they're not needed in the vast majority of cases.

Fish
  • 331
  • 1
  • 3
  • 9
  • Hm. So in what sense is `/q(?=u)uit/` different than simply `/q(?=u)uit/`? :( – 1252748 Aug 30 '14 at 22:27
  • I wrote that last comment wrong. Should have been: "So in what sense is `/q(?=u)uit/` different than simply `/quit/` ?" – 1252748 Aug 30 '14 at 22:54
  • @thomas There isn't a difference, which is why it's a contrived example. I very rarely use positive lookaheads, because the desired outcome can often be achieved in a simpler way. – Fish Aug 30 '14 at 23:09
  • Perhaps because you can plug an additional regex into the lookahead parenthesis you might achieve something pretty clever. – 1252748 Aug 30 '14 at 23:18