You pointed to the docs that say:
(*ACCEPT)
is the only backtracking verb that is allowed to be quantified because an ungreedy quantification with a minimum of zero acts only when a backtrack happens. Consider, for example,
(A(*ACCEPT)??B)C
where A, B, and C may be complex expressions. After matching "A", the matcher processes "BC"; if that fails, causing a backtrack, (*ACCEPT)
is triggered and the match succeeds. In both cases, all but C is captured. Whereas (*COMMIT)
(see below) means "fail on backtrack", a repeated (*ACCEPT)
of this type means "succeed on backtrack".
However, (*ACCEPT)
doesn't seem to relate to backtracking, and you see it here in your example.
So, AC
can't be matched with A(*ACCEPT)??B
because:
A
in the pattern matches A
in the string,
(*ACCEPT)??
is skipped first because it is lazily quantified
B
can't match C
in the string, and fail occurs.
You expected backtracking to occur, but (*ACCEPT)??
does not trigger backtracking.
A more helpful (*ACCPET)
usage example:
The only use case for (*ACCEPT)
that I'm aware of is when the branches of an alternation are distributed into a later expression that is not required for all of the branches. For instance, suppose you want to match any of these patterns: BAZ
, BIZ
, BO
.
You could simply write BAZ|BIZ|BO
, but if B
and Z
stand for complicated sub-patterns, you'll probably look for ways to factor the B
and Z
patterns. A first pass might give you B(?:AZ|IZ|O)
, but that solution doesn't factor the Z
. Another option would be B(?:A|I)Z|BO
, but it forces you to repeat the B
. This pattern allows you to factor both the B
and the Z
:
B(?:A|I|O(*ACCEPT))Z
If he engine follows the O branch, it never matches BOZ
because it returns BO
as soon as (*ACCEPT)
is encountered—which is what we wanted.