1

I want to match "AB",if behind "A" is not B,only match "A"

I used perl regex: A(*ACCEPT)??B

Strings "AB" is good match,but "AC" it not return "A".Why?

I know alternative,but I want to understand (*ACCEPT) with quantifier.

Is it something I understand wrong? Thanks for your help!

Steven
  • 23
  • 3
  • Why are you using the nongreedy quantifier on `(*ACCEPT)` ? It does not make sense to me. The meaning of `a??` is match "a" 0 or 1 times. Try 0 first, then 1. On the other hand, `(*ACCEPT)` is not a character but a command to the parser – Håkon Hægland Feb 02 '20 at 09:27
  • 1
    Did you mean to use something like `AB|A(*ACCEPT)B`? Well, this pattern makes little sense since it is the same as `AB?` / `A(?:B)?`. You only need to use `(*ACCEPT)` when you need to factor alternatives in a group that is followed with some pattern that should only be matched with some of the alternatives. For your case, just make the `B` optional, `(?:B)?` or if it is a single atom, `B?`. – Wiktor Stribiżew Feb 02 '20 at 09:58
  • Please clarify what you want to achieve. – Wiktor Stribiżew Feb 02 '20 at 12:35
  • I saw it in:https://www.pcre.org/current/doc/html/pcre2pattern.html#SEC29 (*ACCEPT) is the only backtracking verb that is allowed to be quantified because an ungreedy quantification with a minimum of zero acts only when a backtrack happens. Consider, for example,(A(*ACCEPT)??B)C – Steven Feb 03 '20 at 01:52
  • You are significantly overcomplicating this. Here is one of my more recent explanations of how the regex engine works, it will always try the same things first and succeed with the first matching substring it finds. https://stackoverflow.com/questions/59849525/perl-regular-expression-regex-fails-when-i-make-it-optional/59849622#59849622 – Grinnz Feb 03 '20 at 15:37

2 Answers2

1

You pointed to the docs that say:

(*ACCEPT) is the only backtracking verb that is allowed to be quantified because an ungreedy quantification with a minimum of zero acts only when a backtrack happens. Consider, for example,

(A(*ACCEPT)??B)C

where A, B, and C may be complex expressions. After matching "A", the matcher processes "BC"; if that fails, causing a backtrack, (*ACCEPT) is triggered and the match succeeds. In both cases, all but C is captured. Whereas (*COMMIT) (see below) means "fail on backtrack", a repeated (*ACCEPT) of this type means "succeed on backtrack".

However, (*ACCEPT) doesn't seem to relate to backtracking, and you see it here in your example. So, AC can't be matched with A(*ACCEPT)??B because:

  • A in the pattern matches A in the string,
  • (*ACCEPT)?? is skipped first because it is lazily quantified
  • B can't match C in the string, and fail occurs.
  • You expected backtracking to occur, but (*ACCEPT)?? does not trigger backtracking.

    A more helpful (*ACCPET) usage example:

The only use case for (*ACCEPT) that I'm aware of is when the branches of an alternation are distributed into a later expression that is not required for all of the branches. For instance, suppose you want to match any of these patterns: BAZ, BIZ, BO.

You could simply write BAZ|BIZ|BO, but if B and Z stand for complicated sub-patterns, you'll probably look for ways to factor the B and Z patterns. A first pass might give you B(?:AZ|IZ|O), but that solution doesn't factor the Z. Another option would be B(?:A|I)Z|BO, but it forces you to repeat the B. This pattern allows you to factor both the B and the Z:

B(?:A|I|O(*ACCEPT))Z

If he engine follows the O branch, it never matches BOZ because it returns BO as soon as (*ACCEPT) is encountered—which is what we wanted.

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
0

Here is a regex that will match only the "A" out of "AC", but will match the whole of "AB":

AB?

... but from your regex, I believe that's probably too simple? What are some longer strings you might want to match with the same regex?

J S
  • 766
  • 6
  • 11
  • pcre.org/current/doc/html/pcre2pattern.html#SEC29 (*ACCEPT) is the only backtracking verb that is allowed to be quantified because an ungreedy quantification with a minimum of zero acts only when a backtrack happens. Consider, for example,(A(*ACCEPT)??B)C – Steven Feb 03 '20 at 01:59
  • Again, what are you trying to match? In your example, I don't see why you need any backtracking. – J S Feb 05 '20 at 01:37