9

I'm trying to understand the following regular expression quantifier (a is just an exemplary token here):

a{n}?

How does the question mark affect the match of the above expression? And how does it differ from the following?

a{n}

I would have expected the pattern aa{1}?a to match both aaa and aa for example. While it matches aaa, aa is not a match. The pattern a(a{1})?a does match both, so the parentheses do make a difference here.


Note: The msdn article Quantifiers in Regular Expressions states for both:

The {n} quantifier matches the preceding element exactly n times, where n is any integer.

For {n}?, it adds the following, not overly helpful part:

It is the lazy counterpart of the greedy quantifier {n}+.

Marius Schulz
  • 14,538
  • 12
  • 58
  • 94
  • 1
    What does this have to do with C#? – Austin Henley Aug 01 '13 at 23:03
  • @AustinHenley may be he means it in context of C# Regex only. – TheVillageIdiot Aug 01 '13 at 23:03
  • @AustinHenley Different languages have different Regex flavors. As far as I know, the `{n}?` quantifier is not part of the POSIX standard, so I added the language I'm working with to avoid ambiguities. – Marius Schulz Aug 01 '13 at 23:09
  • 7
    It almost seems like whoever wrote that article wasn't thinking about what they were writing. How could there be a greedy vs lazy version of *matching exactly n times*? – McGarnagle Aug 01 '13 at 23:14
  • 2
    @McGarnagle I would understand the thinking behind an *optional* version (match it *either **exactly n times** or **not at all***), which can be expressed by the pattern `a(a{1})?a`. But you're right, what's a greedy, exactly counted match? – Marius Schulz Aug 01 '13 at 23:17
  • 6
    @Mobstaa: Not in regular expressions. You should read the question before commenting or answering. :-) – Ken White Aug 01 '13 at 23:23
  • This answer has been added to the [Stack Overflow Regular Expression FAQ](http://stackoverflow.com/a/22944075/2736496), under "Quantifiers > More on the differences..." – aliteralmind Apr 10 '14 at 00:10

2 Answers2

8

Nothing. The article states:

The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.

The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.

Notice the text is exactly the same. Basically, adding ? does not change the behavior of the quantifier. It appears that .NET's regular expression engine supports {n}? as a alternative to {n}.


Interestingly, this article does appear to contain an error:

The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.

This is wrong. The lazy equivalent of {n,} is {n,}? which is not the same as {n}?.

Community
  • 1
  • 1
p.s.w.g
  • 136,020
  • 27
  • 262
  • 299
  • So it seems the version with the question mark (`{n}?`) is only implemented (and documented) for completeness because its siblings `{n,}?` and `{n,m}?` also exist? – Marius Schulz Aug 01 '13 at 23:34
  • 1
    @MariusSchulz Yes, I'd say so. The article even provides an example using `{n}?` but that code behaves identically if you replace it with `{n}`. – p.s.w.g Aug 01 '13 at 23:36
  • 10
    Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have `{n}?` problems. – Marius Schulz Aug 01 '13 at 23:42
  • "only implemented (and documented) for completeness" -- no, it's the other way around ... the spec and the implementation would have to go out of the way to exclude {...}? in the case where there's an exact count, and there's no reason to do that and plenty of reasons not to. – Jim Balter Jul 21 '19 at 03:46
  • "It is the lazy counterpart of the greedy quantifier {n}+" -- that was an error (in implementations that support it, {n}+ is possessive, not merely greedy) but the documentation has been changed and no longer says that. – Jim Balter Jul 21 '19 at 03:49
0

More a notice than an answer, but good to know, in particular if you project to use a same pattern in different languages or if you decide to use an other regex library with .net.

About:

I would have expected the pattern aa{1}?a to match both aaa and aa for example. While it matches aaa, aa is not a match.

a{n} and a{n}? produce the same result (there are seen as the greedy and non-greedy version, but of a fixed quantifier) with most of the regex engines.

But this is not the case with Oniguruma and Onigmo regex engines. With them a{n}? behaves like (?:a{n})?. Since wrappers for .net exist for these libraries, it is useful to clarify.

The same with ERE (Extended Regular Expressions) used in sed, grep and with dbms.

Casimir et Hippolyte
  • 83,228
  • 5
  • 85
  • 113