Trying to learn regexp but got confused about the syntax. How would these different expressions differ? :
([A-Z]){3}
([A-Z]{3})
[A-Z]{3}
[A-Z]\3 edit: meant ([A-Z])\3
([A-Z]){3}
- matches 3 uppercase letters, there will be 3 groups([A-Z]{3})
- matches 3 uppercase letters into one group[A-Z]{3}
- matches 3 uppercase letters, no grouping[A-Z]\3
- should be an invalid regex in most languages (matches one uppercase letter and a backreference to group 3) e.g. ([A-Z])([A-Z])([A-Z])\3
would matche 2 uppercase letters and another uppercase letter that occurs two times([A-Z]){3}
- This matches three capture groups, with each containing letters from A-Z
([A-Z]{3})
- This is the same as above, but it encloses all three letteres in a single capture group
[A-Z]{3}
- This matches letters from A-Z three times, with no capture group
[A-Z]\3
- This matches a single character from A-Z followed by \3
(at least in Java)
You might be wondering what a "capture group" is. It is a way of keeping track of things which matched during the course of evaluating your regular expression. For example, consider your first regex:
([A-Z]){3}
which is equivalent to
([A-Z])([A-Z])([A-Z])
If you evaluated this regex, then, in Java for example, you would be able to access each of the three matched letters using the variables $1
, $2
, and $3
.