i googled about regex for checking string more than twice in a line, then i found this example:
egrep "(\w{2}).*\1" file
but i coudlnt understand "(\w{2}).*\1" this.
can someone explain me in detail or get me some reference web page??
i googled about regex for checking string more than twice in a line, then i found this example:
egrep "(\w{2}).*\1" file
but i coudlnt understand "(\w{2}).*\1" this.
can someone explain me in detail or get me some reference web page??
(\w{2})
matches any word character that includes all these: A-Z,
a-z, 0-9 and underscore with a quantity of 2({2}
quantifier), it
also groups them as a captured group i.e remembers the matched characters and those
characters can be referenced again using numbered backreferences, in
this case \1
.*
matches 0 or more any chars\1
matches the 1st group againTherefore the regex tries to match any 2 word characters that are repeated after 0 or more characters in the same line.
$ egrep "(\w{2}).*\1"
ab;;ab
ab;;ab
abcdab
abcdab
12ab12
12ab12
12abcd123
12abcd123
abab
abab
$
Inputs and matched output:
ab;;ab
captured group \1: ab
and matched string is ab;;ab
abcdab
captured group \1: ab
and matched string is abcdab
12ab12
captured group \1: ab
and matched string is 12ab12
12abcd123
captured group \1: 12
and matched string is
12abcd12
abab
captured group \1: ab
and matched string is abab
As pointed out more information on the meta/special characters can be found here