Given a txt file, how can I use grep to find the line that contains at least 5 times a same vowel using regular expression?
I tried:
egrep '[aeiou]{5,}\1' file
but it does not work.
Given a txt file, how can I use grep to find the line that contains at least 5 times a same vowel using regular expression?
I tried:
egrep '[aeiou]{5,}\1' file
but it does not work.
For a test file that looks like this
aeiou
a1a2a3a4a5
e1e2e3e4e5
aaa
where we want to match the second and third line, you can use
$ grep -E '([aeiou]).*(\1.*){4}' infile
a1a2a3a4a5
e1e2e3e4e5
This matches and captures any one vowel, then looks for four times the same vowel, padded with optional characters.
With GNU grep:
To find words with at least 5 times the same vowel:
grep -Eo '\b\w*((a\w*){5,}|(e\w*){5,}|(i\w*){5,}|(o\w*){5,}|(u\w*){5,})\b' file
One idea is to match for a vowel followed by anything
and expect this pattern at least 5 times. So, for a
alone, that would be:
egrep '(a.*){5,}' $FILE
Repeat that for each vowel using the |
character, like:
egrep '(a.*){5,}|(e.*){5,} <and so on>' $FILE
There might be a better solution, for example compacting all those pipes (|
), but I cannot think of any right now.