grep find the words with same vowels in a line

Question

Given a txt file, how can I use grep to find the line that contains at least 5 times a same vowel using regular expression?

I tried:

egrep '[aeiou]{5,}\1' file

but it does not work.

Do you want to find the **line** that contains at least 5 times a same vowel or find **words** with at least 5 times a same vowel? — Cyrus, Feb 12 '16 at 18:01

score 2 · Answer 1 · answered Feb 12 '16 at 18:51

For a test file that looks like this

aeiou
a1a2a3a4a5
e1e2e3e4e5
aaa

where we want to match the second and third line, you can use

$ grep -E '([aeiou]).*(\1.*){4}' infile 
a1a2a3a4a5
e1e2e3e4e5

This matches and captures any one vowel, then looks for four times the same vowel, padded with optional characters.

score 1 · Answer 2 · edited May 23 '17 at 11:44

1

With GNU grep:

To find words with at least 5 times the same vowel:

grep -Eo '\b\w*((a\w*){5,}|(e\w*){5,}|(i\w*){5,}|(o\w*){5,}|(u\w*){5,})\b' file

edited May 23 '17 at 11:44

Community

answered Feb 12 '16 at 17:59

Cyrus

score 1 · Accepted Answer · answered Feb 12 '16 at 18:05

One idea is to match for a vowel followed by anything and expect this pattern at least 5 times. So, for a alone, that would be:

egrep '(a.*){5,}' $FILE

Repeat that for each vowel using the | character, like:

egrep '(a.*){5,}|(e.*){5,} <and so on>' $FILE

There might be a better solution, for example compacting all those pipes (|), but I cannot think of any right now.

3 Answers3