1

Given a txt file, how can I use grep to find the line that contains at least 5 times a same vowel using regular expression?

I tried:

egrep '[aeiou]{5,}\1' file

but it does not work.

Alan Moore
  • 68,531
  • 11
  • 88
  • 149
Amber
  • 113
  • 1
  • 6
  • 1
    Do you want to find the **line** that contains at least 5 times a same vowel or find **words** with at least 5 times a same vowel? – Cyrus Feb 12 '16 at 18:01

3 Answers3

2

For a test file that looks like this

aeiou
a1a2a3a4a5
e1e2e3e4e5
aaa

where we want to match the second and third line, you can use

$ grep -E '([aeiou]).*(\1.*){4}' infile 
a1a2a3a4a5
e1e2e3e4e5

This matches and captures any one vowel, then looks for four times the same vowel, padded with optional characters.

Benjamin W.
  • 33,075
  • 16
  • 78
  • 86
1

With GNU grep:

To find words with at least 5 times the same vowel:

grep -Eo '\b\w*((a\w*){5,}|(e\w*){5,}|(i\w*){5,}|(o\w*){5,}|(u\w*){5,})\b' file

See: The Stack Overflow Regular Expressions FAQ

Community
  • 1
  • 1
Cyrus
  • 69,405
  • 13
  • 65
  • 117
1

One idea is to match for a vowel followed by anything and expect this pattern at least 5 times. So, for a alone, that would be:

egrep '(a.*){5,}' $FILE

Repeat that for each vowel using the | character, like:

egrep '(a.*){5,}|(e.*){5,} <and so on>' $FILE

There might be a better solution, for example compacting all those pipes (|), but I cannot think of any right now.

fanton
  • 720
  • 5
  • 18