Returning repeated words from a string

Question

I have a string: "Are you ok? [Hello Hello Hello]. Yes I am! [Bye Bye Bye]"

I need to return ['Hello Hello Hello', 'Bye Bye Bye'] in a list.

Using a regular expression should be the easiest. I have tried findall() but it only returns the first word like Hello and Bye and not the entire string of [Hello Hello Hello] or [Bye Bye Bye]. I have also tried finditer() but that too is returning the only the first world.

text = "Are you ok? [Hello Hello Hello]. Yes I am! [Bye Bye Bye]"
def find_words(text):
    p = re.compile(r'(\w{3,})\s\1')
    for match in p.finditer(text):
        print(match.groups(0))

Expected result ['Hello Hello Hello', 'Bye Bye Bye'] When I run the code I get ['Hello', 'Bye']

Are the 3 words always surrounded by `[]`. – Dani Mesejo Oct 23 '19 at 23:57 — Dani Mesejo, Oct 23 '19 at 23:57

score 0 · Answer 1 · answered Oct 24 '19 at 00:00

0

You could try this regex?

\b(\w+)\s+\1\b

as from here Regular Expression For Consecutive Duplicate Words

answered Oct 24 '19 at 00:00

SamHDev

154
11

Returning repeated words from a string

1 Answers1