0

I am working on a anti-spam chat bot and have been using the following Regex to catch spammers.

spam = re.compile(ur'(?:\b(\w+)\b) (?:\1(?: |$))+')

But, the issue with this is that it triggers when the same word repeated twice. How do i make this so it triggers if the word is repeated 4 times in total not just 2 times?

Thanks in Advanced!

Mr.Unknown21
  • 35
  • 1
  • 6
  • 2
    Read about "regex quantifiers". Check the [regex reference](http://stackoverflow.com/a/22944075). Also your example doesn't "just match 2 times". – HamZa Oct 13 '14 at 20:49
  • HamZa nailed it, here's a resource for python: http://www.informit.com/articles/article.aspx?p=1278986 – Devarsh Desai Oct 13 '14 at 20:56

1 Answers1

0

This is for 4 times in a row.
More generally, you can replace space [ ] with whitespace \s+ and allow for some separation.

 # \b(\w+)(?:[ ]\1(?=[ ]|$)){3}

 \b 
 ( \w+ )                # (1), One
 (?:                    # Plus
      [ ] 
      \1                # Three
      (?= [ ] | $ )
 ){3}
                        # Four Total