I want to count the number of words from a dictionary that appear in a string if it is surrounded by whitespace or it is at the start or end of the string.
I'm using this answer like this:
library(stringi)
testStr <- c("dutch dutch brown", "brown ", "AAdutch", "dutchAA", "AAbrown",
"brownAA", "hello")
stri_count_regex(testStr, "(^|\\s+)dutch|brown(\\s+|$)")
Which returns 3 1 0 1 1 0 0
, but I'm expecting 3 1 0 0 0 0 0
. So the problem is that it also counts "dutchAA"
and "AAbrown"
which I don't want.
I'm a bit puzzled about this, as this regular expression works fine when I run it on RegExr.