1

I would like to match some strings using regex while negating others in R. In the below example, I would like exclude subsections of strings that I would otherwise like to match. Example below using the answer from Regular expression to match a line that doesn't contain a word?.

My confusion is that when I try this, grepl throws an error:

Error in grepl(mypattern, mystring) : invalid regular expression 'boardgames|(^((?!games).)*$)', reason 'Invalid regexp'

mypattern <- "boardgames|(^((?!games).)*$)"
mystring <- c("boardgames", "boardgames", "games")

grepl(mypattern, mystring)

Note running using str_detect returns desired results (i.e. T, T, F), but I would like to use grepl.

tall_table
  • 261
  • 3
  • 9

1 Answers1

2

We need perl = TRUE as the default option is perl = FALSE

grepl(mypattern, mystring, perl = TRUE)
#[1]  TRUE  TRUE FALSE

This is needed when Perl-compatible regexps are used

According to ?regexp

The perl = TRUE argument to grep, regexpr, gregexpr, sub, gsub and strsplit switches to the PCRE library that implements regular expression pattern matching using the same syntax and semantics as Perl 5.x, with just a few differences.

akrun
  • 674,427
  • 24
  • 381
  • 486