1

EDIT:

As divibisan notes, this question provides a range of general regex answers, which were tested on Python. I'm not sure works all of them apply in R. The highlighted answer is noted to be time-consuming; the most up-voted answer does not work directly on this example.


I'm trying to plug in an option for the user to select variables based on a regex in a custom function. The pattern should be optional, but I can't think of a foolproof default value.

library(dplyr)

my_select <- function(..., pattern = "") {

  x <- enquos(...)

  mtcars %>% 
    select(!!!x, matches(pattern))


} 

my_select(cyl)
#> Error in matches(pattern): nchar(match) > 0 is not TRUE

NULL or any other logical also gives an error because the regex argument passed to matches must be a string.

#> Error in matches(pattern) : is_string(match) is not TRUE 

For the moment, I'm going with " ", as I presume it would be extremely rare... but it is certainly possible.

Is there a way around this or should I simply avoid matches and write my own thing in base?

Cheers!

Fons MA
  • 780
  • 1
  • 6
  • 19
  • To be clear: if the user doesn’t provide a `pattern` argument, then no columns should be selected, right? – divibisan Mar 23 '19 at 01:31
  • That's right! Just whatever columns go to the ellipsis. the other way around (only pattern supplied) works fine out of the box – Fons MA Mar 23 '19 at 01:44
  • 1
    Possible duplicate of [A Regex that will never be matched by anything](https://stackoverflow.com/questions/1723182/a-regex-that-will-never-be-matched-by-anything) – divibisan Mar 23 '19 at 01:49
  • Just after I answered, I noticed this likely duplicate – divibisan Mar 23 '19 at 01:50
  • The discussion over there has been tested mostly for Python.. not sure about the rules for duplication but I do like your answer better, tbh. – Fons MA Mar 23 '19 at 02:21

2 Answers2

1

How about ”^$”?

^ matches the start of a string, while $ matches the end. So ”^$” would only match an empty string, which definitely isn’t a legal R variable name.

grepl('^$', '')
[1] TRUE

grepl('$^', c('d', 'cat', '1', 's.s'))
[1] FALSE FALSE FALSE FALSE
divibisan
  • 8,631
  • 11
  • 31
  • 46
1

Perhaps sidestepping the intent of the question but you can also use some default and use an if to catch it:

library(dplyr)

my_select <- function(..., pattern = NULL) {

  x <- enquos(...)
  if (is.null(pattern)) {
    mtcars %>% 
      select(!!!x)
  }
  else {
    mtcars %>% 
      select(!!!x, matches(pattern))
  }

} 

my_select(cyl)
#>                     cyl
#> Mazda RX4             6
#> Mazda RX4 Wag         6
#> Datsun 710            4
#> Hornet 4 Drive        6
#> Hornet Sportabout     8
#> Valiant               6
#> Duster 360            8
#> Merc 240D             4
#> Merc 230              4
#> Merc 280              6
#> Merc 280C             6
#> Merc 450SE            8
#> Merc 450SL            8
#> Merc 450SLC           8
#> Cadillac Fleetwood    8
#> Lincoln Continental   8
#> Chrysler Imperial     8
#> Fiat 128              4
#> Honda Civic           4
#> Toyota Corolla        4
#> Toyota Corona         4
#> Dodge Challenger      8
#> AMC Javelin           8
#> Camaro Z28            8
#> Pontiac Firebird      8
#> Fiat X1-9             4
#> Porsche 914-2         4
#> Lotus Europa          4
#> Ford Pantera L        8
#> Ferrari Dino          6
#> Maserati Bora         8
#> Volvo 142E            4

Created on 2019-03-22 by the reprex package (v0.2.1)

Calum You
  • 12,622
  • 2
  • 17
  • 35