0

Regexp.union returns the following inspection:

Regexp.union(/dogs/, /cats/i) #=> /(?-mix:dogs)|(?i-mx:cats)/

What does ?-mix: mean?

sawa
  • 156,411
  • 36
  • 254
  • 350
Donato
  • 5,990
  • 7
  • 41
  • 75

2 Answers2

1

They are flag switches inside the regexp. /cats/i means "use flag i across the whole regexp". You can turn flags on and off inside a regexp, so the above is equivalent to /(?i:cats)/. There are two other flags that can be manipulated this way: m and x. Since by default all flags are disabled unless specified otherwise, it is equivalent to /(?i-mx:cats)/ ("enable i and disable m and x inside the regular expression cats").

When you combine regular expressions, they explicitly disable the disabled flags so that they would not inherit them from an outer context. For example:

tiny = /tiny/             # case sensitive
tina = /#{tiny} tina/i    # case insensitive
# => /(?-mix:tiny) tina/i

This will match "tiny tina" or "tiny Tina", but not "Tiny Tina". If the embedded regexp for tiny did not explicitly turn off case sensitivity, and just yielded /tiny tina/i, everything would be case insensitive.

sawa
  • 156,411
  • 36
  • 254
  • 350
Amadan
  • 169,219
  • 18
  • 195
  • 256
1

What you're seeing is a representation of options on sub-regexes. The options to the left of the hyphen are on, and the options to the right of the hyphen are off. It's smart to explicitly set each option as on or off to ensure the right behavior if this regex ever became part of a larger one.

In your example, (?-mix:dogs) means that the m, i, and x options are all off whereas in (?i-mx:cats), the i option is on and thus that subexpression is case-insensitive.

See the Ruby docs on Regexp Options:

The end delimiter for a regexp can be followed by one or more single-letter options which control how the pattern can match.

  • /pat/i - Ignore case
  • /pat/m - Treat a newline as a character matched by .
  • /pat/x - Ignore whitespace and comments in the pattern
  • /pat/o - Perform #{} interpolation only once

i, m, and x can also be applied on the subexpression level with the (?on-off) construct, which enables options on, and disables options off for the expression enclosed by the parentheses.

sawa
  • 156,411
  • 36
  • 254
  • 350
JKillian
  • 16,577
  • 8
  • 30
  • 62