3

Using Perl and Regex, is it possible to create a one-liner regex to match some phrases while NOT matching other? The rules are stored in a hash and the qualifier (column) needs to match the expression fed to it.

I have an object that contains qualifiers that I would prefer to feed in as a one-line regex. For example:

my %filters = {
  rule_a  => {
    regex  => {
      qualifier_a => qr/fruits/i,
      qualifier_b => qr/apples|oranges|!bananas/i, # NOT bananas
    }
  }
}

So, in this case, I want to match qualifier_a containing 'fruits' and qualifier_b containing 'apples', 'oranges', but NOT 'bananas'.

The script loops over each qualifier as follows:

my $match = 1; # Assume a match    
foreach my $qualifier (keys %filters{rule_a}{regex}) {
  # If fails to match, set match as false and break out of loop
  $match = ($qualifier =~ /%filters{rule_a}{regex}{$qualifier}/) ? 1 : 0);
  if(!$match){
   last;
  }
}
if($match){
  # Do all the things
}

Thoughts?

Ryan Dunphy
  • 629
  • 2
  • 8
  • 26
  • I thought the saying was "this, that **and** the other." ;) – ThisSuitIsBlackNot Dec 08 '14 at 16:42
  • I should also mention that this regex can and most likely WILL end up with more values as the filters get more complex. So assume that the regex can contain multiple HAVES and HAVE NOTS. (ie: Contains apples, oranges, but not bananas and grapes – Ryan Dunphy Dec 08 '14 at 17:07

2 Answers2

3

Somewhat convoluted, but does its job:

^.*?bananas.*(*SKIP)(*FAIL)|apples|oranges

Demo

The first part will match the whole string if it contains bananas, and then force a failure without retrying at the next position.

The demo uses . to make lines independent, but you could use the s option if your string can contain newlines.

Lucas Trzesniewski
  • 47,154
  • 9
  • 90
  • 138
  • Still trying to wrap my head around negative lookarounds, but I like that it forces the failure. – Ryan Dunphy Dec 08 '14 at 16:56
  • 1
    This uses backtracking verbs instead of lookarounds. If you use a lookaround, you'd need to duplicate it: `(?!^.*?bananas)(?:apples|oranges)(?!.*?bananas)`. This seemed ugly to me. – Lucas Trzesniewski Dec 08 '14 at 17:00
1

You can use a capture group and check the status of it:

$match = ($qualifier =~ /banana|(apples|oranges)/);
if (!($match && $1)) {
    last;
}

As banana isn't in the capture group, $1 won't be set if it matches.

See Regex Pattern to Match, Excluding when... / Except between for a much more detailed explanation of this technique.

Community
  • 1
  • 1
RobEarl
  • 7,556
  • 6
  • 31
  • 48