4

Trying to untaint some variables in Perl, and the following code works great:

if ($year =~ /^(\d{4})$/) {
        $year = $1;
} else {
        &invalid("year");
}

In the above instance, $1 contains $year if valid. However, when using the ?: operator, $1 contains "1" when valid:

($year =~ /^(\d{4})$/) ? $year = $1 : &invalid("year");

Anyone see where I might be at fault? I'm confused why this is happening. It's only happening on this machine. Or rather, I have successfully used the ? operator for returning proper match variables for years. I haven't tried this piece of code on any other machine yet.

This is Perl, v5.8.8 built for x86_64-linux-thread-multi
Jonathan Leffler
  • 666,971
  • 126
  • 813
  • 1,185
Jacob
  • 49
  • 1
  • 3
    Are you sure that's the code you're running? Can you include a small self-contained script that demonstrates your problem? – mu is too short Sep 06 '11 at 03:52
  • Assignments inside of the conditional operator are often incorrect due to precedence issues. (e.g. `c ? x=1 : x=2` means `(c ? x=1 : x) = 2`) The code you presented does not have the problem you describe or any precedence issues, though. – ikegami Sep 06 '11 at 04:19
  • OT: don't call sub routines with `&` as in `&invalid(...)`; it doesn't do what you expect... use `invalid(...)` – pavel Sep 06 '11 at 13:05
  • @pavel, Or more precisely, it does more than most expect. (It does call the sub, which is the expected behaviour.) – ikegami Sep 06 '11 at 15:02
  • @pavel: what does prefixing the sub with a sigil do that isn't what I expect (which is, Perl 4 compatibility notation for a call to a sub that may or may not have been seen by the Perl compiler as yet)? – Jonathan Leffler Sep 06 '11 at 15:18
  • 2
    @jonathan: it disables prototype checking and, more importantly, when used without arguments, the sub routine will use the `@_` array as it is when calling the routine... – pavel Sep 06 '11 at 15:24

3 Answers3

6

Be careful about precedence here.

Contrast http://codepad.org/vEPnhWfH which works as expected, with http://codepad.org/nXVU5CA7 which does not. Of course, I tweaked the code a little bit to avoid calling invalid, but I suspect this may be at the root of the problem.

In both cases, though $1 contains "2011", so perhaps you should show additional code, as requested in the first comment by mu.

UPDATE

I changed the codepad examples to use the call to &invalid and the error does not show up.

sub invalid {
    print("INVALID\n");
}

sub check_with_if {
    print("Trying with if-statement\n");
    my $year = shift;
    if ($year =~ /^(\d{4})$/) {
        $year = $1;
    } else {
        &invalid($year);
    }
    print("\$1 is $1 and \$year is $year"."\n");
}

sub check_with_conditional {
    print("Trying with conditional expression\n");
    my $year = shift;
    ($year =~ /^(\d{4})$/) ? $year = $1 : &invalid($year);
    print("\$1 is $1 and \$year is $year"."\n");
}

check_with_if("2011");
check_with_conditional("2011");

Output is

Trying with if-statement 
$1 is 2011 and $year is 2011
Trying with conditional expression
$1 is 2011 and $year is 2011

http://codepad.org/z22GMEcn

EDIT 2 Also works on 5.12.3 on the Mac.

I agree with Jonathan that you may have found a bug in 5.8.8 that has been fixed since. If you are incredibly curious you can work your way through the Perl changes, such as:

Ray Toal
  • 79,229
  • 13
  • 156
  • 215
  • Interesting: in the second example, you have `($year =~ /^(\d{4})$/) ? $year = $1 : $year = "invalid";` and that produces `invalid`. Change that to `($year =~ /^(\d{4})$/) ? $year = $1 : &invalid("year");` and sub `invalid()` is not called. (Test: Perl 5.14.1 on MacOS X 10.7.1). I'm not sure I have an explanation. – Jonathan Leffler Sep 06 '11 at 04:45
  • @Jonathan I think I did do that in my first update. The reason is precedence: assignment has lower precedence than conditional. The invalid subroutine is called with bad data: http://codepad.org/RUzgLQDa (Unless I am not understanding your remark.) – Ray Toal Sep 06 '11 at 04:56
  • Happens all the time :) Wonder if the OP will add to the question? – Ray Toal Sep 06 '11 at 05:41
5

Damien Conway's Perl Best Practices explains some of the pitfalls of using the match variables $1 and so on, one of them is that:

  • if the regex doesn't match anything, $1 does not become undef, but it preserves its previous value (which of course, in most cases is undef)

My intuition tells me that you have a problem with contexts - your expression might be evaluated in scalar context, thus returning the number of matches when it is valid.

So, I would try rewriting the code like:

($year) = ( $year =~ /^(\d{4})$/) or &invalid("year");
Tudor Constantin
  • 24,065
  • 7
  • 44
  • 66
  • 1
    That has the side-effect of changing `$year` if the year is invalid. That would prevent the value from being used in error messages, for one. – ikegami Sep 06 '11 at 05:57
  • indeed it has that side effect which can be avoided by my `($validated_year) = ( $year =~ /^(\d{4})$/) or &invalid("year");` – Tudor Constantin Sep 06 '11 at 09:31
3

With Perl 5.14.1 (on MacOS X 10.7.1), this script seems to work OK:

use strict;
use warnings;

my $year = "1984";

sub invalid
{
    my($year) = @_;
    print "Invalid year $year\n";
}

if ($year =~ /^(\d{4})$/)
{
    $year = $1;
}
else
{
    &invalid("year");
}


print "Year1: $year\n";

$year = "1984";
($year =~ /^(\d{4})$/) ? $year = $1 : &invalid("year");

print "Year2: $year\n";

It produces:

Year1: 1984
Year2: 1984

If the same script produces a different answer in Perl 5.8.8, then presumably you've hit a bug that has been fixed and an upgrade is in order. If Perl 5.8.8 produces the same answer (as Perl 5.14.1), then you have not yet characterized your problem so that I can understand it and reproduce the issue.

Jonathan Leffler
  • 666,971
  • 126
  • 813
  • 1,185
  • I tried my script above on a RHEL 5 x86_64 machine with the system-provided Perl 5.8.8, and got the same output as from 5.14.1 on MacOS X. I also tried my home-built Perl 5.12.1 on RHEL 5 and got the same result. – Jonathan Leffler Sep 06 '11 at 05:13