0

In What regular expressions can never match? /$./ was given as a response. I played around with that a bit, and discovered that the following two lines of code generate different output. The second matches, but the first does not. Can anyone explain why?

$ printf 'a\nb\n' | perl -0777 -ne 'print if m/$./m'
$ perl -0777 -e '$_="a\nb\n"; print if m/$./m'

Also, note that adding <> in the following causes the match to fail:

$ printf 'a\nb\n' | perl -0777 -e '$b = "a\nb\n"; say $b =~ m/$./m'
$ printf 'a\nb\n' | perl -0777 -e '$b = "a\nb\n"; <>; say $b =~ m/$./m'

(That is, the first prints '1', the second prints a blank line)

Community
  • 1
  • 1
William Pursell
  • 174,418
  • 44
  • 247
  • 279

4 Answers4

9

Turning on warnings gives a clue about the reason:

$ printf 'a\nb\n' | perl -0777 -w -e 'use feature qw/say/; $b = "a\nb\n"; say $b =~ m/$./m'
Use of uninitialized value $. in regexp compilation at -e line 1.
1

You're using an undefined value in the regex. The sequence $. refers to the special variable for the line number of the last-accessed file handle. It does not designate the regular expression for "end of line followed by any character." Since you haven't accessed any files, it's still undef, so the regex is empty. When you use the -n option, it effectively wraps the rest of the program in while (<>) { ... }, so you read <> and end up with 1 in $. because you've read one line.

When you say <> in the second attempt, you've accessed the stdin file handle. Now the regular expression is m/1/m, which doesn't match the input string.

Brad Gilbert
  • 32,263
  • 9
  • 73
  • 122
Rob Kennedy
  • 156,531
  • 20
  • 258
  • 446
  • 1
    You can use `-E` instead of `-e` which does the same thing as -e'use feature ':5.10'` => `perl -0777 -wE'$b = "a\nb\n"; say $b =~ m/$./m'` – Brad Gilbert Dec 04 '09 at 18:11
6

The $. in your regex is being parsed as the value of the special variable $. ($INPUT_LINE_NUMBER) rather than "end of line followed by any character."

Also note that the /m modifier changes the meaning of $ from matching at the end of the string to matching a line-ending anywhere within the string. See Modifiers in perlre. This means that it is possible to have something after it (with the appropriate modifiers):

say "a\nb\n" =~ m/$ ./msx;

prints "1". The /x modifier permits the use of embedded whitespace so we can separate the $ from the . to avoid it being interpreted as a variable.

Michael Carman
  • 29,981
  • 9
  • 71
  • 121
5

This code print "broken pipe" for me, because perl doen't expect any input here. It also uses undefined variable $. (you'll see it if you'll add -w switch to perl). This variable $. represents current line number then reading lines by <...>. That's why it's undefined in this example:

## matching pattern will look like m//m
printf 'a\nb\n' | perl -0777 -e '$b = "a\nb\n"; say $b =~ m/$./m'

Following code reads pipe data, but will not match because $. became equal to 1 after <>. And the matching pattern became m/1/m:

## matching patter will be m/1/m, which is not found in $b value
printf 'a\nb\n' | perl -0777 -e '$b = "a\nb\n"; <>; say $b =~ m/$./m'

Update:

Use m'$.'m or m/$ ./mx (thanks to Michael Carman) to disable variable interpolation.

Ivan Nevostruev
  • 26,025
  • 8
  • 61
  • 80
  • Of course! Special variable `$.`! How would one write the regexp, then? Using `m'$.'` ? – JB. Dec 04 '09 at 17:03
  • `/\$./` parses as "literal '$' followed by any character" You can use the `/x` modifier to separate them, though: `m/$ ./x` – Michael Carman Dec 04 '09 at 17:21
0

In the first instance I believe it's because of the -n switch. In other words

printf 'a\nb\n' | perl -0777 -ne 'print if m/$./m'

Causes $_ to get the value a\n the first time through the loop and b\n the second time so clearly that won't match. Whereas with /m $ matches the \n in the second example so that's why that matches.

With the latter two examples, I'm still working on that :)

Edit: Wow I had that completely wrong, and I think you may have also. The issue is that m/$./m is not an end-of-line followed by a wildcard, but rather the variable $. interpolated as a regular expression. Yikes!

Dan
  • 10,380
  • 4
  • 46
  • 74