52

It was noted in another question that wrapping the result of a PHP function call in parentheses can somehow convert the result into a fully-fledged expression, such that the following works:

<?php
error_reporting(E_ALL | E_STRICT);

function get_array() {
   return array();
}

function foo() {
   // return reset(get_array());
   //              ^ error: "Only variables should be passed by reference"

   return reset((get_array()));
   //           ^ OK
}

foo();

I'm trying to find anything in the documentation to explicitly and unambiguously explain what is happening here. Unlike in C++, I don't know enough about the PHP grammar and its treatment of statements/expressions to derive it myself.

Is there anything hidden in the documentation regarding this behaviour? If not, can somebody else explain it without resorting to supposition?


Update

I first found this EBNF purporting to represent the PHP grammar, and tried to decode my scripts myself, but eventually gave up.

Then, using phc to generate a .dot file of the two foo() variants, I produced AST images for both scripts using the following commands:

$ yum install phc graphviz
$ phc --dump-ast-dot test1.php > test1.dot
$ dot -Tpng test1.dot > test1.png
$ phc --dump-ast-dot test2.php > test2.dot
$ dot -Tpng test2.dot > test2.png

In both cases the result was exactly the same:

Parse tree of snippets 1 and 2

Community
  • 1
  • 1
Lightness Races in Orbit
  • 358,771
  • 68
  • 593
  • 989
  • 1
    It looks like that this is exclusively to expressions in form of a single function call. – hakre Jul 17 '11 at 20:41
  • 3
    `Array()` with uppercase A? afaik, the language construct is written `array()` – knittl Jul 17 '11 at 20:47
  • 7
    PHP, hence not case-sensitive. – Wrikken Jul 17 '11 at 20:55
  • @knittl: It's not case-sensitive, and I prefer `Array`. – Lightness Races in Orbit Jul 17 '11 at 20:59
  • @wrikken @tomalak: only variables (user code) are case sensitive? didn't know that! learnt something new today – knittl Jul 17 '11 at 21:04
  • @knittl: Yea, [pretty much just variable names](http://codepad.org/NVPG1zjv). – Lightness Races in Orbit Jul 17 '11 at 21:09
  • 2
    The reasons why only a single function call can have this, is that only either a variable or a single function returning by reference _can_ be correct input for `reset`. A variable obviously will always work by reference, which leaves us with the functioncall which is only checked at execution because of the possibility to have something like `$variablewithafunctionname()`. Why the `()` would make `reset` not complain... That would mean at the time `reset` gets its input it _is_ a reference (refcount > 1), which would mean the expression `(get_array())` leaves some zval in memory... – Wrikken Jul 17 '11 at 21:31
  • 1
    Digging a bit further, the strict warning is comming out of the VM part/runtime. The fatal errors (not in the Q's example, one would be: `return reset((get_array()?:0));`) is already at compile time and the wording is much more harsh: *"Fatal error: Only variables can be passed by reference"* (and wrong, if a function returns a reference it's all fine). Many flags are checked prior giving the strict notice, I smell somewhere therein it lies but I do not know much about PHP internals: php-trunk/Zend/zend_vm_execute.h line 10853~ – hakre Jul 17 '11 at 21:38
  • @Wrikken: For the zval idea: It needs no refcount at all, a normal variable wouldn't have any as well. So just a zval (refcount >= 0) should do it. – hakre Jul 17 '11 at 21:45
  • Hmm, point there. And a return from a function has a minimum refcount of 1.. Dang. – Wrikken Jul 17 '11 at 21:46
  • Of course the `reset((get_array()?:0));` is an error at compile time because of the `0`. try something like `reset((get_array()?:$var));`: that can have proper outcomes, but still yields a fatal. – Wrikken Jul 17 '11 at 21:52
  • 1
    Some reference handling internals of PHP explained in deep details, if someone can found things there: http://derickrethans.nl/talks/phparch-php-variables-article.pdf – regilero Jul 17 '11 at 22:07
  • @regilero: Good heavens; I'll definitely have to give that a read when I get a chance. Thanks! – Lightness Races in Orbit Jul 17 '11 at 22:15
  • @Wrikken: My fault, there is no refcount = 0 for a var, it's always 1 minimum. `debug_zval_dump(get_array());` gives one refcount btw., using parenthesis makes no difference but this can be misleading. – hakre Jul 17 '11 at 23:49
  • @Wrikken: debug_zval_dump always gives at least one refcount, the one from the function parameter. And effectively `debug_zval_dump(get_array());` and `debug_zval_dump((get_array()));` gives the same result, except the first one generates a STRICT notice. – regilero Jul 18 '11 at 08:45
  • Yup, and changing the return of a `get_array` to a reference does not yield any usable results either. I don't think we can get any usable info in PHP, so some brave soul will have to delve through the spaghetti that is the PHP C-source to get a definitive answer. – Wrikken Jul 18 '11 at 10:03

2 Answers2

32

This behavior could be classified as bug, so you should definitely not rely on it.

The (simplified) conditions for the message not to be thrown on a function call are as follows (see the definition of the opcode ZEND_SEND_VAR_NO_REF):

  • the argument is not a function call (or if it is, it returns by reference), and
  • the argument is either a reference or it has reference count 1 (if it has reference count 1, it's turned into a reference).

Let's analyze these in more detail.

First point is true (not a function call)

Due to the additional parentheses, PHP no longer detects that the argument is a function call.

When parsing a non empty function argument list there are three possibilities for PHP:

  • An expr_without_variable
  • A variable
  • (A & followed by a variable, for the removed call-time pass by reference feature)

When writing just get_array() PHP sees this as a variable.

(get_array()) on the other hand does not qualify as a variable. It is an expr_without_variable.

This ultimately affects the way the code compiles, namely the extended value of the opcode SEND_VAR_NO_REF will no longer include the flag ZEND_ARG_SEND_FUNCTION, which is the way the function call is detected in the opcode implementation.

Second point is true (the reference count is 1)

At several points, the Zend Engine allows non-references with reference count 1 where references are expected. These details should not be exposed to the user, but unfortunately they are here.

In your example you're returning an array that's not referenced from anywhere else. If it were, you would still get the message, i.e. this second point would not be true.

So the following very similar example does not work:

<?php

$a = array();
function get_array() {
   return $GLOBALS['a'];
}

return reset((get_array()));
NikiC
  • 95,987
  • 31
  • 182
  • 219
1

A) To understand what's happening here, one needs to understand PHP's handling of values/variables and references (PDF, 1.2MB). As stated throughout the documentation: "references are not pointers"; and you can only return variables by reference from a function - nothing else.

In my opinion, that means, any function in PHP will return a reference. But some functions (built in PHP) require values/variables as arguments. Now, if you are nesting function-calls, the inner one returns a reference, while the outer one expects a value. This leads to the 'famous' E_STRICT-error "Only variables should be passed by reference".

$fileName = 'example.txt';
$fileExtension = array_pop(explode('.', $fileName));
// will result in Error 2048: Only variables should be passed by reference in…

B) I found a line in the PHP-syntax description linked in the question.

expr_without_variable = "(" expr ")"

In combination with this sentence from the documentation: "In PHP, almost anything you write is an expression. The simplest yet most accurate way to define an expression is 'anything that has a value'.", this leads me to the conclusion that even (5) is an expression in PHP, which evaluates to an integer with the value 5.

(As $a = 5 is not only an assignment but also an expression, which evalutes to 5.)

Conclusion

If you pass a reference to the expression (...), this expression will return a value, which then may be passed as argument to the outer function. If that (my line of thought) is true, the following two lines should work equivalently:

// what I've used over years: (spaces only added for readability)
$fileExtension = array_pop( ( explode('.', $fileName) ) );
// vs
$fileExtension = array_pop( $tmp = explode('.', $fileName) );

See also PHP 5.0.5: Fatal error: Only variables can be passed by reference; 13.09.2005

Community
  • 1
  • 1
feeela
  • 26,359
  • 6
  • 56
  • 68
  • but from this doc page: http://www.php.net/manual/en/language.references.pass.php it seems expressions cannot be used "as the result is undefined". I wonder if the whole parenthesis trick is not just bypassing internal checks, and may become in long term an undefined application result thing. – regilero Jul 17 '11 at 22:58
  • Well this post is highly speculative. In the absence of a documentation (and I've searched for more than one hour, knowing how to use search engines), this is the best I can provide. My idea was, to commonly create a documentation for that behavior as a SO-wiki-entry. – feeela Jul 17 '11 at 23:11
  • FWIW, `(5)` would be an expression in pretty much all C-like languages. – Lightness Races in Orbit Jul 17 '11 at 23:23
  • 2
    -1 "IMO, that means, any function in PHP will return a reference." is not true. The answer has some passages that are true but the conclusions never follow from those passages. – Artefacto Jul 18 '11 at 14:03
  • I think "Bug" is not a classification at all. It's a non-precise term that often is congruent being a "Feature". Would you classify the described behaviour as a ["Fault"](http://stackoverflow.com/questions/494498/what-is-a-software-fault-in-testing) in the PHP programming language? – hakre Aug 30 '11 at 00:28
  • @hakre *grin*, no it's a nice feature—also questioned on the PHP bug-report: "Should we make use of this feature when programming PHP or not?" – feeela Aug 30 '11 at 09:12
  • @feeela: That was hakre's comment, was it not? – Lightness Races in Orbit Sep 12 '11 at 07:22
  • @Tomalak Geret'kal Not really sure what you mean, but yes I've cited hakres comment on the PHP-bug-page… (as linked above by himself). – feeela Sep 13 '11 at 10:28
  • @feeela: Just seemed like you were quoting something to hakre without realising that it was hakre who'd written it in the first place; seemed odd! :) – Lightness Races in Orbit Sep 13 '11 at 10:29