6

I am looking to write a Rust backend for my library, and I need to implement the equivalent of the following function in pyo3:

def f(x):
    return x

This should return the same object as the input, and the function getting the return value should hold a new reference to the input. If I were writing this in the C API I would write it as:

PyObject * f(PyObject * x) {
    Py_XINCREF(x);
    return x;
}

In PyO3, I find it quite confusing to navigate the differences between PyObject, PyObjectRef, &PyObject, Py<PyObject>, Py<&PyObject>.

The most naive version of this function is:

extern crate pyo3;

use pyo3::prelude::*;

#[pyfunction]
pub fn f(_py: Python, x: &PyObject) -> PyResult<&PyObject> {
    Ok(x)
}

Among other things, the lifetimes of x and the return value are not the same, plus I see no opportunity for pyo3 to increase the reference count for x, and in fact the compiler seems to agree with me:

error[E0106]: missing lifetime specifier
 --> src/lib.rs:4:49
  |
4 | pub fn f(_py: Python, x: &PyObject) -> PyResult<&PyObject> {
  |                                                 ^ expected lifetime parameter
  |
  = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `_py` or `x`

There may be a way for me to manually increase the reference count using the _py parameter and use lifetime annotations to make the compiler happy, but my impression is that pyo3 intends to manage reference counts itself using object lifetimes.

What is the proper way to write this function? Should I be attempting to wrap it in a Py container?

Paul
  • 9,014
  • 9
  • 41
  • 75
  • 1
    @Shepmaster I think your edit makes the title less accurate. The fact that the reference count is increased is *not* something I want to manage or think about in the Rust version. If pyo3 does something else that is equivalent to increasing the reference count that's fine. – Paul Sep 27 '18 at 13:53
  • @Shepmaster It's not really a hard requirement. If `pyo3` keeps a single reference to any Python object created and then manages whether or not to release that reference using Rust object lifetimes or something, that would be fine. I don't really know or care, so long as I'm using the correct lifetime management from a pyo3 perspective. – Paul Sep 27 '18 at 13:58
  • *the lifetimes of `x` and the return value are not the same* — why do you say this? [Lifetime elision](https://doc.rust-lang.org/book/second-edition/ch10-03-lifetime-syntax.html#lifetime-elision) makes them the same. You could explicitly annotate them to "prove" it: `fn f(_py: Python, x: &'a PyObject) -> PyResult`. – Shepmaster Sep 27 '18 at 15:01
  • 2
    @Shepmaster Doing that seems to result in compiler errors anyway, because `pyfunc` is not retaining lifetime parameters. – SE_net4 the downvoter Sep 27 '18 at 15:03
  • *no opportunity for `pyo3` to increase the reference count for `x`* — what prevents it from increasing the reference *before* it hands it to you? – Shepmaster Sep 27 '18 at 15:03
  • @Shepmaster I imagine it would have to increase it in the `#[pyfunc]` macro by detecting that the function returns a reference, rather than before it is handed to the function itself, since not all functions increase the reference count of their input parameters. – Paul Sep 27 '18 at 15:05
  • 1
    As to the lifetime question, that was the gist of the compiler error - this may be the issue @E_net4 was talking about, but I think the point of Python's reference counting is that the lifetime of the input and output should be disconnected (hence the reference counting), so even if the Rust semantics say it has the correct lifetime *as declared*, that's not how the function is intended to be used. – Paul Sep 27 '18 at 15:07

2 Answers2

6

A PyObject is a simple wrapper around a raw pointer:

pub struct PyObject(*mut ffi::PyObject);

It has multiple creation functions, each corresponding to different kinds of pointers that we might get from Python. Some of these, such as from_borrowed_ptr, call Py_INCREF on the passed-in pointer.

Thus, it seems like we can accept a PyObject, so long as it was created in the "right" manner.

If we expand this code:

#[pyfunction]
pub fn example(_py: Python, x: PyObject) -> PyObject {
    x
}

We can see this section of code that calls our function:

let mut _iter = _output.iter();
::pyo3::ObjectProtocol::extract(_iter.next().unwrap().unwrap()).and_then(
    |arg1| {
        ::pyo3::ReturnTypeIntoPyResult::return_type_into_py_result(example(
            _py, arg1,
        ))
    },
)

Our argument is created by a call to ObjectProtocol::extract, which in turn calls FromPyObject::extract. This is implemented for PyObject by calling from_borrowed_ptr.

Thus, using a bare PyObject as the argument type will correctly increment the reference count.

Likewise, when a PyObject is dropped in Rust, it will automatically decrease the reference count. When it is returned back to Python, ownership is transferred and it is up to the Python code to update the reference count appropriately.


All investigation done for commit ed273982 from the master branch, corresponding to v0.5.0-alpha.1.

Shepmaster
  • 274,917
  • 47
  • 731
  • 969
4

According to the other answer, pyo3 takes care of building additional boilerplate around our functions in order to keep track of Python reference counting. In particular, the counter is already incremented when passing the object as an argument to the function. Nevertheless, the clone_ref method can be used to explicitly create a new reference to the same object, which will also increment its reference counter.

The output of the function must still be an actual Python object rather than a reference to it (which seems reasonable, as Python does not understand Rust references; pyo3 seems to ignore lifetime parameters in these functions).

#[pyfunction]
fn f(py: Python, x: PyObject) -> PyResult<PyObject> {
    Ok(x.clone_ref(py))
}

From playing around with the function in Python land (AKA not a serious testbed), it at least seems to work as intended.

from dummypy import f

def get_object():
    return f("OK")

a = [1, 2, 3]

if True:
    b = f(a)
    assert b is a
    b[0] = 9001

print(a)

x = get_object()
print(x)
SE_net4 the downvoter
  • 21,043
  • 11
  • 69
  • 107
  • The simplest test would be to call `a = object(); assert f(a) is a`, since the important part is that the input should be the same object as the output. – Paul Sep 27 '18 at 16:28
  • @Paul I guess that could be added as well. But I was mostly attempting to check whether the reference count is consistent with the number of variables pointing at the object. – SE_net4 the downvoter Sep 27 '18 at 16:30