28

I would like to return some data from c++ code as a numpy.array object. I had a look at boost::python::numeric, but its documentation is very terse. Can I get an example of e.g. returning a (not very large) vector<double> to python? I don't mind doing copies of data.

Saullo G. P. Castro
  • 49,101
  • 22
  • 160
  • 223
eudoxos
  • 17,278
  • 9
  • 48
  • 94
  • 2
    I agree its documentation is dreadful. They just copy the commentless header into their documentation page and don't show you the basics, i.e. getting data from STL collection into this object. – CashCow Jan 08 '13 at 11:57
  • 1
    The boost people are very clever, too clever for their own good. I go to their Wrapper concepts page and see nothing that makes sense. – CashCow Jan 08 '13 at 11:57
  • I found what I think is the best solution I've come across yet and posted it below. – CashCow Jan 09 '13 at 10:18

5 Answers5

26

UPDATE: the library described in my original answer (https://github.com/ndarray/Boost.NumPy) has been integrated directly into Boost.Python as of Boost 1.63, and hence the standalone version is now deprecated. The text below now corresponds to the new, integrated version (only the namespace has changed).

Boost.Python now includes a moderately complete wrapper of the NumPy C-API into a Boost.Python interface. It's pretty low-level, and mostly focused on how to address the more difficult problem of how to pass C++ data to and from NumPy without copying, but here's how you'd do a copied std::vector return with that:

#include "boost/python/numpy.hpp"

namespace bp = boost::python;
namespace bn = boost::python::numpy;

std::vector<double> myfunc(...);

bn::ndarray mywrapper(...) {
    std::vector<double> v = myfunc(...);
    Py_intptr_t shape[1] = { v.size() };
    bn::ndarray result = bn::zeros(1, shape, bn::dtype::get_builtin<double>());
    std::copy(v.begin(), v.end(), reinterpret_cast<double*>(result.get_data()));
    return result;
}

BOOST_PYTHON_MODULE(example) {
    bn::initialize();
    bp::def("myfunc", mywrapper);
}
jbosch
  • 951
  • 6
  • 8
  • May be very nice if I could actually get to the code but github seems to be blocked here, or something else is wrong because I'm getting a broken link. Surely there must be a way to populate a boost::python::numeric::array with data from a simple std::vector without having to get some 3rd party library. It would help if boost's documentation actually gave you documentation on the member functions rather than reproducing the uncommented header. – CashCow Jan 08 '13 at 13:43
  • 1
    I can't make an edit because it's too minor, but it should be `bn::zeros`, not `bp::zeros`. – Gabriel Jul 23 '14 at 14:29
  • I could not make this work (Ubuntu 14.04). What would be an example for `(...)`?, what is `bn::initialize()` supposed to do?. Also the example seems outdated -> When I try in include `boost/numpy.hpp` I get `fatal error: boost/numpy.hpp: No such file or directory` – mcExchange Jun 28 '17 at 13:23
20

A solution that doesn't require you to download any special 3rd party C++ library (but you need numpy).

#include <numpy/ndarrayobject.h> // ensure you include this header

boost::python::object stdVecToNumpyArray( std::vector<double> const& vec )
{
      npy_intp size = vec.size();

     /* const_cast is rather horrible but we need a writable pointer
        in C++11, vec.data() will do the trick
        but you will still need to const_cast
      */

      double * data = size ? const_cast<double *>(&vec[0]) 
        : static_cast<double *>(NULL); 

    // create a PyObject * from pointer and data 
      PyObject * pyObj = PyArray_SimpleNewFromData( 1, &size, NPY_DOUBLE, data );
      boost::python::handle<> handle( pyObj );
      boost::python::numeric::array arr( handle );

    /* The problem of returning arr is twofold: firstly the user can modify
      the data which will betray the const-correctness 
      Secondly the lifetime of the data is managed by the C++ API and not the 
      lifetime of the numpy array whatsoever. But we have a simple solution..
     */

       return arr.copy(); // copy the object. numpy owns the copy now.
  }

Of course you might write a function from double * and size, which is generic then invoke that from the vector by extracting this info. You could also write a template but you'd need some kind of mapping from data type to the NPY_TYPES enum.

CashCow
  • 29,087
  • 4
  • 53
  • 86
  • 2
    Thanks for this example. Just a heads up, I had to use numeric::array::set_module_and_type("numpy", "ndarray"); or I would get the python runtime error "ImportError: No module named 'Numeric' or its type 'ArrayType' did not follow the NumPy protocol" – PiQuer Aug 29 '13 at 14:51
  • Thanks @PiQuer, it helped – VforVitamin Dec 14 '14 at 00:23
  • Why are you `const_cast`ing if you can just make the argument a non-const reference? – rubenvb Apr 17 '15 at 11:52
  • @rubenvb Because we want the argument to be a const reference. We are not actually going to modify the data, but we need to workaround the fact that PyArray_SimpleNewFromData requires a double* – CashCow Apr 17 '15 at 11:56
  • 1
    Note that unlike many of my answers on StackOverflow this was a situation where I actually needed it, came here, found the question but no adequate answer. Then worked it out and came back to post it. – CashCow Apr 17 '15 at 11:59
  • Ah, I see. Bad API needs `const_cast`... When will we ever see the end of that. – rubenvb Apr 17 '15 at 12:00
  • I don't really know if it's a bad API because Python has no concept of const so a numpy array is always modifiable. However before we actually let Python users use our object we duplicate it, creating a new copy that they can modify happily without worrying our own data. – CashCow Apr 17 '15 at 12:01
  • Can you not avoid the `const_cast` by just creating a numpy array that owns its own memory using `PyArray_SimpleNew` then copying the vector's data into it? – Eddy Ferreira Apr 22 '15 at 18:27
  • I use you method, but declaring in the function body 'double data[4] ={1,2,3,4}'. I got segmentation fault. – kevin May 14 '15 at 03:55
  • Do that and size as 4 in the call to PyArray_SimpleNewFromData and then as after that including arr.copy() should work. Failure to duplicate your object will indeed lead to undefined behaviour if they try using it as it's local to the function and will not be valid anymore. – CashCow May 15 '15 at 09:36
  • Apparantly `.copy()` during return is not necessary. Simply `return arr` works too -> Avoiding this copy operation might help to gain performance – mcExchange Jul 05 '17 at 09:37
  • arr.copy() is necessary the way I did it for the 2 reasons I specified. That makes the data belong to python object in a way that it can be modified and its lifetime is determined by Python and not the vector from which it got its data. – CashCow Jul 05 '17 at 09:39
10

It's a bit late, but after many unsuccessful tries I found a way to expose c++ arrays as numpy arrays directly. Here is a short C++11 example using boost::python and Eigen:

#include <numpy/ndarrayobject.h>
#include <boost/python.hpp>

#include <Eigen/Core>

// c++ type
struct my_type {
  Eigen::Vector3d position;
};


// wrap c++ array as numpy array
static boost::python::object wrap(double* data, npy_intp size) {
  using namespace boost::python;

  npy_intp shape[1] = { size }; // array size
  PyObject* obj = PyArray_New(&PyArray_Type, 1, shape, NPY_DOUBLE, // data type
                              NULL, data, // data pointer
                              0, NPY_ARRAY_CARRAY, // NPY_ARRAY_CARRAY_RO for readonly
                              NULL);
  handle<> array( obj );
  return object(array);
}



// module definition
BOOST_PYTHON_MODULE(test)
{
  // numpy requires this
  import_array();

  using namespace boost::python;

  // wrapper for my_type
  class_< my_type >("my_type")
    .add_property("position", +[](my_type& self) -> object {
        return wrap(self.position.data(), self.position.size());
      });

}

The example describes a "getter" for the property. For the "setter", the easiest way is to assign the array elements manually from a boost::python::object using a boost::python::stl_input_iterator<double>.

max
  • 978
  • 8
  • 20
  • Could you tell me how to setup my project to be able to use the numpy header? Do I need to compile some libraries? Or is it enough to include the numpy header? – NOhs Jan 26 '16 at 11:43
  • 1
    I got the numpy header directory using: `python -c "import numpy; print numpy.get_include()"` – max Jan 26 '16 at 16:32
  • Ok. That worked, thanks. but the compiler complains that import_array() is returning a value, while init_module_... is a 'void' function. – NOhs Jan 26 '16 at 16:39
  • 1
    Ok, so it seems to be related with how the `import_array()` macro was change from Python 2 to Python 3 to now return something. Here is a (ugly) solution that keeps it version independent: https://mail.scipy.org/pipermail/numpy-discussion/2010-December/054345.html – NOhs Jan 26 '16 at 17:06
  • finally someone got it right! With a comprehensive example! Thank You! – Reza Toghraee Feb 29 '16 at 21:21
  • what is this notation I never saw before `+[](my_type& self)` in your code ? I'm talking about `+[]` or more specifically the `+` sign before the lambda capture `[]` ? – David Bellot Jun 05 '16 at 21:42
  • IIRC it's a hack to force a conversion to a function pointer from a (non-capturing) lambda expression. I am not sure if it is required by the standard (I'd say it's not), but it helped triggering the conversion on some compilers. Edit: found it: http://stackoverflow.com/questions/18889028/a-positive-lambda-what-sorcery-is-this – max Jun 06 '16 at 13:52
  • wow, what a trick. At least I can say that without it, my boost.python code will not compile. So in my case, the trick is necessary. Thanks – David Bellot Jun 08 '16 at 09:45
  • I get a segfault with this, which unfortunate as this is precisely what I am looking for... ;/ – XapaJIaMnu Jan 06 '17 at 22:26
  • 1
    Apparently, `boost::python` now provides direct access to numpy arrays: http://www.boost.org/doc/libs/1_63_0/libs/python/doc/html/numpy/tutorial/index.html can't get it to link though :-/ – max Mar 11 '17 at 23:26
2

Doing it using the numpy api directly is not necessarily difficult, but I use boost::multiarray regularly for my projects and find it convenient to transfer the shapes of the array between the C++/Python boundary automatically. So, here is my recipe. Use http://code.google.com/p/numpy-boost/, or better yet, this version of the numpy_boost.hpp header; which is a better fit for multi-file boost::python projects, although it uses some C++11. Then, from your boost::python code, use something like this:

PyObject* myfunc(/*....*/)
{
   // If your data is already in a boost::multiarray object:
   // numpy_boost< double, 1 > to_python( numpy_from_boost_array(result_cm) );
   // otherwise:
   numpy_boost< double, 1> to_python( boost::extents[n] );
   std::copy( my_vector.begin(), my_vector.end(), to_python.begin() );

   PyObject* result = to_python.py_ptr();
   Py_INCREF( result );

   return result;
}
dsign
  • 11,152
  • 6
  • 51
  • 77
  • What would be the correct way to return a `py::object` (`py`=`boost::python`)? I have `PyObject* result=numpy_boost(numpy_from_boost_array(...)).py_ptr();` and `return py::object(py::handle<>(py::borrowed(o)));` but that crashes. Hint? – eudoxos May 29 '12 at 10:53
  • PS. the crash is at line 229 of the dropbox version, line `a = (PyArrayObject*)PyArray_SimpleNew(NDims, shape, detail::numpy_type_map::typenum);`. Strange. – eudoxos May 29 '12 at 11:05
  • 1
    @eudoxos You might have a problem with the PY_ARRAY_UNIQUE_SYMBOL and NO_IMPORT_ARRAY macros, as well as import_array, as your crash is exactly when the array is created, which needs a call (I think) through certain pointer table that numpy needs (initialized with import_array() ). – dsign May 29 '12 at 13:00
  • The link to the C++11 version is broken. Would you mind fixing that? – Zendel Aug 24 '17 at 00:27
1

I looked at the available answers and thought, "this will be easy". I proceeded to spend hours attempting what seemed like a trivial examples/adaptations of the answers.

Then I implemented @max's answer exactly (had to install Eigen) and it worked fine, but I still had trouble adapting it. My problems were mostly (by number) silly, syntax mistakes, but additionally I was using a pointer to a copied std::vector's data after the vector seemed to be dropped off the stack.

In this example, a pointer to the std::vector is returned, but also you could return the size and data() pointer or use any other implementation that gives your numpy array access to the underlying data in a stable manner (i.e. guaranteed to exist):

class_<test_wrap>("test_wrap")
    .add_property("values", +[](test_wrap& self) -> object {
            return wrap(self.pvalues()->data(),self.pvalues()->size());
        })
    ;

For test_wrap with a std::vector<double> (normally pvalues() might just return the pointer without populating the vector):

class test_wrap {
public:
    std::vector<double> mValues;
    std::vector<double>* pvalues() {
        mValues.clear();
        for(double d_ = 0.0; d_ < 4; d_+=0.3)
        {
            mValues.push_back(d_);
        }
        return &mValues;
    }
};

The full example is on Github so you can skip the tedious transcription steps and worry less about build, libs, etc. You should be able to just do the following and get a functioning example (if you have the necessary features installed and your path setup already):

git clone https://github.com/ransage/boost_numpy_example.git
cd boost_numpy_example
# Install virtualenv, numpy if necessary; update path (see below*)
cd build && cmake .. && make && ./test_np.py

This should give the output:

# cmake/make output
values has type <type 'numpy.ndarray'>
values has len 14
values is [ 0.   0.3  0.6  0.9  1.2  1.5  1.8  2.1  2.4  2.7  3.   3.3  3.6  3.9]

*In my case, I put numpy into a virtualenv as follows - this should be unnecessary if you can execute python -c "import numpy; print numpy.get_include()" as suggested by @max:

# virtualenv, pip, path unnecessary if your Python has numpy
virtualenv venv
./venv/bin/pip install -r requirements.txt 
export PATH="$(pwd)/venv/bin:$PATH"

Have fun! :-)

sage
  • 4,100
  • 1
  • 36
  • 46