43

I am looking for a way to pass NumPy arrays to Matlab.

I've managed to do this by storing the array into an image using scipy.misc.imsave and then loading it using imread, but this of course causes the matrix to contain values between 0 and 256 instead of the 'real' values.

Taking the product of this matrix divided by 256, and the maximum value in the original NumPy array gives me the correct matrix, but I feel that this is a bit tedious.

is there a simpler way?

Levon
  • 118,296
  • 31
  • 184
  • 178
  • 3
    I forget, does Matlab allow parsing text files? Because you could just format the numpy arrays as Matlab-style ones in strings, write them to a file, and then read the arrays into Matlab. – JAB Jun 12 '12 at 13:07
  • 1
    Did you consider mlabwrap http://mlabwrap.sourceforge.net/#description – dilip kumbham Jun 12 '12 at 13:12
  • 2
    are you sure you cannot do the calculation entirely in numpy/scipy? just wondering – Bort Jun 12 '12 at 13:14
  • I'm pretty sure that I would be able to convert the Matlab implementation of a PLSM algorithm to numpy, but to solve all the problems caused by off-by-ones and difference in functions is very time-consuming. Thanks for the tip @JAB, it's less tedious than converting it to an image first. However, I might come across 3D matrices later, so Joe's solution works out for me. –  Jun 12 '12 at 13:19
  • @dilipkumbham that python-to-matlab bridge looks promising! Will check it out. –  Jun 12 '12 at 13:22
  • 2
    MATLAB can read and write HDF5 format, and there are python libraries. .. – Memming Jun 12 '12 at 15:08
  • related question: https://www.mathworks.com/matlabcentral/answers/359680-can-someone-provide-me-an-example-of-loading-data-from-python-to-matlab – Charlie Parker Oct 04 '17 at 17:46

7 Answers7

56

Sure, just use scipy.io.savemat

As an example:

import numpy as np
import scipy.io

x = np.linspace(0, 2 * np.pi, 100)
y = np.cos(x)

scipy.io.savemat('test.mat', dict(x=x, y=y))

Similarly, there's scipy.io.loadmat.

You then load this in matlab with load test.

Alteratively, as @JAB suggested, you could just save things to an ascii tab delimited file (e.g. numpy.savetxt). However, you'll be limited to 2 dimensions if you go this route. On the other hand, ascii is the universial exchange format. Pretty much anything will handle a delimited text file.

Joe Kington
  • 239,485
  • 62
  • 555
  • 446
11

A simple solution, without passing data by file or external libs.

Numpy has a method to transform ndarrays to list and matlab data types can be defined from lists. So, when can transform like:

np_a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
mat_a = matlab.double(np_a.tolist())

From matlab to python requires more attention. There is no built-in function to convert the type directly to lists. But we can access the raw data, which isn't shaped, but plain. So, we use reshape (to format correctly) and transpose (because of the different way MATLAB and numpy store data). That's really important to stress: Test it in your project, mainly if you are using matrices with more than 2 dimensions. It works for MATLAB 2015a and 2 dims.

np_a = np.array(mat_a._data.tolist())
np_a = np_a.reshape(mat_a.size).transpose()
Juliano ENS
  • 745
  • 7
  • 14
  • 1
    Note that `mat_a = matlab.double(np_a.tolist())` can be horribly inefficient/slow. Go with Joe Kington's answer for anything other than np arrays. See https://stackoverflow.com/a/45284125/2524427 – 5Ke Jul 24 '17 at 15:15
5

Here's a solution that avoids iterating in python, or using file IO - at the expense of relying on (ugly) matlab internals:

import matlab
# This is actually `matlab._internal`, but matlab/__init__.py
# mangles the path making it appear as `_internal`.
# Importing it under a different name would be a bad idea.
from _internal.mlarray_utils import _get_strides, _get_mlsize

def _wrapper__init__(self, arr):
    assert arr.dtype == type(self)._numpy_type
    self._python_type = type(arr.dtype.type().item())
    self._is_complex = np.issubdtype(arr.dtype, np.complexfloating)
    self._size = _get_mlsize(arr.shape)
    self._strides = _get_strides(self._size)[:-1]
    self._start = 0

    if self._is_complex:
        self._real = arr.real.ravel(order='F')
        self._imag = arr.imag.ravel(order='F')
    else:
        self._data = arr.ravel(order='F')

_wrappers = {}
def _define_wrapper(matlab_type, numpy_type):
    t = type(matlab_type.__name__, (matlab_type,), dict(
        __init__=_wrapper__init__,
        _numpy_type=numpy_type
    ))
    # this tricks matlab into accepting our new type
    t.__module__ = matlab_type.__module__
    _wrappers[numpy_type] = t

_define_wrapper(matlab.double, np.double)
_define_wrapper(matlab.single, np.single)
_define_wrapper(matlab.uint8, np.uint8)
_define_wrapper(matlab.int8, np.int8)
_define_wrapper(matlab.uint16, np.uint16)
_define_wrapper(matlab.int16, np.int16)
_define_wrapper(matlab.uint32, np.uint32)
_define_wrapper(matlab.int32, np.int32)
_define_wrapper(matlab.uint64, np.uint64)
_define_wrapper(matlab.int64, np.int64)
_define_wrapper(matlab.logical, np.bool_)

def as_matlab(arr):
    try:
        cls = _wrappers[arr.dtype.type]
    except KeyError:
        raise TypeError("Unsupported data type")
    return cls(arr)

The observations necessary to get here were:

  • Matlab seems to only look at type(x).__name__ and type(x).__module__ to determine if it understands the type
  • It seems that any indexable object can be placed in the ._data attribute

Unfortunately, matlab is not using the _data attribute efficiently internally, and is iterating over it one item at a time rather than using the python memoryview protocol :(. So the speed gain is marginal with this approach.

Eric
  • 87,154
  • 48
  • 211
  • 332
  • 2
    The speed up should be quite significant. I got about a factor of 15 with this method. https://stackoverflow.com/a/45290997/4045774 – max9111 Nov 06 '19 at 08:53
  • @max9111: Did you find that wrapping with `array.array` (as you do there) vs not doing so (as I do here) made any difference? – Eric Nov 06 '19 at 10:49
  • Yes, your version is faster ;). Maybe it would be good to clarify that your solution is actually drastically faster than the matlab.double(np_a.tolist()). – max9111 Nov 06 '19 at 13:43
  • For situations where you'd like to avoid the data duplication on the filesystem and any I/O related time, this seems like the best option. – NLi10Me Dec 17 '19 at 20:29
  • For a 100 x 100 x 100 array, this approach takes 2.73 ms ± 191 µs per loop (mean ± std. dev. of 7 runs, 100 loops each), but matlab.double takes1.12 s ± 58.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each). Around 400x speed up... – ZK xxxxx Feb 18 '21 at 22:12
4

scipy.io.savemat or scipy.io.loadmat does NOT work for matlab arrays --v7.3. But the good part is that matlab --v7.3 files are hdf5 datasets. So they can be read using a number of tools, including numpy.

For python, you will need the h5py extension, which requires HDF5 on your system.

import numpy as np, h5py 
f = h5py.File('somefile.mat','r') 
data = f.get('data/variable1') 
data = np.array(data) # For converting to numpy array
vikrantt
  • 1,993
  • 1
  • 10
  • 5
2

Some time ago I faced the same problem and wrote the following scripts to allow easy copy and pasting of arrays back and forth from interactive sessions. Obviously only practical for small arrays, but I found it more convenient than saving/loading through a file every time:

Matlab -> Python

Python -> Matlab

robince
  • 10,446
  • 3
  • 32
  • 46
2

Not sure if it counts as "simpler" but I found a solution to move data from a numpy arrray created in a python script which is called by matlab quite fast:

dump_reader.py (python source):

import numpy

def matlab_test2():
    np_a    = numpy.random.uniform(low = 0.0, high = 30000.0, size = (1000,1000))
    return np_a

dump_read.m (matlab script):

clear classes
mod = py.importlib.import_module('dump_reader');
py.importlib.reload(mod);

if count(py.sys.path,'') == 0
    insert(py.sys.path,int32(0),'');
end

tic
A = py.dump_reader.matlab_test2();
toc
shape = cellfun(@int64,cell(A.shape));
ls = py.array.array('d',A.flatten('F').tolist());
p = double(ls);
toc
C = reshape(p,shape);
toc

It relies on the fact that matlabs double seems be working efficiently on arrays compared to cells/matrices. Second trick is to pass the data to matlabs double in an efficient way (via pythons native array.array).

P.S. sorry for necroposting but I struggled a lot with its and this topic was one of the closest hits. Maybe it helps someone to shorten the time of struggling.

P.P.S. tested with Matlab R2016b + python 3.5.4 (64bit)

Christian B.
  • 603
  • 5
  • 8
0

Let use say you have a 2D daily data with shape (365,10) for five years saved in np array np3Darrat that will have a shape (5,365,10). In python save your np array:

import scipy.io as sio     #SciPy module to load and save mat-files
m['np3Darray']=np3Darray   #shape(5,365,10)
sio.savemat('file.mat',m)  #Save np 3D array 

Then in MATLAB convert np 3D array to MATLAB 3D matix:

load('file.mat','np3Darray')
M3D=permute(np3Darray, [2 3 1]);   %Convert numpy array with shape (5,365,10) to MATLAB matrix with shape (365,10,5)
ASE
  • 668
  • 1
  • 7
  • 18