matlab data file to pandas DataFrame

Question

Is there a standard way to convert matlab .mat (matlab formated data) files to Panda DataFrame?

I am aware that a workaround is possible by using scipy.io but I am wondering whether there is a straightforward way to do it.

@MarkMikofski I do not think this is a duplicate of "Read .mat files in Python", which does not touch how to process the extracted data so that it can be put in a Pandas dataframe. — Post169, Aug 02 '18 at 18:50

Destrif · Accepted Answer · 2016-07-05T07:34:14.620

I found 2 way: scipy or mat4py.

mat4py

Load data from MAT-file

The function loadmat loads all variables stored in the MAT-file into a simple Python data structure, using only Python’s dict and list objects. Numeric and cell arrays are converted to row-ordered nested lists. Arrays are squeezed to eliminate arrays with only one element. The resulting data structure is composed of simple types that are compatible with the JSON format.

Example: Load a MAT-file into a Python data structure:

data = loadmat('datafile.mat')

From:

https://pypi.python.org/pypi/mat4py/0.1.0

Scipy:

Example:

import numpy as np
from scipy.io import loadmat  # this is the SciPy module that loads mat-files
import matplotlib.pyplot as plt
from datetime import datetime, date, time
import pandas as pd

mat = loadmat('measured_data.mat')  # load mat-file
mdata = mat['measuredData']  # variable in mat file
mdtype = mdata.dtype  # dtypes of structures are "unsized objects"
# * SciPy reads in structures as structured NumPy arrays of dtype object
# * The size of the array is the size of the structure array, not the number
#   elements in any particular field. The shape defaults to 2-dimensional.
# * For convenience make a dictionary of the data using the names from dtypes
# * Since the structure has only one element, but is 2-D, index it at [0, 0]
ndata = {n: mdata[n][0, 0] for n in mdtype.names}
# Reconstruct the columns of the data table from just the time series
# Use the number of intervals to test if a field is a column or metadata
columns = [n for n, v in ndata.iteritems() if v.size == ndata['numIntervals']]
# now make a data frame, setting the time stamps as the index
df = pd.DataFrame(np.concatenate([ndata[c] for c in columns], axis=1),
                  index=[datetime(*ts) for ts in ndata['timestamps']],
                  columns=columns)

From:

http://poquitopicante.blogspot.fr/2014/05/loading-matlab-mat-file-into-pandas.html

Finally you can use PyHogs but still use scipy:

Reading complex .mat files.

This notebook shows an example of reading a Matlab .mat file, converting the data into a usable dictionary with loops, a simple plot of the data.

http://pyhogs.github.io/reading-mat-files.html

The `scipy.io` and `mat4py` modules cannot read Matlab v7.3+ HDF5 datafiles. — SebMa, Jul 02 '18 at 17:25
Known limitations for mat4py: * Arrays with more than 2 dimensions [important] * Arrays with complex numbers [important] * Sparse arrays [important] * Function arrays * Object classes * Anonymous function classes https://pypi.org/project/mat4py/ — Suhas C, Jan 09 '20 at 06:49

score 10 · Answer 2 · answered Jul 05 '16 at 07:24

10

Ways to do this:
As you mentioned scipy

import scipy.io as sio
test = sio.loadmat('test.mat')

Using the matlab engine:

import matlab.engine
eng = matlab.engine.start_matlab()
content = eng.load("example.mat",nargout=1)

answered Jul 05 '16 at 07:24

SerialDev

2,603
19
31

Matlab says "you cannot run the MATLAB engine on a machine that only has the MATLAB Runtime" https://uk.mathworks.com/help/matlab/matlab-engine-for-python.html?w.mathworks.com – Suhas C Jan 09 '20 at 06:20

matlab data file to pandas DataFrame

2 Answers2

Linked