Is there a way to convert npz files to panda dataframe?

Question

I have a large npz file that l've loaded with numpy's np.load. I want to convert this to panda's dataframe so l can apply machine learning algorithms (KNN, K-Means, DT) using scikit-learn. I am new to python so my experience is very limited to this library. Thank you for the help.

This is what l have so far:

dataset = np.load('./example.npz')

test_data = dataset['data']

test_labels = dataset['labels']

print data.shape gives (17000, 78400)

print labels.shape gives (17000, 1)

Try to refer to this https://stackoverflow.com/a/51308247/8185479 — Abhiram Satputé, Nov 30 '19 at 05:15
I'm pretty sure that scikit-learn will work with `numpy.ndarray` objects — juanpa.arrivillaga, Nov 30 '19 at 05:31

score 2 · Accepted Answer · answered Mar 01 '20 at 15:35

I'm not sure how you want to structure your dataframe, but this will load the npz file with the labels as index:

import pandas as pd
import numpy as np

npz = np.load('/path/to/npz.npz')
df= pd.DataFrame.from_dict({item: npz[item] for item in npz.files}, orient='index')

if you want to load the arrays into a single column use:

pd.DataFrame.from_dict({item: [npz[item]] for item in npz.files}, orient='index')

Just drop the orient='index' if you want to load the labels as columns.

score -2 · Answer 2 · answered Nov 30 '19 at 05:11

-2

Please try out this:

import pandas as pd
df = pd.DataFrame(dataset)

answered Nov 30 '19 at 05:11

Lakshmi - Intel

483
3
10

Is there a way to convert npz files to panda dataframe?

2 Answers2

Linked