4

A python package that I'm using has data stored under a single file with a .pkz extension. How would I unzip (?) this file to view the format of data within?

Knowledge Cube
  • 864
  • 9
  • 33
aardwolf
  • 69
  • 2
  • 6
  • 1
    Which Python package are you talking about? – Knowledge Cube Jun 01 '17 at 19:00
  • 1
    Given that one of the first Google hits I found was "File extension pkz is related to the Kart Racing Pro, a realistic karting simulator from Piboso developed for Microsoft Windows operating system" and another was about "Winoncd Images Mask file", it seems that this isn't a very standard file extension and has different meanings in different contexts. What does "pkz" mean for you? You might need to look at the documentation of whatever is making these files. – John Coleman Jun 01 '17 at 19:00
  • 1
    @JohnColeman To be fair, that bit was not added by the OP, but someone else. I would have rolled that back if I could. – Knowledge Cube Jun 01 '17 at 19:11
  • 2
    @ChristopherKyleHorton It is in sklearn.datasets. The package I'm using is called fetch_olivetti_faces. It is found [here](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html#sklearn.datasets.fetch_olivetti_faces). – aardwolf Jun 02 '17 at 02:29
  • 1
    @Aurelius Fortunately, the code for that package is on GitHub. I didn't have a lot of time to look at this and I haven't used Python in a while, but it appears that the .pkz files are handled by the [`joblib`](https://github.com/scikit-learn/scikit-learn/tree/master/sklearn/externals/joblib) module within that package (specifically, in the `dump` and `load` functions), and themselves represent pickled Python objects. Maybe this is a starting point? – Knowledge Cube Jun 02 '17 at 02:41
  • 1
    @Aurelius And [here](https://github.com/scikit-learn/scikit-learn/blob/e3c9ae204ffb152c151e9b61306ff8f16a2c1e0a/sklearn/datasets/olivetti_faces.py#L54) is where your `fetch_olivetti_faces` function is actually defined. It seems the unpickled Python object contains numpy array data. (And has a MATLAB-related format.) – Knowledge Cube Jun 02 '17 at 02:46

3 Answers3

3

Looks like what you are referencing is just a one-off file format used in sample data in scikit-learn. The .pkz is just a compressed version of a Python pickle file which usually has the extension .pkl.

Specifically you can see this in one of their sample files here along with the fact they are using the zlib_codec. To open it, you can go in reverse or try uncompressing from the command line.

Mike Biglan MS
  • 1,662
  • 1
  • 18
  • 20
1

Before attempting to open an PKZ file, you'll need to determine what kind of file you are dealing with and whether it is even possible to open or view the file format.

Files which are given the .PKZ extension are known as Winoncd Images Mask files, however other file types may also use this extension. If you are aware of any additional file formats that use the PKZ extension, please let us know.

How to open a PKZ file:

The best way to open an PKZ file is to simply double-click it and let the default assoisated application open the file. If you are unable to open the file this way, it may be because you do not have the correct application associated with the extension to view or edit the PKZ file. If you can do it, great, you have a program installed that can do it, lets say that program is called pkzexecutor.exe, with python, you just have to do:

import subprocess
import os


path_to_notepad = 'C:\\Windows\\System32\\pkzexecutor.exe'
path_to_file = 'C:\\Users\\Desktop\\yourfile.pkz'

subprocess.call([path_to_notepad, path_to_file])
Damián Rafael Lattenero
  • 14,625
  • 3
  • 30
  • 62
1

From the source code for fetch_olivetti_faces, the file appears to be downloaded from http://cs.nyu.edu/~roweis/data/ and originally has a .mat file extension, meaning it is actually a MATLAB file. If you have access to MATLAB or another program which can read those files, try opening it from there with the original file extension and see what that gives you.

(If you want to try opening this file in Python itself, then perhaps give this question a look: Read .mat files in Python )

Knowledge Cube
  • 864
  • 9
  • 33