1

I have a file with MIME type of text/x-python. It shows binary format when I opened it with Vim.

image

The file begins with these bytes, according to a hexdump with xxd:

00000000: 03f3 0d0a f0db af5a 6300 0000 0000 0000  .......Zc.......
00000010: 000e 0000 0040 6001 0073 2f0b 0000 6400  .....@`..s/...d.

How to convert this file into text format?

RJHunter
  • 2,599
  • 3
  • 24
  • 28
mining
  • 3,073
  • 3
  • 34
  • 56
  • It should be just text. Can you post a sample of what you're looking at? (for example as output from `xxd` so we can see what the binary is) – viraptor Apr 16 '18 at 01:33
  • Hi, @viraptor, thanks very much for your reply! I've uploaded an image. Please have a check. – mining Apr 16 '18 at 01:36
  • 2
    We can't learn much from it - can you post the beginning of `xxd < model_train.py` instead? This will show us the contents in format we can actually analyse. – viraptor Apr 16 '18 at 02:05
  • Could this be a pyc file with the wrong extension? – ekhumoro Apr 16 '18 at 02:08
  • @viraptor, I've uploaded the picture. Thanks a lot. I think this file is a binary file, but from the `Readme` file in the project that contains this file, this file can be executed by `python model_train.py`. If this is a binary file, I'm not sure if python can execute binary file. – mining Apr 16 '18 at 02:31
  • @ekhumoro, thanks for your reply. I also think this is a binary file. The author of this file may want to hide the source code. – mining Apr 16 '18 at 02:32
  • perhaps there is some preamble in the file indicative of its real type? – Paul Rooney Apr 16 '18 at 03:12
  • Hi @PaulRooney, thanks for your reply. But I'm not sure how to get the extra information of this file. In fact, this file is included in a docker image. I just copy this file from the docker container to my host computer. – mining Apr 16 '18 at 03:28
  • 1
    Indeed, now that the OP included a hexdump, we can see that this file begins with the bytes 0x03f30d0a, which the `file` tool identifies as byte-compiled code for Python 2.7. – RJHunter Apr 16 '18 at 03:49
  • @RJHunter, got it. Thanks a lot! – mining Apr 16 '18 at 04:37

1 Answers1

4

The MIME type text/x-python, and the file extension .py, are usually attached to plain-text files.

The file in your question is binary, not plain text, and so neither the file extension .py nor the MIME type text/python are appropriate. In other words: the file has a misleading name.

The bytes shown in the hexdump correspond to compiled Python bytecode. Files like these are usually named with .pyc instead of .py. When the Python interpreter loads a module from a .py text file, it compiles the text into bytecode saves a copy of the compiled result in a .pyc file. This means that loading is faster next time.

If you have .pyc bytecode but not the original .py, there are tools to disassemble ("decompile") the bytecode and show the results in Python text, but that's quite unusual situation.

See also: Is it possible to decompile a compiled .pyc file into a .py file?

RJHunter
  • 2,599
  • 3
  • 24
  • 28
  • Hi, @RJHunter, thanks a lot for your kind answer! I also think this is binary file. But from the `Readme` file of the project that contains this file, it shows this file can be executed by `python model_train.py`. The author may make the content private, but the results of this file public. – mining Apr 16 '18 at 02:34
  • 1
    The author may try to make the content of their file private, but like everything else in Python, you really can't do that unless you write an extension. I had a customer that tried obfuscating their code as a pyx file because they were supposed to share everything pertaining to the task as part of our agreement. I didn't even bother complaining, just spent 15 minutes researching decompilers. I got their code in plain text, and they got to keep their sense of smug security. – Mad Physicist Apr 16 '18 at 02:56
  • Hi, @MadPhysicist, thanks for your kind comment! Yeah, the author may convert the source code to binary code due to their agreement. – mining Apr 16 '18 at 03:29