8

I am interested in reading a pgm file in python as a numerical file/matrix

Right now I open the file with

f = open('/home/matthew/NCM/mdb001.pgm', 'rb')

When I read the first line, it looks as expected

r.readline()

produces

'P5\n'

and the next line is fine

'1024 1024\n'

and the next

'255\n'

but then I get a series of gibberish. It looks like some hex values mixed in with other stuff.

I don't want to view the file as an image picture, I just want to see it in this format.

Matt Cremeens
  • 4,641
  • 6
  • 29
  • 59
  • 1
    You are reading the `P5` style pgm file as the documentation you link to describes. The "gibberish" you are seeing are the pixel data encoded as bytes between `'\x00'` to maxval which you show as 255 (or `'\xff`'`). There should be 1024×1024 bytes of "gibberish" representing the image data. – msw Mar 01 '16 at 13:59
  • OK, so how can I display it as numerical data (perhaps separated by spaces) and not hex values, etc.? – Matt Cremeens Mar 01 '16 at 14:42
  • And it appears to not just be hex values, but some other stuff, too. like this `:;;=>??A?@A@??@?A?BEBACADAHHFEEHHFIFFEGKJLLJLMJKKJIJJFJFHHIGIIIHIILIKLNRNNSTUY]lw` – Matt Cremeens Mar 01 '16 at 14:44

2 Answers2

8

After reading the header as you've shown, you've got the width (1024) the height (the next 1024) and the depth (255). To get the pixel data it is easiest to read them byte-by-byte:

def read_pgm(pgmf):
    """Return a raster of integers from a PGM as a list of lists."""
    assert pgmf.readline() == 'P5\n'
    (width, height) = [int(i) for i in pgmf.readline().split()]
    depth = int(pgmf.readline())
    assert depth <= 255

    raster = []
    for y in range(height):
        row = []
        for y in range(width):
            row.append(ord(pgmf.read(1)))
        raster.append(row)
    return raster

This code will only work for 8-bit depth images which is why the assert statement is present.

It is legal for a PGM file to have the header information on one line as in:

P5 1024 1024 15

If you do encounter such a file, read_pgm will fail noisily; the code to handle such cases is left as an exercise for the reader.

msw
  • 40,500
  • 8
  • 77
  • 106
  • 1
    I'm also learning that PIL seems to handle this type of image file reading nicely. Thanks so much for your time and effort. – Matt Cremeens Mar 01 '16 at 15:54
2

msw's answer guided me to write the following function to read 16-bits .pmg images with the kind of header he described :

def read_pgm(pgmf):
"""Return a raster of integers from a PGM as a list of lists."""
   header = pgmf.readline()
   assert header[:2] == b'P5'
   (width, height) = [int(i) for i in header.split()[1:3]]
   depth = int(header.split()[3])
   assert depth <= 65535

   raster = []
   for y in range(height):
       row = []
       for y in range(width):
           low_bits = ord(pgmf.read(1))
           row.append(low_bits+255*ord(pgmf.read(1)))
       raster.append(row)
   return raster

f = open(pgm_path, 'rb')
im = read_pgm(f)
f.close()
im = np.array(im)

I hope this helps clarify how to use the previously given answer