3

I am reading a large binary file in partitions. Each partition is mapped using numpy.memmap.

The file consist of 1M rows, where a row is 198 2-byte integers. A partition is 1000 rows long.

Below is the code snippet:

mdata = np.memmap(fn, dtype='int16',  mode='r', offset=offset *2)
data = np.array(mdata[0:count])

here offset is 1000 * 198 * 2 * partition_idx where partition_idx ranges from [0:1000]. Count is 1000.

I get the error: memory mapped size must be positive

kmario23
  • 42,075
  • 12
  • 123
  • 130
  • 1
    You're mapping the entire file into memory, not just the partition you're trying to work with. Try adding ``shape=(198000,)`` or ``shape=(1000,198)`` to the memmap call to specify the size. – jasonharper Feb 22 '17 at 04:46

0 Answers0