Questions tagged [numpy-memmap]

An advanced numpy.memmap() utility to avoid RAM-size limit and reduce final RAM-footprint ( at a reasonable cost of O/S-cached fileIO mediated via a small-size in-RAM proxy-view window into whole array-data ) Creates and handles a memory-map to an array stored in a binary file on disk.

Creates and handles a memory-map to an array stored in a binary file on disk.

Memory-mapped files are used for arranging access to large non-in-RAM arrays via small proxy-segments of an O/S-cached area of otherwise unmanageably large data files.

Leaving most of the data on disk, without reading the entire file into RAM memory and working with data via smart, moving, O/S-cached window-view into the non-in-RAM big file, enables to escape from both O/S RAM-limits and from adverse side-effects of python's memory management painfull reluctance to release once allocated memory-blocks anytime before the python program termination.

numpy's memmap's are array-like objects.

This differs from Python's mmap module, which uses file-like objects.

83 questions

votes

2 answers

Can memmap pandas series. What about a dataframe?

It seems that I can memmap the underlying data for a python series by creating a mmap'd ndarray and using it to initialize the Series. def assert_readonly(iloc): try: iloc[0] = 999 # Should be non-editable …

asked Aug 29 '17 at 15:36

user48956

11,390
14
67
125

votes

1 answer

numpy mean is larger than max for memmap

I have an array of timestamps, increasing for each row in the 2nd column of matrix X. I calculate the mean value of the timestamps and it's larger than the max value. I'm using a numpy memmap for storage. Why is this happening? >>>…

python numpy numpy-memmap

asked Apr 09 '16 at 21:13

siamii

20,540
26
86
136

votes

2 answers

How to read a large text file avoiding reading line-by-line :: Python

I have a large data file (N,4) which I am mapping line-by-line. My files are 10 GBs, a simplistic implementation is given below. Though the following works, it takes huge amount of time. I would like to implement this logic such that the text file…

python numpy hdf5 h5py numpy-memmap

asked Jul 22 '20 at 20:04

nuki

votes

2 answers

numpy memmap memory usage - want to iterate once

let say I have some big matrix saved on disk. storing it all in memory is not really feasible so I use memmap to access it A = np.memmap(filename, dtype='float32', mode='r', shape=(3000000,162)) now let say I want to iterate over this matrix (not…

python numpy numpy-memmap

asked Jul 16 '17 at 20:16

user2717954

1,518
2
12
26

votes

1 answer

Do xarray or dask really support memory-mapping?

In my experimentation so far, I've tried: xr.open_dataset with chunks arg, and it loads the data into memory. Set up a NetCDF4DataStore, and call ds['field'].values and it loads the data into memory. Set up a ScipyDataStore with mmap='r', and…

numpy dask numpy-memmap xarray

asked Jun 24 '17 at 05:23

chrisbarber

votes

0 answers

Caching a data frame in joblib

Joblib has functionality for sharing Numpy arrays across processes by automatically memmapping the array. However this makes use of Numpy specific facilities. Pandas does use Numpy under the hood, but unless your columns all have the same data type,…

python python-3.x pandas joblib numpy-memmap

asked Feb 07 '19 at 02:59

shadowtalker

8,614
2
34
70

votes

1 answer

Numpy Memmap Ctypes Access

I'm trying to use a very large numpy array using numpy memmap, accessing each element as a ctypes Structure. class My_Structure(Structure): _fields_ = [('field1', c_uint32, 3), ('field2', c_uint32, 2), ('field3',…

python numpy ctypes numpy-memmap

asked Dec 14 '17 at 18:18

sheridp

1,187
1
9
19

votes

2 answers

packing boolean array needs go throught int (numpy 1.8.2)

I'm looking for the more compact way to store boolean. numpy internally need 8bits to store one boolean, but np.packbits allow to pack them, that's pretty cool. The problem is that to pack in a 4e6 bytes array a 32e6 bytes array of boolean we need…

python arrays numpy memory numpy-memmap

asked Dec 29 '15 at 12:44

user3313834

5,701
4
40
76

votes

0 answers

Numpy Memmap WinError8

My first StackOverflow message after 6 years of using great experience from this site. Thank you all for all the great help you have offered to me and to others. This problem, however, baffles me completely and I would like to ask for assistance…

python numpy numpy-memmap

asked Mar 21 '18 at 10:27

Urban Avsec

votes

0 answers

numpy memmap read error memory mapped size must be positive

I am reading a large binary file in partitions. Each partition is mapped using numpy.memmap. The file consist of 1M rows, where a row is 198 2-byte integers. A partition is 1000 rows long. Below is the code snippet: mdata = np.memmap(fn,…

python numpy binaryfiles integer-overflow numpy-memmap

asked Feb 22 '17 at 01:44

rishu saxena

votes

1 answer

Python: passing memmap array through function?

Suppose that I am working with very large array (e.g., ~45GB) and am trying to pass it through a function which open accepts numpy arrays. What is the best way to: Store this for limited memory? Pass this stored array into a function that takes…

python numpy numpy-memmap

asked Oct 27 '16 at 18:05

Andy

votes

0 answers

When updating a numpy.memmap'd file in parallel, is there a way to only "flush" a slice and not the whole file?

I have to do a lot of nasty i/o and I have elected to use memory mapped files with numpy...after a lot of headache I realized that when a process "flushes" to disk it often overwrites what other processes are attempting to write with old data...I…

python numpy parallel-processing mmap numpy-memmap

asked Oct 07 '16 at 22:55

SciPyInTheHole

votes

1 answer

Why am I getting an OverflowError and WindowsError with numpy memmap and how to solve it?

In relation to my other question here, this code works if I use a small chunk of my dataset with dtype='int32', using a float64 produces a TypeError on my main process after this portion because of safe rules so I'll stick to working with int32 but…

python numpy numpy-memmap

asked Jan 04 '16 at 11:02

ZeferiniX

votes

1 answer

Memory Error when using float32 in dask array

I am trying to import a 1.25 GB dataset into python using dask.array The file is a 1312*2500*196 Array of uint16's. I need to convert this to a float32 array for later processing. I have managed to stitch together this Dask array in uint16, however…

python arrays numpy dask numpy-memmap

asked Oct 04 '15 at 22:50

Amdixer

votes

0 answers

Numpy memmap throttles with Pytorch Dataloader when available RAM less than file size

I'm working on a dataset that is too big to fit into RAM. The solution I'm trying currently is to use numpy memmap to load one sample/row at a time using Dataloader. The solution looks something like this: class MMDataset(torch.utils.data.Dataset): …

numpy pytorch dataloader numpy-memmap

asked May 28 '20 at 19:02

Kevin

2 3 4 5 6 Next