2

I'm getting the following error:

MemoryError: Unable to allocate array with shape (118, 840983) and data type float64

in my python code whenever I am running a python pandas.readcsv() function to read a text file. Why is this??

This is my code:

import pandas as pd
df = pd.read_csv("LANGEVIN_DATA.txt", delim_whitespace=True)
Mahesh Waghmare
  • 1,124
  • 10
  • 22
M Khalil
  • 55
  • 1
  • 6
  • 2
    What about the error message confuses you? It seems to imply to me that there's not enough space in memory to hold the entire csv at one go. – Paritosh Singh Nov 18 '19 at 09:00
  • 118 by 840983 for 8 bytes of float64 data should be less than 1GB; how big is your system memory? – 9769953 Nov 18 '19 at 09:01
  • https://stackoverflow.com/questions/57507832/unable-to-allocate-array-with-shape-and-data-type – PV8 Nov 18 '19 at 09:03
  • Possible duplicate of [Unable to allocate array with shape and data type](https://stackoverflow.com/questions/57507832/unable-to-allocate-array-with-shape-and-data-type) – PV8 Nov 18 '19 at 09:03
  • Try using the `chunksize` parameter for `read_csv`. Set it to e.g. `chunksize=1000`. – 9769953 Nov 18 '19 at 09:03
  • @PV8 the accepted answer quotes "Obvious overcommits of address space are refused". Unless my calculation is incorrect or the OPs system has very low memory for today's standards, this does not appear to be a case of a memory overcommit. – 9769953 Nov 18 '19 at 09:05
  • @ParitoshSingh This is the weird thing that I have 16 GB of only 0.8GB used, as it is a new PC in the office. I runned the same code on my own PC and it worked normally. Thank you.. – M Khalil Nov 18 '19 at 09:09
  • 2
    @MKhalil interesting. is your python a 32 bit process in the office? try running the following. `import platform; platform.architecture()` – Paritosh Singh Nov 18 '19 at 09:19
  • Yupe it is a 32 bit process, the code worked using the comment down below. Thanks again – M Khalil Nov 18 '19 at 09:33
  • 1
    @MKhalil in that case, my recommendation is, get your office to install and use python 64 bit. 32 bit comes with severe memory restrictions. Your company *will* suffer down the line. – Paritosh Singh Nov 18 '19 at 10:06

1 Answers1

3

The MemoryError means, you file is too large to readcsv in one time, you need used the chunksize to avoid the error.

just like:

import pandas as pd
df = pd.read_csv("LANGEVIN_DATA.txt", delim_whitespace=True, chunksize=1000)

you can read the official document for more help.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

youDaily
  • 1,234
  • 11
  • 17