6

I want to read huge data from CSV, containing around 500,000 rows. I am using OpenCSV library for it. My code for it is like this

    CsvToBean<User> csvConvertor = new CsvToBean<User>();
    List<User> list = null;
    try {
        list =csvConvertor.parse(strategy, new BufferedReader(new FileReader(filepath)));
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }

Upto 200,000 records,data is read into list of User bean objects. But for data more than that I am getting

java.lang.OutOfMemoryError: Java heap space

I have this memory setting in "eclipse.ini" file

-Xms256m
-Xmx1024m

I am thinking a solution of splitting the huge file in separate files and read those files again, which I think is a lengthy solution.

Is there any other way, by which I can avoid OutOfMemoryError exception.

Ninad Pingale
  • 5,634
  • 4
  • 24
  • 45
  • See this - http://stackoverflow.com/questions/5868369/how-to-read-a-large-text-file-line-by-line-using-java – web-nomad Nov 18 '13 at 08:29
  • Why do you need to hold the 200k objects in memory can't you read smaller subsets? What are you doing with the the list of objects? – shyam Nov 18 '13 at 08:35
  • Just look what csvConverter.parse does, and re-implemt it. Shouldn't be much. let the method retun an iterator, than you can parse while iterating. – Sir RotN Nov 18 '13 at 08:59
  • 1
    You try to hold the whole content of the file in memory (in the list). If you really need this: buy more RAM. Otherwise read/process the entries line per line or in smaller sets. – HectorLector Nov 18 '13 at 11:14

3 Answers3

13

Read line by line

something like this

    CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
    String [] nextLine;
    while ((nextLine = reader.readNext()) != null) {
        // nextLine[] is an array of values from the line
        System.out.println(nextLine[0] + nextLine[1] + "etc...");
    }
urbiwanus
  • 683
  • 5
  • 20
  • 1
    That println have no sense at all. Also it is not the answer to his question. He needs data translated into User objets, and if he needs all 200 000 objects at once, reading line by line does not help. – Saša Šijak Nov 18 '13 at 16:06
  • 3
    Thats right. Its just an example. Instead of printing the result to console you can persist the data for further processing ( i.e Batch processing ) or do whatever you want – urbiwanus Nov 19 '13 at 07:27
  • @urbiwanus how we can reuse CSVReader object instead of creating it multiple times. – Krishna Kumar Singh Sep 14 '20 at 06:10
0

You must set -Xmx value for your app, not eclipse in this case. In "Run configurations", select your app, then go to "Arguments" tab and in the "VM arguments" set that value, for example -Xmx1024m. You can open Run configurations by right clicking in the file you wish to run, then select Run As and then selecting "Run configurations..."

Saša Šijak
  • 7,295
  • 5
  • 41
  • 74
  • I am getting this message - Error occurred during initialization of VM Could not reserve enough space for object heap. after adding -Xmx1024m in "Run Configuration" – Ninad Pingale Nov 18 '13 at 10:02
  • 1
    You do not have enough RAM + swap for the memory you wish to reserve + memory taken by all running programs. Try setting it to lower value, for example 512m and shutting down some unnecessary programs. – Saša Šijak Nov 18 '13 at 13:18
-1

Below Example through you read n number of records from csv file.

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;

public class ReadCSV 
{
    public static void main(String[] args) 
    {
        String csvFile = "C:/Users/LENOVO/Downloads/Compressed/GeoIPCountryWhois.csv";
        BufferedReader br = null;
        String line = "";
        String cvsSplitBy = ",";

        try 
        {
            br = new BufferedReader(new FileReader(csvFile));
            while ((line = br.readLine()) != null) 
            {
                // use comma as separator
                String[] country = line.split(cvsSplitBy);

                System.out.println("Country [code= " + country[4] + " , name=" + country[5] + "]");
            }

        }
        catch (FileNotFoundException e) 
        {
            e.printStackTrace();
        } 
        catch (IOException e) 
        {
            e.printStackTrace();
        } 
        finally 
        {
            if (br != null) 
            {
                try 
                {
                    br.close();
                } 
                catch (IOException e) 
                {
                    e.printStackTrace();
                }
            }
        }
        System.out.println("Done");
    }
}
Gautam Viradiya
  • 457
  • 7
  • 11