-1

I'm doing some data analysis using some very large csv files (one has 2 million+ lines of data). What is the best way to quickly read the file and parse the lines into javabean classes?

Example Object class:

public class Crime {

    private String district;
    private String psa;
    private String dispatchDateTime;
    private String dispatchDate;
    private String dispatchTime;
    private String hour;
    private String dcKey;
    private String locationBlock;
    private String ucrGeneral;
    private String generalCode;
    private String month;
    private String lon;
    private String lat;


    public Crime() {

//        
    }
    public void setDistrict(String district) {
        this.district = district;
    }
   ........
    public String getDispatchDateTime() {
        return dispatchDateTime;
    }
    public String getDispatchDate() {
        return dispatchDate;
    }
}

Example line of data:

35,D,2009-07-19 01:09:00,2009-07-19,01:09:00,1,200935061008,5500 BLOCK N 5TH ST,1500,Weapon Violations,20,2009-07,-75.130477,40.036389
P0bbn
  • 1
  • 1
  • Use a CSV parser that supports **streamed** parsing. – Andreas Apr 02 '20 at 18:41
  • Thank you! Do you have any suggestions/a personal preference? – P0bbn Apr 02 '20 at 18:47
  • Duplicate of [this](https://stackoverflow.com/questions/20043181/read-large-csv-in-java), [this](https://stackoverflow.com/q/31531258/642706), and others. – Basil Bourque Apr 02 '20 at 18:49
  • @P0bbn Sorry, [questions asking us to recommend or find a software library are off-topic for Stack Overflow](https://stackoverflow.com/help/on-topic). Besides, Jens Scharmann [already gave you an example](https://stackoverflow.com/a/60998826/5221149). You should however do your own research, looking into Java CSV parsers that support streaming, and decide for yourself which one is best for *you*. – Andreas Apr 02 '20 at 18:51
  • Similar solution already provided in this thread. Please check. [enter link description here](https://stackoverflow.com/a/62171055/2648257) – Bharathiraja Jun 05 '20 at 12:12

1 Answers1

0

Have a look: http://commons.apache.org/proper/commons-csv/

Use the library to parse the file and create an object for every line.

jsc57x
  • 71
  • 6
  • Will do! Will that be able to parse through the file relatively quickly? – P0bbn Apr 02 '20 at 18:46
  • 2
    @P0bbn it's too early for you to worry about performance. But it's a library created for parsing CSV, why would you expect it not to be efficient? – Kayaman Apr 02 '20 at 18:49