0

I'm trying to create a simple class to read a csv file and store the contents in an

ArrayList<ArrayList<T>>.  

I'm creating a generic class CsvReader so that I can handle data of different types: int, double, String. If I had, say, a csv file of doubles, I was imagining I would use my class like this:

//possible method 1
CsvReader<Double> reader = new CsvReader<Double>();
ArrayList<ArrayList<Double>> contents = reader.getContents();

//possible method 2
CsvReader reader = new CsvReader(Double.class);
ArrayList<ArrayList<Double>> contents = reader.getContents();

But method 1 won't work since type erasure prevents you from writing code like

rowArrayList.add(new T(columnStringValue)); 

But I can't even make the passing in a Double.class solution work. The problem is that what's really going on is that I need my class "parameterized" (in the general sense of that word, not the technical java generics sense) on a type with the following property: it has a ctor accepting a single String argument. That is, to create the row ArrayLists on, say, a Double csv file, I'd need to write:

StringTokenizer st = new StringTokenizer(line,",");
ArrayList<Double> curRow = new ArrayList<Double>();
while (st.hasMoreTokens()) {
 curRow.add(new Double(st.nextToken());
}

Having passed in Double.class, I could get its String ctor using

  Constructor ctor = c.getConstructor(new Class[] {String.class});

but this has two problems. Most importantly, this is a general constructor that will return a type Object, which I cannot then downcast into a Double. Second, I would be missing "type" checking on the fact that I am requiring my passed in class to have a String arg constructor.

My question is: How can I properly implement this general purpose CsvReader?

Thanks, Jonah

Jonah
  • 14,529
  • 18
  • 76
  • 146

4 Answers4

7

I'm not sure a generic CSV reader would be this simple to use (and to create, by the way).

The first question that comes to my mind is: What if the CSV contains three columns: first an integer, then a string and finally a date? How would you use your generic CSV reader?

Anyway, lets suppose you want to create a CSV reader where all columns are of the same type. As you said, you can't parametrize a class on a type "that accepts a String as constructor". Java just doesn't allow that. The solution using reflection is a good start. But what if your class doesn't take a String as parameter in one of its constructors?

Here you can come with an alternative: a parser that would take your String and return an object of the correct type. Create a generic interface, and make some implementations for the type you want to crawl:

public interface Parser<T> {

    T parse(String value);

}

And then, implement:

public class StringParser implements Parser<String> {

    public String parse(String value) {
        return value;
    }

}

Then, you CSV reader can take a Parser as one of its parameters. Then, it can use this parser to convert each String into a Java object.

With this solution, you get rid of the not-so-pretty reflection your where using. And you can convert to any type, you just have to implement a Parser.

Your reader will look like this:

public CSVReader<T> {

    Parser<T> parser;

    List<T> getValues() {
        // ...
    }

}

Now, back at the problem where a CSV file can have multiple types, just improve your reader a little. All you need is a list of parsers (one per column) instead of one that parse all columns.

Hope that helps :-)

Vivien Barousse
  • 19,611
  • 2
  • 55
  • 64
1

If you are trying to do real work, I suggest you forget that and use Scanner.

If you are experimenting: I would make CsvReader an abstract class:

public abstract class  CsvReader<T> {
...
    // This is what you use in the rest of CsvReader
    // to create your objects from the strings in the CSV
    protected abstract T parse(String s);
...
}

And it would be used as:

CsvReader<Double> = new CsvReader<Double>() {
    @Override protected Double parse(String s) {
        return Double.valueOf(s);
    }
};
...

Not perfect, but reasonable.


EDIT: It turns out that you can have it your way, though it looks a bit hackish. See Super Type Tokens. It would basically involve including the logic shown in the Super Type Tokens link in CsvReader to have avilable the class object corresponding to your element class.

BalusC
  • 992,635
  • 352
  • 3,478
  • 3,452
gpeche
  • 20,230
  • 4
  • 32
  • 47
  • +1 for the java.util.Scanner class I didn't know, need it nearly 9000 times. – Ither Sep 19 '10 at 19:15
  • Yes. At the very least I would get rid of `StringTokenizer` and use `String.split()` instead. – gpeche Sep 19 '10 at 19:54
  • Do you guys mean I should use Scanner to implement the code that breaks up the text data and puts it into ArrayLists? Also, what is the advantage of String.split() over StringTokenizer, out of curiosity? – Jonah Sep 19 '10 at 20:25
  • gpeche, thanks for you contribution btw -- very interesting link – Jonah Sep 19 '10 at 20:36
  • @Jonah See http://stackoverflow.com/questions/691184/scanner-vs-stringtokenizer-vs-string-split – gpeche Sep 19 '10 at 20:44
1

Creating a correct CVS reader might be more difficult than you thought. For example, in your code example, it will not work correctly under the following situation.

"Microsoft, Inc",1,2,3

Instead of 4 fields, what you will be getting is 5 fields based on

StringTokenizer st = new StringTokenizer(line,",");

What my suggestion is, use third party libraries implementation. For example

http://opencsv.sourceforge.net/

I use it in one of my application, and my application has been running for 3 years. So far so good.

Cheok Yan Cheng
  • 49,649
  • 117
  • 410
  • 768
0

I had a need to read a simple list of strings stored in the cells of a CSV file, and started searching for a Java solution. I found most open source CSV readers to be unnecessarily complicated for my purpose. (See https://agiletribe.purplehillsbooks.com/2012/11/23/the-only-class-you-need-for-csv-files/ for a comprehensive review). Finally I found MKYong's code very effective. I had to adapt it for my purpose to read the whole CSV or TSV file and return it as a list of lists. Each element in the inner list represents one cell of the CSV. The code along with credites to MKYong can be found at: https://github.com/ramanraja/CsvReader

AgilePro
  • 4,774
  • 2
  • 28
  • 51
Raja
  • 834
  • 10
  • 11