1

I'm consuming a CSV that is generated by an external process. This CSV goes to different places and requires different columns to be included or excluded.

An example of the difference in files...

File 1:

Col1,Col2,Col3,Col4,Col5
ABC,DEF,GHI,JKL,MNO

File 2:

Col4,Col5
JKL,MNO

Pseudo:

1. Open the initial CSV file and create a new CSV file.
2. Loop through the CSV file and for each line copy the columns needed
3. Drop new file in new location

I'm stuck copying the right columns or just removing them. Is there an easy way to loop through each row and just remove data up to a certain comma?

seesharp
  • 11
  • 1
  • 2
    Welcome to Stack Overflow! Have you tried to code anything? Are you including or excluding columns by name or number? Are you handling commas in a column’s text field, or simply praying the will never exist? – AJNeufeld Oct 18 '17 at 13:54
  • In case you using Java8 or newer take look at https://stackoverflow.com/questions/769621/dealing-with-commas-in-a-csv-file?rq=1 – Reporter Oct 18 '17 at 13:57
  • @AJNeufeld Thanks! Yes, the file copy and rewriting wasn't an issue. I was able to remove the first header row (by name, ex. Col1, Col2, Col3), but since each row after that doesn't follow that same format (more dynamic data, can be ABC, XYX, BBQ) I couldn't just remove those using the same logic. The external process that generates the CSV does not allow commas in the text fields, so I don't have to worry about that. – seesharp Oct 18 '17 at 14:12

2 Answers2

0

Split the CSV by commas and take the columns you need. In this demo I have only shown for one line of CSV but you can extend this program to handle multiple lines.

import java.util.*;
import java.lang.*;
import java.io.*;

{
    public static void main (String[] args) throws java.lang.Exception
    {
        // Read a file into inputCsv
        String inputCsv = "c0,c1,c2,c3";
        String outputCsv = "";
        int[] colsNeeded = {1,3};

        String[] cols = inputCsv.split(",");
        for(int i = 0; i < colsNeeded.length; i++){
            outputCsv += cols[colsNeeded[i]];
            if(i < colsNeeded.length - 1)
                outputCsv += ",";
        }
        System.out.println(outputCsv);
        // Write output Csv onto some file
    }
}
Souradeep Nanda
  • 2,423
  • 1
  • 28
  • 36
  • 1
    Note that constantly adding to a string is slow, you may want to write directly to the file (using a BufferedWriter helps). Reading in the full file in one go also seems a bit excessive depending on the file size, it may be easier to process on a line by line basis – phflack Oct 18 '17 at 14:10
  • These are good suggestions which I am feeling too lazy to implement :P An even better solution would be to eliminate file writing altogether. The OP says that the data comes from another process. It would be more reasonable to pipe the data instead of going through disk I/O. – Souradeep Nanda Oct 18 '17 at 14:12
0

Just use univocity-parsers for that:

    String input = "Col1,Col2,Col3,Col4,Col5\n" +
            "ABC,DEF,GHI,JKL,MNO\n";

    Reader inputReader = new StringReader(input); //reading from your input string. Use FileReader for files
    Writer outputWriter = new StringWriter(); //writing into another string. Use FileWriter for files.

    CsvParserSettings parserSettings = new CsvParserSettings(); //configure the parser
    parserSettings.selectFields("Col4", "Col5"); //select fields you need here

    //For convenience, just use ready to use routines.
    CsvRoutines routines = new CsvRoutines(parserSettings);

    //call parse and write to read the selected columns and write them to the output
    routines.parseAndWrite(inputReader, outputWriter);

    //print the result
    System.out.println(outputWriter);

Output:

    Col4,Col5
    JKL,MNO

Hope this helps.

Disclaimer: I'm the author of this library. It's open-source and free (Apache 2.0 license).

Jeronimo Backes
  • 5,701
  • 2
  • 20
  • 28