1

I have a CSV file like below

COL1,COL2,COL3,COL4
3920,10163,"ST. PAUL, MN",TWIN CITIES

I want to read the file and split them outside double quotes WITHOUT using any external libraries. For example in the above CSV, we need to split them into 4 parts as
1. 3920
2. 10163
3. ST. PAUL, MN
4. TWIN CITIES

i tried using regex with folliwing code but never worked. I want to make this work using Groovy code. I tried different solutions given in Java. But couldnt achieve the solution.

NOTE : I dont want to use any external grails/Jars to make this work.

def staticCSV = new File(staticMapping.csv")  
staticCSV.eachLine {line->
def parts = line.split(",(?=(?:[^\"]\"[^\"]\")[^\"]\${1})")
parts.each {
    println "${it}"
}
}
user1523153
  • 139
  • 3
  • 9
  • 1
    Any solution you get will be brittle. Just use an external library that does this right and is thoroughly tested – tim_yates Mar 17 '19 at 17:34
  • 1
    Possible duplicate of [Java: splitting a comma-separated string but ignoring commas in quotes](https://stackoverflow.com/questions/1757065/java-splitting-a-comma-separated-string-but-ignoring-commas-in-quotes) – Szymon Stepniak Mar 17 '19 at 18:55

2 Answers2

1

Got the solution :

def getcsvListofListFromFile( String fileName ) {
    def lol = [] 
    def r1 = r1 = ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*\$)"  

    try {
        def csvf =  new File(fileName)  ;
        csvf.eachLine { line ->
            def c1 = line.split(r1)  
            def c2 = [] 
            c1.each { e1 ->
                def s = e1.toString() ;
                s = s.replaceAll('^"', "").replaceAll('"\$', "") 
                c2.add(s)
            }
            lol.add(c2) ;
        }
        return (lol)  
    } catch (Exception e) {
        def eMsg = "Error Reading file [" + fileName + "] --- " + e.getMessage();
        throw new RuntimeException(eMsg) 
    }
}
user1523153
  • 139
  • 3
  • 9
0

Using a ready-made library is a better idea. But you certainly have your reasons. Here is an alternative solution to yours. It splits the lines with commas and reassembles the parts that originally belonged together (see multipart).

def content =
"""COL1,COL2,COL3,COL4
   3920,10163, "ST. PAUL, MN" ,TWIN CITIES
   3920,10163, "   ST. PAUL, MN " ,TWIN CITIES, ,"Bla,Bla, Bla" """  

content.eachLine {line ->
    def multiPart
    for (part in line.split(/,/)) {
        if (!part.trim()) continue         // for empty parts 
        if (part =~ /^\s*\"/) {            // beginning of a multipart
            multiPart = part
            continue
        } else if (part =~ /"\s*$/) {      // end of the multipart
            multiPart += "," + part
            println multiPart.replaceAll(/"/, "").trim()
            multiPart = null
            continue
        }        
        if (multiPart) {
            multiPart += "," + part
        } else {
            println part.trim()
        }        
    }
}

Output (You can copy the code directly into the GroovyConsole to run.

COL1
COL2
COL3
COL4
3920
10163
ST. PAUL, MN
TWIN CITIES
3920
10163
ST. PAUL, MN
TWIN CITIES
Bla,Bla, Bla