0

I have a csv file that looks like this:

"2014", "2", "AMC-South", "inpatient", "complication", "1", "2", "2", "13,125.83", "6,562.95"

How can I remove all the quotes and commas separating the items, so it can look like this?:

2014 2 AMC-South inpatient complication 1 2 2 13,125.83 6,562.95

I need this formatting so I can parse the csv file easier (using java). Thanks.

azro
  • 35,213
  • 7
  • 25
  • 55
  • `String new_line = oldline.replace("\"", ");` - btw. your expected result still has commas in it .... – Patrick Artner Feb 09 '18 at 20:51
  • 1
    Possible duplicate of [replace String with another in java](https://stackoverflow.com/questions/5216272/replace-string-with-another-in-java) – Patrick Artner Feb 09 '18 at 20:51
  • @PatrickArtner i only need the commas separating the values to be gone. for example, 13,125.83 is a single number –  Feb 09 '18 at 20:56
  • Please take a peek at this answer: https://stackoverflow.com/a/24950812/7505395 - there are csv pasing lib recommendations in it. – Patrick Artner Feb 09 '18 at 21:04
  • Don't re-invent the wheel. I recommend the Apache CSV Parser. Parsing CSV files is not something you should try to do yourself. It's very easy to get this wrong. – Dawood ibn Kareem Feb 09 '18 at 21:15

5 Answers5

0

Command line one-liner, using Perl:

$ echo '"2014", "2", "AMC-South", "inpatient", "complication", "1", "2", "2", "13,125.83", "6,562.95"'
"2014", "2", "AMC-South", "inpatient", "complication", "1", "2", "2", "13,125.83", "6,562.95"


$ echo '"2014", "2", "AMC-South", "inpatient", "complication", "1", "2", "2", "13,125.83", "6,562.95"' | perl -pe 's/^"//; s/", "/ /g; s/"$//;'
2014 2 AMC-South inpatient complication 1 2 2 13,125.83 6,562.95

Please note that this will only work correctly for CSV where the fields do not contain white space. That's the reason the CSV has those " around each field.

IMHO you should look for a Java CSV parser module. It will make life much easier in the long run.

Stefan Becker
  • 5,201
  • 9
  • 17
  • 27
  • Just showing the proper regexes which should work with any decent language or tool that uses PCRE. – Stefan Becker Feb 09 '18 at 20:56
  • OP is interested in a Java solution. Regular expressions aside, this solution is completely useless to someone who doesn't know how PERL parses its command line. It's also wrong in many cases - even if this were translated to a Java program, there are loads of CSV files on which it would fail. – Dawood ibn Kareem Feb 09 '18 at 21:18
0

Here is algorithm outline:

The java string replace() method returns a string replacing all the old char or CharSequence to new char or CharSequence.

String replaceString = your_string.replace("string_to_replace","[\",]+");

Consider this instead:

replaceAll(String regex, String replacement)

Replaces each substring of this string that matches the given regular expression with the given replacement.

Possible Regex

Remario
  • 3,455
  • 2
  • 14
  • 22
0

A work around to avoid the CSV issue since multiple values contain commas, you could split around the following characters ", ". Then all you need to do is remove the first and last " contained within those elements

String[] data = scanner.readLine().split("\", \"");

if(data.length() > 0 && data.length()  <= 10)
{
    data[0].replaceAll("\"", "");
    data[9].replaceAll("\"", "");
}

You could also split around "[\D+],[\D+]" and after the array is returned remove any and all " from each string within the array.

RAZ_Muh_Taz
  • 3,964
  • 1
  • 10
  • 22
0

Have you considered using a library to parse data? Apache Commons CSV is great for that - https://commons.apache.org/proper/commons-csv/

File csvData = new File("/path/to/csv");
CSVParser parser = CSVParser.parse(csvData, CSVFormat.DEFAULT);
for (CSVRecord record : parser) {
     ...
}
medicm
  • 60
  • 1
  • 9
0

Regex: ",?

Details:

  • ? Matches between zero and one times

Java code:

String text = "\"2014\", \"2\", \"AMC-South\", \"inpatient\", \"complication\", \"1\", \"2\", \"2\", \"13,125.83\", \"6,562.95\"";
text = text.replaceAll("\",?", "");

System.out.println(text);

Output:

2014 2 AMC-South inpatient complication 1 2 2 13,125.83 6,562.95
Srdjan M.
  • 3,072
  • 2
  • 10
  • 28