1

When parsing a .csv file I iterate over the file's column headers and see if one of them equals (ignoring case) comparand id:

String comparand = "id";
for (String header : headerMap.keySet()) {
   if (header.equalsIgnoreCase(comparand)) {
      recordMap.put("_id", csvRecord.get(header));
   } else {
      recordMap.put(header, csvRecord.get(header));
   }
}

The file is read using the UTF-8 charset:

Reader reader = new InputStreamReader(file.getInputStream(), StandardCharsets.UTF_8);

The CSV parser library I use is Apache Commons CSV:

CSVParser csvParser = CSVFormat.DEFAULT
   .withDelimiter(delimiter)
   .withFirstRecordAsHeader()
   .withIgnoreEmptyLines()
   .parse(reader);

Map<String, Integer> headerMap = csvParser.getHeaderMap();

Somehow the above equalsIgnoreCase() evaluates to false while both strings have the value id.

Observing the debugger shows that the header value is a non-compact string (UTF-16) whereas the comparand value is a compact string (ASCII):

Non-compact string vs. compact string

Is this default behavior or a bug? How can I make the equalsIgnoreCase evaluate to true as one would expect?

Karol Dowbecki
  • 38,744
  • 9
  • 58
  • 89
whitebrow
  • 1,610
  • 16
  • 21

1 Answers1

4

Your header value starts with an UTF-16 BOM FFFE. Strip the BOM when reading the header before comparing it with comparand.

Karol Dowbecki
  • 38,744
  • 9
  • 58
  • 89